SQL Server : data type to use in a table

SQL Server : data type to use in a table - sql-server

Might be a silly question to ask but what data type should I setup a column so I can enter multiple values?
Example: I have two tables, one called Application_users and the other Products.
Application_Users has an id column.
What I want is to have a column in Products which is called Application_Users_id and I enter 1,2,3,4
The idea is if an Application_User_id is say 3, they would only see products were the Products.Application_Users_ID contains a 3.
So what data type do I use so I can enter values such as 1,2,3,4 in a column?
I have tried NVARCHAR and INTEGER but neither work (NVARCHAR works but won't let me amend it e.g. add numbers).
Let me know what everyone thinks is the best approach here please.
Thanks
John

It might be a silly question but you would be surprised how many developers makes the very same mistake. It's so often that I have a ready-to-paste comment to address it:
Read Is storing a delimited list in a database column really that bad?, where you will see a lot of reasons why the answer to this question is Absolutely yes!
And if you actually go and read this link, you'll see that it's so wrong and so frequently used that Bill Karwin addressed it in the first chapter of his book - SQL Antipatterns: Avoiding the Pitfalls of Database Programming.
Having said that, SQL Server Does support XML columns, in which you can store multiple values, but that is not the case when you want to use them. XML columns are good for storing stuff like property bags, where you can't tell in advance the data types you'll have to deal with, for example.
tl;dr; - So what should you do?
What you want to do is probably a many to many relationship between Application_users and Products.
The way to create a many to many relationship is to add another table that will be the "bridge" between the two tables - let's call it Application_users_to_products.
This table will have only two columns - application_user_id and product_id, each of them is a foreign key to the respective table, and the combination of both columns is the primary key of the bridge table.

Related

Data Modeling: Is it bad practice to store IDs from various sources in the same column?

I am attempting to merge data from various sources into an existing data model. Each source uses different types of IDs (such as GUID, Salesforce IDs, etc.). For example, if I were to merge data from two different sources, the table may look like the following (where the first two SalesPersonIDs are GUID IDs and the second two are Salesforce IDs):
Is this a bad practice? I could also imagine a table where each ID type was its own column and could be left blank if it was not applicable. Something like the following:
I apologize, I am a bit new to this. Thanks in advance for any insight, I greatly appreciate it!

The big roles of an ID column are to act as a key connecting data in different tables, and to help indexing - quickly find rows so your queries run fast.
The second solution wouldn't work well for these purposes, and will lead to big headaches in queries: every time you want to group by the ID, you'll have to combine the info from 2 columns in some way, hopefully getting a correct unique result every time.
On the one hand, all you might ever need from an ID is for it to be unique. The first solution might be fine this respect - but are you sure you'll never, ever get data about one SalesPerson from more than one source?
I'd suggest keeping all the IDs in one column, and adding a column to say what kind of ID this is. At least this way, you won't lose any information and can do other things in the future.
One thing you might consider is making a separate table of SalesPerson with all their possible IDs, and have this keyed to other (Sales?) data by a unique ID used only in your database.

INVENTORY SYSTEM! Problems with tables

Hello guys! The picture attached is a screenshot of our current database design in MySql Workbench. We presented it into our professor and said that our schema was wrong particularly in the Product Categories (Cakes, Cupcakes, Pies), because it should've been inside the products table.
Can you help me improving this kind of schema by adding more tables and not just 5?

Since your other tables (cakes, pies, etc.) don't add any new attributes, it's hard to envision why you need them. Wouldn't a row in cakes just be an exact duplicate of the row it references in products?
If you need to distinguish between product category, just add a column to your products table for the category.
P.S. Please don't get into the habit of prefixing all your table names with "tbl." This is redundant, because it's obviously a table if you can query from it. Just skip the "tbl." Likewise don't encode the data type in your column names. What if you need to change the data type someday? It would break all your queries.
P.P.S. Don't use FLOAT for currency—use NUMERIC.
See https://twitter.com/billkarwin/status/347561901460447232
Also read What Every Computer Scientist Should Know About Floating-Point Arithmetic.

Parent child design to easily identify child type

In our database design we have a couple of tables that describe different objects but which are of the same basic type. As describing the actual tables and what each column is doing would take a long time I'm going to try to simplify it by using a similar structured example based on a job database.
So say we have following tables:
These tables have no connections between each other but share identical columns. So the first step was to unify the identical columns and introduce a unique personId:
Now we have the "header" columns in person that are then linked to the more specific job tables using a 1 to 1 relation using the personId PK as the FK. In our use case a person can only ever have one job so the personId is also unique across the Taxi driver, Programmer and Construction worker tables.
While this structure works we now have the use case where in our application we get the personId and want to get the data of the respective job table. This gets us to the problem that we can't immediately know what kind of job the person with this personId is doing.
A few options we came up with to solve this issue:
Deal with it in the backend
This means just leaving the architecture as it is and look for the right table in the backend code. This could mean looking through every table present and/or construct a semi-complicated join select in which we have to sift through all columns to find the ones which are filled.
All in all: Possible but means a lot of unecessary selects. We also would like to keep such database oriented logic in the actual database.
Using a Type Field
This means adding a field column in the Person table filled for example with numbers to determine the correct child table like:
So you could add a 0 in Type if it's a taxi driver, a 1 if it's a programmer and so on...
While this greatly reduced the amount of backend logic we then have to make sure that the numbers we use in the Type field are known in the backend and don't ever change.
Use separate IDs for each table
That means every job gets its own ID (has to be nullable) in Person like:
Now it's easy to find out which job each person has due to the others having an empty ID.
So my question is: Which one of these designs is the best practice? Am i missing an obvious solution here?

Bill Karwin made a good explanation on a problem similar to this one. https://stackoverflow.com/a/695860/7451039

We've now decided to go with the second option because it seem to come with the least drawbacks as described by the other commenters and posters. As there was no actual answer portraying the second option as a solution i will try to summarize our reasoning:
Against Option 1:
There is no way to distinguish the type from looking at the parent table. As a result the backend would have to include all logic which includes scanning all tables for the that contains the id. While you can compress most of the logic into a single big Join select it would still be a lot more logic as opposed to the other options.
Against Option 3:
As #yuri-g said this one is technically not possible as the separate IDs could not setup as primary keys. They would have to be nullable and as a result can't be indexed, essentially rendering the parent table useless as one of the reasons for it was to have a unique personID across the tables.
Against a single table containing all columns:
For smaller use cases as the one i described in the question this might me viable but we are talking about a bunch of tables with each having roughly 2-6 columns. This would make this option turn into a column-mess really quickly.
Against a flat design with a key-value table:
Our properties have completly different data types, different constraints and foreign key relations. All of this would not be possible/difficult in this design.
Against custom database objects containt the child specific properties:
While this option that #Matthew McPeak suggested might be a viable option for a lot of people our database design never really used objects so introducing them to the mix would likely cause confusion more than it would help us.
In favor of the second option:
This option is easy to use in our table oriented database structure, makes it easy to distinguish the proper child table and does not need a lot of reworking to introduce. Especially since we already have something similar to a Type table that we can easily use for this purpose.

Third option, as you describe it, is impossible: no RDBMS (at least, of I personally know about) would allow you to use NULLs in PK (even composite).
Second is realistic.
And yes, first would take up to N queries to poll relatives in order to determine the actual type (where N is the number of types).
Although you won't escape with one query in second case either: there would always be two of them, because you cant JOIN unless you know what exactly you should be joining.
So basically there are flaws in your design, and you should consider other options there.
Like, denormalization: line non-shared attributes into the parent table anyway, then fields become nulls for non-correpondent types.
Or flexible, flat list of attribute-value pairs related through primary key (yes, schema enforcement is a trade-off).
Or switch to column-oriented DB: that's a case for it.

Does this database model make sense?

I am new speaking about modelling databases. But I give my best to learn as much as possible by my own. Therefore I want to ask you, whether my first attemp make sense for the following example:
So I modeled the database as followed:
The databse is about medicine. There are several medicine items which should be dosed depending on the age of the patient. Every medicine item can belong to one ore more groups (or none).
This is just a test case to show what I learned so far. So every tip to improve my skills is welcome!
Thanks a lot!

The relationtable table name is just a placeholder, right? It should be more descriptive, maybe dosage?
Something tells me that age ranges will greatly vary. Some medicines have different rules for children under 3 years, other under 5, 10, and so on. Instead of creating a separate table, just include two extra columns (start and end) in relationtable. It will be much easier to query and I won't consider this a denormalization.
Talking about age and dose tables - get rid of unit column and use normalized, fixed unit. Years for age and mg for doses. This will make querying much simpler. Don't be afraid to use floating numbers, e.g. 0.5 to represent six months.

I agree with what Tomasz write and would like to add:
If the relationtable is the correct way to go depends on some knowledge not contained in the table. It sounds strange that one medicine can be part of different groups and that the dosage depends on that relation. I would expect that a medicine can belong to different groups (resulting in a medicine2group mapping table) and that their exist different dosages depending on the age for a medicine (so you get dosage4age table, combining the existing age and dose tables. That new table would directly reference the medicine)
Which version is correct can not be told from the table alone.
As a rule of thumb: I get skeptical when a table without a proper name and concept links more then two other table. It is possible but often hints at concept hiding somewhere.
In order to check if the proposed model is correct, ask the business experts if the table is still correct if you replace Antibiotika with Superantibiotika in one of the first three rows. If it is, this means that the dosage does not depend on the group and should not be linked to it, so the model proposed by me would be more correct.
If the altered table is not correct, your model might be the better one, but I would listen carefully about the explanation why it isn't correct.

Table "Inheritance" in SQL Server

I am currently in the process of looking at a restructure our contact management database and I wanted to hear peoples opinions on solving the problem of a number of contact types having shared attributes.
Basically we have 6 contact types which include Person, Company and Position # Company.
In the current structure all of these have an address however in the address table you must store their type in order to join to the contact.
This consistent requirement to join on contact type gets frustrating after a while.
Today I stumbled across a post discussing "Table Inheritance" (http://www.sqlteam.com/article/implementing-table-inheritance-in-sql-server).
Basically you have a parent table and a number of sub tables (in this case each contact type). From there you enforce integrity so that a sub table must have a master equivalent where it's type is defined.
The way I see it, by this method I would no longer need to store the type in tables like address, as the id is unique across all types.
I just wanted to know if anybody had any feelings on this method, whether it is a good way to go, or perhaps alternatives?
I'm using SQL Server 05 & 08 should that make any difference.
Thanks
Ed

I designed a database just like the link you provided suggests. The case was to store the data for many different technical reports. The number of report types is undefined and will probably grow to about 40 different types.
I created one master report table, that has an autoincrement primary key. That table contains all common information like customer, testsite, equipmentid, date etc.
Then I have one table for each report type that contains the spesific information relating to that report type. That table have the same primary key as the master and references the master as well.
My idea for splitting this into different tables with a 1:1 relation (which normally would be a no-no) was to avoid getting one single table with a huge number of columns, that gets very difficult to maintain as your constantly adding columns.
My design with table inheritance gave me segmented data and expandability without beeing difficult to maintain. The only thing I had to do was to write special a special save method to handle writing to two tables automatically. So far I'm very happy with the design and haven't really found any drawbacks, except for a little more complicated save method.

Google on "gen-spec relational modeling". You'll find a lot of articles discussing exactly this pattern. Some of them focus on table design, while others focus on an object oriented approach.
Table inheritance pops up in a few of them.

I know this won't help much now, but initially it may have been better to have an Entity table rather than 6 different contact types. Then each Entity could have as many addresses as necessary and there would be no need for type in the join.

You'll still have the problem that if you want the sub-type fields and you have only the master contact, you'll have to know what table to go looking at - or else join to all of them. But otherwise this is a workable solution to a common problem.
Another possibility (fairly similar in structure, but different in how you think of it) is to simply put all your contacts into one table. Then for the more specific fields (birthday say for people and department for position#company) create separate tables that are associated with that contact.
Contact Table
--------------
Name
Phone Number
Address Table
-------------
Street / state, etc
ContactId
ContactBirthday Table
--------------
Birthday
ContactId
Departments Table
-----------------
Department
ContactId
It requires a different way of thinking of things though - instead of thinking of people vs. companies, you think of the various functional requirements for the task at hand - if you want to send out birthday cards, get all the contacts that have birthdays associated with them, etc..

I'm going to go out on a limb here and suggest you should rethink your normalization strategy (as you seem to be lucky enough to be able to rethink your schema quite fundamentally). If you typically store an address for each contact, then your contact table should have the address fields in it. Alternatively if the address is stored per company then the address should be stored in the company table and your contacts linked to that company.
If your contacts only have one address, or one (or even 3, just not 'many') instance of the other fields, think about rationalizing them into a single table. In my experience having a few null fields is a far better alternative than needing left joins to data you aren't sure exists.
Fortunately for anyone who vehemently disagrees with me you did ask for opinions! :) IMHO you should only normalize when you really need to. Where you are rethinking schemas, denormalization should be considered at every opportunity.

When you have a 7th type, you'll have to create another table.

I'm going to try this approach. Yes, you have to create new tables when you have a new type, but since this table will probably have different columns, you'll end up doing this anyway if you don't use this scheme.
If the tables that inherit the master don't differentiate much from one another, I'd recommend you try another approach.

May I suggest that we just add a Type table. Ie a person has an address, name etc then the student, teacher as each use case presents its self we have a PersonType table that has an entry from the person table to n types and the subsequent new tables teacher, alien, singer as the system eveolves...

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight