Multiple elements in one database cell - database

The question is how database design should I apply for this situation:
main table:
ID | name | number_of_parameters | parameters
parameters table:
parameter | parameter | parameter
Number of elements in parameters table does not change. number_of_parameters cell defines how many parameters tables should be stored in next cell.
I have problems to move from object thinking to database design. So when we talk about object one row has as much parameters as number_of_parameters says.
I hope that description of requirements is clear. What is the correct way to design such database. If someone can provide some SQL statments to obtain it it would be nice. But the main goal of this question is to understand how to make such architecture.
I want to use SQLite to create this database.

The relational way is to have two tables. The main table has an ID, name and as many other universally-present parameters as possible. The parameters table contains a mapping from an ID in the main table to a parameter name and a parameter value; the main table ID should be a foreign key, and the combination of ID and name should be unique.
The number of parameters can be found by just counting the number of rows with a particular ID.

If you can serialize the data whiile saving to the database and deserialize it back when you get the record it will work. You can get total number of objects in serialized container and save the count to the number_of_parameters field and serialized data in parameters field.

There isn't one perfect correct way, but if you want to use a relational database, you preferably have relational tables.
If you have a key-value database, you place your serialized data as a document attached to your key.
If you want a hybrid solution, both human editable and single table, you can serialize your data to a human-readable format such as yaml, which sees heavy usage in configuration sections of open source projects.

Related

Need help in a MS SQL DB design

Hi I have 2 type of data entry which needs to be stored in db so it can be used for calculations later. Each entry has a unique id for it. The data entry are -
1.
2.
So I have to save this data in DB. With my understanding I thought of the following -
Create 3 tables - Common, Entry1 and Entry2(multiple tables with unique id as name)
The Common table will have a unique entry of each data and which table refer to for the value (Entry1/Entry2).
The Entry1 data is a single line so it can be inserted. But the Entry2 data will require a complete table because of its structure. So whenever we add a type 2 entry then a new table has to be created, which will create a lot of tables.
Or I could save the type2 values in another database and fetch the values from there. So please suggest me a way which is better than this.
I believe that you have 2 entry types with identical structure, but one containing a single row and one containing many.
In this case, I would suggest a single table containing the data for all entries, wtih a second table grouping them together. Even if your input contains a single row, it should still gain an EntryID. Perhaps something like the below:

Is it a good idea to create a db with a generic table entity that can be decorated with a role and metadatas?

I've been thinking about creating a database that, instead of having a table per object I want to represent, would have a series of generic tables that would allow me to represent anything I want and even modifying (that's actually my main interest) the data associated with any kind of object I represent.
As an example, let's say I'm creating a web application that would let people make appointments with hairdressers. What I would usually do is having the following tables in my database :
clients
hairdressers: FK: id of the company the hairdresser works for
companies
appointments: FK: id of the client and the hairdresser for that appointment
But what happens if we deal with scientific hairdressers that want to associate more data to an appointment (e.g. quantity of shampoo used, grams of hair cut, number of scissor's strokes,...) ?
I was thinking instead of that, I could use the following tables:
entity: represents anything I want. PK(entity_id)
group: is an entity (when I create a group, I first create an entity which
id is then referred to by the FK of the group). PK(group_id), FK(entity_id)
entity_group: each group can contain multiple entity (thus also other groups): PK(entity_id, group_id).
role: e.g. Administrator, Client, HairDresser, Company. PK(role_id)
entity_role: each entity can have multiple roles: PK(entity_id, role_id)
metadata: contains the name and type of the metadata aswell as the associated role and a flag that describes if its mandatory or not. PK(metadata_id), FK(metadata_type_id, role_id)
metadata_type: contains information about available metadata types. PK(metadata_type_id)
metadata_value: PK(metadata_value_id), FK(metadata_id)
metadata_: different tables for the different types e.g. char, text, integer, double, datetime, date. PK(metadata__id), FK(metadata_value_id) which contain the actual value of a metadata associated with an entity.
entity_metadata: contains data associated with an entity e.g. name of a client, address of a company,... PK(entity_id, metadata_value_id). Using the type of the metadata, its possible to select the actual value of a metadata for this entity in the corresponding table.
This would allow me to have a completely flexible data structure but has a few drawbacks:
Selecting the metadatas associated with an entity returns multiple rows that I have to process in my code to create the representation of the entity in my code.
Selecting metadatas of multiple entities requires to loop over the same process as above.
Selecting metadatas will also require me to do a select for each one of the metadata_* table that I have.
On the other hand, it has some advantages. For example, instead of having a client table with a lot of fields that will almost never be filled, I just use the exact number of rows that I need.
Is this a good idea at all?
I hope that I've expressed clearly what I'm trying to achieve. I guess that I'm not the first one who wonders how to achieve that but I was not able to find the right keywords to find an answer to that question :/
Thanks!

Database design query

I'm trying to work out a sensible approach for designing a database where I need to store a wide range of continuously changing information about pets. The categories of data can be broken down into, for example, behaviour, illness etc. Data will be submitted on a regular basis relating to these categories, so i need to find a good way to design the db to efficiently accommodate this. A simple approach would just to store multiple records for each pet within each relevant table - e.g the behaviour table would store the behaviour data and would simply have a timestamp for each record along with the identifier for that pet. When querying the db, it would be straightforward to query the one table with the pet id, using the timestamps to output the correct history of submissions. Is there a more sensible way around this or does that make sense?
I would use a combination of lookup tables with a strong use of foreign keys. I think what you are suggesting is very common. For example, get me all the reported illnesses for a particluar pet during this data range would look something like:
Select *
from table_illness
where table_illness.pet_id = <value>
and date between table_illness.start_date and table_illness.finish_date
You could do that for any of the tables. The lookup tables will be a link between, for example, table_illness.illness_type and illness_types.illness_type. The illness_types table is where you would store the details on the types of illnesses.
When designing a database you should build your tables to mimic real-life objects or concepts. So in that sense the design you suggest makes sense. Each pet should have its own record in a pet table which doesn't change. Changing information should then be placed into the appropriate table which has the pet's id. The time stamp method you suggest is probably what I would do -- unless of course this is for a vet or something. Then I'd create an appointment table with the date and connect the illness or behavior to the appointment as well.

Is using multiple tables an advisable solution to dealing with user defined fields?

I am looking at a problem which would involve users uploading lists of records with various field structures into an application. The 2nd part of this would be to also allow the users to specify fields to capture information.
This is a step beyond anything ive done up to this point where i would have designed a static RDMS structure myself. In some respects all records will be treated the same so there will be some common fields required for each. Almost all queries will be run on these common fields.
My first thought would be to dynamically generate a new table for each import and another for each data capture field spec.Then have a master table with a guid for every record in the application along with the common fields and then fields that specify the name of the table the data was imported to and name of table with the data capture fields.
Further information (metadata?) about the fields in the dynamically generated tables could be stored in xml or in a 'property' table.
This would mean as users log into the application i would be dynamically choosing which table of data to presented to the user, and there would be a large number of tables in the database if it was say not only multiuser but then multitennant.
My question is are there other methods to solving this kind of varaible field issue, im i going down an unadvised path here?
I believe that EAV would require me to have a table defining the fields for each import / data capture spec and then another table with the import - field - values data and that seems impracticle.
I hate storing XML in the database, but this is a perfect example of when it makes sense. Store the user imports in XML initially. As your data schema matures, you can later decide which tables to persist for your larger clients. When the users pick which fields they want to query, that's when you come back and build a solid schema.
What kind is each field? Could the type of field be different for each record?
I am working on a program now that does this sorta and the way we handle it is basically a record table which points to a recordfield table. the recordfield table contains all of the fields along with the field name of the actual field in the database(the column name). We then have a recorddata table which is where all the data goes for each record. We also store a record_id telling it which record it is holding.
This is how we do it where if each column for the record is the same type, then we don't need to add new columns to the table, and if it has more fields or fields of a different type, then we add fields as appropriate to the data table.
I think this is what you are talking about.. correct me if I'm wrong.
I think that one additional table for each type of user defined field for the table that the user can add the fields to is a good way to go.
Say you load your records into user_records(id), that table would have an id column which is a foreign key in the user defined fields tables.
user defined string fields would go in user_records_string(id, name), where id is a foreign key to user_records(id), and name is a string, or a foreign key to a list of user defined string fields.
Searching on them requires joining them in to the base table, probably with a sub-select to filter down to one field based on the user meta-data, so that the right field can be added to the query.
To simulate the user creating multiple tables, you can have a foreign key in the user_records table that points at a table list, and filter on that when querying for a single table.
This would allow your schema to be static while allowing the user to arbitrarily add fields and tables.

How do you manage "pick lists" in a database

I have an application with multiple "pick list" entities, such as used to populate choices of dropdown selection boxes. These entities need to be stored in the database. How do one persist these entities in the database?
Should I create a new table for each pick list? Is there a better solution?
In the past I've created a table that has the Name of the list and the acceptable values, then queried it to display the list. I also include a underlying value, so you can return a display value for the list, and a bound value that may be much uglier (a small int for normalized data, for instance)
CREATE TABLE PickList(
ListName varchar(15),
Value varchar(15),
Display varchar(15),
Primary Key (ListName, Display)
)
You could also add a sortOrder field if you want to manually define the order to display them in.
It depends on various things:
if they are immutable and non relational (think "names of US States") an argument could be made that they should not be in the database at all: after all they are simply formatting of something simpler (like the two character code assigned). This has the added advantage that you don't need a round trip to the db to fetch something that never changes in order to populate the combo box.
You can then use an Enum in code and a constraint in the DB. In case of localized display, so you need a different formatting for each culture, then you can use XML files or other resources to store the literals.
if they are relational (think "states - capitals") I am not very convinced either way... but lately I've been using XML files, database constraints and javascript to populate. It works quite well and it's easy on the DB.
if they are not read-only but rarely change (i.e. typically cannot be changed by the end user but only by some editor or daily batch), then I would still consider the opportunity of not storing them in the DB... it would depend on the particular case.
in other cases, storing in the DB is the way (think of the tags of StackOverflow... they are "lookup" but can also be changed by the end user) -- possibly with some caching if needed. It requires some careful locking, but it would work well enough.
Well, you could do something like this:
PickListContent
IdList IdPick Text
1 1 Apples
1 2 Oranges
1 3 Pears
2 1 Dogs
2 2 Cats
and optionally..
PickList
Id Description
1 Fruit
2 Pets
I've found that creating individual tables is the best idea.
I've been down the road of trying to create one master table of all pick lists and then filtering out based on type. While it works, it has invariably created headaches down the line. For example you may find that something you presumed to be a simple pick list is not so simple and requires an extra field, do you now split this data into an additional table or extend you master list?
From a database perspective, having individual tables makes it much easier to manage your relational integrity and it makes it easier to interpret the data in the database when you're not using the application
We have followed the pattern of a new table for each pick list. For example:
Table FRUIT has columns ID, NAME, and DESCRIPTION.
Values might include:
15000, Apple, Red fruit
15001, Banana, yellow and yummy
...
If you have a need to reference FRUIT in another table, you would call the column FRUIT_ID and reference the ID value of the row in the FRUIT table.
Create one table for lists and one table for list_options.
# Put in the name of the list
insert into lists (id, name) values (1, "Country in North America");
# Put in the values of the list
insert into list_options (id, list_id, value_text) values
(1, 1, "Canada"),
(2, 1, "United States of America"),
(3, 1, "Mexico");
To answer the second question first: yes, I would create a separate table for each pick list in most cases. Especially if they are for completely different types of values (e.g. states and cities). The general table format I use is as follows:
id - identity or UUID field (I actually call the field xxx_id where xxx is the name of the table).
name - display name of the item
display_order - small int of order to display. Default this value to something greater than 1
If you want you could add a separate 'value' field but I just usually use the id field as the select box value.
I generally use a select that orders first by display order, then by name, so you can order something alphabetically while still adding your own exceptions. For example, let's say you have a list of countries that you want in alpha order but have the US first and Canada second you could say "SELECT id, name FROM theTable ORDER BY display_order, name" and set the display_order value for the US as 1, Canada as 2 and all other countries as 9.
You can get fancier, such as having an 'active' flag so you can activate or deactivate options, or setting a 'x_type' field so you can group options, description column for use in tooltips, etc. But the basic table works well for most circumstances.
Two tables. If you try to cram everything into one table then you break normalization (if you care about that). Here are examples:
LIST
---------------
LIST_ID (PK)
NAME
DESCR
LIST_OPTION
----------------------------
LIST_OPTION_ID (PK)
LIST_ID (FK)
OPTION_NAME
OPTION_VALUE
MANUAL_SORT
The list table simply describes a pick list. The list_ option table describes each option in a given list. So your queries will always start with knowing which pick list you'd like to populate (either by name or ID) which you join to the list_ option table to pull all the options. The manual_sort column is there just in case you want to enforce a particular order other than by name or value. (BTW, whenever I try to post the words "list" and "option" connected with an underscore, the preview window goes a little wacky. That's why I put a space there.)
The query would look something like:
select
b.option_name,
b.option_value
from
list a,
list_option b
where
a.name="States"
and
a.list_id = b.list_id
order by
b.manual_sort asc
You'll also want to create an index on list.name if you think you'll ever use it in a where clause. The pk and fk columns will typically automatically be indexed.
And please don't create a new table for each pick list unless you're putting in "relationally relevant" data that will be used elsewhere by the app. You'd be circumventing exactly the relational functionality that a database provides. You'd be better off statically defining pick lists as constants somewhere in a base class or a properties file (your choice on how to model the name-value pair).
Depending on your needs, you can just have an options table that has a list identifier and a list value as the primary key.
select optionDesc from Options where 'MyList' = optionList
You can then extend it with an order column, etc. If you have an ID field, that is how you can reference your answers back... of if it is often changing, you can just copy the answer value to the answer table.
If you don't mind using strings for the actual values, you can simply give each list a different list_id in value and populate a single table with :
item_id: int
list_id: int
text: varchar(50)
Seems easiest unless you need multiple things per list item
We actually created entities to handle simple pick lists. We created a Lookup table, that holds all the available pick lists, and a LookupValue table that contains all the name/value records for the Lookup.
Works great for us when we need it to be simple.
I've done this in two different ways:
1) unique tables per list
2) a master table for the list, with views to give specific ones
I tend to prefer the initial option as it makes updating lists easier (at least in my opinion).
Try turning the question around. Why do you need to pull it from the database? Isn't the data part of your model but you really want to persist it in the database? You could use an OR mapper like linq2sql or nhibernate (assuming you're in the .net world) or depending on the data you could store it manually in a table each - there are situations where it would make good sense to put it all in the same table but do consider this only if you feel it makes really good sense. Normally putting different data in different tables makes it a lot easier to (later) understand what is going on.
There are several approaches here.
1) Create one table per pick list. Each of the tables would have the ID and Name columns; the value that was picked by the user would be stored based on the ID of the item that was selected.
2) Create a single table with all pick lists. Columns: ID; list ID (or list type); Name. When you need to populate a list, do a query "select all items where list ID = ...". Advantage of this approach: really easy to add pick lists; disadvantage: a little more difficult to write group-by style queries (for example, give me the number of records that picked value X".
I personally prefer option 1, it seems "cleaner" to me.
You can use either a separate table for each (my preferred), or a common picklist table that has a type column you can use to filter on from your application. I'm not sure that one has a great benefit over the other generally speaking.
If you have more than 25 or so, organizationally it might be easier to use the single table solution so you don't have several picklist tables cluttering up your database.
Performance might be a hair better using separate tables for each if your lists are very long, but this is probably negligible provided your indexes and such are set up properly.
I like using separate tables so that if something changes in a picklist - it needs and additional attribute for instance - you can change just that picklist table with little effect on the rest of your schema. In the single table solution, you will either have to denormalize your picklist data, pull that picklist out into a separate table, etc. Constraints are also easier to enforce in the separate table solution.
This has served us well:
SQL> desc aux_values;
Name Type
----------------------------------------- ------------
VARIABLE_ID VARCHAR2(20)
VALUE_SEQ NUMBER
DESCRIPTION VARCHAR2(80)
INTEGER_VALUE NUMBER
CHAR_VALUE VARCHAR2(40)
FLOAT_VALUE FLOAT(126)
ACTIVE_FLAG VARCHAR2(1)
The "Variable ID" indicates the kind of data, like "Customer Status" or "Defect Code" or whatever you need. Then you have several entries, each one with the appropriate data type column filled in. So for a status, you'd have several entries with the "CHAR_VALUE" filled in.

Resources