I'm designing a DB and would like to know the best way to store this:
I have an user table and let's say it can have 100 item slots, and I store an ID.
Should I use JSON ({slot1:123232,slot20:123123123,slot82:23123}) or create 100+ fields (slot1, slot2, slotn)?
Thank you.
Third alternative, create another table for slots, and have a one-to-many relationship between users and slots. Your business logic would enforce the 100 slot limit.
I would recommend against doing the embedded JSON in the database. I'm not sure what DB you are using, but it will likely be very difficult to query the actual slot data for a given user without pulling out and parsing all 100 records.
To create a one-to-many relationship, you'll need a second table
Slots
id (primary key)
user_id (mapping to user table)
item_id (your slot # you want to store)
Now, you can do useful SQL queries like
SELECT * FROM Users,Slots WHERE Slots.user_id = Users.id AND Slots.item_id = 12345
Which would give you a list of all users who have slot item #12345
With database design normalization, you should not have multivalued attributes.
You might want this.
Users
=====
UserId
UserSlots
=========
UserId
SlotId
Value
Slots
=====
SlotId
Value
You should not create 100 fields.
Create a table with two fields, the ID and your "JSON Data", which you could store in a string or other field type depending on size.
You could normalize it as others have suggested, by that would increase your save and retrieve time.
Related
We have two tables. One is normal users tables where we are storing the normal user information this table we can say as a base table. And another is user_staff table which is also having different table but with user_id as a reference to user table.
Virtually saying the users and user_staff are almost same set of data but we are keeping this in different table since user table having master data and staff table having user_staff data.
While we are indexing this data in solr, we would like to keep all these users and user_staff in a single collection but we have to keep an additional key in each collection to mention the type like master, sub-account, user. But if we are usig this in a single collection how we can keep a connection of users_staff to user table as we do in mysql in solr?
I'm creating a rather large APEX application which allows managers to go in and record statistics for associates in the company. Currently we have a database in oracle with data from AD which hold all the associates information. Name, Manager, Employee ID, etc.
Now I'm responsible for creating and modeling a table that will house all their stats for each employee. The table I have created has over 90+ columns in it. Some contain data such as:
Documents Processed
Calls Received
Amount of Doc 1 Processed
Amount of Doc 2 Processed
and the list goes on for well over 90 attributes. So here is my question:
When creating this table in my application with so many different columns how would I go about choosing a primary key that's appropriate? Should I link it to our employee table using the employees identification which is unique (each have a associate number)?
Secondly, how can I create these tables (and possibly form) to allow me to associate the statistic I am entering for an individual to the actual individual?
I have ordered two books from amazon on data modeling since I am new to APEX and DBA design. Not a fresh chicken, but new enough to need some guidance. An additional problem I am running into is that each form can have only 60 fields to it. So I had thought about creating tables for different functions out of my 90+ I have.
Thanks
4.2 allows for 200 items per page.
oracle apex component limits
A couple of questions come to mind:
Are you sure that the employee Ids are not recyclable? If these ids are unique and not recycled.. you've found yourself a good primary key.
What do you plan on doing when you decide to add a new metric? Seems like you might have to add a new column to your rather large and likely not normalized table.
I'd recommend a vertical table for your metrics.. you can use oracle's pivot function to make your data appear more like a horizontal table.
If you went this route you would store your employee Id in one column, your metric key in another, and value...
I'd recommend that you create a metric table consisting of a primary key, a metric label, an active indicator, creation timestamp, creation user id, modified timestamp, modified user id.
This metric table will allow you to add new metrics, change the name of the metric, deactivate a metric, and determine who changed what and when.
This would be a much more flexible approach in my opinion. You may also want to think about audit logs.
Are tables with lots of columns indicative of bad design? For example say I have the following table that stores user information and user settings:
[Users table]
userId
name
address
somesetting1
...
somesetting50
As the site requires more settings the table gets larger. In my mind this table is normalized, all the settings are dependent on the userId.
I have a thing against tables with lots of columns it just seems wrong to me, but then I remembered that you can select what data to return from the table, so If the table is large I could still break it into several different objects in code. For example
[User object]
[UserSetting object]
and return only the data to fill those objects.
Is the above common practice, or are their other techniques that deal with tables with lots of columns that are more suitable to use?
I think you should use multiple tables like this:
[Users table]
userId
name
address
[Settings table]
settingId
userId
settingKey
settingValue
The tables are related by the userId column which you can use to retrieve the settings for the user you need to.
I would say that it is bad table design. If a user doesn't have an entry for 47 of those 50 settings then you will have a large number of NULL's in the table which isn't good practice and will also slow down performance (NULL's have to be handled in a special way).
Instead, have the following:
USER TABLE
Id,
FirstName
LastName
etc
SETTINGS
Id,
SettingName
USER SETTINGS
Id,
SettingId,
UserId,
SettingValue
You then have a many to many join, and eliminate NULL's
first, don't put spaces in table names! all the [braces] will be a real pain!
if you have 50 columns how meaningful will all that data be for each user? will there be lots of nulls? Most data may not even apply to any given user. Think 1 to 1 tables, where you break down the "settings" into logical groups:
Users: --main table where most values will be stored
userId
name
address
somesetting1 ---please note that I'm using "somesetting1", don't
... --- name the columns like this, use meaningful names!!
somesetting5
UserWidgets --all widget settings for the user
userId
somesetting6
....
somesetting12
UserAccounting --all accounting settings for the user
userId
somesetting13
....
somesetting23
--etc..
you only need to have a Users row for each user, and then a row in each table where that data applies to the given user. I f a user doesn't have any widget settings then no row for that user. You can LEFT join each table as necessary to get all the settings as needed. Usually you only need to work on a sub set of settings based on which part of the application that is running, which means you won't need to join in all of the tables, just the one or tow that you need at that time.
You could consider an attributes table. As long as your indexes are good, then you wouldn't have too much of a performance issue:
[AttributeDef]
AttributeDefId int (primary key)
GroupKey varchar(50)
ItemKey varchar(50)
...
[AttributeVal]
AttributeValId int (primary key)
AttributeDefId int (FK -> AttributeDef.AttributeDefId)
UserId int (probably FK to users table?)
Val varchar(255)
...
basically you're "pivoting" your table with many columns into 2 tables with less columns. You can write views and table functions around this structure to give you data for a group of related items or just a specific item, etc. You could also add other things to the attribute definition table to indicate required data elements, restrictions on the data elements, etc.
What's your thought on this type of design?
Use several tables with matching indexes to get the best SELECT speed. Use the indexes as a way to relate the information between tables using a JOIN.
I have a user table which has about 50-ish pieces of data. Some of it is Religion, political party, Ethnicity, City, Favorite movies, etc. Each of these items are lookup values from either: Their own lookup table OR I have a common lookup table for the small items like gender, sex preference, etc. Even favorite movie is from a movie lookup table.
The question is i assume in the member table all these will be stored as IDs and not text? So first Q:
1) Should they or should they not have FKs to the lookup tables?
2) If we store IDs then to get the actual answer text like Id 6 in city table = new york, Id 10 in nationality table = American etc. for the actual output on the page ,how will it be done? Do we need to Select from each lookup table in the read mode to output the text value? This scares me because out of the 50 pieces of data about 40 of them are lookup based, so that means 40 different select on 40 tables on page read mode and again on edit mode for the user to edit the values.
How is this implemented in real world sites with detailed user profiles? (I have search and analytics on each value so I need to ID them)
Depends on the scope, but this sounds like a sync process - setup a weekly/daily/hourly process to resync extended user information into a master table with a foreign key to the "user"-related table (username, password, email, update stamps, etc...).
What you've described is the big tradeoff between normalized DB design and more of a flat-table design: the queries are a lot more complicated with the normalized design, which is sounds like you have.
I'd think that you'd be reading from the table a lot more than you'd be writing to it? (How often does a person's religion, gender, city, etc. change?) In this case, (only) if you're running into performance issues on the read end, you might maintain two representations of the table: one extensible, normalized one like you have, and a plain-text, flat version that's fast and piece of cake to query and read. When you update the record in the normalized one, you update the record in the flat one.
I am looking at a problem which would involve users uploading lists of records with various field structures into an application. The 2nd part of this would be to also allow the users to specify fields to capture information.
This is a step beyond anything ive done up to this point where i would have designed a static RDMS structure myself. In some respects all records will be treated the same so there will be some common fields required for each. Almost all queries will be run on these common fields.
My first thought would be to dynamically generate a new table for each import and another for each data capture field spec.Then have a master table with a guid for every record in the application along with the common fields and then fields that specify the name of the table the data was imported to and name of table with the data capture fields.
Further information (metadata?) about the fields in the dynamically generated tables could be stored in xml or in a 'property' table.
This would mean as users log into the application i would be dynamically choosing which table of data to presented to the user, and there would be a large number of tables in the database if it was say not only multiuser but then multitennant.
My question is are there other methods to solving this kind of varaible field issue, im i going down an unadvised path here?
I believe that EAV would require me to have a table defining the fields for each import / data capture spec and then another table with the import - field - values data and that seems impracticle.
I hate storing XML in the database, but this is a perfect example of when it makes sense. Store the user imports in XML initially. As your data schema matures, you can later decide which tables to persist for your larger clients. When the users pick which fields they want to query, that's when you come back and build a solid schema.
What kind is each field? Could the type of field be different for each record?
I am working on a program now that does this sorta and the way we handle it is basically a record table which points to a recordfield table. the recordfield table contains all of the fields along with the field name of the actual field in the database(the column name). We then have a recorddata table which is where all the data goes for each record. We also store a record_id telling it which record it is holding.
This is how we do it where if each column for the record is the same type, then we don't need to add new columns to the table, and if it has more fields or fields of a different type, then we add fields as appropriate to the data table.
I think this is what you are talking about.. correct me if I'm wrong.
I think that one additional table for each type of user defined field for the table that the user can add the fields to is a good way to go.
Say you load your records into user_records(id), that table would have an id column which is a foreign key in the user defined fields tables.
user defined string fields would go in user_records_string(id, name), where id is a foreign key to user_records(id), and name is a string, or a foreign key to a list of user defined string fields.
Searching on them requires joining them in to the base table, probably with a sub-select to filter down to one field based on the user meta-data, so that the right field can be added to the query.
To simulate the user creating multiple tables, you can have a foreign key in the user_records table that points at a table list, and filter on that when querying for a single table.
This would allow your schema to be static while allowing the user to arbitrarily add fields and tables.