Is using multiple tables an advisable solution to dealing with user defined fields? - sql-server

I am looking at a problem which would involve users uploading lists of records with various field structures into an application. The 2nd part of this would be to also allow the users to specify fields to capture information.
This is a step beyond anything ive done up to this point where i would have designed a static RDMS structure myself. In some respects all records will be treated the same so there will be some common fields required for each. Almost all queries will be run on these common fields.
My first thought would be to dynamically generate a new table for each import and another for each data capture field spec.Then have a master table with a guid for every record in the application along with the common fields and then fields that specify the name of the table the data was imported to and name of table with the data capture fields.
Further information (metadata?) about the fields in the dynamically generated tables could be stored in xml or in a 'property' table.
This would mean as users log into the application i would be dynamically choosing which table of data to presented to the user, and there would be a large number of tables in the database if it was say not only multiuser but then multitennant.
My question is are there other methods to solving this kind of varaible field issue, im i going down an unadvised path here?
I believe that EAV would require me to have a table defining the fields for each import / data capture spec and then another table with the import - field - values data and that seems impracticle.

I hate storing XML in the database, but this is a perfect example of when it makes sense. Store the user imports in XML initially. As your data schema matures, you can later decide which tables to persist for your larger clients. When the users pick which fields they want to query, that's when you come back and build a solid schema.

What kind is each field? Could the type of field be different for each record?
I am working on a program now that does this sorta and the way we handle it is basically a record table which points to a recordfield table. the recordfield table contains all of the fields along with the field name of the actual field in the database(the column name). We then have a recorddata table which is where all the data goes for each record. We also store a record_id telling it which record it is holding.
This is how we do it where if each column for the record is the same type, then we don't need to add new columns to the table, and if it has more fields or fields of a different type, then we add fields as appropriate to the data table.
I think this is what you are talking about.. correct me if I'm wrong.

I think that one additional table for each type of user defined field for the table that the user can add the fields to is a good way to go.
Say you load your records into user_records(id), that table would have an id column which is a foreign key in the user defined fields tables.
user defined string fields would go in user_records_string(id, name), where id is a foreign key to user_records(id), and name is a string, or a foreign key to a list of user defined string fields.
Searching on them requires joining them in to the base table, probably with a sub-select to filter down to one field based on the user meta-data, so that the right field can be added to the query.
To simulate the user creating multiple tables, you can have a foreign key in the user_records table that points at a table list, and filter on that when querying for a single table.
This would allow your schema to be static while allowing the user to arbitrarily add fields and tables.

Related

Indexing two user type information in one collection in solr

We have two tables. One is normal users tables where we are storing the normal user information this table we can say as a base table. And another is user_staff table which is also having different table but with user_id as a reference to user table.
Virtually saying the users and user_staff are almost same set of data but we are keeping this in different table since user table having master data and staff table having user_staff data.
While we are indexing this data in solr, we would like to keep all these users and user_staff in a single collection but we have to keep an additional key in each collection to mention the type like master, sub-account, user. But if we are usig this in a single collection how we can keep a connection of users_staff to user table as we do in mysql in solr?

Oracle APEX - Data Modeling & Primary Keys

I'm creating a rather large APEX application which allows managers to go in and record statistics for associates in the company. Currently we have a database in oracle with data from AD which hold all the associates information. Name, Manager, Employee ID, etc.
Now I'm responsible for creating and modeling a table that will house all their stats for each employee. The table I have created has over 90+ columns in it. Some contain data such as:
Documents Processed
Calls Received
Amount of Doc 1 Processed
Amount of Doc 2 Processed
and the list goes on for well over 90 attributes. So here is my question:
When creating this table in my application with so many different columns how would I go about choosing a primary key that's appropriate? Should I link it to our employee table using the employees identification which is unique (each have a associate number)?
Secondly, how can I create these tables (and possibly form) to allow me to associate the statistic I am entering for an individual to the actual individual?
I have ordered two books from amazon on data modeling since I am new to APEX and DBA design. Not a fresh chicken, but new enough to need some guidance. An additional problem I am running into is that each form can have only 60 fields to it. So I had thought about creating tables for different functions out of my 90+ I have.
Thanks
4.2 allows for 200 items per page.
oracle apex component limits
A couple of questions come to mind:
Are you sure that the employee Ids are not recyclable? If these ids are unique and not recycled.. you've found yourself a good primary key.
What do you plan on doing when you decide to add a new metric? Seems like you might have to add a new column to your rather large and likely not normalized table.
I'd recommend a vertical table for your metrics.. you can use oracle's pivot function to make your data appear more like a horizontal table.
If you went this route you would store your employee Id in one column, your metric key in another, and value...
I'd recommend that you create a metric table consisting of a primary key, a metric label, an active indicator, creation timestamp, creation user id, modified timestamp, modified user id.
This metric table will allow you to add new metrics, change the name of the metric, deactivate a metric, and determine who changed what and when.
This would be a much more flexible approach in my opinion. You may also want to think about audit logs.

Multiple elements in one database cell

The question is how database design should I apply for this situation:
main table:
ID | name | number_of_parameters | parameters
parameters table:
parameter | parameter | parameter
Number of elements in parameters table does not change. number_of_parameters cell defines how many parameters tables should be stored in next cell.
I have problems to move from object thinking to database design. So when we talk about object one row has as much parameters as number_of_parameters says.
I hope that description of requirements is clear. What is the correct way to design such database. If someone can provide some SQL statments to obtain it it would be nice. But the main goal of this question is to understand how to make such architecture.
I want to use SQLite to create this database.
The relational way is to have two tables. The main table has an ID, name and as many other universally-present parameters as possible. The parameters table contains a mapping from an ID in the main table to a parameter name and a parameter value; the main table ID should be a foreign key, and the combination of ID and name should be unique.
The number of parameters can be found by just counting the number of rows with a particular ID.
If you can serialize the data whiile saving to the database and deserialize it back when you get the record it will work. You can get total number of objects in serialized container and save the count to the number_of_parameters field and serialized data in parameters field.
There isn't one perfect correct way, but if you want to use a relational database, you preferably have relational tables.
If you have a key-value database, you place your serialized data as a document attached to your key.
If you want a hybrid solution, both human editable and single table, you can serialize your data to a human-readable format such as yaml, which sees heavy usage in configuration sections of open source projects.

Database Normalization and Nested Lists -- Cannot Think of a Solution

I am trying to implement a system on my website similar to that of Facebook's "Like" feature. Where users can click a button which counter++'s. However, I have run into a problem in terms of efficiently storing data into my DB.
Each story has it's own row in the stories table in my DB with the columns like and users_like.
I want each person to only be able to like the story once. Therefore I need to somehow store data that shows that the user has, in fact, like++'d the post.
All I could thing of was to have a column named users_like and then add each user, followed by a comma, to the column using CONCAT and then using the php function to explode the data.
However, this method, as far as I know, is in the opposite direction of database normalization.
What is the best way to do this and I understand "best" is subjective.
I cannot add a liked flag to the user table because there will be a vast number of stories the person could 'like.'
Thanks
You need a many to many table in your database that will store a foreign key to the stories table and a foreign key to the user table. You put a constraint on this table saying that the story fk - user fk combo must be unique.
You now don't even have to have a like column, you just count the number of rows in the many to many table corresponding to your story.

Organizing database tables - large number of properties

I have a database that stores some users in it. Each user has its account settings, privacy settings and lots of other properties to set. The number of those properties started to grow and I could end up with 30 properties or so.
Till now, I used to keep it in "UserInfo" table having User and UserInfo related as One-To-Many (keeping a log of all changes). Putting it in a single "UserInfo" table doesn't sound nice and, at least in the database model, it would look messy. What's the solution?
Separating privacy settings, account settings and other "groups" of settings in separate tables and have 1-1 relations between UserInfo and each group of settings table is one solution, but would that be too slow (or much slower) when retrieving the data? I guess all data would not be presented on a single page at the same moment. So maybe having one-to-many relationships to each table is a solution too (keeping log of each group separately)?
If it's only 30 properties, I'd recommend just creating 30 columns. That's not too much for a modern database to handle.
But I would guess that if you ahve 30 properties today, you will continue to invent new properties as time goes on, and the number of columns will keep growing. Restructuring your table to add columns every day may become time-consuming as you get lots of rows.
For an alternative solution check out this blog for a nifty solution for storing lots of dynamic attributes in a "schemaless" way: How FriendFeed Uses MySQL.
Basically, collect all the properties into some format and store it in a single TEXT column. The format is semi-structured, that is your application can separate the properties if needed but you can also add more at any time, or even have different properties per row. XML or YAML or JSON are example formats, or some object serialization format supported by your application code language.
CREATE TABLE Users (
user_id SERIAL PRIMARY KEY,
user_proerties TEXT
);
This makes it hard to search for a given value in a given property. So in addition to the TEXT column, create an auxiliary table for each property you want to be searchable, with two columns: values of the given property, and a foreign key back to the main table where that particular value is found. Now you have can index the column so lookups are quick.
CREATE TABLE UserBirthdate (
user_id BIGINT UNSIGNED PRIMARY KEY,
birthdate DATE NOT NULL,
FOREIGN KEY (user_id) REFERENCES Users(user_id),
KEY (birthdate)
);
SELECT u.* FROM Users AS u INNER JOIN UserBirthdate b USING (user_id)
WHERE b.birthdate = '2001-01-01';
This means as you insert or update a row in Users, you also need to insert or update into each of your auxiliary tables, to keep it in sync with your data. This could grow into a complex chore as you add more auxiliary tables.

Resources