Oracle APEX - Data Modeling & Primary Keys - data-modeling

I'm creating a rather large APEX application which allows managers to go in and record statistics for associates in the company. Currently we have a database in oracle with data from AD which hold all the associates information. Name, Manager, Employee ID, etc.
Now I'm responsible for creating and modeling a table that will house all their stats for each employee. The table I have created has over 90+ columns in it. Some contain data such as:
Documents Processed
Calls Received
Amount of Doc 1 Processed
Amount of Doc 2 Processed
and the list goes on for well over 90 attributes. So here is my question:
When creating this table in my application with so many different columns how would I go about choosing a primary key that's appropriate? Should I link it to our employee table using the employees identification which is unique (each have a associate number)?
Secondly, how can I create these tables (and possibly form) to allow me to associate the statistic I am entering for an individual to the actual individual?
I have ordered two books from amazon on data modeling since I am new to APEX and DBA design. Not a fresh chicken, but new enough to need some guidance. An additional problem I am running into is that each form can have only 60 fields to it. So I had thought about creating tables for different functions out of my 90+ I have.
Thanks

4.2 allows for 200 items per page.
oracle apex component limits
A couple of questions come to mind:
Are you sure that the employee Ids are not recyclable? If these ids are unique and not recycled.. you've found yourself a good primary key.
What do you plan on doing when you decide to add a new metric? Seems like you might have to add a new column to your rather large and likely not normalized table.
I'd recommend a vertical table for your metrics.. you can use oracle's pivot function to make your data appear more like a horizontal table.
If you went this route you would store your employee Id in one column, your metric key in another, and value...
I'd recommend that you create a metric table consisting of a primary key, a metric label, an active indicator, creation timestamp, creation user id, modified timestamp, modified user id.
This metric table will allow you to add new metrics, change the name of the metric, deactivate a metric, and determine who changed what and when.
This would be a much more flexible approach in my opinion. You may also want to think about audit logs.

Related

SQL Server Employee table with existing ID number

I am attempting to create an Employee table in SQL Server 2016 and I want to use EmpID as the Primary Key and Identity. Here is what I believe to be true and my question: When I create the Employee table with EmpID as the Primary Key and an Identity(100, 1) column, each time I add a new employee, SQL Server will auto create the EmpID starting with 100 and increment by 1 with each new employee. What happens if I want to import a list of existing employees from another company and those employees already have an existing EmpID? I haven't been able to figure out how I would import those employees with the existing EmpID. If there is a way to import the employee list with the existing EmpID, will SQL Server check to make sure the EmpID's from the new list does not exist for a current employee? Or is there some code I need to write in order to make that happen?
Thanks!
You are right about primary keys, but about importing employees from another company and Merging it with your employee list, you have to ask these things:
WHY? Sure there are ways to solve this problem, but why will you merge other company employees into your company employee?
Other company ID structure: Most of the time, companies have different ID structure, some have 4 characters others have only numbers so on and so forth. But you have to know the differences of the companies ID Structure.
If the merging can't be avoided, then you have to tell the higher ups about the concern, and you have to tell them that you have to give the merging company new employee ID's which is a must. With this in my, simply appending your database with the new data is the solution.
This is an extremely normal data warehousing issue where a table has data sources from multiple places. Also comes up in migration, acquisitions, etc.
There is no way to keep the existing IDs as a primary key if there are multiple people with the same ID.
In the data warehouse world we would always create a new surrogate key, which is the primary key to the table, and include the original key and a source system identifier as two attributes.
In your scenario you will probably keep the existing keys for the original company, and create new IDs for the new employees, and save the oldID in an additional column for historical use.
Either of these choices also means that as you migrate other associated data such as leave information imported from the old system, you can translate it to the new key by looking up OldID in the employee table, and finding the associated newID to associate it with when saving your lave records in the new system.
At the end of the day there is no alternative to this, as you simply cant have two employees with the same primary key.
I have never seen any company that migrate employees from another company and keep their existed employee id. Usually, they'll give them a new ID and keep the old one in the employee file for references uses. But they never uses the old one as an active ID ever.
Large companies usually uses serial of special identities that are already defined in the system to distinguish employees based on field, specialty..etc.
Most companies they don't do the same as large ones, but instead, they stick with one identifier, and uses dimensions as an identity. These dimensions specify areas of work for employees, projects, vendors ..etc. So, they're used in the system globally, and affected on company financial reports (which is the main point of using it).
So, what you need to do is to see the company ID sequence requirements, then, play your part on that. As depending on IDENTITY alone won't be enough for most companies. If you see that you can depend on identity alone, then use it, if not, then see if you can use dimensions as an identity (you could create five dimensions - Company, Project, Department, Area, Cost Center - it will be enough for any company).
if you used identity alone, and want to migrate, then in your insert statement do :
SET IDENTITY_INSERT tableName ON
INSRT INTO tableName (columns)
...
this will allow you to insert inside identity column, however, doing this might require you to reset the identity to a new value, to avoid having issues. read DBCC CHECKIDENT
If you end up using dimensions, you could make the dimension and ID both primary keys, which will make sure that both are unique in the table (treated as one set).

Where should I store repetitive data in Access?

I'm creating this little Access DB, for the HR department to store all data related to all the training sessions that the company organizes for all the employees.
So, I have a Training Session table with information like date, subject, place, observations, trainer, etc, and the unique ID number.
Then there's the Personnel table, with employer ID (which is also the unique table number), names and working department.
So, after that I need another table that keeps a record of all the attendants of each training session. And here's the question, should I use a table for that in the first place? Does it have to be one table for each training session to store the attendants?
I've used excel for quite some time now, but I'm very new to Access and databases (even small ones like this). Any information will be highly appreciated.
Thanks in advance!
It should be one table for persons, one table for trainings, and one for participation/attendance, to minimize (or best: avoid) repetition. Your tables should use primary and foreign keys, so that there are one-to-many relationships between trainings and attendances as well as people and attendances (the attendances table would then have a column referring to the person who attended, and another column referring to the training session).
Google "database normalization" for more detail and variations of that principle (https://en.wikipedia.org/wiki/Database_normalization).

How can I store an indefinite amount of stuff in a field of my database table?

Heres a simple version of the website I'm designing: Users can belong to one or more groups. As many groups as they want. When they log in they are presented with the groups the belong to. Ideally, in my Users table I'd like an array or something that is unbounded to which I can keep on adding the IDs of the groups that user joins.
Additionally, although I realize this isn't necessary, I might want a column in my Group table which has an indefinite amount of user IDs which belong in that group. (side question: would that be more efficient than getting all the users of the group by querying the user table for users belonging to a certain group ID?)
Does my question make sense? Mainly I want to be able to fill a column up with an indefinite list of IDs... The only way I can think of is making it like some super long varchar and having the list JSON encoded in there or something, but ewww
Please and thanks
Oh and its a mysql database (my website is in php), but 2 years of php development I've recently decided php sucks and I hate it and ASP .NET web applications is the only way for me so I guess I'll be implementing this on whatever kind of database I'll need for that.
Your intuition is correct; you don't want to have one column of unbounded length just to hold the user's groups. Instead, create a table such as user_group_membership with the columns:
user_id
group_id
A single user_id could have multiple rows, each with the same user_id but a different group_id. You would represent membership in multiple groups by adding multiple rows to this table.
What you have here is a many-to-many relationship. A "many-to-many" relationship is represented by a third, joining table that contains both primary keys of the related entities. You might also hear this called a bridge table, a junction table, or an associative entity.
You have the following relationships:
A User belongs to many Groups
A Group can have many Users
In database design, this might be represented as follows:
This way, a UserGroup represents any combination of a User and a Group without the problem of having "infinite columns."
If you store an indefinite amount of data in one field, your design does not conform to First Normal Form. FNF is the first step in a design pattern called data normalization. Data normalization is a major aspect of database design. Normalized design is usually good design although there are some situations where a different design pattern might be better adapted.
If your data is not in FNF, you will end up doing sequential scans for some queries where a normalized database would be accessed via a quick lookup. For a table with a billion rows, this could mean delaying an hour rather than a few seconds. FNF guarantees a direct access lookup path for each item of data.
As other responders have indicated, such a design will involve more than one table, to be joined at retrieval time. Joining takes some time, but it's tiny compared to the time wasted in sequential scans, if the data volume is large.

Indicating primary/default record in database

I've been struggling with how I should indicate that a certain record in a database is the "fallback" or default entry. I've also been struggling with how to reduce my problem to a simple problem statement. I'm going to have to provide an example.
Suppose that you are building a very simple shipping application. You'll take orders and will need to decide which warehouse to ship them from.
Let's say that you have a few cities that have their own dedicated warehouses*; if an order comes in from one of those cities, you'll ship from that city's warehouse. If an order comes in from any other city, you want to ship from a certain other warehouse. We'll call that certain other warehouse the fallback warehouse.
You might decide on a schema like this:
Warehouses
WarehouseId
Name
WarehouseCities
WarehouseId
CityName
The solution must enforce zero or one fallback warehouses.
You need a way to indicate which warehouse should be used if there aren't any warehouses specified for the city in question. If it really matters, you're doing this on SQL Server 2008.
EDIT: To be clear, all valid cities are NOT present in the WarehouseCities table. It is possible for an order to be received for a City not listed in WarehouseCities. In such a case, we need to be able to select the fallback warehouse.
If any number of default warehouses were allowed, or if I was assigning default warehouses to, say, states, I would use a DefaultWarehouse table. I could use such a table here, but I would need to limit it to exactly one row, which doesn't feel right.
How would you indicate the fallback warehouse?
*Of course, in this example we discount the possibility that there might be multiple cities with the same name. The country you are building this application for rigorously enforces a uniqueness constraint on all city names.
I understand your problem, but have questions about parts of it, so I'll be a bit more general.
If at all possible I would store warehouse/backup warehouse data with your inventory data (either directly hanging of warehouses, or if it's product specific off the inventory tables).
If the setup has to be calculated through your business logic then the records should hang off the order/order_item table
In terms how to implement the structure in SQL, I'll assume that all orders ship out of a single warehouse and that the shipping must be hung off the orders table (but the ideas should be applicable elsewhere):
The older way to enforce zero/one backup warehouses would be to hang a Warehouse_Source record of the Orders table and include an "IsPrimary" field or "ShippingPriority" then include a composite unique index that includes OrderID and IsPrimary/ShippingPriority.
if you will only ever have one backup warehouse you could add ShippingSource_WareHouseID and ShippingSource_Backup_WareHouseID fields to the order. Although, this isn't the route I would go.
In SQL 2008 and up we have the wonderful addition of Filtered Indexes. These allow you to add a WHERE clause to your index -- resulting in a more compact index. It also has the added benefit of allowing you to accomplish some things that could only be done through triggers in the past.
You could put a Unique filtered index on OrderID & IsPrimary/ShippingPriority (WHERE IsPrimary = 0).
Add a comment or such if you want me to explain further.
re: how I should indicate that a certain record in a database is the "fallback" or default entry
use another column, isFallback, holding a binary value. I'm assuming your fallback warehouse won't have any cities associated with it.
As I see it, the fallback warehouse after all is just another warehouse and if I understood, every record in WarehouseCities has a reference to one record in Warehouses:
WarehouseCities(*)...(1)Warehouses
Which means that if there are a hundred cities without a dedicated warehouse they all will reference the id of an specific fallback warehouse. So I don't see any problem (which makes me thing I didn't understand the problem), even the model looks well defined.
Now you could identify if a warehouse is fallback warehouse with an attribute like type_warehouse on Warehouses.
EDIT after comment
Assuming there is only one fallback warehouse for the cities not present in WarehouseCities, I suggest to keep the fallback warehouse as just another warehouse and keep its Id (WarehouseId) as an application parameter (a table for parameters maybe?), of course, this solution is programmatically and not attached to your database platform.

Organizing database tables - large number of properties

I have a database that stores some users in it. Each user has its account settings, privacy settings and lots of other properties to set. The number of those properties started to grow and I could end up with 30 properties or so.
Till now, I used to keep it in "UserInfo" table having User and UserInfo related as One-To-Many (keeping a log of all changes). Putting it in a single "UserInfo" table doesn't sound nice and, at least in the database model, it would look messy. What's the solution?
Separating privacy settings, account settings and other "groups" of settings in separate tables and have 1-1 relations between UserInfo and each group of settings table is one solution, but would that be too slow (or much slower) when retrieving the data? I guess all data would not be presented on a single page at the same moment. So maybe having one-to-many relationships to each table is a solution too (keeping log of each group separately)?
If it's only 30 properties, I'd recommend just creating 30 columns. That's not too much for a modern database to handle.
But I would guess that if you ahve 30 properties today, you will continue to invent new properties as time goes on, and the number of columns will keep growing. Restructuring your table to add columns every day may become time-consuming as you get lots of rows.
For an alternative solution check out this blog for a nifty solution for storing lots of dynamic attributes in a "schemaless" way: How FriendFeed Uses MySQL.
Basically, collect all the properties into some format and store it in a single TEXT column. The format is semi-structured, that is your application can separate the properties if needed but you can also add more at any time, or even have different properties per row. XML or YAML or JSON are example formats, or some object serialization format supported by your application code language.
CREATE TABLE Users (
user_id SERIAL PRIMARY KEY,
user_proerties TEXT
);
This makes it hard to search for a given value in a given property. So in addition to the TEXT column, create an auxiliary table for each property you want to be searchable, with two columns: values of the given property, and a foreign key back to the main table where that particular value is found. Now you have can index the column so lookups are quick.
CREATE TABLE UserBirthdate (
user_id BIGINT UNSIGNED PRIMARY KEY,
birthdate DATE NOT NULL,
FOREIGN KEY (user_id) REFERENCES Users(user_id),
KEY (birthdate)
);
SELECT u.* FROM Users AS u INNER JOIN UserBirthdate b USING (user_id)
WHERE b.birthdate = '2001-01-01';
This means as you insert or update a row in Users, you also need to insert or update into each of your auxiliary tables, to keep it in sync with your data. This could grow into a complex chore as you add more auxiliary tables.

Resources