SQL Server Employee table with existing ID number

SQL Server Employee table with existing ID number - sql-server

I am attempting to create an Employee table in SQL Server 2016 and I want to use EmpID as the Primary Key and Identity. Here is what I believe to be true and my question: When I create the Employee table with EmpID as the Primary Key and an Identity(100, 1) column, each time I add a new employee, SQL Server will auto create the EmpID starting with 100 and increment by 1 with each new employee. What happens if I want to import a list of existing employees from another company and those employees already have an existing EmpID? I haven't been able to figure out how I would import those employees with the existing EmpID. If there is a way to import the employee list with the existing EmpID, will SQL Server check to make sure the EmpID's from the new list does not exist for a current employee? Or is there some code I need to write in order to make that happen?
Thanks!

You are right about primary keys, but about importing employees from another company and Merging it with your employee list, you have to ask these things:
WHY? Sure there are ways to solve this problem, but why will you merge other company employees into your company employee?
Other company ID structure: Most of the time, companies have different ID structure, some have 4 characters others have only numbers so on and so forth. But you have to know the differences of the companies ID Structure.
If the merging can't be avoided, then you have to tell the higher ups about the concern, and you have to tell them that you have to give the merging company new employee ID's which is a must. With this in my, simply appending your database with the new data is the solution.

This is an extremely normal data warehousing issue where a table has data sources from multiple places. Also comes up in migration, acquisitions, etc.
There is no way to keep the existing IDs as a primary key if there are multiple people with the same ID.
In the data warehouse world we would always create a new surrogate key, which is the primary key to the table, and include the original key and a source system identifier as two attributes.
In your scenario you will probably keep the existing keys for the original company, and create new IDs for the new employees, and save the oldID in an additional column for historical use.
Either of these choices also means that as you migrate other associated data such as leave information imported from the old system, you can translate it to the new key by looking up OldID in the employee table, and finding the associated newID to associate it with when saving your lave records in the new system.
At the end of the day there is no alternative to this, as you simply cant have two employees with the same primary key.

I have never seen any company that migrate employees from another company and keep their existed employee id. Usually, they'll give them a new ID and keep the old one in the employee file for references uses. But they never uses the old one as an active ID ever.
Large companies usually uses serial of special identities that are already defined in the system to distinguish employees based on field, specialty..etc.
Most companies they don't do the same as large ones, but instead, they stick with one identifier, and uses dimensions as an identity. These dimensions specify areas of work for employees, projects, vendors ..etc. So, they're used in the system globally, and affected on company financial reports (which is the main point of using it).
So, what you need to do is to see the company ID sequence requirements, then, play your part on that. As depending on IDENTITY alone won't be enough for most companies. If you see that you can depend on identity alone, then use it, if not, then see if you can use dimensions as an identity (you could create five dimensions - Company, Project, Department, Area, Cost Center - it will be enough for any company).
if you used identity alone, and want to migrate, then in your insert statement do :
SET IDENTITY_INSERT tableName ON
INSRT INTO tableName (columns)
...
this will allow you to insert inside identity column, however, doing this might require you to reset the identity to a new value, to avoid having issues. read DBCC CHECKIDENT
If you end up using dimensions, you could make the dimension and ID both primary keys, which will make sure that both are unique in the table (treated as one set).

Related

Oracle APEX - Data Modeling & Primary Keys

I'm creating a rather large APEX application which allows managers to go in and record statistics for associates in the company. Currently we have a database in oracle with data from AD which hold all the associates information. Name, Manager, Employee ID, etc.
Now I'm responsible for creating and modeling a table that will house all their stats for each employee. The table I have created has over 90+ columns in it. Some contain data such as:
Documents Processed
Calls Received
Amount of Doc 1 Processed
Amount of Doc 2 Processed
and the list goes on for well over 90 attributes. So here is my question:
When creating this table in my application with so many different columns how would I go about choosing a primary key that's appropriate? Should I link it to our employee table using the employees identification which is unique (each have a associate number)?
Secondly, how can I create these tables (and possibly form) to allow me to associate the statistic I am entering for an individual to the actual individual?
I have ordered two books from amazon on data modeling since I am new to APEX and DBA design. Not a fresh chicken, but new enough to need some guidance. An additional problem I am running into is that each form can have only 60 fields to it. So I had thought about creating tables for different functions out of my 90+ I have.
Thanks

4.2 allows for 200 items per page.
oracle apex component limits
A couple of questions come to mind:
Are you sure that the employee Ids are not recyclable? If these ids are unique and not recycled.. you've found yourself a good primary key.
What do you plan on doing when you decide to add a new metric? Seems like you might have to add a new column to your rather large and likely not normalized table.
I'd recommend a vertical table for your metrics.. you can use oracle's pivot function to make your data appear more like a horizontal table.
If you went this route you would store your employee Id in one column, your metric key in another, and value...
I'd recommend that you create a metric table consisting of a primary key, a metric label, an active indicator, creation timestamp, creation user id, modified timestamp, modified user id.
This metric table will allow you to add new metrics, change the name of the metric, deactivate a metric, and determine who changed what and when.
This would be a much more flexible approach in my opinion. You may also want to think about audit logs.

Best Table Relationship Design for Similar Entities

I am trying to figure out the best way to set up my Entity Diagram. I will explain based on the image below.
tblParentCustomer: This table stores information for our Primary Customers, which can either be a Business or Consumer.(They are identified using a lookup table tblCustomerType.)
tblChildCustomer: This table stores customers that are under the Primary Customer. The Primary Business customers can have Authorized Employees and Authorized Reps. The Primary Consumer customers can have Authorized Users. (They are identified using a lookup table tblCustomerType.)
tblChildAccountNumber: This table stores AccountNumbers for tblChildCustomer. These account numbers are mainly for the Child Business Customers. I may be adding Account Numbers for the Child Consumer customers, I am not sure yet, but I believe this design will allow for that if/when necessary.
Going back to tblParentCustomer : If this customer is a Consumer, I will need to add account numbers for them. My question is, do I create a 1 - Many relationship between tblParentCustomer and tblParentAccountNumber? This option would give me 2 different Account Number Tables.
Or would it make sense to create a Junction Account Table that intersects tblParentCustomer and tblChildCustomer?
The first option doesn't really make sense to me because what if there is only 1 Account number for a customer but multiple childCustomers?
Does it make sense to have 2 similar Account Tables that serve a different purpose?

Creating a many-to-many the way you want it to be, you need a link table that will make the whole thing go from 1-* and then *-1
That link table will have two FK, one linking to the parentTable and one linking to the childTable. Combination of those two FK will give you a composite PK (this is important to avoid duplicates). It will allow for any customer to be part of as many accounts as possible (duh.. it'll make the parent/child table a many-to-many relationship).
This approach is extremely common with regards to CRM or any Accounts containing people. Bring it one step further and in that table, you might want to add a "is primary contact" in the AccountMembers table. Drop the childAccountNumber table; you don't need it.

Two separated tables vs One table with two columns

I am creating a windows forms application that must control the entry and exit of people in an office building. These people may be visitors or employees, and everybody must use an access card at the building entrance. A card will be programmed temporarily when the person is a visitor, and the employees should use their own cards. My doubt is about my database. Is there a way to do this nicely? The cards (no matter if it is an employee or visitor ) has a strong key to the ratchet can identify it wich comes from the manufacturer of turnstiles and I can't change it. So, my structure is:
In my database, I have a table where I keep the cards. When someone try to get inside the building, the turnstile sends to my system the date and time of access and the card code. Now I do not know how to separate the employees and visitors. Should I have a separated flow table to employees and visitors? For the employees flow table, I get the employees card from the card table using the same ID. In the visitor flow table, I need to know who is the visitor using the temporary card (the key to the temporary card never changes, so I can not rely on me only in the key). Or should I only have a flow table with a Visitor_ID and Worker_ID column , one of which will always be null (so I know if it was an employee or a visitor by the field with a value).
Can anyone tell me which of the two is more applicable and why?

Employees and visitors are both people. Specifically people that may (or may not) have an access card assigned to them.
I would have one People table that has a foreign key relationship to the AccessCard table. If you only care about whether the person is an employee or visitor, but the information you store is otherwise identical, a boolean column is fine. If your system stores additional information for employees and/or visitors, create an Employees and Visitors table, and have a foreign key relationship from People to each of those.

I would create single table to store both employee and visitor then add an extra column for Type(E, V).

Organizing database tables - large number of properties

I have a database that stores some users in it. Each user has its account settings, privacy settings and lots of other properties to set. The number of those properties started to grow and I could end up with 30 properties or so.
Till now, I used to keep it in "UserInfo" table having User and UserInfo related as One-To-Many (keeping a log of all changes). Putting it in a single "UserInfo" table doesn't sound nice and, at least in the database model, it would look messy. What's the solution?
Separating privacy settings, account settings and other "groups" of settings in separate tables and have 1-1 relations between UserInfo and each group of settings table is one solution, but would that be too slow (or much slower) when retrieving the data? I guess all data would not be presented on a single page at the same moment. So maybe having one-to-many relationships to each table is a solution too (keeping log of each group separately)?

If it's only 30 properties, I'd recommend just creating 30 columns. That's not too much for a modern database to handle.
But I would guess that if you ahve 30 properties today, you will continue to invent new properties as time goes on, and the number of columns will keep growing. Restructuring your table to add columns every day may become time-consuming as you get lots of rows.
For an alternative solution check out this blog for a nifty solution for storing lots of dynamic attributes in a "schemaless" way: How FriendFeed Uses MySQL.
Basically, collect all the properties into some format and store it in a single TEXT column. The format is semi-structured, that is your application can separate the properties if needed but you can also add more at any time, or even have different properties per row. XML or YAML or JSON are example formats, or some object serialization format supported by your application code language.
CREATE TABLE Users (
user_id SERIAL PRIMARY KEY,
user_proerties TEXT
);
This makes it hard to search for a given value in a given property. So in addition to the TEXT column, create an auxiliary table for each property you want to be searchable, with two columns: values of the given property, and a foreign key back to the main table where that particular value is found. Now you have can index the column so lookups are quick.
CREATE TABLE UserBirthdate (
user_id BIGINT UNSIGNED PRIMARY KEY,
birthdate DATE NOT NULL,
FOREIGN KEY (user_id) REFERENCES Users(user_id),
KEY (birthdate)
);
SELECT u.* FROM Users AS u INNER JOIN UserBirthdate b USING (user_id)
WHERE b.birthdate = '2001-01-01';
This means as you insert or update a row in Users, you also need to insert or update into each of your auxiliary tables, to keep it in sync with your data. This could grow into a complex chore as you add more auxiliary tables.

What is the best practices in db design when I want to store a value that is either selected from a dropdown list or user-entered?

I am trying to find the best way to design the database in order to allow the following scenario:
The user is presented with a dropdown list of Universities (for example)
The user selects his/her university from the list if it exists
If the university does not exist, he should enter his own university in a text box (sort of like Other: [___________])
how should I design the database to handle such situation given that I might want to sort using the university ID for example (probably only for the built in universities and not the ones entered by users)
thanks!
I just want to make it similar to how Facebook handles this situation. If the user selects his Education (by actually typing in the combobox which is not my concern) and choosing one of the returned values, what would Facebook do?
In my guess, it would insert the UserID and the EducationID in a many-to-many table. Now what if the user is entering is not in the database at all? It is still stored in his profile, but where?

CREATE TABLE university
(
id smallint NOT NULL,
name text,
public smallint,
CONSTRAINT university_pk PRIMARY KEY (id)
);
CREATE TABLE person
(
id smallint NOT NULL,
university smallint,
-- more columns here...
CONSTRAINT person_pk PRIMARY KEY (id),
CONSTRAINT person_university_fk FOREIGN KEY (university)
REFERENCES university (id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION
);
public is set to 1 for the Unis in the system, and 0 for user-entered-unis.

You could cheat: if you're not worried about the referential integrity of this field (i.e. it's just there to show up in a user's profile and isn't required for strictly enforced business rules), store it as a simple VARCHAR column.
For your dropdown, use a query like:
SELECT DISTINCT(University) FROM Profiles
If you want to filter out typos or one-offs, try:
SELECT University FROM PROFILES
GROUP BY University
HAVING COUNT(University) > 10 -- where 10 is an arbitrary threshold you can tweak
We use this code in one of our databases for storing the trade descriptions of contractor companies; since this is informational only (there's a separate "Category" field for enforcing business rules) it's an acceptable solution.

Keep a flag for the rows entered through user input in the same table as you have your other data points. Then you can sort using the flag.

One way this was solved in a previous company I worked at:
Create two columns in your table:
1) a nullable id of the system-supplied string (stored in a separate table)
2) the user supplied string
Only one of these is populated. A constraint can enforce this (and additionally that at least one of these columns is populated if appropriate).
It should be noted that the problem we were solving with this was a true "Other:" situation. It was a textual description of an item with some preset defaults. Your situation sounds like an actual entity that isn't in the list, s.t. more than one user might want to input the same university.

This isn't a database design issue. It's a UI issue.
The Drop down list of universities is based on rows in a table. That table must have a new row inserted when the user types in a new University to the text box.
If you want to separate the list you provided from the ones added by users, you can have a column in the University table with origin (or provenance) of the data.

I'm not sure if the question is very clear here.
I've done this quite a few times at work and just select between either the drop down list of a text box. If the data is entered in the text box then I first insert into the database and then use IDENTITY to get the unique identifier of that inserted row for further queries.
INSERT INTO MyTable Name VALUES ('myval'); SELECT ##SCOPE_IDENTITY()
This is against MS SQL 2008 though, I'm not sure if the ##SCOPE_IDENTITY() global exists in other versions of SQL, but I'm sure there's equivalents.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight