Ensuring that two column values are related in SQL Server - sql-server

I'm using Microsoft SQL Server 2017 and was curious about how to constrain a specific relationship. I'm having a bit of trouble articulating so I'd prefer to share through an example.
Consider the following hypothetical database.
Customers
+---------------+
| Id | Name |
+---------------+
| 1 | Sam |
| 2 | Jane |
+---------------+
Addresses
+----------------------------------------+
| Id | CustomerId | Address |
+----------------------------------------+
| 1 | 1 | 105 Easy St |
| 2 | 1 | 9 Gale Blvd |
| 3 | 2 | 717 Fourth Ave |
+------+--------------+------------------+
Orders
+-----------------------------------+
| Id | CustomerId | AddressId |
+-----------------------------------+
| 1 | 1 | 1 |
| 2 | 2 | 3 |
| 3 | 1 | 3 | <--- Invalid Customer/Address Pair
+-----------------------------------+
Notice that the final Order links a customer to an address that isn't theirs. I'm looking for a way to prevent this.
(You may ask why I need the CustomerId in the Orders table at all. To be clear, I recognize that the Address already offers me the same information without the possibility of invalid pairs. However, I'd prefer to have an Order flattened such that I don't have to channel through an address to retrieve a customer.)
From the related reading I was able to find, it seems that one method may be to enable a CHECK constraint targeting a User-Defined Function. This User-Defined Function would be something like the following:
WHERE EXISTS (SELECT 1 FROM Addresses WHERE Id = Order.AddressId AND CustomerId = Order.CustomerId)
While I imagine this would work, given the somewhat "generality" of the articles I was able to find, I don't feel entirely confident that this is my best option.
An alternative might be to remove the CustomerId column from the Addresses table entirely, and instead add another table with Id, CustomerId, AddressId. The Order would then reference this Id instead. Again, I don't love the idea of having to channel through an auxiliary table to get a Customer or Address.
Is there a cleaner way to do this? Or am I simply going about this all wrong?

Good question, however at the root it seems you are struggling with creating a foreign key constraint to something that is not a foreign key:
Orders.CustomerId -> Addresses.CustomerId
There is no simple built-in way to do this because it is normally not done. In ideal RDBMS practices you should strive to encapsulate data of specific types in their own tables only. In other words, try to avoid redundant data.
In the example case above the address ownership is redundant in both the address table and the orders table, because of this it is requiring additional checks to keep them synchronized. This can easily get out of hand with bigger datasets.
You mentioned:
However, I'd prefer to have an Order flattened such that I don't have to channel through an address to retrieve a customer.
But that is why a relational database is relational. It does this so that distinct data can be kept distinct and referenced with relative IDs.
I think the best solution would be to simply drop this requirement.
In other words, just go with:
Customers
+---------------+
| Id | Name |
+---------------+
| 1 | Sam |
| 2 | Jane |
+---------------+
Addresses
+----------------------------------------+
| Id | CustomerId | Address |
+----------------------------------------+
| 1 | 1 | 105 Easy St |
| 2 | 1 | 9 Gale Blvd |
| 3 | 2 | 717 Fourth Ave |
+------+--------------+------------------+
Orders
+--------------------+
| Id | AddressId |
+--------------------+
| 1 | 1 |
| 2 | 3 |
| 3 | 3 | <--- Valid Order/Address Pair
+--------------------+
With that said, to accomplish your purpose exactly, you do have views available for this kind of thing:
create view CustomerOrders
as
select o.Id OrderId,
a.CustomerId,
o.AddressId
from Orders
join Addresses a on a.Id = o.AddressId
I know this is a pretty trivial use-case for a view but I wanted to put in a plug for it because they are often neglected and come in handy with organizing big data sets. Using WITH SCHEMABINDING they can also be indexed for performance.

You may ask why I need the CustomerId in the Orders table at all. To be clear, I recognize that the Address already offers me the same information without the possibility of invalid pairs. However, I'd prefer to have an Order flattened such that I don't have to channel through an address to retrieve a customer.
If you face performance problems, the first thing is to create or amend proper indexes. And DBMS are usually good at join operations (with proper indexes). But yes normalization can sometimes help in performance tuning. But it should be a last resort. And if that route is taken, one should really know what one is doing and be very careful not to damage more at the end of a day, that one has gained. I have doubts, that you're out of options here and really need to go that path. You're likely barking up the wrong tree. Therefore I recommend you take the "normal", "sane" way and just drop customerid in orders and create proper indexes.
But if you really insist, you can try to make (id, customerid) a key in addresses (with a unique constraint) and then create a foreign key based on that.
ALTER TABLE addresses
ADD UNIQUE (id,
customerid);
ALTER TABLE orders
ADD FOREIGN KEY (addressid,
customerid)
REFERENCES addresses
(id,
customerid);

Related

Issue with new table in oracle

I have to add a new table according to some new requirements, the model currently consists of two tables: DETAIL and SUMMARY.
The relation is that every detail has associated one summary, so now I need to add a new table called SUMMARY_ESP, which has the a FK ( SUMMARY) and two more columns, something like this:
ID | SUMMARY_ID | ESP_ID | PRIORITY_ESP | PTY_ID | PRIORITY_PTY
1 | 123 | 34 | 1 | 122 | 1
2 | 123 | 35 | 2 | 111 | 2
3 | 123 | 30 | 3 | null | null
4 | 1111 | 34 | 4 | null | null
Other tables info:
DETAIL TABLE
ID_DET | AMOUNT | DATE | ID_SUMMARY | EXTERNAL_ID
1 | 1000 | 14/05/2018 | 1111 | 4
2 | 2000 | 18/07/2016 | 1111 | 4
3 | 1200 | 11/07/2017 | 123 | 1
4 | 1300 | 21/09/2018 | 123 | 2
SUMMARY TABLE
ID_SUMMARY | PRIORITY| PROFILE | CLASS | AREA
123 | 1 | 1 | 5 | 3
1111 | 2 | 1 | 5 | 3
33 | 3 | 2 | 5 | 9
4 | 4 | 8 | 5 | 10
So according to this, SUMMARY_ID , ESP_ID and PTY_ID are unique, the thing is at some point to know the what is the ESP_ID of certain detail, but since the relation is with SUMMARY table, I have no idea which one was when it was added, so I was asked to create a new column to the DETAIL table called EXTERNAL_ID, so I can know what is the code from the SUMMARY_ESP.
So if the row is the first one, it can be either 24 or 122 in the new column according to some previous logic, but I'm worried about the implications this might have in the future, because somehow I might be duplicating information, also I would need to make some weird logic in order to get the priority depending on whether it's ESP_ID or PTY_ID.
The new table along with SUMMARY are somehow parameters table, their values do not change that often and only the PRIORITY column would change, DETAIL instead is more transactional, and it has insert and update everyday according to some business logic.
I was thinking of adding the ID of the new table as a FK to the DETAIL table, but at the end would be the same, because it'll be hard to maintain and update would be harder, also it's like a circular dependency, so I'm kind of stuck with this , so any kind of help would be really helpful, below the complete model, with the current idea.
Also I can't add those new columns to the table SUMMARY, because there could be more than one associated to the same code in that table and since it's the PK I cant add two rows with the same code.
The relation is that every detail has associated one summary
You need to represent that relationship in your database layout : if you have a 1-N relationship between SUMMARY and DETAIL, you want to create another column in DETAIL that holds the primary key of the SUMMARY record that it is related to.
With this relation in place, you can start from a DETAIL row, relate a row from SUMMARY and identifiy all SUMMARY_ESP records that are linked to it.
Now if you need to uniquely relate a DETAIL record to a SUMMARY_ESP record, then you want to either add a foreign key to SUMMARY_ESP in DETAIL, or the other way around (add a foreign key to DETAIL in SUMMARY_ESP), depending on the way your data flows.

Condensing Row Data into a View

I have data in my PeopleInfo table where there are some people that have multiple records that I am trying to combine together into one record for a view.
All people data is the almost the same except for the PlanId and PlanName. So:
| FirstName | LastName | SSN | PlanId | PlanName | Status | Price1 | Price2 |
|-----------|----------|-----------|--------|----------|-----------|---------|--------|
| John | Doe | 123456789 | 1 | Plan A | Primary | 9.00 | NULL |
|-----------|----------|-----------|--------|----------|-----------|---------|--------|
| John | Doe | 123456789 | 2 | Plan B | Secondary | NULL | 5.00 |
I would like to only to have one John Doe record in my view that looked like this:
| FirstName | LastName | SSN | PlanId | PlanName | Status | Price1 | Price2 |
|-----------|----------|-----------|--------|----------|-----------|---------|--------|
| John | Doe | 123456789 | 1 | Plan A | Primary | 9.00 | 5.00 |
Where the Primary status determines which PlanId and PlanName to show. Can anyone help me with this query?
declare #t table ( FNAME varchar(10), LNAME varchar(10), SSN varchar(10), PLANID INT,PLANNAME varchar(10),stat varchar(10),Price1 decimal(18,2),Price2 decimal(18,2))
insert into #t (FNAME,LNAME,SSN,PLANID,PLANNAME,stat,Price1,Price2)values ('john','doe','12345',1,'PlanA','primary',9.00,NULL),('john','doe','12345',1,'PlanB','secondary',Null,8.00)
select
FNAME,
LNAME,
SSN,
MAX(PLANID)PLANID,
MIN(PLANNAME)PLANNAME,
MIN(stat)stat,
MIN(Price1)Price1,
MIN(Price2)Price2 from #t
GROUP BY FNAME,LNAME,SSN
(I can't yet add a comment, so have an answer.)
The only thing that troubles me here is that i am also determining which PlanId and PlanName since they are different and i want to show a specific one based off of the Status field of both records.
Then you don't even need GROUPing. It would be much simpler. Just SELECT WHERE 'Primary' = PlanName. Assuming that (A) there will always be this PlanName for each user, and (B) You are happy to ignore all others.
P.S. If you will only be using Primary and Secondary PlanNames, you might want to change the column to a bit named something like isPrimaryPlan where 1 indicates true and 0 false. However, if you might bring in e.g. Bronze and Consolation Prize Plans later, then you'll need to retain a more variable datatype. Perhaps store the plans in a separate table and have an int FOREIGN KEY to it... I could go on!
OK, I'm back after having a sleep, which has improved my brain slightly,
First, let the record reflect that I don't like the database design here. The People and Plans should be separate tables, linked by foreign keys - via a 3rd table, e.g. PeoplePlans. That takes me to another point: the people here have no primary key (at least not that you have specified). So when writing the below, I had to pick the SSN, assuming that will always be present and unique.
Anyway, something like this should work, with the caveat that I'm not going to replicate the database structure to test it.
select
FirstName,
LastName,
SSN,
PlanId,
PlanName,
Status,
_ca._sum_Price1,
_ca._sum_Price2
from
PeopleInfo as _Primary
cross apply (
select
sum(Price1) as _sum_Price1,
sum(Price2) as _sum_Price2
from
PeopleInfo
where
_Primary.SSN = SSN
) as _ca
where
'Primary' = Status;
This SELECTs all People with Primary status in order to get you those rows. It then CROSS APPLYs their Primary and any other rows and takes the summed Prices.
Hopefully this makes sense. If not, you'll have to do some reading about CROSS APPLY, in addition to about good relational database design. ;-)

Best structure for this database?

The Problem: I want to design a website and I need a database for that, however I don't know which structure is better!
What will happen: Users will add some URIs to their favorites.
Possible structures:
Structure one:
TABLE "USERS":
=====================================================================
id | name | last_name | urls
1 | John | Smith | [google.com,stackoverflow.com,yahoo.com,...]
=====================================================================
Structure two:
TABLE "USERS":
=============================================
id | name | last_name
1 | John | Smith
2 | Joe | Roth
==============================================
TABLE "URLS":
==============================================
id | user_id | url
1 | 1 | google.com
2 | 1 | stackoverflow.com
3 | 2 | ask.com
4 | 1 | yahoo.com
5 | 2 | being.com
==============================================
Which structure is best? Thanks in advance!
The second schema (with a foreign key constraint on URLS.user_id) will be easier to manage. A select query will need to use a join and you'll be performing more inserts, but you won't need to perform string parsing to figure out what the urls are (which, with the first schema, you would need for every single select and update).
One of the tables in the production database at my current job has a schema similar to the first case, and it makes my entire department cringe and complain when they have to write code working with it. Yet the table remains, because fixing it would be a massive overhaul. (The table was created by a manager who no longer works here.) If you're creating your schema now, do it right, while you still can.
Absolutely create another table that save urls + user_id. Your first type is not in Normal form.

Relationships Between Tables in MS Access

I'm new in DataBases at all and have some difficulties with setting relationships between 3 tables in MS Access 2013.
The idea is that I have a table with accounts info, a table with calls related to this accounts and also one table with all the possible call responses. I tried different combinations between them but nothing works.
1st table - Accounts : AccountID(PK) | AccountName | Language | Country | Email
2nd table - Calls : CallID(PK) | Account | Response | Comment | Date
3rd table - Responses: ResponseID(PK) | Response
When you have a table, it usually has a Primary Key field that is the main index of the table. In order for you to connect it with other tables, you usually do that by setting Foreign Key on the other table.
Let's say you have your Accounts table, and it has AccountID field as Primary Key. This field is unique (meaning no duplicate value for this field).
Now, you have the other table called Calls and you have a Foreign Key field called AccountID there, which points to the Accounts table.
Essentially you have Accounts with the following data:
AccountID| AccountName | Language | Country | Email
1 | FirstName | EN | US | some#email.com
2 | SecondName | EN | US | some#email.com
Now you have the other table Calls with Many calls
CallID(PK) | AccountID(FK) | ResponseID(FK) | Comment | Date
1 | 1 | 1 | a comment | 26/10
2 | 1 | 1 | a comment | 26/10
3 | 2 | 3 | a comment | 26/10
4 | 2 | 3 | a comment | 26/10
You can see the One to Many relationship: One accountID (in my example AccountID=1) to Many Calls (in my example 2 rows with AccountID=1 as foreign keys, rows 1 & 2) and AccountID=2 has also 2 rows of Calls (rows 3 and 4)
Same goes for the Responses table
Using this table structure:
Accounts : AccountID(PK) | AccountName | Language | Country | Email
Calls : CallID(PK) | AccountID(FK) | ResponseID(FK) | Comment | Date
Responses: ResponseID(PK) | Response
Accounts.AccountID is referenced by Calls.AccountID. 1:n – many calls for one account possible, but each call concerns just one account.
Responses.ResponseID is referenced by Calls.ResponseID. 1:n – many calls can get the same response from the prepared set, but each call gets exactly one of them.
To actually define the Relationships in Access, open the Relationships window...
... then follow the detailed instructions here:
How to define relationships between tables in an Access database

Schema to support dynamic properties

I'm working on an editor that enables its users to create "object" definitions in real-time. A definition can contain zero or more properties. A property has a name a type. Once a definition is created, a user can create an object of that definition and set the property values of that object.
So by the click of a mouse-button, the user should ie. be able to create a new definition called "Bicycle", and add the property "Size" of type "Numeric". Then another property called "Name" of type "Text", and then another property called "Price" of type "Numeric". Once that is done, the user should be able to create a couple of "Bicycle" objects and fill in the "Name" and "Price" property values of each bike.
Now, I've seen this feature in several software products, so it must be a well-known concept. My problem started when I sat down and tried to come up with a DB schema to support this data structure, because I want the property values to be stored using the appropriate column types. Ie. a numeric property value is stored as, say, an INT in the database, and a textual property value is stored as VARCHAR.
First, I need a table that will hold all my object definitions:
Table obj_defs
id | name |
----------------
1 | "Bicycle" |
2 | "Book" |
Then I need a table for holding what sort of properties each object definition should have:
Table prop_defs
id | obj_def_id | name | type |
------------------------------------
1 | 1 | "Size" | ? |
2 | 1 | "Name" | ? |
3 | 1 | "Price" | ? |
4 | 2 | "Title" | ? |
5 | 2 | "Author" | ? |
6 | 2 | "ISBN" | ? |
I would also need a table that holds each object:
Table objects
id | created | updated |
------------------------------
1 | 2011-05-14 | 2011-06-15 |
2 | 2011-05-14 | 2011-06-15 |
3 | 2011-05-14 | 2011-06-15 |
Finally, I need a table that will hold the actual property values of each object, and one solution is for this table to have one column for each possible value type, such as this:
Table prop_vals
id | prop_def_id | object_id | numeric | textual | boolean |
------------------------------------------------------------
1 | 1 | 1 | 27 | | |
2 | 2 | 1 | | "Trek" | |
3 | 3 | 1 | 1249 | | |
4 | 1 | 2 | 26 | | |
5 | 2 | 2 | | "GT" | |
6 | 3 | 2 | 159 | | |
7 | 4 | 3 | | "It" | |
8 | 5 | 3 | | "King" | |
9 | 6 | 4 | 9 | | |
If I implemented this schema, what would the "type" column of the prop_defs table hold? Integers that each map to a column name, varchars that simply hold the column name? Any other possibilities? Would a stored procedure help me out here in some way? And what would the SQL for fetching the "name" property of object 2 look like?
You are implementing something called Entity-Attribute-Value model http://en.wikipedia.org/wiki/Entity-attribute-value_model.
Lots of folks will say it's a bad idea (usually I am one of those) because the answer to your last question, "What would the SQL for fetching..." tends to be "thick hairy and nasty, and gettting worse."
These criticisms tend to hold once you allow users to start nesting objects inside of other objects, if you do not allow that, the situation will remain manageable.
For your first question, "what would the "type" column of the prop_defs table hold", everything will be simpler if you have a table of types and descriptions that holds {"numeric","Any Number"}, {"textual","String"}, etc. The first value is the primary key. Then in prop_defs your column "type" is a foreign key to that table and holds values "numeric", "textual", etc. Some will tell you incorrectly to always use integer keys because they JOIN faster, but if you use the values "numeric", "textual" etc. you don't have to JOIN and the fastest JOIN is the one you don't do.
The query to grab a single value will have a CASE statement:
SELECT case when pd.type = "numeric" then pv.numeric
when pd.type = "textual" then pv.textual
when pd.type = "boolean" then pv.boolean
from prov_vals pv
JOIN prop_defs pd ON pv.prop_def_id = pv.id
WHERE pv.object_id = 2
AND pd.name = "Name"
You must accept that relational databases are not good at providing this kind of functionality. They CAN provide it, but are not good at it. (I hope I'm wrong). Relational databases lend themselves better to defined interfaces, not changing interfaces.
--EAV tables give dynamic fields but suck on performance. Sucks on indexing. And it is complex to query. It gets the job done in many situations, but can fall apart on big tables with lots of users hitting the system.
--"Regular" tables with several place holder columns are OK for performance, but you get non-descriptive column names and are limited in the number of columns you can "add". Also it does not support sub-type separation.
--Typically you create/modify tables at development time, not run time. Should we really discriminate against modifying the database at run time? maybe, maybe not. Creating new tables, foreign keys, and columns at run-time can achieve true dynamic objects, while giving the performance benefits of "regular" tables. But you would have to query the schema of the database, then dynamically generate all of your queries. That would suck. It would totally break the concept of tables as an interface.

Resources