Here is a simplification of my database:
Table: Property
Fields: ID, Address
Table: Quote
Fields: ID, PropertyID, BespokeQuoteFields...
Table: Job
Fields: ID, PropertyID, BespokeJobFields...
Then we have other tables that relate to the Quote and Job tables individually.
I now need to add a Message table where users can record telephone messages left by customers regarding Jobs and Quotes.
I could create two identical tables (QuoteMessage and JobMessage), but this violates the DRY principal and seems messy.
I could create one Message table:
Table: Message
Fields: ID, RelationID, RelationType, OtherFields...
But this stops me from using constraints to enforce my referential integrity. I can also forsee it creating problems with the devlopment side using Linq to SQL later on.
Is there an elegant solution to this problem, or am I ultimately going to have to hack something together?
Burns
Create one Message table, containing a unique MessageId and the various properties you need to store for a message.
Table: Message
Fields: Id, TimeReceived, MessageDetails, WhateverElse...
Create two link tables - QuoteMessage and JobMessage. These will just contain two fields each, foreign keys to the Quote/Job and the Message.
Table: QuoteMessage
Fields: QuoteId, MessageId
Table: JobMessage
Fields: JobId, MessageId
In this way you have defined the data properties of a Message in one place only (making it easy to extend, and to query across all messages), but you also have the referential integrity linking Quotes and Jobs to any number of messages. Indeed, both a Quote and Job could be linked to the same message (I'm not sure if that is appropriate to your business model, but at least the data model gives you the option).
About the only other way I can think of is to have a base Message table, with both an Id and a TypeId. Your subtables (QuoteMessage and JobMessage) then reference the base table on both MessageId and TypeId - but also have CHECK CONSTRAINTS on them to enforce only the appropiate MessageTypeId.
Table: Message
Fields: Id, MessageTypeId, Text, ...
Primary Key: Id, MessageTypeId
Unique: Id
Table: MessageType
Fields: Id, Name
Values: 1, "Quote" : 2, "Job"
Table: QuoteMessage
Fields: Id, MessageId, MessageTypeId, QuoteId
Constraints: MessageTypeId = 1
References: (MessageId, MessageTypeId) = (Message.Id, Message.MessageTypeId)
QuoteId = Quote.QuoteId
Table: JobMessage
Fields: Id, MessageId, MessageTypeId, JobId
Constraints: MessageTypeId = 2
References: (MessageId, MessageTypeId) = (Message.Id, Message.MessageTypeId)
JobId = Job.QuoteId
What does this buy you, as compared to just a JobMesssage and QuoteMessage table? It elevates a Message to a first class citizen, so that you can read all Messages from a single table. In exchange, your query path from a Message to it's relevant Quote or Job is 1 more join away. It kind of depends on your app flow whether that's a good tradeoff or not.
As for 2 identical tables violating DRY - I wouldn't get hung up on that. In DB design, it's less about DRY, and more about normalization. If the 2 things you're modeling have the same attributes (columns), but are actually different things (tables) - then it's reasonable to have multiple tables with similar schemas. Much better than the reverse of munging different things together.
#burns
Ian's answer (+1) is correct [see note]. Using a many to many table QUOTEMESSAGE to join QUOTE to MESSAGE is the most correct model, but will leave orphaned MESSAGE records.
This is one of those rare cases where a trigger can be used. However, caution needs to be applied to ensure that the a single MESSAGE record cannot be associated with both a QUOTE and a JOB.
create trigger quotemessage_trg
on quotemessage
for delete
as
begin
delete
from [message]
where [message].[msg_id] in
(select [msg_id] from Deleted);
end
Note to Ian, I think there is a typo in the table definition for JobMessage, where the columns should be JobId, MessageId (?). I would edit your quote but it might take me a few years to gain that level of reputation!
Why not just have both QuoteId and JobId fields in the message table? Or does a message have to be regarding either a quote or a job and not both?
Related
I have here some database schema with tables having long fields (in MS-SQL-Server of type "text", in Sybase of type "text" too) and I need to retrieve distinct rows.
The tables looks like
create table node (id int primary key, … a few more fields … data text);
create table ref (id int primary key, node_id int, … a few more fields);
For one row in "node", there may be zero or more rows in "ref".
Now I have a query like
SELECT node.* FROM node, ref WHERE node.id = ref.node_id AND ... some more restrictions.
This query returns duples and triples when there is more than a single row in "ref" for some "node_id".
But I need unique rows!
Using SELECT DISTINCT node.* does not work because of the columns of type "text" :-(
In Sybase there is trick, just add "GROUP BY node.id" to the query, voila! You get unique rows returned.
Is there some similar simple Trick for MS-SQL-Server?
I have already a solution with temporary tables, but this seems to be a lot slower maybe the reason is just because of the larger number of statements transferred to the database?
It looks like you are approaching this problem from the wrong direction. Joins are typically used to expand on keys where relevant data is stored in different tables. So it's no surprise you are getting more than one row per node_id.
In your query, you join the two tables together, but then you ignore everything from ref. It looks like you're just trying to filter out ids from node that are not referenced in ref. If that is the case, then you don't want to use a join. The following will work much better
select *
from node
where id in (
select node_id
from ref
where [any restrictions placed on the ref table go here]
)
and [any restrictions placed on the node table go here]
Furthermore, at the risk of teaching you bad join practices, the same thing can be accomplished they way you were trying to do it originally, but it's more painful to write and it's not good practice
select node.col1, node.col2, ... , node.last_col
FROM node
inner join ref on node.id = ref.node_id
where [some restrictions.]
group by node.col1, node.col2, ... , node.last_col
I want to begin by stating I'm an SQL noob, so I'd appreciate any suggestions or comments on my workflow and/or mindset when trying to solve this issue.
What I'm doing is gathering usage statistics about several applications, in several categories (not all categories necessarily apply to all applications), storing them in a database.
I've set up a few tables to do that, and then one table to link everything together that's structured like so (from now on: Dtable):
(column name - details)
UserID - foreign key to another table which stores users data
ApplicationID - foreign key to another table which stores applications data
CategoryID - foreign key to another table which holds a list of different categories
Value - the actual data
Each application gathers the data, then submits it to the database using a stored procedure. As the amount of data can be different based on actual usage (not always sending every category) and for each application, I was thinking of sending the data as a DataTable with a list of CategoryID and Value so I won't have to call a procedure for every individual category (Ptable).
I need to update each record in Dtable to the correct value in Ptable according to CategoryID, but also filtered by UserID and ApplicationID. UserID and ApplicationID will be given as two other parameters to the Stored Procedure. Ptable only contains a list of CategoryID / Value records.
Now, I read about Cursors (for each record in the table parameter set the relevant data in the database table), but the consensus seems to be "Avoid at all costs".
How would I go about updating the table, then, based on the varying records in Ptable?
P.S.
The tables are structured like so to keep agility and scalability in adding more categories/applications in the future. If there's a better way to do it I'll be happy to know.
I believe the update statement would look something like this, where #ApplicationID and #UserID are the stored proc's other parameters:
update Dtable
set Dtable.Value = p.Value
from Ptable p
where Dtable.UserID = #UserID
and Dtable.ApplicationID = #ApplicationID
and Dtable.CategoryID = p.CategoryID;
Due to non-disclosure at my work, I have created an analogy of the situation. Please try to focus on the problem and not "Why don't you rename this table, m,erge those tables etc". Because the actual problem is much more complex.
Heres the deal,
Lets say I have a "Employee Pay Rise" record that has to be approved.
There is a table with single "Users".
There are tables that group Users together, forexample, "Managers", "Executives", "Payroll", "Finance". These groupings are different types with different properties.
When creating a "PayRise" record, the user who is creating the record also selects both a number of these groups (managers, executives etc) and/or single users who can 'approve' the pay rise.
What is the best way to relate a single "EmployeePayRise" record to 0 or more user records, and 0 or more of each of the groupings.
I would assume that the users are linked to the groups? If so in this case I would just link the employeePayRise record to one user that it applies to and the user that can approve. So basically you'd have two columns representing this. The EmployeePayRise.employeeId and EmployeePayRise.approvalById columns. If you need to get to groups, you'd join the EmployeePayRise.employeeId = Employee.id records. Keep it simple without over-complicating your design.
My first thought was to create a table that relates individual approvers to pay rise rows.
create table pay_rise_approvers (
pay_rise_id integer not null references some_other_pay_rise_table (pay_rise_id),
pay_rise_approver_id integer not null references users (user_id),
primary key (pay_rise_id, pay_rise_approver_id)
);
You can't have good foreign keys that reference managers sometimes, and reference payroll some other times. Users seems the logical target for the foreign key.
If the person creating the pay rise rows (not shown) chooses managers, then the user interface is responsible for inserting one row per manager into this table. That part's easy.
A person that appears in more than one group might be a problem. I can imagine a vice-president appearing in both "Executive" and "Finance" groups. I don't think that's particularly hard to handle, but it does require some forethought. Suppose the person who entered the data changed her mind, and decided to remove all the executives from the table. Should an executive who's also in finance be removed?
Another problem is that there's a pretty good chance that not every user should be allowed to approve a pay rise. I'd give some thought to that before implementing any solution.
I know it looks ugly but I think somethimes the solution can be to have the table_name in the table and a union query
create table approve_pay_rise (
rise_proposal varchar2(10) -- foreign key to payrise table
, approver varchar2(10) -- key of record in table named in other_table
, other_table varchar2(15) );
insert into approve_pay_rise values ('prop000001', 'e0009999', 'USERS');
insert into approve_pay_rise values ('prop000001', 'm0002200', 'MANAGERS');
Then either in code a case statement, repeated statements for each other_table value (select ... where other_table = '' .. select ... where other_table = '') or a union select.
I have to admit I shudder when I encounter it and I'll now go wash my hands after typing a recomendation to do it, but it works.
Sounds like you'd might need two tables ("ApprovalUsers" and "ApprovalGroups"). The SELECT statement(s) would be a UNION of UserIds from the "ApprovalUsers" and the UserIDs from any other groups of users that are the "ApprovalGroups" related to the PayRiseId.
SELECT UserID
INTO #TempApprovers
FROM ApprovalUsers
WHERE PayRiseId = 12345
IF EXISTS (SELECT GroupName FROM ApprovalGroups WHERE GroupName = "Executives" and PayRiseId = 12345)
BEGIN
SELECT UserID
INTO #TempApprovers
FROM Executives
END
....
EDIT: this would/could duplicate UserIds, so you would probably want to GROUP BY UserID (i.e. SELECT UserID FROM #TempApprovers GROUP BY UserID)
Ok I have a question and it is probably very easy but I can not find the solution.
I have 3 tables plus one main tbl.
tbl_1 - tbl_1Name_id
tbl_2- tbl_2Name_id
tbl_3 - tbl_3Name_id
I want to connect the Name_id fields to the main tbl fields below.
main_tbl
___________
tbl_1Name_id
tbl_2Name_id
tbl_3Name_id
Main tbl has a Unique Key for these fields and in the other table, fields they are normal fields NOT NULL.
What I would like to do is that any time when the record is entered in tbl_1, tbl_2 or tbl_3, the value from the main table shows in that field, or other way.
Also I have the relationship Many to one, one being the main tbl of course.
I have a feeling this should be very simple but can not get it to work.
Take a look at SQL Server triggers. This will allow you to perform an action when a record is inserted into any one of those tables.
If you provide some more information like:
An example of an insert
The resulting change you would like
to see as a result of that insert
I can try and give you some more details.
UPDATE
Based on your new comments I suspect that you are working with a denormalized database schema. Below is how I would suggest you structure your tables in the Employee-Medical visit scenario you discussed:
Employee
--------
EmployeeId
fName
lName
EmployeeMedicalVisit
--------------------
VisitId
EmployeeId
Date
Cost
Some important things:
Note that I am not entering the
employees name into the
EmployeeMedicalVisit table, just the EmployeeId. This
helps to maintain data integrity and
complies with First Normal Form
You should read up on 1st, 2nd and
3rd normal forms. Database
normalization is a very imporant
subject and it will make your life
easier if you can grasp them.
With the above structure, when an employee visited a medical office you would insert a record into EmployeeMedicalVisit. To select all medical visits for an employee you would use the query below:
SELECT e.fName, e.lName
FROM Employee e
INNER JOIN EmployeeMedicalVisit as emv
ON e.EployeeId = emv.EmployeeId
Hope this helps!
Here is a sample trigger that may show you waht you need to have:
Create trigger mytabletrigger ON mytable
For INSERT
AS
INSERT MYOTHERTABLE (MytableId, insertdate)
select mytableid, getdate() from inserted
In a trigger you have two psuedotables available, inserted and deleted. The inserted table constains the data that is being inserted into the table you have the trigger on including any autogenerated id. That is how you get the data to the other table assuming you don't need other data at the same time. YOu can get other data from system stored procuders or joins to other tables but not from a form in the application.
If you do need other data that isn't available in a trigger (such as other values from a form, then you need to write a sttored procedure to insert to one table and return the id value through an output clause or using scope_identity() and then use that data to build the insert for the next table.
I have a postgres database with a user table (userid, firstname, lastname) and a usermetadata table (userid, code, content, created datetime). I store various information about each user in the usermetadata table by code and keep a full history. so for example, a user (userid 15) has the following metadata:
15, 'QHS', '20', '2008-08-24 13:36:33.465567-04'
15, 'QHE', '8', '2008-08-24 12:07:08.660519-04'
15, 'QHS', '21', '2008-08-24 09:44:44.39354-04'
15, 'QHE', '10', '2008-08-24 08:47:57.672058-04'
I need to fetch a list of all my users and the most recent value of each of various usermetadata codes. I did this programmatically and it was, of course godawful slow. The best I could figure out to do it in SQL was to join sub-selects, which were also slow and I had to do one for each code.
This is actually not that hard to do in PostgreSQL because it has the "DISTINCT ON" clause in its SELECT syntax (DISTINCT ON isn't standard SQL).
SELECT DISTINCT ON (code) code, content, createtime
FROM metatable
WHERE userid = 15
ORDER BY code, createtime DESC;
That will limit the returned results to the first result per unique code, and if you sort the results by the create time descending, you'll get the newest of each.
I suppose you're not willing to modify your schema, so I'm afraid my answe might not be of much help, but here goes...
One possible solution would be to have the time field empty until it was replaced by a newer value, when you insert the 'deprecation date' instead. Another way is to expand the table with an 'active' column, but that would introduce some redundancy.
The classic solution would be to have both 'Valid-From' and 'Valid-To' fields where the 'Valid-To' fields are blank until some other entry becomes valid. This can be handled easily by using triggers or similar. Using constraints to make sure there is only one item of each type that is valid will ensure data integrity.
Common to these is that there is a single way of determining the set of current fields. You'd simply select all entries with the active user and a NULL 'Valid-To' or 'deprecation date' or a true 'active'.
You might be interested in taking a look at the Wikipedia entry on temporal databases and the article A consensus glossary of temporal database concepts.
A subselect is the standard way of doing this sort of thing. You just need a Unique Constraint on UserId, Code, and Date - and then you can run the following:
SELECT *
FROM Table
JOIN (
SELECT UserId, Code, MAX(Date) as LastDate
FROM Table
GROUP BY UserId, Code
) as Latest ON
Table.UserId = Latest.UserId
AND Table.Code = Latest.Code
AND Table.Date = Latest.Date
WHERE
UserId = #userId