Brief Description:
I have a table that stores articles. Articles are listed on table and sorted - DESC - by dateCreated.
dateCreated column represents the date and time user has posted/created the article. It is fixed and must not be changed.
Problem:
By the days, old articles usually ends up in far pages. However, the user has the chance of bumping his article back to the top of the table on first page. Since i'm ordering the articles by dateCreated, which mustn't be changed, how can i bump the article without changing dateCreated?
My Solution - I'm not sure if its a good one or not (i need suggestions):
Create another column called bumpDate. When user posts an article, date/time will be inserted into dateCreated and bumpDate. The articles on the gridview will be sorted by bumpDate. When user bumps his article, i only update bumpDate. Therefore, user's article ,regardless of dateCreated date, will be on top. Gradually, the article will go down by the days depending how many new articles posted by other users.
Do you see any glitches in this design/solution?
What you have outlined is how those things are typically done. While BumpDate might not be the best name (does this truly represent the action of "bumping", or did they do something else like modifying it?), that's what you should use.
I don't see any real issues with what you are proposing, only possible thing would be to try and utilize an INT value instead of a DateTime (4 bytes vs 8 bytes) if you are going to have a LOT of data, otherwise I would do the same thing you are proposing.
Related
Potential Table structure
Final form look
I'm uncertain if this table structure is correct for what I want. For one thing, I don't really want a date next to each task, because all of the tasks should reflect one date -- this is a form for the day, where all the tasks completed go with the same day: the form is for that day, so if 7 are checked, that's one day. 12 may be checked the next.
I was also thinking about laying out every task as its own field, with one date field of course, solving the date issue, but that feels wrong.
Should I consider making a second table that links to the tasks with the fields done, date, taskID(FK).
Looking for suggestions on table structure, thanks!
Honestly, I think you'd be better off with soemthing like this:
Yes, it will make inserts a bit more annoying, but however, the table structure you outlined, is clearly a one-to-many relaitonship. It woul dbe more correct to design your tables in a more relational way, and then account for it by the way you wrote code. This will also allow you to insert rows for each level of difficulty as well (assuming youre allowing end users to multi select).
You could even stretch this out a bit further and make another look up table for your notes, since they seem to be pretty standardized and then have the noteID in your tblActivity. This would essentially make tblActivity your main table, with all the nonsense you dont necessarily need in look up tables.
I'm making an api for movie/tv/actors etc. with web api 2 and sql server. The database now has >30 tables, most of them storing data users will be able to edit.
How should I store old version of entries?
Say someone edits description, runtime and tagline for a entry(movie) in the movies table.
I'll have a table(movies_old), where I store the editable files in 'movies' pluss who/when it was edited.
All in the same database. The '???_old' tables has no relationships.
I'm very new to database design. Is there something obviously wrong with this?
To my mind, there are two issues here: what table you store the data in, and what goes in the "historical value" field.
On the first question, there are two obvious options: Store old and new records in the same table, with some sort of indication of which is "current" and which is "history", or have a separate table for history.
The main advantage of one table is that you have a simpler schema. This is especially true if the table contains many fields. If there are two tables, then all the field definitions are duplicated. When you move data from the current table to the history table, you have to copy every field, and if the list of fields changes, or their formats change, you have to remember to update the copy. Any queries that show the history have to read two tables. Etc. But with one table, all that goes away. Converting a record from current to history just means changing the setting of the "is_current" flag or however you indicate it.
The main advantages of two tables are, (a) Access is probably somewhat faster, as you don't have so many irrelevant records to skip over. (b) When reading the current table you don't have to worry about excluding the history records.
Oh, an annoying thing about SQL: In principle you could put a date on each record, and then the record with the latest date is the current one. In practice this is a pain: you usually have to have an inner query to find the latest date, and then feed this back in to an outer query that re-reads the record with that date. (Some SQL engines have ways around this. Postgres, for example.) So in practice, you need an "is_current" flag, probably 1 for current and 0 for history or some such.
The other issue is what to put in the contents. If you're dealing with short fields, customer number and amount billed and so forth, then the simple and easy thing to do is just store the complete old contents in one record and the complete new contents in the new record. But if you're dealing with a long text block, like a plot synopsis or a review, there could be many small editorial changes. If every time someone fixes a grammar or spelling error, we have a whole new record with the entire 1000 characters, of which 5 characters are different, this could really clutter up the database. If that's the case you might want to investigate ways to store changes more efficiently. May or may not be an issue to you.
As this question is sure to reveal, I'm relatively new to database design. Please pardon my lack of understanding.
I want to store values like the following (example from Google calendar) in a database:
What's the best way to do this? Would this be one database field or several?
If the former, does this disobey normalization rules?
Thanks!
I suggest you to create a relation many-many, you can achive that separating the columns in more logical way (normalization) ... for the example above:
You should have a table called "schedules"(or whatever make sense to you), another something like "repeat_on" and a third table called "days"(here you have monday-sunday with their IDs). In the table of the middle(repeat_on) you should create foreign keys(for the other 2 tables: schedule_id and day_id) to do the magic.
This way you can combine whatever you want, for example:
schedule day
1 1
1 3
1 7
Meaning that you have to do the same on monday, wednesday and sunday.
IMO, normalization is an art. However, I usually take fields like your example and keep them in one table as we will never have more than 7 days. However, if there is any chance of growth I would put it in a separate table.
If the options are mutually exclusive, then you can use one field to store the choice. Ideally set up constraints for the field such that only the allowed values can be stored.
If more than one option can be chosen, you should have one field per option, with values 'y' and 'n' (or 't'/'f' for true/false). Again, you should add a constraint to only allow these values. If you DBMS supports it, use a BIT datatype, which only allows 1 and 0.
This may be overblown for your example, but you may want to look at the following article:
http://martinfowler.com/apsupp/recurring.pdf
I know this is an indirect answer but couldn't hurt to read.
Something like days of the week is tricky. PachinSV and Dustin Laine both make good points. If you have a list of things to choose from, having a code table to list the things and an intersection table to say which ones are chosen is a good basic design.
The reason days of the week are tricky is that the domain (i.e. list of days) is pretty small and there is no way the domain will ever expand. Also, one of the advantages of the intersection table approach is that you can run a query against everything that happens on a Wednesday (for example). This is great when your code table is something like category tags for blog articles, since asking to see everything with the tag "How-To" is a reasonable question. For the case of days of the week recurrence, does it make any actual business sense to say show me everything that recurs on Wednesdays? I don't think so. For sure you'll query on dates and date ranges, but days of the week, and only in the context of recurrence? I can't think of a practical reason to do that.
Therefore, there is an argument to be made that the days of the week are attributes and not independent entities so having seven bit flags on your table is still 3NF.
Part of my table design is to include a IsDeleted BIT column that is set to 1 whenever a user deletes a record. Therefore all SELECTS are inevitable accompanied by a WHERE IsDeleted = 0 condition.
I read in a previous question (I cannot for the love of God re-find that post and reference it) that this might not be the best design and an 'Audit Trail' table might be better.
How are you guys dealing with this problem?
Update
I'm on SQL Server. Solutions for other DB's are welcome albeit not as useful for me but maybe for other people.
Update2
Just to encapsulate what everyone said so far. There seems to be basically 3 ways to deal with this.
Leave it as it is
Create an audit table to keep track of all the changes
Use of views with WHERE IsDeleted = 0
Therefore all SELECTS are inevitable accompanied by a WHERE IsDeleted = 0 condition.
This is not a really good way to do it, as you probably noticed, it is quite error-prone.
You could create a VIEW which is simply
CREATE VIEW myview AS SELECT * FROM yourtable WHERE NOT deleted;
Then you just use myview instead of mytable and you don't have to think about this damn column in SELECTs.
Or, you could move deleted records to a separate "archive" table, which, depending on the proportion of deleted versus active records, might make your "active" table a lot smaller, better cached in RAM, ie faster.
If you have to have this kind of Deleted Bit column, then you really should consider setting up some VIEWs with the WHERE clause in it, and use those rather than the underlying tables. Much less error prone.
For example, if you have this view:
CREATE VIEW [Current Product List] AS
SELECT ProductID,ProductName
FROM Products
WHERE Discontinued=No
Then someone who wants to see current products can simply write:
SELECT * FROM [Current Product List]
This is much less error prone than writing:
SELECT ProductID,ProductName
FROM Products
WHERE Discontinued=No
As you say, people will forget that WHERE clause, and get confusing and incorrect results.
P.S. the example SQL comes from Microsoft's Northwind database. Normally I would recommend NOT using spaces in column and table names.
We're actively using the "Deleted" column in our enterprise software. It is however a source of constant errors when forgetting to add "WHERE Deleted = 0" to an SQL query.
Not sure what is meant by "Audit Trail". You may wish to have a table to track all deleted records. Or there may be an option of moving the deleted content to paired tables (like Customer_Deleted) to remove the passive content from tables to minimize their size and optimize performance.
A while ago there was some blog uproar on this issue, Ayende and Udi Dahan both posted on this.
Nai this is totally up to you.
Do you need to be able to see who has deleted / modified / inserted what and when? If so, you should design the tables for this and adjust your procs to write these values when they are called.
If you dont need an audit trail, dont waste time with one. Just do as you are with IsDeleted.
Personally, I flag things right now, as an audit trail wasn't specified in my spec, but that said, I don't like to actually delete things. Hence, I chose to flag it. I'm not going to waste a clients time writing something they diddn't request. I wont mess about with other tables because that's another thing for me to think about. I'd just make sure my index's were up to the job.
Ask your manager or client. Plan out how long the audit trail would take so they can cost it and let them make the decision for you ;)
Udi Dahan said this:
Model the task, not the data
Looking back at the story our friend from marketing told us, his intent is to discontinue the product – not to delete it in any technical sense of the word. As such, we probably should provide a more explicit representation of this task in the user interface than just selecting a row in some grid and clicking the ‘delete’ button (and “Are you sure?” isn’t it).
As we broaden our perspective to more parts of the system, we see this same pattern repeating:
Orders aren’t deleted – they’re cancelled. There may also be fees incurred if the order is canceled too late.
Employees aren’t deleted – they’re fired (or possibly retired). A compensation package often needs to be handled.
Jobs aren’t deleted – they’re filled (or their requisition is revoked).
In all cases, the thing we should focus on is the task the user wishes to perform, rather than on the technical action to be performed on one entity or another. In almost all cases, more than one entity needs to be considered.
If you have Oracle DB, then you can use audit trail for auditing. Check the AUDIT VAULT tool form OTN, here. It even supports SQL Server.
Views (or stored procs) to get at the underlying table data are the best way. However, if you have the problem with "too many cooks in the kitchen" like we do (too many people have rights to the data and may just use the table without knowing enough to use the view/proc) you should try using another table.
We have a complete mimic of the base table with a few extra columns for tracking. So Employee table has an EmployeeDeleted table with the same schema but extra columns for when it was deleted and who deleted it and sometimes even the reason for deletion. You can even get fancy and have triggers do the insertion directly instead of going through applications/procs.
Biggest Advantage: no flag to worry about during selects
Biggest Disadvantage: any schema changes to the base table also have to be made on the "deleted" table
Best for: situations where for whatever reason (usually political with us) many not-as-experienced people have rights to the data but still expect it to be accurate without having to understand flags or schemas, etc
I've used soft deletes before on a number of applications I've worked on, and overall it's worked out quite well. Yes, there is the issue of always having to remember to add AND IsActive = 1 to all of your SELECT queries, but really that's not so bad. You can create views if you don't want to have to remember to always do that.
The reason we've done this is because we had very specific business needs to be able to report on records that have been deleted. The reporting needs varied widely - sometimes they'd need to see just the active records, or just the inactive records, or sometimes a mix of both - so pushing all the deleted records into an audit table wasn't a very good option.
So, depending on your particular business needs, I think this approach is certainly a viable option.
I am designing this database that must maintain a history of employee salary and the movements within the organization. Basically, my design has 3 tables (I mean, there more tables but for this question I'll mention 3, so bear with me). Employee table (containing the most current salary, position data, etc), SalaryHistory table (salary, date, reason, etc.) and MovementHistory(Title, Dept., comments). I'll be using Linq to Sql, so what I was thinking is that every time employee data is updated, the old values will be copied to their respective history tables. Is this a good approach? Should I just do it using Linq to SQL or triggers? Thanks for any help, suggestion or idea.
Have a look at http://www.simple-talk.com/sql/database-administration/database-design-a-point-in-time-architecture .
Basically, the article suggests that you have the following columns in the tables you need to track history for -
* DateCreated – the actual date on which the given row was inserted.
* DateEffective – the date on which the given row became effective.
* DateEnd – the date on which the given row ceased to be effective.
* DateReplaced – the date on which the given row was replaced by another row.
* OperatorCode – the unique identifier of the person (or system) that created the row.
DateEffective and DateEnd together tell you the time for which the row was valid (or the time for which an employee was in a department, or the time for which he earned a particular salary).
It is a good idea to keep that logic internal to the database: that's basically why triggers exist. I say this carefully, however, as there are plenty of reasons to keep it external. Often times - especially with a technology as easy as LINQ-to-SQL - it is easier to write the code externally. In my experience, more people could write that logic in C#/LINQ than could do it correctly using a trigger.
Triggers are fast - they're compiled! However, they're very easy to misuse and make your logic overcomplex to a degree that performance can degrade rapidly. Considering how simple your use case is, I would opt to use triggers, but that's me personally.
Triggers will likely be faster, and don't require a "middle man" to get the job done, eliminating at least one chance for errors.
Depending on your database of choice, you can just use one table and enable OID's on it, and add two more columns, "flag" and "previous". Never update this table, only insert. Add a trigger so that when a row is added for employee #id, set all records with employee #id to have a flag of "old" and set the new rows "previous" value to the previous row.
I think this belongs in the database for two reasons.
First, middle tiers come and go, but databases are forever. This year Java EJBs, next year .NET, the year after that something else. The data remains, in my experience.
Second, if the database is shared at all it should not have to rely on every application that uses it to know how to maintain its data integrity. I would consider this an example of encapsulation of the database. Why force knowledge and maintenance of the history on every client?
Triggers make your front-end easier to migrate to something else and they will keep the database consistent no matter how data is inserted/updated/removed.
Besides in your case I would write the salaries straight to the salary history - from your description I wouldn't see a reason why you should go the way via an update-trigger on the employee table.