I use stackoverflow a lot, but this is my first question here, so if i'm doing anything wrong just let me know. I'm not a programmer (I just do programming for my own needs) so I'm open to tutorial suggestions etc. I won't be offended if you just give me something to read and find the answers myself.
OK, to the point - I'm trying to write simple application to track my personal expenses and I have a problem with database design. I'm using VStudio to create the database (SQLite). I attached a diagram with my design and I have some questions.
My SQLite diagram
I don't know exactly how to design "Transactions" table. Fields like Date, Payment Type etc. seems to be easy enough but the idea was to store in this table information about transactions so I need to store multiple products there. I've read about it and created table "Transactions_Products" that will help with that. My problem is : where do I put quantity of products in the transaction? I can't think of a place to put it. I tried to find similar databases but couldn't find anything.
Second thing. I've read about indexing a lot, but I still can't grasp the idea. I don't know when to use it. Should I use it only on fields that I will be "querying" a lot?
Last one - is it better for such a small application just for myself to store my account balance in a separate table or should I just calculate it every time?
As I said, I don't need answers like: "do this, do that". If you just give me some good tutorials/articles I think I can find answers on my own, but I couldn't find it. Maybe I'm searching for it wrong.
Thank you in advance for any information.
where do I put quantity of products in the transaction?
Transactions is a bad table name as it's vague and has multiple meanings. Consider "payments", "purchase invoices", etc. See https://dba.stackexchange.com/questions/12991/ready-to-use-database-models-example/23831#23831 for some existing patterns.
Should I use [indexes] only on fields that I will be "querying" a lot?
There's no free lunch. Indexes take space, and can slow down inserts. Start with indexes on your primary keys (which is the default for SQLite), measure what is slow (looking at query plans) and add indexes if they help and if you have room.
is it better for such a small application just for myself to store my account balance in a separate table or should I just calculate it every time?
For an operational/transactional database like you describe, avoid storing calculated values. SQLite can count numbers quickly :)
Premature optimization is premature. Make it work first with full normalization. If you have performance problems, analyze what is really causing the slow-down and go from there.
Related
(https://i.stack.imgur.com/VYkV6.png) :
I'm asked to design a relational database to keep data to answer clinic operation queries such as:
● List the patient appointments for each doctor for a given date.
● When a patient rings to make an appointment, give the available time slots for a given date.
● Retrieve the address of patients to send notices via mail services.
I have one database schema of one relation as shown below, but I was wondering whether there were any mistakes I've made?
ABC(doc-name, doc-gender, registration_num, qualification, pat-name, pat-gender, DOB, address, phone-num, appoint-date, appoint-time, type)
Is the use of words such as date and the use of hyphens generally discouraged? Are there any other weaknesses in my design?
Thank you
So, that's not a schema or a design. Not for a relational database, which, based on the tags for the question, is what you're looking for. That's the storage definition for an ID/Value style of database. If you're looking for actual relational storage, you should be building out those relationships through the process of normalization.
For example, let's start at the beginning with doc-name (I am personally not crazy about using hyphens, but it's not a showstopper, so at least on that note, be sure whichever RDBMS you're working with supports them in the name and then you're good to go). If we think about this just from a data entry stand point, we don't want to have to type in the name of the doctor every time we use that doctor. Instead, we'd want to pull that from a list. So, clearly, we can break that apart from the rest of the information. There is the beginning of our normalization process. We can also easily note the fact that a patient is likely to have more than one appointment. Under the current structure, we'd have to re-enter every bit of patient information prior to the appointment. There's another place where we'd break this apart.
There is tons more to this simple example that could be split out and normalized.
I'd suggest you read up on data normalization. My favorite teacher on the subject is Louis Davidson. Here's his book on the topic. Read that and then try to readdress the situation you're facing.
I'm assuming this isn't just homework. If it is, currently, I'd give you an "F". If it isn't, you should track down someone to give you hand with this database design. You won't be able to quickly read Louis' book on the topic and turn around even a rough working design in any reasonable period of time.
I have to second what Grant said, this is not a relational design at all.
Stop and ask yourself for example what happens if Steven Arrow has to take an afternoon off and update his schedule. You need to be very careful updating the database lest you reassign all his patients.
Spending a total of 5 minutes on this, I see at the very least:
A Doctors table, a Patients Table, and probably a table of open appointment times (which btw, is a bit harder than you think, so you have to give some thought how to handle that and some reading up on tables for scheduling).
That's for starters. I might break out Patients phone numbers to its own table. Why? Well how many columns do you want have for phone numbers? 1? What if they have a work AND home number? Or a Work and Cell and Home? And more.
The concept you're looking for is normal forms. You don't need to go overboard, but generally 3NF is about right.
I'm not that experienced with SQL server but I need to come up with a solution to the following problem.
I'm creating a database that holds cars for sale. Cars are purchased via a handful of ways (contracts), here are 2 examples of the pricing fields needed:
I've left out unnecessary fields for the sake of clarity.
Type: Personal Contract Hire
Fields: InitalPayment, MonthlyPayment
Type: Personal Contract Purchase
Fields: InitialPayment, MonthlyPayment, GFMVPayment
The differences are subtle.
The question is, would it be better to create a table for each type along with some kind of header table or create a single table with a few extra unused fields? Or something else?
I know the purists will hate me for even raising the question of redundancy but the solution has to be practical too and I'm worried about overcomplicating something that needn't be.
I'm using Entity Framework as my ORM.
Any thoughts?
I've never designed a database, but I work with them everyday at my job. The databases I encounter were designed by professionals with years of experience in IT, and many of our tables face the same issue you are describing here. Every single time the answer is create a single table with a few extra unused fields. I realize this may just be the preference of the IT team and that this is not the only way to do it, but as someone who writes dozens of business-analytics queries a day, I can confidently say that this design is very natural and easy to use.
You're probably going to run into this problem again in the future. You may even create another type that requires a 4th field. Imagine if every time that happened, you just added another table. Your database would quickly become hard to manage, and anyone else using it would need to memorize which three or four tables give access to pretty much the same data, with only subtle differences. That's not very user-friendly.
Overall, I suggest creating a single table with some unused fields.
I have been tasked with creating a product inventory module. After reading all the posts I can find on Stack Overflow, I have decided the best way is to not keep a separate, running ‘balance’, but to create one on the fly. I have attached a representation of the tables involved.
Actually, it seems like I don't have enough reputation points to include a picture, so here is a link to a dropbox file:
So I have two questions, which are somewhat related, so it seem like I should include them in the same question posting, though I am not a frequent poster and a sql noob. So please excuse me if I am displaying my ignorance with posting or sql.
First, does this look correct (I named all the columns as non-opaque as possible)? I have to create reports that show the current inventory balance for all the products and for products individually as well as a ‘Transaction Register’ with running balance.
Second, provided the first answer is yes, is this a good candidate for creating a view?
Complex question. Difficult to answer without understanding the full scope of the project. One point - I see there is no Current On-hand table. I agree that the running balance at any point in time is best to use a calculated table. It is however common practice to keep a current on-hand table. This gives you the on hand inventory and values with-out having to sum up the transaction. This is the approach in Microsoft Axapta, and other products I have worked with.
I have a table that holds Test Questions. Each row of the table contains details of a question for a test for a particular user.
The user is presented with 3-5 possible answers and I would like to store details of the answers that have been checked in the row. I don't really want to add new rows for every answer as this would create a huge number of rows.
Is there a way that I can store something like an array of answers in a column in SQL Server? Presently I am storing the data as a JSON string but I remember that Oracle had some way to store array data and I am wondering if SQL Server has the same.
Generally denormallizing is not a good idea. It is rarely a good idea idea. However, it is sometimes necessary for performance reasons. So, if not too slow, don't even consider it.
If you make a secondary answers table in your case with the TestQuestionID (or whatever you call the answers for a single question) to be the clustered index, it won't be much of a performance difference at all compared to a denormalized table.
If I were denormalizing your descriibed table I would probably just create 5 columns in the table, You could also use an xml field, but all you are storing is 5 answers, so I would not use xml in this simple case.
Since you are asking this question, you are not really a seasoned professional (we all start as novices) and you should consult the local sql expert before you denormalize.
ADDED CAVEAT,
Since you accepted this answer, you really need to understand for certain that denormalizing is almost always the wrong thing to do. That is why everyone, including me, was trying to tell you. Don't do this without talking to your DBA -- if you don't have a local DBA (unfortunately all too common) take the collective advice, and don't denormalize. I can think of only 1 time n my career that I think denormalizing was the correct solution. And I have bitten by the bad design (forced on me) by innappropriate denormlazation on many occasions.
I have this table
tblStore
with these fields
storeID (autonumber)
storeName
locationOrBranch
and this table
tblPurchased
with these fields
purchasedID
storeID (foreign key)
itemDesc
In the case of stores that have more than one location, there is a problem when two people inadvertently key the same store location differently. For example, take Harrisburg Chevron. On some of its receipts it calls itself Harrisburg Chevron, some just say Chevron at the top, and under that, Harrisburg. One person may key it into tblStore as storeName Chevron, locationoOrBranch Harrisburg. Person2 may key it as storeName Harrisburg Chevron, locationOrBranch Harrisburg. What makes this bad is that the business's name is Harrisburg Chevron. It seems hard to make a rule (that would understandably cover all future opportunities for this error) to prevent people from doing this in the future.
Question 1) I'm thinking as the instances are found, an update query to change all records from one way to the other is the best way to fix it. Is this right?
Questions 2) What would be the best way to have originally set up the db to have avoided this?
Question 3) What can I do to make future after-the-fact corrections easier when this happens?
Thanks.
edit: I do understand that better business practices are the ideal prevention, but for question 2 I'm looking for any tips or tricks that people use that could help. And question 1 and 3 are important to me too.
This is not a database design issue.
This is an issue with the processes around using the database design.
The real question I have is why are users entering in stores ad-hoc? I can think of scenarios, but without knowing your situation it is hard to guess.
The normal solution is that the tblStore table is a lookup table only. Normally users only have access to stores that have already been entered.
Then there is a controlled process to maintain the tblStore table in a consistent manner. Only a few users would have access to this process.
Of course as I alluded to above this is not always possible, so you may need a different solution.
UPDATE:
Question #1: An update script is the best approach. The best way to do this is to have a copy of the database if possible, or a close copy if not, and test the script against this data. Once you have ensured that the script runs correctly, then you can run it against the real data.
If you have transactional integrity you should use that. Use "begin" before running the script and if the number of records is what you expect, and any other tests you devise (perhaps also scripted), then you can "commit"
Do not type in SQL against a live DB.
Question #3: I suggest your first line of attack is to create processes around the creation of new stores, but this may not be wiuthin your ambit.
The second is possibly to get proactive and identify and enter new stores (if this is the problem) before the users in the field need to do so. I don't know if this works inside your scenario.
Lastly if you had a script that merged "store1" into "store2" you can standardise on that as a way of reducing time and errors. You could even possibly build that into an admin only screen that automated merging stores.
That is all I can think of off the top of my head.