Normalizing too much vs too little, examples? [closed] - database

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I don't usually design databases, so I'm having some doubts about how to normalize (or not) a table for registering users. The fields I'm having doubts are:
locationTown: I plan to normalize for countries, and have a separate table for it, but should I do the same for towns? I guess users would type this in when registering, and not choosing from a dropdown. Can one normalize when the input may be coming from users?
maritalStatus: I would have a choice of about 5 or so different statuses.
Also, does anyone know of a good place to find real world database schema/normalizing examples?
Thanks

locationTown - just store it directly inside user table. Otherwise you will have to search for existing town, taking typos and code case into account. Also some people use non-standard characters and languages (Kraków vs. Krakow vs. Cracow, see also: romanization). If you really want to have a table with towns, at least provide auto-complete box so the users are more likely choosing existing town. Otherwise prepare for lots of duplicates or almost duplicates.
maritalStatus - this in the other hand should be in a separate table. Or more accurately: use single character or a number to represent marital status. An extra table mapping this to human-readable form is just for convenience (remember about i18n) and foreign key constraint makes sure incorrect status aren't used.

I wouldn't worry about it too much - database normalization (3NF, et al) has been over-emphasized in academia and isn't overly practical in industry. In addition, we would need to see your whole schema in order to judge where these implementations are appropriate. Focus on indexing commonly-used columns before you worry about normalization.
You might want to take a look at this SO question before you dive in any further.

Related

What is the difference between a row, record and tuple? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I am studying in a database development course at the moment and I am having trouble getting my head this!
My course notes describe a tuple as:
A tuple is a row of a relation
From what I have understood since working with MySQL you search for row(s). Or when browsing through a database you are looking through rows in a table.
And from what I understood a record is information within a row.
Is there any distinct differences between the three?
I know someone has posted something similar but I couldn't really understand his answer.
Thanks for all help in advnce!
Peter
In your context they are different words to mean exactly the same thing.
A tuple, in general, means an ordered list with possibly repeated elements (as contrasted to a set, which has all unique elements and is not ordered)
They are the same.
A row—also called a record or tuple—represents a single, implicitly structured data item in a table.
They mean exactly same thing: tuple, rows or records.
Your SELECT query will generate results that may contain 0 or more rows/records or tuples.
A SELECT query can span 1 or more tables

If all data were put into memory, what's the fastest way to do a "SELECT ...WHERE ..." thing? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
If all data were put into memory, which means the media speed is much more faster, what's the fastest way to do a "SELECT .. WHERE .." query (filter data)? So far the options in my mind:
1) b tree like algorithms, but it may still need index and larger space
2) fixed length array, smaller size but may be slower.
So are there any other better ways, if both speed and size are the concerns
It is dependent on the specific case you have - what operations you need fast, what is the exact size, and more. Some examples:
For AND queries, a set of sorted lists is usually maintained (a list for each feature). This data structure is called an inverted index, and
is used often by search engines to get the relevant documents from a
given query. (Apache Lucene uses this data structure, for example).
If arrays can be used - and iteration over the data is needed - it is a very efficient approach, since arrays are basically the most cache efficient data structure there is. Reading sequentially from an array is much faster in most cases then any other DS, since it gets you the fewest "hit misses", which are often the bottle neck when iterating your data.
If your data is strings for example, and you are going to filter according to some string attributes (prefix for example) using a designed data structure for strings, such as a trie or a radix tree - might get you the best performance.
Buttom line: If you are going to do something custom made in order to enhance performance of the default libraries, you should consider the specific problem details before designing your data structure of choice.

Best Practices on naming Views [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I have for example a table name of Cars_Import
I need a view that will be called that grabs the data to be imported and that view is run and does the work to import the data into the Cars_Import table.
My problem is, I can't name the view the same, I have to differentiate it because I guess SQL server look at same name objects as a conflict no matter what type of object it is.
So for best practices in naming conventions that are generally accepted, when you have 2 objects that really relate to each other, and I know it's not good practice to append stuff like tbl, vw for view, etc. in the name, what would you suggest here as the view name related to Cars_Import?
I wouldn't want the view to have it for example switched around which would work but just seems messy to me such as Import_Cars
So what's the advice here on naming the table and its related view which will grab all data from that table that we need? There is no business logic, it's just grabbing the data and we're gonna import it into a data warehouse, all the data as is initially.
Views are actually the one place where I don't mind a prefix or suffix that describes what it is. Unlike when comparing a table or a stored procedure, which are quite obvious because they are used differently, tables and views are largely interchangeable. So I find that this differentiation can be helpful when reverse engineering or troubleshooting code (and I'm talking about when you come across the name in a piece of code, not browsing the objects through Object Explorer, which makes things much more obvious by definition).
Your naming scheme is up to you, and you're largely not going to get a "correct" answer here, other than that you should apply your convention consistently and unilaterally, and do what you can to make sure your entire time buys into it and follows it as well. But I will say that I wouldn't balk at something like this:
Table: dbo.Cars_Import
View: dbo.View_Cars_Import
But to me, this seems to imply that the view may just be something that sits over the table (say prettifying output, adding or hiding columns, etc.), not something that feeds the table. So I kind of agree with #HABO that maybe there is a better way to name this view that describes what it does.

Join slows down sql [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 12 years ago.
We have a discussion over SQL Server 2008 and join. One half says the more joins the slower you sql runs. The other half says ihat it does not matter because SQL server takes care of business so you wil not notice any performance loss. What is true?
Instead of asking the question the way you have, consider instead:
Can I get the data I want without the join?
No => You need the join, end of discussion.
It is also a matter of degree. It is impossible for a join not to add additional processing. Even if the Query Optimizer takes it out (e.g. left join with nothing used from the join) - it still costs CPU cycles to parse it.
Now if the question is about comparing joins to another technique, such as one special case of LEFT JOIN + IS NULL vs NOT EXISTS for a record in X not in Y scenario, then let's discuss specifics - table sizes (X vs Y), indexes etc.
It will slow it down: the more complicated a query, the more work the database server has to do to execute it.
But about that "performance loss": over what? Is there another way to get at the same data? If so, then you can profile the various options against each other to see which is fastest.

What is the difference between a Rule and a Policy [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
in the context of a database, we sometimes need to check values against some statements like "the customer name is non-empty" or "the customer number of purchases is positive"...
But do such statements constitute rules or policies ?
In general how would you define these concepts, their differences and relations ?
Thanks in advance.
I think I know what you're talking about; I've run into such distinctions before (even though the English words are not all that different) and here is how I think it plays out in most business computing areas.
A rule in such a context is something that--whether it's a structural fact or a business-imposed statement--will not change, or at least stands only a very small chance of changing. Most statements of the form "X cannot be null" represent rules. "Null" typically doesn't make much sense to a business user; usually you arrive at these rules by examining the way that your model is constructed. A change to a rule has far-reaching consequences to the way that your database and any supporting applications are built.
A policy is more like a business instruction. Preferred customers get 10% off may be a policy, but as you know, things like this tend to change. A change to a policy may impact the way your application works, but not its fundamental architecture or underpinnings.
Pragmatically speaking--and it sounds like you may already know this--you want to make policies relatively easy to change. Rules, while they may change, are typically more involved: changing a rule often requires changing code, UIs, mental models, ways of thought, and so on.
I hope this helps.
In the context of a database, I would argue that it's a rule to have a username, while it's a policy (potentially overridden by administrative or other approval) to allow customers to have a lower assigned discount if they have less than a set number of purchases.
Rule: All users must have a username.
Rule: All users must have a password.
Rule: All users must have a valid email address.
Rule: All users must have a valid credit card on file.
Policy: All users begin with a 0% discount rate on purchases.
Policy: All users are required to pay for shipping.
Rules are outward-facing statements backed by validation. Policies are internal rules backed by consequence.
It could be a policy that later on down the road, a user can change a username (depending on how the software was written), or that the discount and shipping rates assigned on signup may be adjusted to create customer opportunities.
In my estimation then, a rule requires hard validation, while a policy by nature is subject to intervention and/or manipulation.
HTH
Jared

Resources