As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 12 years ago.
We have a discussion over SQL Server 2008 and join. One half says the more joins the slower you sql runs. The other half says ihat it does not matter because SQL server takes care of business so you wil not notice any performance loss. What is true?
Instead of asking the question the way you have, consider instead:
Can I get the data I want without the join?
No => You need the join, end of discussion.
It is also a matter of degree. It is impossible for a join not to add additional processing. Even if the Query Optimizer takes it out (e.g. left join with nothing used from the join) - it still costs CPU cycles to parse it.
Now if the question is about comparing joins to another technique, such as one special case of LEFT JOIN + IS NULL vs NOT EXISTS for a record in X not in Y scenario, then let's discuss specifics - table sizes (X vs Y), indexes etc.
It will slow it down: the more complicated a query, the more work the database server has to do to execute it.
But about that "performance loss": over what? Is there another way to get at the same data? If so, then you can profile the various options against each other to see which is fastest.
Related
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I am studying in a database development course at the moment and I am having trouble getting my head this!
My course notes describe a tuple as:
A tuple is a row of a relation
From what I have understood since working with MySQL you search for row(s). Or when browsing through a database you are looking through rows in a table.
And from what I understood a record is information within a row.
Is there any distinct differences between the three?
I know someone has posted something similar but I couldn't really understand his answer.
Thanks for all help in advnce!
Peter
In your context they are different words to mean exactly the same thing.
A tuple, in general, means an ordered list with possibly repeated elements (as contrasted to a set, which has all unique elements and is not ordered)
They are the same.
A row—also called a record or tuple—represents a single, implicitly structured data item in a table.
They mean exactly same thing: tuple, rows or records.
Your SELECT query will generate results that may contain 0 or more rows/records or tuples.
A SELECT query can span 1 or more tables
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
If all data were put into memory, which means the media speed is much more faster, what's the fastest way to do a "SELECT .. WHERE .." query (filter data)? So far the options in my mind:
1) b tree like algorithms, but it may still need index and larger space
2) fixed length array, smaller size but may be slower.
So are there any other better ways, if both speed and size are the concerns
It is dependent on the specific case you have - what operations you need fast, what is the exact size, and more. Some examples:
For AND queries, a set of sorted lists is usually maintained (a list for each feature). This data structure is called an inverted index, and
is used often by search engines to get the relevant documents from a
given query. (Apache Lucene uses this data structure, for example).
If arrays can be used - and iteration over the data is needed - it is a very efficient approach, since arrays are basically the most cache efficient data structure there is. Reading sequentially from an array is much faster in most cases then any other DS, since it gets you the fewest "hit misses", which are often the bottle neck when iterating your data.
If your data is strings for example, and you are going to filter according to some string attributes (prefix for example) using a designed data structure for strings, such as a trie or a radix tree - might get you the best performance.
Buttom line: If you are going to do something custom made in order to enhance performance of the default libraries, you should consider the specific problem details before designing your data structure of choice.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Is it fair to say that it takes no time (compared to the nested SELECT) to make the second (outer) 'SELECT' from a result-set like this?
SELECT some_column
FROM
(
SELECT some_column
FROM some_table
)
AS _alias
The SQL optimizer is likely to treat that SELECT statement as if it was written:
SELECT some_column FROM some_table
So there'll be no performance difference whatsoever. The optimizer does its best to minimize the cost of producing the answer and will rework the query you write to speed things up. Only the most naïve optimizer would evaluate the inner SELECT and save the results in a table and then run the outer SELECT on that result.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I have few questions which i have been asked in interview:
Performance difference between delete and truncate?
Delete duplicate data from a table which is not having any id column and should not use CTE.
Why we are able to delete data using CTE?
DELETE logs each individual deletion, whereas TRUNCATE is a bulk logged operation, hence is faster.
You could SELECT DISTINCT data into a temp table, TRUNCATE the first then reinsert.
Not a scooby...
here are some pointers to solve your issues:
Since TRUNCATE doesn't actually delete data, but deallocate the data by removing pointers to the indexes it will be much faster than DELETE, when you use DELETE everything is stored in the transaction log row by row, hence it's much slower.
http://www.codeproject.com/Tips/159881/How-to-remove-duplicate-rows-in-SQL-Server-2008-wh
http://blog.sqlauthority.com/2009/06/23/sql-server-2005-2008-delete-duplicate-rows/
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I don't usually design databases, so I'm having some doubts about how to normalize (or not) a table for registering users. The fields I'm having doubts are:
locationTown: I plan to normalize for countries, and have a separate table for it, but should I do the same for towns? I guess users would type this in when registering, and not choosing from a dropdown. Can one normalize when the input may be coming from users?
maritalStatus: I would have a choice of about 5 or so different statuses.
Also, does anyone know of a good place to find real world database schema/normalizing examples?
Thanks
locationTown - just store it directly inside user table. Otherwise you will have to search for existing town, taking typos and code case into account. Also some people use non-standard characters and languages (Kraków vs. Krakow vs. Cracow, see also: romanization). If you really want to have a table with towns, at least provide auto-complete box so the users are more likely choosing existing town. Otherwise prepare for lots of duplicates or almost duplicates.
maritalStatus - this in the other hand should be in a separate table. Or more accurately: use single character or a number to represent marital status. An extra table mapping this to human-readable form is just for convenience (remember about i18n) and foreign key constraint makes sure incorrect status aren't used.
I wouldn't worry about it too much - database normalization (3NF, et al) has been over-emphasized in academia and isn't overly practical in industry. In addition, we would need to see your whole schema in order to judge where these implementations are appropriate. Focus on indexing commonly-used columns before you worry about normalization.
You might want to take a look at this SO question before you dive in any further.