SQL Server - WHERE <several columns> in (<list of columns values>) - sql-server

I have done this long ago in other DBMSs (Oracle or MySQL... don't really remember) and I'm looking for the way to do this in SQL Server, if possible at all.
Suppose you have a table with several columns, say A, B, C, ... M. I wish to phrase a select from this table where columns A, B, and C display specific sets of value or, in other words, a list of values combinations.
For instance, I wish to retrieve all the records that match any of the following combinations:
A B C
1 'Apples' '2016-04-12'
56 'Cars' '2014-02-11'
....
Since the list of possible combinations may be quite long (including the option of an inner SELECT), it would not be practical to use something like:
WHERE ( A = 1 AND B = 'Apples' and C = '2016-04-12' ) OR
( A = 56 AND B = 'Cars' and C = '2014-02-11' ) OR
...
As stated, I did use this type of construct in the past and it was something like:
SELECT *
FROM MyTable
WHERE (A,B,C) IN (SELECT A,B,C FROM MYOtherTable) ;
[Most likely this syntax is wrong but it shows what I'm looking for]
Also, I would rather avoid Dynamic SQL usage.
So, the questions would be:
Is this doable in SQL Server?
If the answer is YES, how should the SELECT be phrased?
Thanks in advance.

You can use JOIN
SELECT m1.*
FROM MyTable m1
JOIN MYOtherTable m2
ON m1.A = m2.A
AND m1.B = m2.B
AND m1.C = m2.C
or Exists
SELECT m1.*
FROM MyTable m1
WHERE EXISTS (SELECT 1
FROM MYOtherTable m2
WHERE m1.A = m2.A
AND m1.B = m2.B
AND m1.C = m2.C)

Related

SQL Server : syntax error when running INSERT INTO where exists and join

I am trying to copy data from one table into another table if these rows are not already in the table. In addition I want to add the value from a column in a third table based on the value of the column in the original table being copied across.
I have inherited two tables one of which is to be made redundant (because the database is going to be made defunct) and hence the information needs to be moved to another table which is not identical in design (however the datatypes are).
INSERT INTO [AMData].[dbo].[AttendanceRecord] A ([TypeID], [Date], [EmpID], [EhrID], [SickReason], [AbsID])
SELECT
T.[ID], F.[EHrDate], F.[EHrEmpID], F.[EHrID], F.[EHRUserComment], F.[AbsID]
FROM
[Focus].[dbo].[AllAbsence] F
INNER JOIN
[AMData].[dbo].[AttendanceTypes] T ON F.[AbsID] = T.[FocusID]
WHERE
NOT EXISTS (SELECT *
FROM [AMData].[dbo].[AttendanceRecord] A, [Focus].[dbo].[AllAbsence] F
WHERE A.[EhrID] = F.[EHrID]
AND F.[EHrDate] BETWEEN '2018/01/01' AND '2018/12/01'
AND F.[AbsID] <> 0
AND F.[AbsID] <> 2);
I am getting a syntax error when I try to run the code. I am not used to running insert code so there is probably obviously wrong.
A guess, but I suspect the problem is you are aliasing the table AttendanceRecord. You can't alias a table in the INSERT clause. The error is actually telling you this:
Ln: 1 Col: 47 - Incorrect syntax near 'A'
Remove the A and this will work (well, the below SQL doesn't have a syntax error, so I use the word "work" loosely):
INSERT INTO [AMData].[dbo].[AttendanceRecord] ([TypeID],
[Date],
[EmpID],
[EhrID],
[SickReason],
[AbsID])
SELECT T.[ID],
F.[EHrDate],
F.[EHrEmpID],
F.[EHrID],
F.[EHRUserComment],
F.[AbsID]
FROM [Focus].[dbo].[AllAbsence] AS F
INNER JOIN [AMData].[dbo].[AttendanceTypes] AS T ON F.[AbsID] = T.[FocusID]
WHERE NOT EXISTS (SELECT *
FROM [AMData].[dbo].[AttendanceRecord] AS A,
[Focus].[dbo].[AllAbsence] AS F
WHERE A.[EhrID] = F.[EHrID]
AND F.[EHrDate] BETWEEN '2018/01/01' AND '2018/12/01'
AND F.[AbsID] <> 0
AND F.[AbsID] <> 2);

Can "constant" lookups be done efficiently within a single query?

Pop quiz, SQL Server hotshots:
How many times will the following student subquery be executed? (assuming there are at least ten rows in something):
SELECT TOP 10 a, b
, (SELECT type_id
FROM type
WHERE type_code = 'student') student
FROM something
If you said 1, then like me, you assume SQL Server would recognize the value of student as an invariant scalar.
Unfortunately, the answer is 10:
I know, I'll use a CTE!
WITH codes (student) AS (
SELECT (SELECT type_id
FROM type
WHERE type_code = 'student')
)
SELECT TOP 10 a, b
, student
FROM something
CROSS JOIN codes
The result is exactly the same.
Of course, I can get the desired efficiency by first capturing the scalar to a variable:
DECLARE #Student tinyint
SELECT #Student = type_id
FROM type
WHERE type_code = 'student'
SELECT TOP 10 a, b
, #Student student
FROM something
This only does one seek, and adds nothing to the main query plan:
But besides being more verbose, if you're defining an inline table-valued function, it means you also have to write out an otherwise implicit return schema, which is a pain (and adds a vector for errors).
Is there any way to write a single query that only runs the subquery once?
For this query:
SELECT TOP 10 a, b,
(SELECT type_id FROM type WHERE type_code = 'student'
) as student
FROM something;
You want an index on type(type_code, type_id).
You might find this more efficient if you move the subquery to the FROM clause:
SELECT TOP 10 a, b,
t.type_id
FROM something s CROSS JOIN
(SELECT type_id FROM type WHERE type_code = 'student'
) t
Or even:
SELECT TOP 10 s.a, s.b, t.type_id
FROM something s JOIN
type t
ON t.type_code = 'student';

Counts rows returned by query that uses distinct

I have a simple query in this form:
SELECT DISTINCT a, b, c
FROM MyTable
WHERE a = SomeConditionsEtc
etc...
But I need to know how many rows it's going to return. Initially I was doing this:
SELECT COUNT(DISTINCT a)
FROM MyTable
WHERE a = SomeConditionsEtc
But that's not reliable in case a contains duplicates where the other don't. So now I'm using a nested query:
SELECT COUNT(*)
FROM (SELECT DISTINCT a, b, c
FROM MyTable
WHERE a = SomeConditionsEtc) AS Temp
Is that the correct approach or is there a better way?
Your query is straight to the point, does the job, and it's simple enough, I'm sure you can bake some unnecessary rocket science into it, but would be overblown imho. Aside from what you have, you can use a group by like below to illustrate what I mean, but you will be basically doing the same thing, getting the uniques and counting them.
SELECT COUNT(1)
FROM (SELECT a
FROM MyTable
WHERE a = 'a'
GROUP BY a, b, c) Temp

How to reuse calculated columns avoiding duplicating the sql statement

I have a lots of calculated columns and they keep repeating themselves, one inside of the others, including nested cases statements.
There is a really simplified version of something that I've searching a way to do.
SELECT
(1+2) AS A,
A + 3 AS B,
B * 7 AS C
FROM MYTABLE
You could try something like this.
SELECT
A.Val AS A,
B.Val AS B,
C.Val AS C
FROM MYTABLE
cross apply(select 1 + 2) as A(Val)
cross apply(select A.Val + 3) as B(Val)
cross apply(select B.Val * 7) as C(Val)
You can't reference just-created expressions by later referencing their column aliases. Think of the entire select list as being materialized at the same time or in random order - A doesn't exist yet when you're trying to make an expression to create B. You need to repeat the expressions - I don't think you'll be able to make "simpler" computed columns without repeating them, and views the same - you'll have to nest things, like:
SELECT A, B, C = B * 7
FROM
(
SELECT A, B = A + 3
FROM
(
SELECT A = (1 + 2)
) AS x
) AS y;
Or repeat the expression (but I guess that is what you're trying to avoid).
Another option if someone is still interested:
with aa(a) as ( select 1+2 )
, bb(b) as ( select a+3 from aa )
,cc(c) as ( select b*7 from bb)
SELECT aa.a, bb.b, cc.c
from aa,bb,cc
The only way to "save" the results of your calculations would be using them in a subquery, that way you can use A, B and C. Unfortunately it cannot be done any other way.
You can create computed columns to represent the values you want. Also, you can use a view if your calculations are dependent on data in a separate table.
Do you want calculated results out of your table? In that case you can put the relevant calculations in scalar valued user defined function and use that inside your select statement.
Or do you want the calculated results to appear as columns in the table, then use a computed column:
CREATE TABLE Test(
ID INT NOT NULL IDENTITY(1,1),
TimesTen AS ID * 10
)

set difference in SQL query

I'm trying to select records with a statement
SELECT *
FROM A
WHERE
LEFT(B, 5) IN
(SELECT * FROM
(SELECT LEFT(A.B,5), COUNT(DISTINCT A.C) c_count
FROM A
GROUP BY LEFT(B,5)
) p1
WHERE p1.c_count = 1
)
AND C IN
(SELECT * FROM
(SELECT A.C , COUNT(DISTINCT LEFT(A.B,5)) b_count
FROM A
GROUP BY C
) p2
WHERE p2.b_count = 1)
which takes a long time to run ~15 sec.
Is there a better way of writing this SQL?
If you would like to represent Set Difference (A-B) in SQL, here is solution for you.
Let's say you have two tables A and B, and you want to retrieve all records that exist only in A but not in B, where A and B have a relationship via an attribute named ID.
An efficient query for this is:
# (A-B)
SELECT DISTINCT A.* FROM (A LEFT OUTER JOIN B on A.ID=B.ID) WHERE B.ID IS NULL
-from Jayaram Timsina's blog.
You don't need to return data from the nested subqueries. I'm not sure this will make a difference withiut indexing but it's easier to read.
And EXISTS/JOIN is probably nicer IMHO then using IN
SELECT *
FROM
A
JOIN
(SELECT LEFT(B,5) AS b1
FROM A
GROUP BY LEFT(B,5)
HAVING COUNT(DISTINCT C) = 1
) t1 On LEFT(A.B, 5) = t1.b1
JOIN
(SELECT C AS C1
FROM A
GROUP BY C
HAVING COUNT(DISTINCT LEFT(B,5)) = 1
) t2 ON A.C = t2.c1
But you'll need a computed column as marc_s said at least
And 2 indexes: one on (computed, C) and another on (C, computed)
Well, not sure what you're really trying to do here - but obviously, that LEFT(B, 5) expression keeps popping up. Since you're using a function, you're giving up any chance to use an index.
What you could do in your SQL Server table is to create a computed, persisted column for that expression, and then put an index on that:
ALTER TABLE A
ADD LeftB5 AS LEFT(B, 5) PERSISTED
CREATE NONCLUSTERED INDEX IX_LeftB5 ON dbo.A(LeftB5)
Now use the new computed column LeftB5 instead of LEFT(B, 5) anywhere in your query - that should help to speed up certain lookups and GROUP BY operations.
Also - you have a GROUP BY C in there - is that column C indexed?
If you are looking for just set difference between table1 and table2,
the below query is simple that gives the rows that are in table1, but not in table2, such that both tables are instances of the same schema with column names as
columnone, columntwo, ...
with
col1 as (
select columnone from table2
),
col2 as (
select columntwo from table2
)
...
select * from table1
where (
columnone not in col1
and columntwo not in col2
...
);

Resources