More efficient alternative to subquery - sql-server

I've got many "actual" and "history companion tables" (so to speak) with structure of last like this:
values| date_deal | type_deal | num (autoinc)
value1| 01.01.2012 | i | 1
value1| 02.01.2012 | u | 2
value2| 02.01.2012 | i | 3
value2| 03.01.2012 | u | 4
value1| 04.01.2012 | d | 5
value2| 05.01.2012 | u | 6
value2| 08.01.2012 | u | 7
If I insert (or update or delete) record in "actual" table, trigger puts affected record into "history table" with date_deal = Geddate(), type_deal = i|u|d (for insert, update and delete triggers respectivly) and num as autoinc unique value
So the question is how to get last record for each distinct value valid on certain date and excluding from final result records which type_deal = 'd' (since that record was deleted from actual table by that time and we don't want to have anything assosiated with it)
The way I do it most of the time:
SELECT *
FROM t_table1 t1
WHERE t1.num = ( SELECT MAX(num)
FROM t_table1 t2
WHERE t2.[values] = t1.[values]
AND t2.[date_deal] < #dt)
AND t1.[type_deal] <> 'D'
But that works very slow sometimes. I'm looking for more efficient alternative. Please, help
So, an update.
Thanks for replies, friends.
I've made some testing on both actual and testing servers.
In order to put these different approaches into same league I've decided that we should take all fields from source table.
Testing server has bellow 200K records and I also had a luxury of using DBCC FreeProcCache
and DBCC DropCleanbuffers directives. Actual working server has over 2.3M records and also no option for droping buffs or cache since.. well.. it is in use by real users. So it was droped only once and i've got results right after that.
Here is actual queries and time it took on both servers:
Original:
DECLARE #dt datetime = CONVERT(datetime, '01.08.2013', 104)
SELECT *
FROM [CLIENTS_HISTORY].[dbo].[Clients_all_h] c
WHERE c.num = ( SELECT MAX(num)
FROM [CLIENTS_HISTORY].[dbo].[Clients_all_h] c2
WHERE c2.[AccountSys] = c.[AccountSys]
AND date_deal <= #dt)
AND c.type_deal <> 'D'
61sec # 2'316'890rec on real one, 4sec # 191'533 on test
Rahul's:
SELECT *
FROM [CLIENTS_HISTORY].[dbo].[Clients_all_h] c
GROUP BY [all_fields]
HAVING c.num = ( SELECT MAX(num)
FROM [CLIENTS_HISTORY].[dbo].[Clients_all_h] c2
WHERE c2.[AccountSys] = c.[AccountSys]
AND date_deal <= #dt)
AND c.type_deal <> 'D'
62sec # 2'316'890rec on real one, 4sec # 191'533 on test
Almost equal
George's (with some major changes):
SELECT * FROM
(
SELECT *,
ROW_NUMBER() OVER(PARTITION BY accountsys ORDER BY num desc) AS aa
FROM [CLIENTS_HISTORY].[dbo].[Clients_all_h] c
WHERE c.date_deal < #dt) as a
WHERE aa=1
AND type_deal <> 'D'
76sec # 2'316'890rec on real one, 5sec # 191'533 on test
So far original and Rahul's are fastest and George's is not so fast.

Try using GROUP BY..HAVING CLAUSE
SELECT *
FROM t_table1 t1
GROUP BY [column_names]
HAVING t1.num = ( SELECT MAX(num)
FROM t_table1 t2
WHERE t2.[values] = t1.[values]
AND t2.[date_deal] < #dt)
AND t1.[type_deal] <> 'D'

I think row_num() could be usefull for you as follows:
select
*
from
(
select
*,
row_number() over( partition by date_deal order by num) as aa
from
t_table1 t1
where
t1.[type_deal] <> 'D'
) as a
where
aa=1

Related

WHERE inside CASE statement

I have seen lots of posts about putting CASE statements inside WHERE clauses. However, in my case, I want to put a WHERE statement inside a CASE statement, which I don't think can be done.
My situation is that I have a view that is going on many, many databases. In half of them, I need to have a WHERE clause. In the other half, I don't need the WHERE clause. Leaving it in isn't harmful, in that I don't get bad data, but it slows the query down considerably given that it has to read and sort a large table when it is unnecessary.
Status_Table
ID Status
1 Active
2 Inactive
3 Unknown --this row of data will not exist in some DBs
Item_table
Item_ID Status value1 value2
1001 3 ..... .....
1002 1 ..... .....
What I want to do is something like this.
-- big nasty ugly query with various CTEs, selects, and joins
CASE WHEN (SELECT MAX(ID) FROM Status_table) = 3
THEN WHERE Item_Table.Status != 3
ELSE WHERE 1=1
END
Ultimately, I can probably swing this by constructing the query using dynamic SQL, but I was hoping for a more elegant solution.
You're trying to build a conditional WHERE clause, yes? I'd think something like this would work pretty well for what you want to do:
WHERE (EXISTS (SELECT 1 FROM Status_table WHERE ID = 3)
AND Item_Table.Status != 3)
OR
NOT EXISTS (SELECT 1 FROM Status_table WHERE ID = 3)
SQL Fiddle
MS SQL Server 2014 Schema Setup:
CREATE TABLE Status_Table ( ID int, [Status] varchar(30) ) ;
INSERT INTO Status_Table ( ID, [Status] )
VALUES
( 1,'Active' ), (2,'Inactive'), (3,'Unknown')
;
CREATE TABLE Item_table ( Item_ID int, [Status] int, value1 varchar(10), value2 varchar(10) ) ;
/* NOTE: Item_table.[Status] should be a descriptive name, like [StatusID], or something not the same name as a different datatype column in the relation table. */
INSERT INTO Item_Table ( Item_ID, [Status], value1, value2 )
VALUES (1001,3,'Bill','Exclude')
,(1002,1,'Ted','Include')
,(1003,3,'Rufus','Exclude')
,(1004,2,'Jay','Include')
,(1005,5,'Bob','BadRecord')
;
Query 1:
SELECT s1.Item_ID, s1.Status, s1.Value1, s1.Value2
FROM (
SELECT i.Item_ID, i.Status, i.Value1, i.Value2
, RANK() OVER (ORDER BY s.ID DESC) AS rn
FROM Item_Table i
RIGHT OUTER JOIN Status_Table s ON i.[Status] = s.ID
) s1
WHERE s1.rn <> 1
Results:
| Item_ID | Status | Value1 | Value2 |
|---------|--------|--------|---------|
| 1004 | 2 | Jay | Include |
| 1002 | 1 | Ted | Include |
If its performance your after start using DECLARE and SET.
It only needs to execute ONCE!!!
DECLARE #UnkownExists AS INT
SET #UnkownExists = (select count(*) from Status_table where ID = 3)
-- big nasty ugly query with various CTEs, selects, and joins
WHERE (#UnkownExists=1 AND Item_Table.Status != 3) OR #NoUnknown=0
You can try this:
CASE WHEN (SELECT MAX(ID) FROM Status_table) = 3
THEN 3
ELSE (SELECT MAX(ID)+1 FROM Status_table) --Invalid value that will not exist
END <> Item_Table.Status

How to select random rows whose column sum meets the condition in Sql server

Is it possible to select random rows from a table whose particular column total (sum) should be less than my condition value ?
My table structure is like -
id | question | answerInSec
1 | Quest1 | 15
2 | Quest2 | 20
3 | Quest3 | 10
4 | Quest4 | 15
5 | Quest5 | 10
6 | Quest6 | 15
7 | Quest7 | 20
I want to get those random questions whose total sum of 'answerInSec' column is less than (nearest total) or equal to 60.
So random combination can be [1,2,3,4] OR [2,3,5,7] OR [4,5,6,7] etc.
I tried as follows but no luck
select id,question,answerinsec
from (select Question.*, sum(answerinsec) over (order by id) as CumTicketCount
from Question
) t
where cumTicketCount <= 60
ORDER BY NEWID();
I hope this one help
DECLARE #MaxAnswerInSec INT = 60
DECLARE #SumAnswerInSec INT = 0
DECLARE #RadomQuestionTable TABLE(Id INT, Question NVARCHAR(100), AnswerInSec INT)
DECLARE #tempId INT,
#tempQuestion NVARCHAR(100),
#tempAnswerInSec INT
WHILE #SumAnswerInSec <= #MaxAnswerInSec
BEGIN
SELECT TOP(1) #tempId = Id, #tempQuestion = Question, #tempAnswerInSec = AnswerInSec
FROM Question
WHERE Id NOT IN (SELECT Id FROM #RadomQuestionTable)
AND AnswerInSec + #SumAnswerInSec <= #MaxAnswerInSec
ORDER BY NEWID()
IF #tempId IS NOT NULL
BEGIN
INSERT INTO #RadomQuestionTable VALUES(#tempId, #tempQuestion, #tempAnswerInSec)
END
ELSE
BEGIN
BREAK
END
SELECT #tempId = NULL
SELECT #SumAnswerInSec = SUM(AnswerInSec) FROM #RadomQuestionTable
END
SELECT * FROM #RadomQuestionTable
OK. Try this. This might not be the fastest, but is easier to understand and implement. Moreover this is a SQL-only solution:
SELECT t1.id, t2.id, t3.id, t4.id FROM Question t1 CROSS JOIN Question t2
CROSS JOIN Question t3 CROSS JOIN Question t4
WHERE t2.id > t1.id AND t3.id > t2.id AND t4.id > t3.id
AND t1.answerInSec + t2.answerInSec + t3.answerInSec + t4.answerInSec = 60
What this basically does is to create a cross product of your Questions table with itself and then repeats this process two more times, thus creating N ^ 4 rows where N is the number of rows in your table. It then filters out duplicate rows by only those selecting the permutations where t1.id < t2.id < t3.id < t4.id. It then filters remaining rows by looking for the rows where the sum of all score fields is equal to your target value (60).
Note that this result set can become HUGE for even moderately sized tables. For example, a table with just 200 rows will generate a cross product of 200 ^ 4 = 1,600,000,000 rows (though a lot of them will be discarded by the WHERE clause). You should have your indexes in place if your table is large.
Also note that this query does not account for the permutations where less than 4 rows may add up to 60. You can easily modify it to do that by including a NULL row in your table (a row whose score field is zero).
SELECT *
FROM question
WHERE answerInSec<50
ORDER BY CHECKSUM(NEWID())

SQL Server 2008 Is it Possible to Have Select Top Return Nulls

(Select top 1 pvd.Code from PatientVisitDiags pvd
where pvd.PatientVisitId = pv.PatientVisitId
Order By pvd.Listorder) as "DX1",
(Select top 1 a.code from (Select top 2 pvd.Code,pvd.ListOrder from PatientVisitDiags pvd
where pvd.PatientVisitId = pv.PatientVisitId
Order By pvd.Listorder)a order by a.ListOrder DESC ) as "DX2",
(Select top 1 a.code from (Select top 3 pvd.Code,pvd.ListOrder from PatientVisitDiags pvd
where pvd.PatientVisitId = pv.PatientVisitId
Order By pvd.Listorder)a order by a.ListOrder DESC ) as "DX3",
(Select top 1 a.code from (Select top 4 pvd.Code,pvd.ListOrder from PatientVisitDiags pvd
where pvd.PatientVisitId = pv.PatientVisitId
Order By pvd.Listorder)a order by a.ListOrder DESC ) as "DX4",
(Select top 1 a.code from (Select top 5 pvd.Code,pvd.ListOrder from PatientVisitDiags pvd
where pvd.PatientVisitId = pv.PatientVisitId
Order By pvd.Listorder)a order by a.ListOrder DESC ) as "DX5"
The above code is what I am using currently (It is not optimal but is only being used once for a one time Data Export).
In the database that we are currently exporting from, there is a table PatientVisitDiags that has columns "ListOrder" and "Code". There can be between 1 and 5 codes. The ListOrder holds the number of that code. For example:
ListOrder|Code |
1 |M51.27 |
2 |M54.17 |
3 |G83.4 |
I am trying to export the Code to its corresponding Column in the new table(DX1,DX2..etc). If I sort by ListOrder I can get them in the order I need (Row 1 to DX1 | Row 2 to DX2 etc.) However when I run the above SQL code, If the source table only has 3 Codes DX4 and DX5 will repeat DX3. For Example:
DX1 |DX2 |DX3 |DX4 |DX5
M51.27 |M54.17 |G83.4 |G83.4 |G83.4
Is there a way to have TOP return NULL values if you Select TOP more than what is given? SQL Sever 2008 does not allow for OFFSET/FETCH, this is what I normally would have done given the option to select individual rows.
TL:DR
ID | Name
1 | Joe
2 | Eric
3 | Steve
4 | John
If I have a table like above and run
SELECT TOP 5 Name FROM Table
Is there anyway to return?
Joe
Eric
Steve
John
NULL
What you're really doing is pivoting. So pivot! Try this little query:
WITH Top5 AS (
SELECT TOP 5
Dx = 'DX' + Convert(varchar(11), Row_Number() OVER (ORDER BY pvd.Listorder)),
pvd.Code
FROM dbo.PatientVisitDiags pvd
WHERE pvd.PatientVisitId = #patientVisitId
)
SELECT *
FROM
Top5 t
PIVOT (Max(Code) FOR Dx IN (DX1, DX2, DX3, DX4, DX5)) p
;
To answer your second question about getting an unpivoted rowset, basically do the same thing but provide the 5 rows somehow and left join to the desired data.
WITH Data AS (
SELECT TOP 5
Seq = Row_Number() OVER(ORDER BY ID),
Name
FROM dbo.Table
ORDER BY ID
)
SELECT
n.Seq,
t.Name
FROM
(VALUES
(1), (2), (3), (4), (5) -- or a numbers-generating CTE perhaps
) n (Seq)
LEFT JOIN Top 5 t
ON n.Seq = t.Seq
;
Side note
The fact that you're doing this:
where pvd.PatientVisitId = pv.PatientVisitId
tells me you're not using ANSI joins. Stop. Don't do that any more. Put this join condition in the ON clause of a JOIN. It's the year 2016... why are you using join syntax from the last century?
Oh, and prefix the schema on the table names. Look it up--you'll find actual performance reasons why you should do that. It's not just about the time taken to find the correct schema, but also about the execution plan cache...
one at a time - answering the last question
create a table with a bunch of null
select top (5) col
from
(
select col from table1
union
select nulCol from nullTable
) tt
order by tt.col

SQL Pulling the latest information and information from another table

I have a record table that is recording changes within a table. I can pull the data from the first table fine, however when i try to join in another table to add some of its column information it stops displaying the information.
PartNumber | PartDesc | value | date
1 | test | 1 | 3/4/2015
I wanted to include the Aisle tag's from the location table
PartNumber| AisleTag | AisleTagTwo
1 | A1 | N/A
here is what i have as my sql statement so far
Select t1.PartNumber, t1.PartDesc , t1.NewValue , t1.Date,t2.AisleTag,t2.AisleTagTwo
from InvRecord t1
JOIN PartAisleListTbl t2 ON t1.PartNumber = t2.PartNumber
where Date = (select max(Date) from InvRecord where t1.PartNumber = InvRecord.PartNumber)
order by t1.PartNumber
it is coming up blank, my original sql statement doesn't include anything from t2. I am not sure what approach to go with in terms of getting the data combined any help is much appreciated thank you !
this should be the end result
PartNumber | PartDesc | value | date | AisleTag | AisleTagTwo
1 | test | 1 | 3/4/2015 | A1 | N/A
Pull the most recent row (based on Date) for each PartNumber in Table A and append data from Table B (joined on PartNumber):
SELECT *
FROM (
SELECT A.PartNumber
, A.PartDesc
, A.NewValue
, A.Date
, B.AisleTag
, B.AisleTagTwo
, DateSeq = ROW_NUMBER() OVER(PARTITION BY A.PartNumber ORDER BY A.Date DESC)
FROM InvRecord A
LEFT JOIN PartAisleListTbl B
ON A.PartNumber = B.PartNumber
) A
WHERE A.DateSeq = 1
ORDER BY A.PartNumber
Are you returning no records at all, or only records with AisleTag and AisleTagTwo as null?
Your sentence "it is coming up blank, my original sql statement doesn't include anything from t2." makes it sound like you're getting records with nulls for the t2 fields.
If you are, then you probably have a record in t2 that has nulls for those fields.
For troubleshooting purposes, try running the query without the WHERE clause:
Select t1.PartNumber, t1.PartDesc , t1.NewValue , t1.Date,t2.AisleTag,t2.AisleTagTwo
from InvRecord t1
JOIN PartAisleListTbl t2 ON t1.PartNumber = t2.PartNumber
order by t1.PartNumber
If you do get records, your problem is with the WHERE clause. If you don't, your problem is with the PartNumber fields in InvRecord and PartAisleListTbl not matching.
Not sure why your's isn't working... is date in both t1 and t2 by any chance?
Here's it re factored to use a inline view instead of a correlated query wonder if it makes a difference.
Select t1.PartNumber, t1.PartDesc , t1.NewValue , t1.Date,t2.AisleTag,t2.AisleTagTwo
from InvRecord t1
JOIN PartAisleListTbl t2
ON t1.PartNumber = t2.PartNumber
JOIN (select max(Date) mdate, PartNumber from InvRecord GROUP BY PartNumber) t3
on t3.partNumber= T1.PartNumber
and T3.mdate = T1.Date
order by t1.PartNumber

How do I get the "Next available number" from an SQL Server? (Not an Identity column)

Technologies: SQL Server 2008
So I've tried a few options that I've found on SO, but nothing really provided me with a definitive answer.
I have a table with two columns, (Transaction ID, GroupID) where neither has unique values. For example:
TransID | GroupID
-----------------
23 | 4001
99 | 4001
63 | 4001
123 | 4001
77 | 2113
2645 | 2113
123 | 2113
99 | 2113
Originally, the groupID was just chosen at random by the user, but now we're automating it. Thing is, we're keeping the existing DB without any changes to the existing data(too much work, for too little gain)
Is there a way to query "GroupID" on table "GroupTransactions" for the next available value of GroupID > 2000?
I think from the question you're after the next available, although that may not be the same as max+1 right? - In that case:
Start with a list of integers, and look for those that aren't there in the groupid column, for example:
;WITH CTE_Numbers AS (
SELECT n = 2001
UNION ALL
SELECT n + 1 FROM CTE_Numbers WHERE n < 4000
)
SELECT top 1 n
FROM CTE_Numbers num
WHERE NOT EXISTS (SELECT 1 FROM MyTable tab WHERE num.n = tab.groupid)
ORDER BY n
Note: you need to tweak the 2001/4000 values int the CTE to allow for the range you want. I assumed the name of your table to by MyTable
select max(groupid) + 1 from GroupTransactions
The following will find the next gap above 2000:
SELECT MIN(t.GroupID)+1 AS NextID
FROM GroupTransactions t (updlock)
WHERE NOT EXISTS
(SELECT NULL FROM GroupTransactions n WHERE n.GroupID=t.GroupID+1 AND n.GroupID>2000)
AND t.GroupID>2000
There are always many ways to do everything. I resolved this problem by doing like this:
declare #i int = null
declare #t table (i int)
insert into #t values (1)
insert into #t values (2)
--insert into #t values (3)
--insert into #t values (4)
insert into #t values (5)
--insert into #t values (6)
--get the first missing number
select #i = min(RowNumber)
from (
select ROW_NUMBER() OVER(ORDER BY i) AS RowNumber, i
from (
--select distinct in case a number is in there multiple times
select distinct i
from #t
--start after 0 in case there are negative or 0 number
where i > 0
) as a
) as b
where RowNumber <> i
--if there are no missing numbers or no records, get the max record
if #i is null
begin
select #i = isnull(max(i),0) + 1 from #t
end
select #i
In my situation I have a system to generate message numbers or a file/case/reservation number sequentially from 1 every year. But in some situations a number does not get use (user was testing/practicing or whatever reason) and the number was deleted.
You can use a where clause to filter by year if all entries are in the same table, and make it dynamic (my example is hardcoded). if you archive your yearly data then not needed. The sub-query part for mID and mID2 must be identical.
The "union 0 as seq " for mID is there in case your table is empty; this is the base seed number. It can be anything ex: 3000000 or {prefix}0000. The field is an integer. If you omit " Union 0 as seq " it will not work on an empty table or when you have a table missing ID 1 it will given you the next ID ( if the first number is 4 the value returned will be 5).
This query is very quick - hint: the field must be indexed; it was tested on a table of 100,000+ rows. I found that using a domain aggregate get slower as the table increases in size.
If you remove the "top 1" you will get a list of 'next numbers' but not all the missing numbers in a sequence; ie if you have 1 2 4 7 the result will be 3 5 8.
set #newID = select top 1 mID.seq + 1 as seq from
(select a.[msg_number] as seq from [tblMSG] a --where a.[msg_date] between '2023-01-01' and '2023-12-31'
union select 0 as seq ) as mID
left outer join
(Select b.[msg_number] as seq from [tblMSG] b --where b.[msg_date] between '2023-01-01' and '2023-12-31'
) as mID2 on mID.seq + 1 = mID2.seq where mID2.seq is null order by mID.seq
-- Next: a statement to insert a row with #newID immediately in tblMSG (in a transaction block).
-- Then the row can be updated by your app.

Resources