SQL Server : getting data from multiple rows into one row - sql-server

I have the following table and I want to get one row per user (userId) with the Answer column value when initial is set to both true and false. So the following tables
UserId QuestionId Answer Initial
----------------------------------------------------------------
027D76AC-DFBC-4BD2-9B88DD7B2456338E 1 5 False
027D76AC-DFBC-4BD2-9B88DD7B2456338E 1 4 True
06B1713D-2E47-4454-8949C950C58753DC 1 4 True
216F33EF-1ACD-4D1F-86D2932AF598326E 1 5 False
216F33EF-1ACD-4D1F-86D2932AF598326E 1 4 True
23A950EB-3C68-4FE7-B719B86DC299343D 1 4 True
23A950EB-3C68-4FE7-B719B86DC299343D 1 4 False
Would return the following results
UserId QuestionId trueAnswer FalseAnswer
-----------------------------------------------------------------------
027D76AC-DFBC-4BD2-9B88DD7B2456338E 1 5 4
216F33EF-1ACD-4D1F-86D2932AF598326E 1 5 4
23A950EB-3C68-4FE7-B719B86DC299343D 1 4 4
Is this something that can be done with sub selects?

I think below can be a solution (top portion is just creating a temp table to test it). I am always an advocate of avoiding GROUP BY and plus I think OUTER/CROSS APPLY are really cool. Notice though that the result is opposite your result. For example, your top one shows the FalseAnswer to be 4. According to the data it is 5. Unless I am missing something.
-- creating sample set
IF object_id('tempdb..#YOUR_TABLE') is not null drop table #YOUR_TABLE
CREATE TABLE #YOUR_TABLE (UserID VARCHAR(200), QuestionID INT, Answer INT, Initial BIT)
INSERT INTO #YOUR_TABLE(UserID, QuestionID, Answer, Initial)
Values
('027D76AC-DFBC-4BD2-9B88DD7B2456338E', 1, 5, 'false'),
('027D76AC-DFBC-4BD2-9B88DD7B2456338E', 1, 4, 'true'),
('06B1713D-2E47-4454-8949C950C58753DC', 1, 4, 'true'),
('216F33EF-1ACD-4D1F-86D2932AF598326E', 1, 5, 'false'),
('216F33EF-1ACD-4D1F-86D2932AF598326E', 1, 4, 'true'),
('23A950EB-3C68-4FE7-B719B86DC299343D', 1, 4, 'true'),
('23A950EB-3C68-4FE7-B719B86DC299343D', 1, 4, 'false')
-- solution
SELECT a.UserID,
a.QuestionID,
a.Answer,
b.FalseAnswer
FROM #YOUR_TABLE AS a
OUTER APPLY
(
SELECT y.Answer AS FalseAnswer
FROM #YOUR_TABLE AS y
WHERE y.Initial='false' AND a.UserID=y.UserID
) AS b
WHERE a.Initial='true' AND b.FalseAnswer IS NOT null
output
UserID QuestionID Answer FalseAnswer
027D76AC-DFBC-4BD2-9B88DD7B2456338E 1 4 5
216F33EF-1ACD-4D1F-86D2932AF598326E 1 4 5
23A950EB-3C68-4FE7-B719B86DC299343D 1 4 4

Try this:
select UserID,
max(QuestionID),
max(case when Initial = 'True' then Answer end) [trueAnswer],
max(case when Initial = 'False' then Answer end) [falseAnswer]
from TABLE_NAME
group by UserID
having count(distinct Initial) = 2

Related

Rank or merge sequential rows

I have a log file I need to either rank (but treating sequential and equal rows as ties), or merge sequential equal rows (based on specific column). My table looks like below, The Start and Stop are all being sequential (within the same ID window)
ID Start Stop Value
1 0 1 A
1 1 2 A
1 2 3 A
1 3 4 B
1 4 5 B
1 5 6 A
2 3 4 A
I have two approches to get what I need.
Approach 1: Rank (treating sequential rows with equal values in "Value" as ties) and using ID as partition.
This should give the output below. But how do I do the special rank: Treating sequential rows with equal values in "Value" as ties.
Select *,
rank() OVER (partition by id order by start, stop) as Rank,
XXX as SpecialRank
from Table
ID Start Stop Value Rank SpecialRank
1 0 1 A 1 1
1 1 2 A 2 1
1 2 3 A 3 1
1 3 4 B 4 2
1 4 5 B 5 2
1 5 6 A 6 3
2 3 4 A 1 1
Approach 2: Merge sequential rows with equal values in "Value".
This will shall create a table like below.
ID Start Stop Value
1 0 3 A
1 3 5 B
1 5 6 A
2 3 4 A
I don't know if this helps, but I have also a nextValue column that might help in this
ID Start Stop Value NextValue
1 0 1 A A
1 1 2 A A
1 2 3 A B
1 3 4 B B
1 4 5 B A
1 5 6 A A
2 3 4 A ...
Example-table:
CREATE TABLE #Table ( id int, start int, stop int, Value char(1), NextValue char(1));
INSERT INTO #Table values (1,0, 1, 'A', 'A');
INSERT INTO #Table values (1,1, 2, 'A', 'A');
INSERT INTO #Table values (1,2, 3, 'A', 'B');
INSERT INTO #Table values (1,3, 4, 'B', 'B');
INSERT INTO #Table values (1,4, 5, 'B', 'A');
INSERT INTO #Table values (1,5, 6, 'A', 'A');
INSERT INTO #Table values (2,3, 4, 'A', null);
Use a self join to an aggregate subquery from the full set, e.g.
with rankTable (id, value) as
( select 1, 'A' union all select 1, 'A' union all select 1, 'B' union all select 2, 'A')
select t2.* from rankTable t1 join (
select id, value, rank() over (partition by id order by value) as specialRank from
(
select distinct id, value
from rankTable
) t) t2 on t2.id =t1.id and t2.value = t1.value
id value specialRank
1 A 1
1 A 1
1 B 2
2 A 1

SQL Server - Behaviour of ROW_NUMBER Partition by Null Value

I find this behaviour very strange and counterintuitive. (Even for SQL).
set ansi_nulls off
go
;with sampledata(Value, CanBeNull) as
(
select 1, 1
union
select 2, 2
union
select 3, null
union
select 4, null
union
select 5, null
union
select 6, null
)
select ROW_NUMBER() over(partition by CanBeNull order by value) 'RowNumber',* from sampledata
Which returns
1 3 NULL
2 4 NULL
3 5 NULL
4 6 NULL
1 1 1
1 2 2
Which means that all of the nulls are being treated as part of the same group for the purpose of calculating the row number. It doesn't matter whether the SET ANSI_NULLLS is on or off.
But since by definition the null is totally unknown then how can the nulls be grouped together like this? It is saying that for the purposes of placing things in a rank order that apples and oranges and the square root of minus 1 and quantum black holes or whatever can be meaningfully ordered. A little experimentation suggests that the first column is being used to generate the rank order as
select 1, '1'
union
select 2, '2'
union
select 5, null
union
select 6, null
union
select 3, null
union
select 4, null
generates the same values. This has significant implications which have caused problems in legacy code I am dealing with. Is this the expected behaviour and is there any way of mitigating it other than replacing the null in the select query with a unique value?
The results I would have expected would have been
1 3 NULL
1 4 NULL
1 5 NULL
1 6 NULL
1 1 1
1 2 2
Using Dense_Rank() makes no difference.
Yo.
So the deal is that when T-SQL is dealing with NULLs in predicates, it uses ternary logic (TRUE, FALSE or UNKNOWN) and displays the behavior that you have stated that you expect from your query. However, when it comes to grouping values, T-SQL treats NULLs as one group. So your query will group the NULLs together and start numbering the rows within that window.
For the results that you say you would like to see, this query should work...
WITH sampledata (Value, CanBeNull)
AS
(
SELECT 1, 1
UNION
SELECT 2, 2
UNION
SELECT 3, NULL
UNION
SELECT 4, NULL
UNION
SELECT 5, NULL
UNION
SELECT 6, NULL
)
SELECT
DENSE_RANK() OVER (PARTITION BY CanBeNull ORDER BY CASE WHEN CanBeNull IS NOT NULL THEN value END ASC) as RowNumber
,Value
,CanBeNull
FROM sampledata

How to auto increment update a column in a SQL Server table

How to auto increment update a column in a SQL Server table based on sorting other columns and restart the increment after value changes in other columns.
My table structure and data is like that:
Name Class OldSrNo NewSrNo
----------------------------
aa 1 1 1
bb 1 2 2
aa 1 3 3
bb 2 4 1
cc 2 5 2
dd 2 6 3
aa 2 7 4
I want to update old sr no to look like NewSrNo
Are you asking this,
Declare #t table (Name varchar(50), Class int, OldSrNo int)
insert into #t values
('aa', 1, 1)
,('bb', 1, 2)
,('aa', 1, 3)
,('bb', 2, 4)
,('cc', 2, 5)
,('dd', 2, 6)
,('aa', 2, 7)
select *,
row_number()over(partition by class order by class)NewSrNo
from #t

Search within ColA duplicates against specific unique vals in ColB to exclude all of ColA

I apologize in advance I feel like I'm missing something really stupid simple. (and let's ignore database structure as I'm kind of locked into that).
I have, let's use customer orders - an order number can be shipped to more than one place. For the sake of ease I'm just illustrating three but it could be more than that (home, office, gift, gift2, gift 3, etc)
So my table is:
Customer orders:
OrderID MailingID
--------------------
1 1
1 2
1 3
2 1
3 1
3 3
4 1
4 2
4 3
What I need to find is OrderIDs that have been shipped to MailingID 1 but not 2 (basically what I need to find is orderID 2 and 3 above).
If it matters, I'm using Sql Express 2012.
Thanks
Maybe this could help:
create table #temp(
orderID int,
mailingID int
)
insert into #temp
select 1, 1 union all
select 1, 2 union all
select 1, 3 union all
select 2, 1 union all
select 3, 1 union all
select 3, 3 union all
select 4, 1 union all
select 4, 2 union all
select 4, 3
-- find orderIDs that have been shipeed to mailingID = 1
select
distinct orderID
from #temp
where mailingID = 1
except
-- find orderIDs that have been shipeed to mailingID = 2
select
orderID
from #temp
where mailingID = 2
drop table #temp
A simple Subquery With NOT IN Operator should work.
SELECT DISTINCT OrderID
FROM <tablename> a
WHERE orderid NOT IN (SELECT orderid
FROM <tablename> b
WHERE b.mailingID = 2)

Tsql group by clause with exceptions

I have a problem with a query.
This is the data (order by Timestamp):
Data
ID Value Timestamp
1 0 2001-1-1
2 0 2002-1-1
3 1 2003-1-1
4 1 2004-1-1
5 0 2005-1-1
6 2 2006-1-1
7 2 2007-1-1
8 2 2008-1-1
I need to extract distinct values and the first occurance of the date. The exception here is that I need to group them only if not interrupted with a new value in that timeframe.
So the data I need is:
ID Value Timestamp
1 0 2001-1-1
3 1 2003-1-1
5 0 2005-1-1
6 2 2006-1-1
I've made this work by a complicated query, but am sure there is an easier way to do it, just cant think of it. Could anyone help?
This is what I started with - probably could work with that. This is a query that should locate when a value is changed.
> SELECT * FROM Data d1 join Data d2 ON d1.Timestamp < d2.Timestamp and
> d1.Value <> d2.Value
It probably could be done with a good use of row_number clause but cant manage it.
Sample data:
declare #T table (ID int, Value int, Timestamp date)
insert into #T(ID, Value, Timestamp) values
(1, 0, '20010101'),
(2, 0, '20020101'),
(3, 1, '20030101'),
(4, 1, '20040101'),
(5, 0, '20050101'),
(6, 2, '20060101'),
(7, 2, '20070101'),
(8, 2, '20080101')
Query:
;With OrderedValues as (
select *,ROW_NUMBER() OVER (ORDER By TimeStamp) as rn --TODO - specific columns better than *
from #T
), Firsts as (
select
ov1.* --TODO - specific columns better than *
from
OrderedValues ov1
left join
OrderedValues ov2
on
ov1.Value = ov2.Value and
ov1.rn = ov2.rn + 1
where
ov2.ID is null
)
select * --TODO - specific columns better than *
from Firsts
I didn't rely on the ID values being sequential and without gaps. If that's the situation, you can omit OrderedValues (using the table and ID in place of OrderedValues and rn). The second query simply finds rows where there isn't an immediate preceding row with the same Value.
Result:
ID Value Timestamp rn
----------- ----------- ---------- --------------------
1 0 2001-01-01 1
3 1 2003-01-01 3
5 0 2005-01-01 5
6 2 2006-01-01 6
You can order by rn if you need the results in this specific order.

Resources