How to implement a ZIP JOIN in T-SQL? - sql-server

Let say I have table #Foo:
Id Color
-- ----
1 Red
2 Green
3 Blue
4 NULL
And table #Bar:
Value
-----
1
2.5
I would like to create table Result using simple statement to get:
Id Color Value
-- ---- -----
1 Red 1
2 Green 2.5
3 Blue NULL
4 NULL NULL
What I have invented so far is:
WITH cte1
AS
(
SELECT [Id], [Color], ROW_NUMBER() OVER (ORDER BY [Id]) AS 'No'
FROM #Foo
),
cte2
AS
(
SELECT [Value], ROW_NUMBER() OVER (ORDER BY [Value]) AS 'No'
FROM #Bar
)
SELECT [Id], [Color], [Value]
FROM cte1 c1
FULL OUTER JOIN cte2 c2 ON c1.[No] = c2.[No]
Do you know faster or more standard way to do ZIP JOIN in T-SQL?

You can simply try this.
;WITH CTE AS
(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 1)) AS Id, Value FROM #Bar
)
SELECT F.Id, F.Color, CTE.Value
FROM #Foo F
LEFT JOIN CTE ON CTE.Id = F.Id

You can get rid of the CTE or make your query shorter
with subquery's like this
select Id,Color,Value from
(
SELECT [Id], [Color], ROW_NUMBER() OVER (ORDER BY [Id]) AS 'No'
FROM #Foo
)x full outer join
(
SELECT [Value], ROW_NUMBER() OVER (ORDER BY [Value]) AS 'No'
FROM #Bar
)y
on x.No=y.No

Will this suffice? (Admittedly, I may be miss-interpreting the question)
SELECT
F.ID AS ID,
F.Color AS Color,
B.Value AS Value
FROM #Foo F
LEFT OUTER JOIN #Bar B ON F.ID = FLOOR(B.Value)
--this DOES seem to return the correct output, but I'm not sure that my logic
--is what you are after
SELECT
F.ID AS ID,
F.Color AS Color,
B.Value AS Value
FROM
(
VALUES
(1,'Red'),(2,'Green'),(3,'Blue'),(4, NULL)
) AS F(ID, Color)
LEFT OUTER JOIN
(
VALUES
(1), (2.5)
) AS B(Value)
ON F.ID = FLOOR(B.Value)
Or are you wanting to essentially:
Sort #Foo by ID
Sort #Boo by Value
Match:
"First" row from #Foo with "First" row from #Bar
"Second" row from #Foo with "Second" row from #Bar
etc...
(Sorry, but I am not familiar with what a "ZIP JOIN" is.
I will look at the link provided by #RszardDzegan, though.)

you could try something like this:
DECLARE #Foo TABLE (Id INT, Color VARCHAR(10));
DECLARE #Bar TABLE (Value DECIMAL(2, 1))
INSERT INTO #Foo (Id, Color)
VALUES (1, 'Red'), (2, 'Green'), (3, 'Blue'), (4, NULL)
INSERT INTO #Bar (Value)
VALUES (1), (2.5);
WITH ECROSS
AS (
SELECT F.Id, F.Color, B.Value, DENSE_RANK() OVER (
ORDER BY F.Id
) AS No1, DENSE_RANK() OVER (
ORDER BY B.Value
) AS No2
FROM #Foo F, #Bar B
)
SELECT A.id, A.Color, B.Value
FROM ECROSS A
LEFT JOIN ECROSS B ON A.No1 = B.No2
AND A.No1 = B.No1
GROUP BY A.id, A.Color, B.Value

DECLARE #Foo TABLE (pk_id int identity(1,1), Id INT, Color VARCHAR(10));
DECLARE #Bar TABLE (pk_id int identity(1,1), Value DECIMAL(2, 1))
INSERT INTO #Foo (Id, Color)
VALUES (1, 'Red'), (2, 'Green'), (3, 'Blue'), (4, NULL)
INSERT INTO #Bar (Value)
VALUES (1), (2.5);
SELECT F.id, F.Color, B.Value
FROM #Foo F
LEFT JOIN #Bar B ON F.pk_id = B.pk_id

Try the following code. You just need to provide both data types in the same structure with a row number per group. With that you can use the PIVOT operator to produce the expected result.
WITH
CTE_FOO AS
(
SELECT
[Group]
,[Spread]
,[Aggregate]
FROM
(VALUES
(1, 1, N'Red' )
,(2, 1, N'Green')
,(3, 1, N'Blue' )
,(4, 1, NULL )
) AS FOO([Group], [Spread], [Aggregate])
),
CTE_BAR AS
(
SELECT
[Group]
,[Spread]
,CAST([Aggregate] AS nvarchar(max)) AS [Aggregate]
FROM
(VALUES
(1, 2, 1 )
,(2, 2, 2.5 )
) AS BAR([Group], [Spread], [Aggregate])
),
CTE_FOOBAR AS
(
SELECT [Group], [Spread], [Aggregate] FROM CTE_FOO
UNION ALL
SELECT [Group], [Spread], [Aggregate] FROM CTE_BAR
)
SELECT
[Group] AS [ID]
,[1] AS [Color]
,[2] AS [Value]
FROM
CTE_FOOBAR
PIVOT
(
MAX([Aggregate]) FOR [Spread] IN ([1], [2])
) AS PivotTable

You can skip creating new row numbers for #Foo, since its row numbers in this case are given.
Then the solution will become
SELECT F.Id,F.Color,newBar.Value from #Foo as F
LEFT JOIN
(
SELECT [Value], ROW_NUMBER() OVER (ORDER BY [Value]) AS 'No'
FROM #Bar
) newBar
on F.Id=newBar.No
This solution has been tested and proven. It gives you all values of #Foo and for each a sorted value of #Bar if there is one.

Related

SQL Server: How do I get the highest value not set of an int column?

Let's take an example. These are the rows of the table I want get the data:
The column I'm talking about is the reference one. The user can set this value on the web form, but the system I'm developing must suggest the lowest reference value still not used.
As you can see, the smallest value of this column is 35. I could just take the smaller reference and sum 1, but, in that case, the value 36 is already used. So, the value I want is 37.
Is there a way to do this without a loop verification? This table will grow so much.
This is for 2012+
DECLARE #Tbl TABLE (id int, reference int)
INSERT INTO #Tbl
( id, reference )
VALUES
(1, 49),
(2, 125),
(3, 35),
(4, 1345),
(5, 36),
(6, 37)
SELECT
MIN(A.reference) + 1 Result
FROM
(
SELECT
*,
LEAD(reference) OVER (ORDER BY reference) Tmp
FROM
#Tbl
) A
WHERE
A.reference - A.Tmp != -1
Result: 37
Here is yet another place where the tally table is going to prove invaluable. In fact it is so useful I keep a view on my system that looks like this.
create View [dbo].[cteTally] as
WITH
E1(N) AS (select 1 from (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))dt(n)),
E2(N) AS (SELECT 1 FROM E1 a cross join E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a cross join E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS
(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
)
select N from cteTally
Next of course we need some sample data and table to hold it.
create table #Something
(
id int identity
, reference int
, description varchar(10)
)
insert #Something (reference, description)
values (49, 'data1')
, (125, 'data2')
, (35, 'data3')
, (1345, 'data4')
, (36, 'data5')
, (7784, 'data6')
Now comes the magic of the tally table.
select top 1 t.N
from cteTally t
left join #Something s on t.N = s.reference
where t.N >= (select MIN(reference) from #Something)
and s.id is null
order by t.N
This is ugly, but should get the job done:
select
top 1 reference+1
from
[table]
where
reference+1 not in (select reference from [table])
order by reference
I used a table valued express to get the next value. I first left outer joined the table to itself (shifting the key in the join by +1). I then looked only at rows that had no corresponding match (b.ID is null). The minimum a.ReferenceID + 1 gives us the answer we are looking for.
create table MyTable
(
ID int identity,
Reference int,
Description varchar(20)
)
insert into MyTable values (10,'Data')
insert into MyTable values (11,'Data')
insert into MyTable values (12,'Data')
insert into MyTable values (15,'Data')
-- Find gap
;with Gaps as
(
select a.Reference+1 as 'GapID'
from MyTable a
left join MyTable b on a.Reference = b.Reference-1
where b.ID is null
)
select min(GapID) as 'NewReference'
from Gaps
NewReference
------------
13
I hope the code was clearer than my description.
CREATE TABLE #T(ID INT , REFERENCE INT, [DESCRIPTION] VARCHAR(50))
INSERT INTO #T
SELECT 1,49 , 'data1' UNION ALL
SELECT 2,125 , 'data2' UNION ALL
SELECT 3,35 , 'data3' UNION ALL
SELECT 4,1345, 'data4' UNION ALL
SELECT 5,36 , 'data5' UNION ALL
SELECT 6,7784, 'data6'
SELECT TOP 1 REFERENCE + 1
FROM #T T1
WHERE
NOT EXISTS
(
SELECT 1 FROM #T T2 WHERE T2.REFERENCE = T1.REFERENCE + 1
)
ORDER BY T1.REFERENCE
--- OR
SELECT MIN(REFERENCE) + 1
FROM #T T1
WHERE
NOT EXISTS
(
SELECT 1 FROM #T T2 WHERE T2.REFERENCE = T1.REFERENCE + 1
)
How about using a Tally table. The following illustrates the concept. It would be better to use a persisted numbers table as opposed to the cte however the code below illustrates the concept.
For further reading as to why you should use a persisted table, check out the following link: sql-auxiliary-table-of-numbers
DECLARE #START int = 1, #END int = 1000
CREATE TABLE #TEST(UsedValues INT)
INSERT INTO #TEST(UsedValues) VALUES
(1),(3),(5),(7),(9),(11),(13),(15),(17)
;With NumberSequence( Number ) as
(
Select #start as Number
union all
Select Number + 1
from NumberSequence
where Number < #end
)
SELECT MIN(Number)
FROM NumberSequence n
LEFT JOIN #TEST t
ON n.Number = t.UsedValues
WHERE UsedValues IS NULL
OPTION ( MAXRECURSION 1000 )
You could try using a descending order:
SELECT DISTINCT reference
FROM `Resultsados`
ORDER BY `reference` ASC;
As far as I know, there is no way to do this without a loop. To prevent multiple values from returning be sure to use DISTINCT.

update column of all rows of table

I have 1 table with 500 rows, and another table with 750 rows or so. What I'm doing is, I'm getting a random 500 rows of a certain column from the second table, and I want to update a newly added column on the first table with those 500 values.
I know how to do updates that look like this:
UPDATE schema.table1
SET column = cl.column FROM schema.table1 cl
INNER JOIN table2 cf ON cf.column = cl.column
but I don't have any columns that are matching in both tables. Is there a way to do this without having to match the columns on the inner join?
so basically, I want to update 500 rows of 1 column in one table, with 500 values coming from another table
You can do it by using ROW_NUMBER to generate column to join two tables. take a look at the example and the output
DECLARE #T1 TABLE ( column1 INT ,column2 VARCHAR(2) )
DECLARE #T2 TABLE ( column1 VARCHAR(2) )
INSERT INTO #T1 ( column1, column2 )
VALUES ( 0, 'A' ), ( 1, 'B' ), ( 2, 'C' )
INSERT INTO #T2 ( column1 )
VALUES ( 'D'),( 'F'),( 'G' )
SELECT *, ROW_NUMBER() OVER (PARTITION BY 1 ORDER BY (SELECT NULL) ) AS RN FROM #T1
SELECT *, ROW_NUMBER() OVER (PARTITION BY 1 ORDER BY (SELECT NULL) ) AS RN FROM #T2
;WITH CTE_1 AS (SELECT *, ROW_NUMBER() OVER (PARTITION BY 1 ORDER BY (SELECT NULL) ) AS RN FROM #T1)
,cte_2 AS (SELECT *, ROW_NUMBER() OVER (PARTITION BY 1 ORDER BY (SELECT NULL) ) AS RN FROM #T2)
UPDATE t1
SET t1.column2 = t2.column1
FROM CTE_1 t1
JOIN cte_2 t2
ON t1.rn = t2.rn
SELECT *, ROW_NUMBER() OVER (PARTITION BY 1 ORDER BY (SELECT NULL) ) AS RN FROM #T1
SELECT *, ROW_NUMBER() OVER (PARTITION BY 1 ORDER BY (SELECT NULL) ) AS RN FROM #T2

Combine two tables in SQL Server

I have tow tables with the same number of rows
Example:
table a:
1,A
2,B
3,C
table b:
AA,BB
AAA,BBB,
AAAA,BBBB
I want a new table made like that in SQL SErver:
1,A,AA,BB
2,B,AAA,BBB
3,C,AAAA,BBBB
How do I do that?
In SQL Server 2005 (or newer), you can use something like this:
-- test data setup
DECLARE #tablea TABLE (ID INT, Val CHAR(1))
INSERT INTO #tablea VALUES(1, 'A'), (2, 'B'), (3, 'C')
DECLARE #tableb TABLE (Val1 VARCHAR(10), Val2 VARCHAR(10))
INSERT INTO #tableb VALUES('AA', 'BB'),('AAA', 'BBB'), ('AAAA', 'BBBB')
-- define CTE for table A - sort by "ID" (I just assumed this - adapt if needed)
;WITH DataFromTableA AS
(
SELECT ID, Val, ROW_NUMBER() OVER(ORDER BY ID) AS RN
FROM #tablea
),
-- define CTE for table B - sort by "Val1" (I just assumed this - adapt if needed)
DataFromTableB AS
(
SELECT Val1, Val2, ROW_NUMBER() OVER(ORDER BY Val1) AS RN
FROM #tableb
)
-- create an INNER JOIN between the two CTE which just basically selected the data
-- from both tables and added a new column "RN" which gets a consecutive number for each row
SELECT
a.ID, a.Val, b.Val1, b.Val2
FROM
DataFromTableA a
INNER JOIN
DataFromTableB b ON a.RN = b.RN
This gives you the requested output:
You could do a rank over the primary keys, then join on that rank:
SELECT RANK() OVER (table1.primaryKey),
T1.*,
T2.*
FROM
SELECT T1.*, T2.*
FROM
(
SELECT RANK() OVER (table1.primaryKey) [rank], table1.* FROM table1
) AS T1
JOIN
(
SELECT RANK() OVER (table2.primaryKey) [rank], table2.* FROM table2
) AS T2 ON T1.[rank] = T2.[rank]
Your query is strange, but in Oracle you can do this:
select a.*, tb.*
from a
, ( select rownum rn, b.* from b ) tb -- temporary b - added rn column
where a.c1 = tb.rn -- assuming first column in a is called c1
if there is not column with numbers in a you can do same trick twice
select ta.*, tb.*
from ( select rownum rn, a.* from a ) ta
, ( select rownum rn, b.* from b ) tb
where ta.rn = tb.rn
Note: be aware that this can generate random combination, for example
1 A AA BB
2 C A B
3 B AAA BBB
because there is no order by in ta and tb

Can this be done with something like a JOIN?

My question is: I have two tables: table A has two columns (KeyA and Match) and table B has two columns (KeyB and Match). I want to compare with the "Match" column.
If table A has 3 rows with a particular "Match", and table B has 2 rows, a JOIN will return me all the combinations (6 in this case). What I want it to do is match up as many as it can, and then NULL out the others.
So, it would match the first "KeyA" with the first "KeyB", the second "KeyA" with the second "KeyB", and then match up the third "KeyA" with NULL, since table B only has two rows for this "Match". The order is actually irrelevant, just as long as 2 rows match up, and then one value from table A returns with a NULL for the table B value. This is not like an INNER or an OUTER JOIN.
I hope this makes sense, it was difficult to express clearly, and was hard to find keywords to search on.
EDIT:
An INNER/OUTER join would match all the table A values with all of the table B values it could. Once a B value is "used up" I do not want it to match it with any other A values.
Example:
Table A (KeyA, Match)
(1, "a")
(2, "a")
(3, "a")
Table B (KeyB, Match)
(11, "a")
(12, "a")
Desired output (KeyA, Match, KeyB):
(1, "a", 11)
(2, "a", 12)
(3, "a", NULL)
You can use partition by to number the rows for each value of match. Then you can use full outer join to fill up rows per Match. For example:
declare #A table (KeyA int, match int)
insert #A values (1,1), (2,1), (3,1), (4,2), (5,2), (6,2)
declare #B table (KeyB int, match int)
insert #B values (1,1), (2,1), (3,2)
select *
from (
select row_number() over (partition by match order by KeyA) as rn
, *
from #A
) as A
full outer join
(
select row_number() over (partition by match order by KeyB) as rn
, *
from #B
) as B
on A.match = B.match
and A.rn = B.rn
Working code at SE Data.
declare #TableA table(ID int, Name varchar(10))
declare #TableB table(ID int, Name varchar(10))
insert into #TableA values(1, 'a'), (1, 'b'), (1, 'c')
insert into #TableB values (1, 'A'), (1, 'B')
insert into #TableA values(2, 'a'), (2, 'b')
insert into #TableB values (2, 'A'), (2, 'B'), (2, 'C')
;with A as
(
select *,
row_number() over(partition by ID order by Name) as rn
from #TableA
),
B as
(
select *,
row_number() over(partition by ID order by Name) as rn
from #TableB
)
select A.ID as AID,
A.Name as AName,
B.ID as BID,
B.Name as BName
from A
full outer join B
on A.ID = B.ID and
A.rn = B.rn
Result:
AID AName BID BName
----------- ---------- ----------- ----------
1 a 1 A
1 b 1 B
1 c NULL NULL
2 a 2 A
2 b 2 B
NULL NULL 2 C
SELECT
ar.Match
COALESCE(ar.RowN, br.RowN) AS RowNumber
ar.KeyA
br.KeyB
FROM
( SELECT KeyA
, Match
, ROW_NUMBER() OVER(PARTITION BY Match) AS RowN
) AS ar
LEFT JOIN --- or FULL JOIN
( SELECT KeyB
, Match
, ROW_NUMBER() OVER(PARTITION BY Match) AS RowN
) AS br
ON br.Match = ar.Match
AND br.RowN = ar.RowN
I think what you are looking for is called a Cross Join, or Cartesian Product.
http://www.sqlguides.com/sql_cross_join.php
edit - Hm now actually I'm not so sure.
As far as I can understand, what you are looking for is a FULL JOIN, or also called CROSS JOIN.
Check out this link. It has good explanation of all types of joins:
http://www.w3schools.com/sql/sql_join.asp

SQL Server: How to use UNION with two queries that BOTH have a WHERE clause?

Given:
Two queries that require filtering:
select top 2 t1.ID, t1.ReceivedDate
from Table t1
where t1.Type = 'TYPE_1'
order by t1.ReceivedDate desc
And:
select top 2 t2.ID
from Table t2
where t2.Type = 'TYPE_2'
order by t2.ReceivedDate desc
Separately, these return the IDs I'm looking for: (13, 11 and 12, 6)
Basically, I want the two most recent records for two specific types of data.
I want to union these two queries together like so:
select top 2 t1.ID, t2.ReceivedDate
from Table t1
where t1.Type = 'TYPE_1'
order by ReceivedDate desc
union
select top 2 t2.ID
from Table t2
where t2.Type = 'TYPE_2'
order by ReceivedDate desc
Problem:
The problem is that this query is invalid because the first select cannot have an order by clause if it is being unioned. And it cannot have top 2 without having order by.
How can I fix this situation?
You should be able to alias them and use as subqueries (part of the reason your first effort was invalid was because the first select had two columns (ID and ReceivedDate) but your second only had one (ID) - also, Type is a reserved word in SQL Server, and can't be used as you had it as a column name):
declare #Tbl1 table(ID int, ReceivedDate datetime, ItemType Varchar(10))
declare #Tbl2 table(ID int, ReceivedDate datetime, ItemType Varchar(10))
insert into #Tbl1 values(1, '20010101', 'Type_1')
insert into #Tbl1 values(2, '20010102', 'Type_1')
insert into #Tbl1 values(3, '20010103', 'Type_3')
insert into #Tbl2 values(10, '20010101', 'Type_2')
insert into #Tbl2 values(20, '20010102', 'Type_3')
insert into #Tbl2 values(30, '20010103', 'Type_2')
SELECT a.ID, a.ReceivedDate FROM
(select top 2 t1.ID, t1.ReceivedDate
from #tbl1 t1
where t1.ItemType = 'TYPE_1'
order by ReceivedDate desc
) a
union
SELECT b.ID, b.ReceivedDate FROM
(select top 2 t2.ID, t2.ReceivedDate
from #tbl2 t2
where t2.ItemType = 'TYPE_2'
order by t2.ReceivedDate desc
) b
select * from
(
select top 2 t1.ID, t1.ReceivedDate
from Table t1
where t1.Type = 'TYPE_1'
order by t1.ReceivedDate de
) t1
union
select * from
(
select top 2 t2.ID
from Table t2
where t2.Type = 'TYPE_2'
order by t2.ReceivedDate desc
) t2
or using CTE (SQL Server 2005+)
;with One as
(
select top 2 t1.ID, t1.ReceivedDate
from Table t1
where t1.Type = 'TYPE_1'
order by t1.ReceivedDate de
)
,Two as
(
select top 2 t2.ID
from Table t2
where t2.Type = 'TYPE_2'
order by t2.ReceivedDate desc
)
select * from One
union
select * from Two
declare #T1 table(ID int, ReceivedDate datetime, [type] varchar(10))
declare #T2 table(ID int, ReceivedDate datetime, [type] varchar(10))
insert into #T1 values(1, '20010101', '1')
insert into #T1 values(2, '20010102', '1')
insert into #T1 values(3, '20010103', '1')
insert into #T2 values(10, '20010101', '2')
insert into #T2 values(20, '20010102', '2')
insert into #T2 values(30, '20010103', '2')
;with cte1 as
(
select *,
row_number() over(order by ReceivedDate desc) as rn
from #T1
where [type] = '1'
),
cte2 as
(
select *,
row_number() over(order by ReceivedDate desc) as rn
from #T2
where [type] = '2'
)
select *
from cte1
where rn <= 2
union all
select *
from cte2
where rn <= 2
The basic premise of the question and the answers are wrong. Every Select in a union can have a where clause. It's the ORDER BY in the first query that's giving yo the error.
The answer is misleading because it attempts to fix a problem that is not a problem. You actually CAN have a WHERE CLAUSE in each segment of a UNION. You cannot have an ORDER BY except in the last segment. Therefore, this should work...
select top 2 t1.ID, t1.ReceivedDate
from Table t1
where t1.Type = 'TYPE_1'
-----remove this-- order by ReceivedDate desc
union
select top 2 t2.ID, t2.ReceivedDate --- add second column
from Table t2
where t2.Type = 'TYPE_2'
order by ReceivedDate desc
Create views on two first "selects" and "union" them.
Notice that each SELECT statement within the UNION must have the same number of columns. The columns must also have similar data types. Also, the columns in each SELECT statement must be in the same order.
you are selecting
t1.ID, t2.ReceivedDate
from Table t1
union
t2.ID
from Table t2
which is incorrect.
so you have to write
t1.ID, t1.ReceivedDate from Table t1
union
t2.ID, t2.ReceivedDate from Table t1
you can use sub query here
SELECT tbl1.ID, tbl1.ReceivedDate FROM
(select top 2 t1.ID, t1.ReceivedDate
from tbl1 t1
where t1.ItemType = 'TYPE_1'
order by ReceivedDate desc
) tbl1
union
SELECT tbl2.ID, tbl2.ReceivedDate FROM
(select top 2 t2.ID, t2.ReceivedDate
from tbl2 t2
where t2.ItemType = 'TYPE_2'
order by t2.ReceivedDate desc
) tbl2
so it will return only distinct values by default from both table.
select top 2 t1.ID, t2.ReceivedDate, 1 SortBy
from Table t1
where t1.Type = 'TYPE_1'
union
select top 2 t2.ID, 2 SortBy
from Table t2
where t2.Type = 'TYPE_2'
order by 3,2

Resources