Related
I am attempting to randomly join the rows of two tables (TableA and TableB) such that each row in TableA is joined to only one row in TableB and every row in TableB is joined to at least one row in TableA.
For example, a random join of TableA with 5 distinct rows and TableB with 3 distinct rows should result in something like this:
TableA TableB
1 3
2 1
3 1
4 2
5 1
However, sometimes not all the rows from TableB are included in the final result; so in the example above might have row 2 from TableB missing because in its place is either row 1 or 3 joined to row 4 on TableA. You can see this occur by executing the script a number of times and checking the result. It seems that it is necessary for some reason to use an interim table (#Q) to be able to ensure that a correct result is returned which has all rows from both TableA and TableB.
Can someone please explain why this is happening?
Also, can someone please advise on what would be a better way to get the desired result?
I understand that sometimes no result is returned due to a failure of some kind in the cross apply and ordering which i have yet to identify and goes to the point that I am sure there is a better way to perform this operation. I hope that makes sense. Thanks in advance!
declare #TableA table (
ID int
);
declare #TableB table (
ID int
);
declare #Q table (
RN int,
TableAID int,
TableBID int
);
with cte as (
select
1 as ID
union all
select
ID + 1
from cte
where ID < 5
)
insert #TableA (ID)
select ID from cte;
with cte as (
select
1 as ID
union all
select
ID + 1
from cte
where ID < 3
)
insert #TableB (ID)
select ID from cte;
select * from #TableA;
select * from #TableB;
with cte as (
select
row_number() over (partition by TableAID order by newid()) as RN,
TableAID,
TableBID
from (
select
a.ID as TableAID,
b.ID as TableBID
from #TableA as a
cross apply #TableB as b
) as M
)
select --All rows from TableB not always included
TableAID,
TableBID
from cte
where RN in (
select
top 1
iCTE.RN
from cte as iCTE
group by iCTE.RN
having count(distinct iCTE.TableBID) = (
select count(1) from #TableB
)
)
order by TableAID;
with cte as (
select
row_number() over (partition by TableAID order by newid()) as RN,
TableAID,
TableBID
from (
select
a.ID as TableAID,
b.ID as TableBID
from #TableA as a
cross apply #TableB as b
) as M
)
insert #Q
select
RN,
TableAID,
TableBID
from cte;
select * from #Q;
select --All rows from both TableA and TableB included
TableAID,
TableBID
from #Q
where RN in (
select
top 1
iQ.RN
from #Q as iQ
group by iQ.RN
having count(distinct iQ.TableBID) = (
select count(1) from #TableB
)
)
order by TableAID;
See if this gives you what you're looking for...
DECLARE
#CountA INT = (SELECT COUNT(*) FROM #TableA ta),
#CountB INT = (SELECT COUNT(*) FROM #TableB tb),
#MinCount INT;
SELECT #MinCount = CASE WHEN #CountA < #CountB THEN #CountA ELSE #CountB END;
WITH
cte_A1 AS (
SELECT
*,
rn = ROW_NUMBER() OVER (ORDER BY NEWID())
FROM
#TableA ta
),
cte_B1 AS (
SELECT
*,
rn = ROW_NUMBER() OVER (ORDER BY NEWID())
FROM
#TableB tb
),
cte_A2 AS (
SELECT
a1.ID,
rn = CASE WHEN a1.rn > #MinCount THEN a1.rn - #MinCount ELSE a1.rn end
FROM
cte_A1 a1
),
cte_B2 AS (
SELECT
b1.ID,
rn = CASE WHEN b1.rn > #MinCount THEN b1.rn - #MinCount ELSE b1.rn end
FROM
cte_B1 b1
)
SELECT
A = a.ID,
B = b.ID
FROM
cte_A2 a
JOIN cte_B2 b
ON a.rn = b.rn;
I want to select all records for customers whose first order is from 2015. I want any orders they placed after 2015 too, but I DON'T want the records for customers whose first order was in 2016. I am ultimately trying to find the percentage of people who order more than twice, but I want to exclude the customers who were new in 2016.
This doesn't work because 'mindate' is an invalid column name but I'm not sure why or how else to try it.
Select
od.CustomerID, OrderID, OrderDSC, OrderDTS
From
OrderDetail OD
Join
(Select
OrderID, Min(orderdts) as mindate
From
OrderDetail
Where
mindate Between '2015-1-1' and '2015-12-31'
Group By Orderid) b on od.OrderID = b.OrderID
Because execution phases - it's seqency how is qry evaluated and by engine. In where clause your mindate not yet exists.
You can change mindate by orderdts:
select OrderID, min(orderdts) as mindate
from OrderDetail
where orderdts between '2015-1-1' and '2015-12-31'
group by Orderid
Second option is to use having statement - it's evaluated after group by.
What I di was select the distinct CustomerIDs that fall in between your daterange and did a left join with the table so it filters out anyone that doesn't fall in between your daterange.
SELECT * FROM
(Select DISTINCT(CustomerID) as CustomerID
FROM OrderDetail WHERE OrderDTS between '2015-1-1' AND '2015-12-31') oIDs
LEFT JOIN
OrderDetail OD
ON oIDs.CustomerID = OD.CustomerID
Try using the EXISTS clause. It is basically a sub-query. Below is an example you should be able to adapt.
create table Test (Id int, aDate datetime)
insert Test values (1,'04/04/2014')
insert Test values (1,'05/05/2015')
insert Test values (1,'06/06/2016')
insert Test values (2,'04/30/2016')
insert Test values (3,'02/27/2014')
select t.* from Test t
where
aDate>='01/01/2015'
and exists(select * from Test x where x.Id=t.Id and x.aDate >='01/01/2015' and x.aDate<'01/01/2016')
I don't know the orderdts data type but if it is datetime orders on 2015-12-31 will not be included (unless the order date is 2015-12-31 00:00:00.000. Note how this will skip the first record:
DECLARE #orders TABLE (CustomerID INT, orderDate DATETIME);
INSERT #orders VALUES (1, '2015-12-31 00:00:01.000'), (1, '2015-12-30'), (2, '2015-01-04');
SELECT * FROM #orders WHERE orderDate BETWEEN '2015-01-01' AND '2015-12-31';
In this case you would want the WHERE clause filter to look like:
WHERE orderDate BETWEEN '2015-01-01 00:00:00.000' AND '2015-12-31 23:59:59.999';
Or
WHERE CAST(orderDate AS date) BETWEEN '2015-01-01' AND '2015-12-31';
(the first example will almost certainly perform better).
Now, using this sample data:
-- Sample data
CREATE TABLE #LIST (LISTName varchar(10) NOT NULL);
INSERT #LIST
SELECT TOP (100) LEFT(newid(), 8)
FROM sys.all_columns a, sys.all_columns b;
-- You will probably want LISTName to be indexed
CREATE NONCLUSTERED INDEX nc_LISTName ON #LIST(LISTName);
You can implement Paul's solution like this:
DECLARE #LIST_Param varchar(8) = 'No List';
SELECT LISTName
FROM
(
SELECT distinct LISTName
FROM #LIST
UNION ALL
SELECT 'No List'
WHERE (SELECT COUNT(DISTINCT LISTName) FROM #LIST) < 1000000
) Distinct_LISTName
WHERE (#LIST_Param = 'No List' or #LIST_Param = LISTName);
Alternatively you can do this:
DECLARE #LIST_Param varchar(8) = 'No List';
WITH x AS
(
SELECT LISTName, c = COUNT(*)
FROM #LIST
WHERE (#LIST_Param = 'No List' or #LIST_Param = LISTName)
GROUP BY LISTName
),
c AS (SELECT s = SUM(c) FROM x)
SELECT LISTName
FROM x CROSS JOIN c
WHERE s < 1000000;
Let's take an example. These are the rows of the table I want get the data:
The column I'm talking about is the reference one. The user can set this value on the web form, but the system I'm developing must suggest the lowest reference value still not used.
As you can see, the smallest value of this column is 35. I could just take the smaller reference and sum 1, but, in that case, the value 36 is already used. So, the value I want is 37.
Is there a way to do this without a loop verification? This table will grow so much.
This is for 2012+
DECLARE #Tbl TABLE (id int, reference int)
INSERT INTO #Tbl
( id, reference )
VALUES
(1, 49),
(2, 125),
(3, 35),
(4, 1345),
(5, 36),
(6, 37)
SELECT
MIN(A.reference) + 1 Result
FROM
(
SELECT
*,
LEAD(reference) OVER (ORDER BY reference) Tmp
FROM
#Tbl
) A
WHERE
A.reference - A.Tmp != -1
Result: 37
Here is yet another place where the tally table is going to prove invaluable. In fact it is so useful I keep a view on my system that looks like this.
create View [dbo].[cteTally] as
WITH
E1(N) AS (select 1 from (values (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))dt(n)),
E2(N) AS (SELECT 1 FROM E1 a cross join E1 b), --10E+2 or 100 rows
E4(N) AS (SELECT 1 FROM E2 a cross join E2 b), --10E+4 or 10,000 rows max
cteTally(N) AS
(
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E4
)
select N from cteTally
Next of course we need some sample data and table to hold it.
create table #Something
(
id int identity
, reference int
, description varchar(10)
)
insert #Something (reference, description)
values (49, 'data1')
, (125, 'data2')
, (35, 'data3')
, (1345, 'data4')
, (36, 'data5')
, (7784, 'data6')
Now comes the magic of the tally table.
select top 1 t.N
from cteTally t
left join #Something s on t.N = s.reference
where t.N >= (select MIN(reference) from #Something)
and s.id is null
order by t.N
This is ugly, but should get the job done:
select
top 1 reference+1
from
[table]
where
reference+1 not in (select reference from [table])
order by reference
I used a table valued express to get the next value. I first left outer joined the table to itself (shifting the key in the join by +1). I then looked only at rows that had no corresponding match (b.ID is null). The minimum a.ReferenceID + 1 gives us the answer we are looking for.
create table MyTable
(
ID int identity,
Reference int,
Description varchar(20)
)
insert into MyTable values (10,'Data')
insert into MyTable values (11,'Data')
insert into MyTable values (12,'Data')
insert into MyTable values (15,'Data')
-- Find gap
;with Gaps as
(
select a.Reference+1 as 'GapID'
from MyTable a
left join MyTable b on a.Reference = b.Reference-1
where b.ID is null
)
select min(GapID) as 'NewReference'
from Gaps
NewReference
------------
13
I hope the code was clearer than my description.
CREATE TABLE #T(ID INT , REFERENCE INT, [DESCRIPTION] VARCHAR(50))
INSERT INTO #T
SELECT 1,49 , 'data1' UNION ALL
SELECT 2,125 , 'data2' UNION ALL
SELECT 3,35 , 'data3' UNION ALL
SELECT 4,1345, 'data4' UNION ALL
SELECT 5,36 , 'data5' UNION ALL
SELECT 6,7784, 'data6'
SELECT TOP 1 REFERENCE + 1
FROM #T T1
WHERE
NOT EXISTS
(
SELECT 1 FROM #T T2 WHERE T2.REFERENCE = T1.REFERENCE + 1
)
ORDER BY T1.REFERENCE
--- OR
SELECT MIN(REFERENCE) + 1
FROM #T T1
WHERE
NOT EXISTS
(
SELECT 1 FROM #T T2 WHERE T2.REFERENCE = T1.REFERENCE + 1
)
How about using a Tally table. The following illustrates the concept. It would be better to use a persisted numbers table as opposed to the cte however the code below illustrates the concept.
For further reading as to why you should use a persisted table, check out the following link: sql-auxiliary-table-of-numbers
DECLARE #START int = 1, #END int = 1000
CREATE TABLE #TEST(UsedValues INT)
INSERT INTO #TEST(UsedValues) VALUES
(1),(3),(5),(7),(9),(11),(13),(15),(17)
;With NumberSequence( Number ) as
(
Select #start as Number
union all
Select Number + 1
from NumberSequence
where Number < #end
)
SELECT MIN(Number)
FROM NumberSequence n
LEFT JOIN #TEST t
ON n.Number = t.UsedValues
WHERE UsedValues IS NULL
OPTION ( MAXRECURSION 1000 )
You could try using a descending order:
SELECT DISTINCT reference
FROM `Resultsados`
ORDER BY `reference` ASC;
As far as I know, there is no way to do this without a loop. To prevent multiple values from returning be sure to use DISTINCT.
I have rows in a table that looks like this:
[date],[name],[duty],[holiday],[hdaypart],[sick],[sdaypart]
2015-04-27, person1, 1,0,NULL,0,NULL
2015-04-27, person1, 0 1,'fd',0,NULL
I would like to combine these rows to:
[date],[name],[duty],[holiday],[hdaypart],[sick],[sdaypart]
2015-04-27, person1, 1,1,'fd',0,NULL
The duty, holiday and sick columns as BIT columns.
Is there way to do this?
The one solution I can come up with is using subqueries, but it consumes a lot of time. A faster solution would be nice.
This is what I have now:
SELECT DISTINCT [name],[date],[region],[cluster]
,CASE WHEN (SELECT SUM(CONVERT(INT,callduty)) FROM planning AS t2
WHERE t1.[Date] = #datum AND t2.[Name] = t1.[name] AND t2.[Date] = t1.[date] ) > 0
THEN 1 ELSE 0 END AS [CallDuty]
,CASE WHEN (SELECT SUM(CONVERT(INT,holiday)) FROM planning AS t2
WHERE t1.[Date] = #datum AND t2.[Name] = t1.[name] AND t2.[Date] = t1.[date] ) > 0
THEN 1 ELSE 0 END AS [Holiday]
FROM planning AS t1
where t1.[Date] = #datum AND t1.[Name] like #naam
group by t1.[date],t1.[name], t1.Region, t1.cluster
order by t1.[name]
You seem to want to group by date and name and select either the maximum or the not null values within each group. MAX aggregate function is suitable for both of these selections:
SELECT [date],[name], MAX([duty]), MAX([holiday]),
MAX([hdaypart]), MAX([sick]), MAX([sdaypart])
FROM mytable
GROUP BY [date],[name]
By looking at your example, I assume that you want to get the maximum values for a specific user.
You could do this using a group by and max
select max([date]),[name],max([duty]),max([holiday]),max([hdaypart]),max([sick]),max([sdaypart])
from yourtable
group by name
This is not really pretty but should perform better than using subqueries.
EDIT:
If you have columns with bit sql types, use
max(cast([bitColumn] as int))
Adding the date column in the group by, as suggested by Giorgos Betsos, the result is
select [date],
[name],
max([duty]),
max([holiday]),
max(cast([hdaypart] as int)),
max(cast([sick] as int)),
max(cast([sdaypart] as int))
from yourtable
group by [date],[name]
declare #t table ([date] date,[name] varchar(10),[duty] varchar(10),[holiday] int,[hdaypart] varchar(10),[sick] int,[sdaypart]
int
)
insert into #t([date],[name],[duty],[holiday],[hdaypart],[sick],[sdaypart])values ('2015-04-27','person1',1,0,NULL,0,NULL),
('2015-04-27','person1',1,0,'fd',0,NULL)
select MAX([date]),MAX([name]),MAX([duty]),MAX([holiday]),MAX([hdaypart]), [sick],[sdaypart] from #t
group by sick,[sdaypart]
OR
select [date],[name],[duty],[holiday],MAX([hdaypart])AS H,[sick],[sdaypart] from #t
group by [date],[name],[duty],[holiday],[sick],[sdaypart]
UNION
select [date],[name],[duty],[holiday],MAX([hdaypart])AS H,[sick],[sdaypart] from #t
group by [date],[name],[duty],[holiday],[sick],[sdaypart]
CREATE TABLE #Combine
(
[date] date,
[name] VARCHAR(10),
[duty] CHAR(1),
[holiday] CHAR(1),
[hdaypart] CHAR(5),
[sick] CHAR(1),
[sdaypart] VARCHAR(10)
)
INSERT INTO #Combine VALUES('2015-04-27', 'person1', '1','0',NULL,'0',NULL),
('2015-04-27', 'person1', '0','1','fd','0',NULL)
SELECT MAX(Date) [date],MAX(name) [name],MAX(Duty) [duty],MAX(holiday) holiday,
MAX(hdaypart) hdaypart,max(sick) sick,max(sdaypart)sdaypart FROM #Combine
I have tow tables with the same number of rows
Example:
table a:
1,A
2,B
3,C
table b:
AA,BB
AAA,BBB,
AAAA,BBBB
I want a new table made like that in SQL SErver:
1,A,AA,BB
2,B,AAA,BBB
3,C,AAAA,BBBB
How do I do that?
In SQL Server 2005 (or newer), you can use something like this:
-- test data setup
DECLARE #tablea TABLE (ID INT, Val CHAR(1))
INSERT INTO #tablea VALUES(1, 'A'), (2, 'B'), (3, 'C')
DECLARE #tableb TABLE (Val1 VARCHAR(10), Val2 VARCHAR(10))
INSERT INTO #tableb VALUES('AA', 'BB'),('AAA', 'BBB'), ('AAAA', 'BBBB')
-- define CTE for table A - sort by "ID" (I just assumed this - adapt if needed)
;WITH DataFromTableA AS
(
SELECT ID, Val, ROW_NUMBER() OVER(ORDER BY ID) AS RN
FROM #tablea
),
-- define CTE for table B - sort by "Val1" (I just assumed this - adapt if needed)
DataFromTableB AS
(
SELECT Val1, Val2, ROW_NUMBER() OVER(ORDER BY Val1) AS RN
FROM #tableb
)
-- create an INNER JOIN between the two CTE which just basically selected the data
-- from both tables and added a new column "RN" which gets a consecutive number for each row
SELECT
a.ID, a.Val, b.Val1, b.Val2
FROM
DataFromTableA a
INNER JOIN
DataFromTableB b ON a.RN = b.RN
This gives you the requested output:
You could do a rank over the primary keys, then join on that rank:
SELECT RANK() OVER (table1.primaryKey),
T1.*,
T2.*
FROM
SELECT T1.*, T2.*
FROM
(
SELECT RANK() OVER (table1.primaryKey) [rank], table1.* FROM table1
) AS T1
JOIN
(
SELECT RANK() OVER (table2.primaryKey) [rank], table2.* FROM table2
) AS T2 ON T1.[rank] = T2.[rank]
Your query is strange, but in Oracle you can do this:
select a.*, tb.*
from a
, ( select rownum rn, b.* from b ) tb -- temporary b - added rn column
where a.c1 = tb.rn -- assuming first column in a is called c1
if there is not column with numbers in a you can do same trick twice
select ta.*, tb.*
from ( select rownum rn, a.* from a ) ta
, ( select rownum rn, b.* from b ) tb
where ta.rn = tb.rn
Note: be aware that this can generate random combination, for example
1 A AA BB
2 C A B
3 B AAA BBB
because there is no order by in ta and tb