SQL cross-joining to produce number sequence - sql-server

I've tried to figure out how this SQL query generates a sequence of numbers, and I still don't have a clue.
Digits Table
digit
--------
0
1
2
3
4
5
6
7
8
9
SELECT D3.digit * 100 + D2.digit * 10 + D1.digit + 1 AS n
FROM dbo.Digits as D1
CROSS JOIN dbo.Digits as D2
CROSS JOIN dbo.Digits AS D3
ORDERY BY n;
The Query Result...
n
------
1
2
3
4
5
...
998
999
1000
How does it work?

If you are into CTE, this will give you 1 to 1000.
;
with
Num(Pos) as
(
select cast(1 as int)
union all
select cast(Pos + 1 as int) from Num where Pos &lt 1000
)
select * from Num option (maxrecursion 0)

A cross join is a Cartesian product: that is, every row joins with every other row.
So a 11 row table joined to a 7 row table gives 77 rows
In your case, you have 10 rows * 10 rows * 10 rows = 1000.
Try this query to see the raw date before you generate the number
SELECT D3.digit, D2.digit, D1.digit
FROM dbo.Digits as D1
CROSS JOIN dbo.Digits as D2
CROSS JOIN dbo.Digits AS D3
ORDER BY d3, d2, d1;
The way you have 100*d3 + 10*d2 + d1 replicates how we count naturally and carry in addition.

CROSS JOIN is much like an INNER JOIN MYTable on 1 = 1, resulting in the Cartesian Product of your Input Sets
Basically, for each record on the left, it joins for each record on the right.
In the case of a 10-digit source table, the first cross join results in 100 records.
In the case of a second cross join to the same 10-digit source table, you get all 100 previous records again, for each record in the source table, resulting in 1000 records.
Your resulting table would look like this, if you your Select Statement was "Select * ..." Order by ...
D1 D2 D3
1 2 3
1 2 4
1 2 5
If you take those values in the table above and concatenate them (then add one) you get consecutive numbers.
"1" + "2" + "3" = 123 (+1 = 124)
"1" + "2" + "4" = 124 (+1 = 125)
"1" + "2" + "5" = 125 (+1 = 126)
Obviously, the author is not concatenating. However, he's doing the mathematical equivalent.
1 * 100 + 2 * 10 + 3 * 1 + 1 = 124
1 * 100 + 2 * 10 + 4 * 1 + 1 = 125
1 * 100 + 2 * 10 + 5 * 1 + 1 = 126
Ultimately, the author devised a strange way to provide a listing of numbers from 1 to 1000.

The values of the digit from the D3 table will range from 0 - 900 (D3.digit * 100)
The values of the digit from the D2 table will range from 0 - 90 (D2.digit * 10)
The values of the digit from the D1 table will range from 0 - 9 (D1.digit * 100)
Add them up and you have a range from 0 - 999
Add 1 to the result and you have a range from 1 - 1000

Related

SQL Server script not working as expected

I have this little script that shall return the first number in a column of type int which is not used yet.
SELECT t1.plu + 1 AS plu
FROM tovary t1
WHERE NOT EXISTS (SELECT 1 FROM tovary t2 WHERE t2.plu = t1.plu + 1)
AND t1.plu > 0;
this returns the unused numbers like
3
11
22
27
...
The problem is, that when I make a simple select like
SELECT plu
FROM tovary
WHERE plu > 0
ORDER BY plu ASC;
the results are
1
2
10
20
...
Why the first script isn't returning some of free numbers like 4, 5, 6 and so on?
Compiling a formal answer from the comments.
Credit to Larnu:
It seems what the OP really needs here is an (inline) Numbers/Tally (table) which they can then use a NOT EXISTS against their table.
Sample data
create table tovary
(
plu int
);
insert into tovary (plu) values
(1),
(2),
(10),
(20);
Solution
Isolating the tally table in a common table expression First1000 to produce the numbers 1 to 1000. The amount of generated numbers can be scaled up as needed.
with First1000(n) as
(
select row_number() over(order by (select null))
from ( values (0),(0),(0),(0),(0),(0),(0),(0),(0),(0) ) a(n) -- 10^1
cross join ( values (0),(0),(0),(0),(0),(0),(0),(0),(0),(0) ) b(n) -- 10^2
cross join ( values (0),(0),(0),(0),(0),(0),(0),(0),(0),(0) ) c(n) -- 10^3
)
select top 20 f.n as Missing
from First1000 f
where not exists ( select 'x'
from tovary
where plu = f.n);
Using top 20 in the query above to limit the output. This gives:
Missing
-------
3
4
5
6
7
8
9
11
12
13
14
15
16
17
18
19
21
22
23
24

Increasing a Column value by a % range

I have a table in SQL Server 2012. The following query works great:
SELECT TOP 300 [ObjectID], [tbh_Objects].Title, [Discount], [tbh_Section].Title
FROM [ECom].[dbo].[tbh_Objects]
INNER JOIN [tbh_Section] ON tbh_Objects.SectionID = tbh_Section.SectionID
ORDER BY tbh_Objects.AddedDate DESC
I want to fire a query which increases the discount value to a random % in the range of 5-10 for all 300 rows at once. So for eg: If DIscount of ObjectID=500 is 30, and the random value between 5 and 10 is "6", I want it to become 30+6%of30 for ObjectID=500.
Similarly for Object ID=230, let's say discount is 20 and the random value is 8, I want it as 20+8%of20.
The end result of the Discount should always be a whole number and not a decimal, so automatically rounds off.
Is this possible in SQL Server? How?
You need random integers and a Modulus (%) operator. A possible approach to generate a random integers is using a combination of NEWID() and CHECKSUM() functions. The following simplified example is a possible solution to your problem:
SELECT
Discount,
RandomPercent,
CONVERT(int, (Discount * (100.0 + RandomPercent) / 100)) AS NewDiscount
FROM (
SELECT Discount, (ABS(CHECKSUM(NEWID()) % 6) + 5) AS RandomPercent
FROM (VALUES (30), (20), (50), (70), (11), (21), (13), (15), (1), (6)) v (Discount)
) t
Result:
Discount RandomPercent NewDiscount
----------------------------------
30 7 32
20 5 21
50 6 53
70 10 77
11 9 11
21 9 22
13 8 14
15 10 16
1 6 1
6 5 6
If you need an UPDATE statement:
;WITH UpdateCTE AS (
SELECT TOP 300 o.[Discount]
FROM [ECom].[dbo].[tbh_Objects] o
INNER JOIN [tbh_Section] s ON o.SectionID = s.SectionID
ORDER BY o.AddedDate DESC
)
UPDATE UpdateCTE
SET [Discount] = CONVERT(int, (o.[Discount] * (100.0 + (ABS(CHECKSUM(NEWID()) % 6) + 5)) / 100))
If you want to round the new discounts before the integer conversion, use ROUND():
SET [Discount] = CONVERT(
int,
ROUND(o.[Discount] * (100.0 + (ABS(CHECKSUM(NEWID()) % 6) + 5)) / 100, 0)
)

TSQL random sample

I need to select a random sample using TSQL from a table based on ratios of 2 different variables in the table.
The random sample required is approximately 8000 records from a table with about 381,000 records. The random sample must have approximate ratios of 2 variables:
4:1 (Male/Female) - 2 category variable
4:3:2:1 (Heavy/Moderate/Light/Very Light) - 4 category variable
Break it down to how many of each
select top (640) *
from table
where sex = 'f'
and cat = 'heavy'
order by NewID()
union all
select top (480) *
from table
where sex = 'f'
and cat = 'medium'
order by NewID()
...
4 + 1 = 5
4 + 3 + 2 + 1 = 10
640 = 8000 / 5 * 4 / 10

What would be my SqlServer query?

Table1
P R E Value
X 1 10 1
Y 2 30 2
Z 3 CR 3
X 1 30 4
Table2
P R E Value
X 1 CR 4
Y 2 10 5
Y 3 CR 6
W 1 30 7
Query1 - Merge these two tables. I'm able to achieve this using union clause.
Query2 - On the merged table select all records except for entries where for a combination of P, R & E; there are similar records with the only mismatch of 'E' as 30 & 10, then ignore record with E as 30. In case only 30 is present then consider it.
Conditions:
10 & 30 - consider only 10, ignore 30
10 - consider it
30 - consider it
CR - consider it
10 & CR - consider both
30 & CR - consider both
10 & 30 & CR - consider 10 & CR
Expected Output table
P R E Value
X 1 10 1
Z 3 CR 3
X 1 CR 4
Y 2 10 5
Y 3 CR 6
W 1 30 7
Ignored records
Y 2 30 2
X 1 30 4
I was able to achieve your desired result set with the following query.
insert into #Table1
values ('X','1','10','1'),
('Y','2','30','2'),
('Z','3','CR','3'),
('X','1','30','4')
insert into #Table2
values ('X','1','CR','4'),
('Y','2','10','5'),
('Y','3','CR','6'),
('W','1','30','7')
--Query2
select * from(
--Query1
select * from #Table1 union select * from #Table2) x
where E != '30' OR
(
E = '30' AND P+':'+R NOT IN
(
--Modified Query1
select P+':'+R from #Table1 where E = '10'
union
select P+':'+R from #Table2 y where E = '10'
)
)
order by Value

SQL Server NTILE - Same value in different quartile

I have a scenario where i'm splitting a number of results into quartilies using the SQL Server NTILE function below. The goal is to have an as equal number of rows in each class
case NTILE(4) over (order by t2.TotalStd)
when 1 then 'A' when 2 then 'B' when 3 then 'C' else 'D' end as Class
The result table is shown below and there is a (9,9,8,8) split between the 4 class groups A,B,C and D.
There are two results which cause me an issue, both rows have a same total std value of 30 but are assigned to different quartiles.
8 30 A
2 30 B
I'm wondering is there a way to ensure that rows with the same value are assigned to the same quartile? Can i group or partition by another column to get this behaviour?
Pos TotalStd class
1 16 A
2 23 A
3 21 A
4 29 A
5 25 A
6 26 A
7 28 A
8 30 A
9 29 A
1 31 B
2 30 B
3 32 B
4 32 B
5 34 B
6 32 B
7 34 B
8 32 B
9 33 B
1 36 C
2 35 C
3 35 C
4 35 C
5 40 C
6 38 C
7 41 C
8 43 C
1 43 D
2 48 D
3 45 D
4 47 D
5 44 D
6 48 D
7 46 D
8 57 D
You will need to re create the Ntile function, using the rank function.
The rank function gives the same rank for rows with the same value. The value later 'jumps' to the next rank as if you used row_number.
We can use this behavior to mimic the Ntile function, forcing it to give the same Ntile value to rows with the same value. However - this will cause the Ntile partitions to be with a different size.
See the example below for the new Ntile using 4 bins:
declare #data table ( x int )
insert #data values
(1),(2),
(2),(3),
(3),(4),
(4),(5)
select
x,
1+(rank() over (order by x)-1) * 4 / count(1) over (partition by (select 1)) as new_ntile
from #data
Results:
x new_ntile
---------------
1 1
2 1
2 1
3 2
3 2
4 3
4 3
5 4
Not sure what you're expecting to happen here, really. SQL Server has divided up the data into 4 groups of as-equal-size-as-possible, as you asked. What do you want to happen? Have a look at this example:
declare #data table ( x int )
insert #data values
(1),(2),
(2),(3),
(3),(4),
(4),(5)
select
x,
NTILE(4) over (order by x) as ntile
from #data
Results:
x ntile
----------- ----------
1 1
2 1
2 2
3 2
3 3
4 3
4 4
5 4
Now every ntile group shares a value with the one(s) next to it! But what else should it do?
Try this:
; with a as (
       select TotalStd,Class=case ntile(4)over( order by TotalStd )
                                when 1 then 'A'
                                when 2 then 'B'
                                when 3 then 'C'
                                when 4 then 'D'
                                end
                from t2
                group by TotalStd
)
select d.*, a.Class from t2 d
inner join a on a.TotalStd=d.TotalStd
order by Class,Pos;
Here we have a table of 34 rows.
DECLARE #x TABLE (TotalStd INT)
INSERT #x (TotalStd) VALUES (16), (21), (23), (25), (26), (28), (29), (29), (30), (30), (31), (32), (32), (32), (32), (33), (34),
(34), (35), (35), (35), (36), (38), (40), (41), (43), (43), (44), (45), (46), (47), (48), (48), (57)
SELECT '#x', TotalStd FROM #x ORDER BY TotalStd
We want to divide into quartiles. If we use NTILE, the bucket sizes will be roughly the same size (8 to 9 rows each) but ties are broken arbitrarily:
SELECT '#x with NTILE', TotalStd, NTILE(4) OVER (ORDER BY TotalStd) quantile FROM #x
See how 30 appears twice: once in quantile 1 and once in quantile 2. Similarly, 43 appears both in quantiles 3 and 4.
What I ought to find is 10 items in quantile 1, 8 in quantile 2, 7 in quantile 3 and 9 in quantile 4 (i.e. not a perfect 9-8-9-8 split, but such a split is impossible if we are not allowed to break ties arbitrarily). I can do it using NTILE to determine cutoff points in a temporary table:
DECLARE #cutoffs TABLE (quantile INT, min_value INT, max_value INT)
INSERT #cutoffs (quantile, min_value)
SELECT y.quantile, MIN(y.TotalStd)
FROM (SELECT TotalStd, NTILE(4) OVER (ORDER BY TotalStd) AS quantile FROM #x) y
GROUP BY y.quantile
-- The max values are the minimum values of the next quintiles
UPDATE c1 SET c1.max_value = ISNULL(C2.min_value, (SELECT MAX(TotalStd) + 1 FROM #x))
FROM #cutoffs c1 LEFT OUTER JOIN #cutoffs c2 ON c2.quantile - 1 = c1.quantile
SELECT '#cutoffs', * FROM #cutoffs
We'll use the the boundary values in the #cutoffs table to create the final table:
SELECT x.TotalStd, c.quantile FROM #x x
INNER JOIN #cutoffs c ON x.TotalStd >= c.min_value AND x.TotalStd < c.max_value

Resources