TSQL random sample - sql-server

I need to select a random sample using TSQL from a table based on ratios of 2 different variables in the table.
The random sample required is approximately 8000 records from a table with about 381,000 records. The random sample must have approximate ratios of 2 variables:
4:1 (Male/Female) - 2 category variable
4:3:2:1 (Heavy/Moderate/Light/Very Light) - 4 category variable

Break it down to how many of each
select top (640) *
from table
where sex = 'f'
and cat = 'heavy'
order by NewID()
union all
select top (480) *
from table
where sex = 'f'
and cat = 'medium'
order by NewID()
...
4 + 1 = 5
4 + 3 + 2 + 1 = 10
640 = 8000 / 5 * 4 / 10

Related

How to update rows based on one field/column on two tables

I want to update one table using another table on field "Id" such that it wont create duplicates
let say my first table is Table1 and second table is Table2 . I would like to update the row in Table1 from Table2 when the Id is matching
I am aware of using UNION function but this applies to entire columns where I only need to consider a single column. https://docs.snowflake.com/en/sql-reference/operators-query.html#union-all
Example of my Tables
Table1
Id name number value
1 a 8 100
2 b 8 100
3 c 8 100
4 d 8 100
Table2
Id name number value
3 c 8 99
4 d 6 100
5 e 7 100
Expected output
Id name number value
1 a 8 100
2 b 8 100
3 c 8 99
4 d 6 100
5 e 7 100
Please note that in the output table row with Id 3,4 has be updated and new Id 5 is inserted.
PS: It would be better if someone could provide me the select statement to get the output table.
The constuct you are searching for is called MERGE:
CREATE OR REPLACE TABLE trg(Id INT, name VARCHAR, number INT, value INT)
AS SELECT 1 ,'a', 8, 100
UNION SELECT 2 ,'b', 8, 100
UNION SELECT 3 ,'c', 8, 100
UNION SELECT 4 ,'d', 8, 100;
CREATE OR REPLACE TABLE src(Id INT, name VARCHAR, number INT, value INT)
AS SELECT 3 ,'c', 8, 99
UNION SELECT 4 ,'d', 6, 100
UNION SELECT 5 ,'e', 7, 100;
Query:
MERGE INTO trg
USING src
ON trg.Id = src.Id
WHEN MATCHED THEN UPDATE SET name = src.name,
number = src.number,
value = src.value
WHEN NOT MATCHED THEN INSERT (ID, name, number, value)
VALUES (src.Id, src.name, src.number, src.value);
SELECT * FROM trg;
Output:
EDIT:
PS: It would be better if someone could provide me the select statement to get the output table.
UNION ALL combined with QUALIFY could be used:
WITH cte AS (
SELECT *, 1 AS priority FROM trg
UNION ALL
SELECT *, 0 AS priority FROM src
)
SELECT Id, Name, Number, Value
FROM cte
QUALIFY ROW_NUMBER() OVER(PARTITION BY ID ORDER BY Priority) = 1
ORDER BY Id;

how to update large number of records as a batch of n number of records

suppose I have 100000 records in A table and 1000 records in B table. both have primary/foreign key relationship. now i want to update a column value for first 100 records in table A with column value from table B first record. similary i want to update all the 100000 records in table A as a batch 100 records for 1000 times with values from table B.
no. of records updated per batch is 100 i.e. 100000/1000=100
Lets assume you have table_a with 20 rows with a unique id column and you want to update the value column:
CREATE TABLE table_a (id, value) AS
SELECT LEVEL, CAST(NULL AS NUMBER(8,0)) FROM DUAL CONNECT BY LEVEL <= 20;
And table_b with 5 rows containing the values you want to update from:
CREATE TABLE table_b (id, value) AS
SELECT LEVEL, LEVEL FROM DUAL CONNECT BY LEVEL <= 5;
Then, you can use a correlated UPDATE statement:
UPDATE table_a a
SET value = (SELECT value
FROM table_b b
WHERE CEIL(a.id*5/20) = b.id);
or a MERGE statement:
MERGE INTO table_a a
USING table_b b
ON (CEIL(a.id*5/20) = b.id)
WHEN MATCHED THEN
UPDATE
SET value = b.value;
Both statements result in:
ID
VALUE
1
1
2
1
3
1
4
1
5
2
6
2
7
2
8
2
9
3
10
3
11
3
12
3
13
4
14
4
15
4
16
4
17
5
18
5
19
5
20
5
db<>fiddle here

How to insert seven kind of ids in a column to all values

I have column data. I need to insert ids in another column. Total i have 7 ids. For first 7 values i have to insert these ids and next 7 values, i have to insert same ids and so on.. Can any one please help?
Pay_headID Pay_amount
16414 8000
16415 300
16416 0
16417 200
16418 500
16419 0
16420 0
16414 9000
16415 300
so on ...
you can use CTE and ROW_NUMBER, i have used ordering by Pay_headId:
WITH cte_myTable
AS (SELECT
*,
(ROW_NUMBER() OVER (ORDER BY Pay_headID)) - 1 AS num
FROM myTable)
UPDATE cte_myTable
SET [Pay_headID] =
CASE
WHEN num % 7 = 0 THEN 16414
WHEN num % 7 = 1 THEN 16415
WHEN num % 7 = 2 THEN 16416
WHEN num % 7 = 3 THEN 16417
WHEN num % 7 = 4 THEN 16418
WHEN num % 7 = 5 THEN 16419
WHEN num % 7 = 6 THEN 16420
END
GO
If you want use ordering on how it was inserted, you can set Pay_headIds to null:
update myTable set Pay_headID=null;
You should use RowNum() to give you an artificial incrementing number, divide it by 7 and then Round it.
SELECT FLOOR((ROW_NUMBER() OVER(ORDER BY Pay_HeadID DESC))/7) AS MyID
to get your ids

modifying the output of a SP

In my SQl server Sp.
`SELECT rating as [Rating],count(id) as [RatingCount]
FROM MMBPollResults
where mmb_id = #MMbid
GROUP BY rating
This SP returns the rating for each user.
i:e rating ratingcount
` 1 2
2 1
5 4
So this means that
2users have rated the transaction with 1star
1 user has rated the transaction with 2stars
4 users have rated the transaction with 5stars
This is how I need the output
rating ratingcount
` 1 2
2 1
3 0
4 0
5 4
Sorry, if this is a silly question
Thanks
Sun
You need a table with 1 to 5. This could be a number table or some other rating table.
Here I use a simple UNION to make a table with 1 to 5
SELECT
List.Rating,
count(MMB.*) as [RatingCount]
FROM
(
SELECT 1 AS Rating
UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5
) List
LEFT JOIN
MMBPollResults MMB ON List.Rating = MMB.Rating AND MMB.mmb_id = #MMbid
GROUP BY
List.Rating
ORDER BY
List.Rating;

SQL cross-joining to produce number sequence

I've tried to figure out how this SQL query generates a sequence of numbers, and I still don't have a clue.
Digits Table
digit
--------
0
1
2
3
4
5
6
7
8
9
SELECT D3.digit * 100 + D2.digit * 10 + D1.digit + 1 AS n
FROM dbo.Digits as D1
CROSS JOIN dbo.Digits as D2
CROSS JOIN dbo.Digits AS D3
ORDERY BY n;
The Query Result...
n
------
1
2
3
4
5
...
998
999
1000
How does it work?
If you are into CTE, this will give you 1 to 1000.
;
with
Num(Pos) as
(
select cast(1 as int)
union all
select cast(Pos + 1 as int) from Num where Pos &lt 1000
)
select * from Num option (maxrecursion 0)
A cross join is a Cartesian product: that is, every row joins with every other row.
So a 11 row table joined to a 7 row table gives 77 rows
In your case, you have 10 rows * 10 rows * 10 rows = 1000.
Try this query to see the raw date before you generate the number
SELECT D3.digit, D2.digit, D1.digit
FROM dbo.Digits as D1
CROSS JOIN dbo.Digits as D2
CROSS JOIN dbo.Digits AS D3
ORDER BY d3, d2, d1;
The way you have 100*d3 + 10*d2 + d1 replicates how we count naturally and carry in addition.
CROSS JOIN is much like an INNER JOIN MYTable on 1 = 1, resulting in the Cartesian Product of your Input Sets
Basically, for each record on the left, it joins for each record on the right.
In the case of a 10-digit source table, the first cross join results in 100 records.
In the case of a second cross join to the same 10-digit source table, you get all 100 previous records again, for each record in the source table, resulting in 1000 records.
Your resulting table would look like this, if you your Select Statement was "Select * ..." Order by ...
D1 D2 D3
1 2 3
1 2 4
1 2 5
If you take those values in the table above and concatenate them (then add one) you get consecutive numbers.
"1" + "2" + "3" = 123 (+1 = 124)
"1" + "2" + "4" = 124 (+1 = 125)
"1" + "2" + "5" = 125 (+1 = 126)
Obviously, the author is not concatenating. However, he's doing the mathematical equivalent.
1 * 100 + 2 * 10 + 3 * 1 + 1 = 124
1 * 100 + 2 * 10 + 4 * 1 + 1 = 125
1 * 100 + 2 * 10 + 5 * 1 + 1 = 126
Ultimately, the author devised a strange way to provide a listing of numbers from 1 to 1000.
The values of the digit from the D3 table will range from 0 - 900 (D3.digit * 100)
The values of the digit from the D2 table will range from 0 - 90 (D2.digit * 10)
The values of the digit from the D1 table will range from 0 - 9 (D1.digit * 100)
Add them up and you have a range from 0 - 999
Add 1 to the result and you have a range from 1 - 1000

Resources