Snowflake Lead Function - snowflake-cloud-data-platform

Snowflake Lead Function - snowflake-cloud-data-platform

I have a question about the use of the lead function within Snowflake.
Let's say I have a text field of '881885895.1636104601'. When I run the lead function (assume this number repeats and returns this from the lead result), I noticed that the output gets trimmed/truncated to '881885895.16361'.
Does anyone know why this is the case, or has seen this strange behavior before?

It has nothing to do with LEAD() function, it happens when you implicitly casting string to number.
See the example below:
select '881885895.1636104601'::double;
+--------------------------------+
| '881885895.1636104601'::DOUBLE |
|--------------------------------|
| 881885895.16361 |
+--------------------------------+
When users casting string to number implicitly, Snowflake will default to NUMBER(18,5).
If you need more scale and precision, you need to cast explicitly.
select '881885895.1636104601'::number(38,10);
+---------------------------------------+
| '881885895.1636104601'::NUMBER(38,10) |
|---------------------------------------|
| 881885895.1636104601 |
+---------------------------------------+

Using default value that has different data type causes an implicit conversion. An example to reporduce the case:
WITH cte AS (
SELECT '881885895.1636104601' AS COL, 1 AS ID
UNION ALL SELECT '881885895.1636104601' AS COL, 2 AS ID
)
SELECT *, LEAD(col,1,0) OVER(ORDER BY ID)
FROM cte;
-- Output
-- 881885895.1636104601 vs 881885895.16361
DESCRIBE RESULT LAST_QUERY_ID();
-- name col
-- col VARCHAR(20)
-- lead NUMBER(18,5)
By matching the data type the output is as intended:
WITH cte AS (
SELECT '881885895.1636104601' AS COL, 1 AS ID
UNION ALL SELECT '881885895.1636104601' AS COL, 2 AS ID
)
SELECT *, LEAD(col,1,'0') OVER(ORDER BY ID)
FROM cte;

Related

Trying to use UNION on multiple queries but won't work because of my subquery uses AVG

I'm trying to basically run 6 select queries(displayed two for the sake of readability but its basically that same pattern) and using union to create a single output. However, when I run the query, I get 2 of the following errors.
Msg.141 A SELECT statement that assigns a value to a variable must not be combined with data-retrieval operation.
Msg.10734 Variable assignment is not allowed in a statement containing a top-level UNION, INTERSECT or EXCEPT operator.
I understand and have looked at similar questions but nothing seems to work.
I would like to have the output look like this
|Tabe|Rent|Thd|
---------------------------------
|table1 | 9999 | 8888 |
|table2 | 9999 | 8888 |
Any suggestions or direction would be greatly appreciated!

I don't see why you need variables at all here. You need an OVER clause to calculate the average over all the rows:
Select
'table1' as TableName,
count(*) as RecordCount,
0.75 * AVG(count(*)) over ()
from table1
where #something = specificDate
group by specificDate
union all
Select
'table2',
count(*) as RecordCount,
0.75 * AVG(count(*)) over ()
from table2
where #something = specificDate
group by specificDate;
I note that your query appears to be filtering on specificdate. SO you can just group by the empty set: group by ()

SQL GROUP BY with columns which contain mirrored values

Sorry for the bad title. I couldn't think of a better way to describe my issue.
I have the following table:
Category | A | B
A | 1 | 2
A | 2 | 1
B | 3 | 4
B | 4 | 3
I would like to group the data by Category, return only 1 line per category, but provide both values of columns A and B.
So the result should look like this:
category | resultA | resultB
A | 1 | 2
B | 4 | 3
How can this be achieved?
I tried this statement:
SELECT category, a, b
FROM table
GROUP BY category
but obviously, I get the following errors:
Column 'a' is invalid in the select list because it is not contained
in either an aggregate function or the GROUP BY clause.
Column 'b' is invalid in the select list because it is not contained in either an
aggregate function or the GROUP BY clause.
How can I achieve the desired result?

Try this:
SELECT category, MIN(a) AS resultA, MAX(a) AS resultB
FROM table
GROUP BY category
If the values are mirrored then you can get both values using MIN, MAX applied on a single column like a.

Seams you don't really want to aggregate per category, but rather remove duplicate rows from your result (or rather rows that you consider duplicates).
You consider a pair (x,y) equal to the pair (y,x). To find duplicates, you can put the lower value in the first place and the greater in the second and then apply DISTINCT on the rows:
select distinct
category,
case when a < b then a else b end as attr1,
case when a < b then b else a end as attr2
from mytable;

Considering you want a random record from duplicates for each category.
Here is one trick using table valued constructor and Row_Number window function
;with cte as
(
SELECT *,
(SELECT Min(min_val) FROM (VALUES (a),(b))tc(min_val)) min_val,
(SELECT Max(max_val) FROM (VALUES (a),(b))tc(max_val)) max_val
FROM (VALUES ('A',1,2),
('A',2,1),
('B',3,4),
('B',4,3)) tc(Category, A, B)
)
select Category,A,B from
(
Select Row_Number()Over(Partition by category,max_val,max_val order by (select NULL)) as Rn,*
From cte
) A
Where Rn = 1

Select from union of nested subqueries

I am quite certain this is an issue with applying proper aliases, I'm just not sure where I'm going wrong. I am looking at the following UNION in sqlserver:
Select Z.DesiredResult1, etc...
from (
Select C.columns
from (
Select B.columns
from (
Select A.columns
from (Subquery) as A
) as B
) as C
Where C.condition = 1
UNION
Select F.columns
from (
Select E.columns
from (
Select D.columns
from (Subquery) as D
) as E
) as F
Where F.condition = 2
) as Z
The union by itself functions perfectly, but when trying to make SELECT statements from it (as shown above) it throws an error:
No column name was specified for column 1 of 'Z'
Any insights would be appreciated, thanks for helping an SQL newbie.
Edit: Solved--I misunderstood the error. The issue was an aggregate function that needed an alias, not an entire subquery. Leaving the aggregate column unnamed worked fine for the union alone, so I didn't even consider it. Thanks for bothering to read.

This error can be easily reproduced. Check it here.
If you do not name the columns in a single UNION
SELECT *
FROM (SELECT 'A','B') T1
UNION
SELECT *
FROM (SELECT 'C','D') T2
You will get the same error:
No column name was specified for column 1 of 'T1'.
No column name was specified for column 2 of 'T1'.
No column name was specified for column 1 of 'T2'.
No column name was specified for column 2 of 'T2'.
Simply name each common column with the same name.
SELECT T3.Result1, T3.Result2
FROM
(SELECT *
FROM (SELECT 'A' Result1, 'B' Result2) T1
UNION
SELECT *
FROM (SELECT 'C' Result1, 'D' Result2) T2) T3
+----+---------+---------+
| | Result1 | Result2 |
+----+---------+---------+
| 1 | A | B |
+----+---------+---------+
| 2 | C | D |
+----+---------+---------+

SQL MAX Date Does Not Decipher Seconds

I have a table which contains the following data:
ID | ObjectID | ActionDate
=======================================
12345 | 422107 | 2016-10-05 11:24:23.790
12346 | 422107 | 2016-10-05 11:24:28.797
I want to return the ID and max date, but the MAX function does not seem to be calculating down to seconds value (SS). Am I missing something, or is this a limitation with the MAX function? Here is the code I am using:
SELECT
TMOA.ObjectID AS [ObjID]
, TMOA.ID AS [ObjActionID]
, MAX(TMOA.ActionDate) AS [PrepDate]
FROM
TM_Procedure AS TMPRD
left join TM_ObjectAction AS TMOA ON TMPRD.ID = TMOA.ObjectID
GROUP BY
TMOA.ObjectID
, TMPRD.ID
, TMOA.ID

Looks like you're grouping by the ID of the table which is UNIQUE. More than likely that's why you're getting a record that you don't want. Just select the MAX(ActionDate) and see what you get.
If you get the records you want, then you have to figure out which column you are selecting/grouping by that is causing the records you don't want. My guess is that it's either TMOA.ObjectID or TMOA.ID

One option is to use the window function Row_Number()
Select *
From (
Select *
,RowNr=Row_Number() over (Partition By ObjectID Order by ActionDate Desc
From YourTable
) A
Where RowNr=1

SQL Select Records ONLY When a Column Value Is In More Than Once

I have a stored procedure in SQL Server, I am trying to select only the records where a column's value is in there more than once, This may seem a bit of an odd request but I can't seem to figure it out, I have tried using HAVING clauses but had no luck..
I want to be able to only select records that have the ACCOUNT in there more than once, So for example:
ACCOUNT | PAYDATE
-------------------
B066 | 15
B066 | OUTSTAND
B027 | OUTSTAND <--- **SHOULD NOT BE IN THE SELECT**
B039 | 09
B039 | OUTSTAND
B052 | 09
B052 | 15
B052 | OUTSTAND
BO27 should NOT show in my select, and the rest of the ACCOUNTS should.
here is my start and end of the Stored Procedure:
Select * from (
*** SELECTS ARE HERE ***
) X where O_STAND <> 0.0000
group by X.ACCOUNT, X.ACCT_NAME , X.DAYS_CR, X.PAYDATE, X.O_STAND
order by X.ACCOUNT
I have been struggling with this for a while, any help or advice would be appreciated. Thank you in advance.

you could replace the first string with
Select *, COUNT(*) OVER (PARTITION BY ACCOUNT) cnt FROM (
and then wrap your query as subquery once more
SELECT cols FROM ( query ) q WHERE cnt>1

Yes, the having clause is for solving exactly this kind of tasks. Basically, it's like where, but allows to filter not only by column values, but also by aggregate functions' results:
declare #t table (
Id int identity(1,1) primary key,
AccountId varchar(20)
);
insert into #t (AccountId)
values
('B001'),
('B002'),
('B015'),
('B015'),
('B002');
-- Get all rows for which AccountId value is encountered more than once in the table
select *
from #t t
where exists (
select 0
from #t h
where h.AccountId = t.AccountId
group by h.AccountId
having count(h.AccountId) > 1
);