How to increase Variable size limit in Snowflake? - snowflake-cloud-data-platform

I'm trying to set a variable by executing e.g.
SET Variable_1 = 'xxxx'
, but am getting this error:
"Assignment to 'Variable_1' not done because value exceeds size limit for variables. Its size is 309; the limit is 256 (internal storage size in bytes)."

According to the documentation, that's the limit:
https://docs.snowflake.com/en/sql-reference/session-variables.html#initializing-variables
That said, depending on what you are using them for, you could potentially use multiple variables (and concatenate them when you use them) or store the values in a temporary table (and look them up when you need them).

An alternative approach could be using variable defined inside Snowflake Scripting block:
CREATE OR REPLACE TABLE t(s STRING);
SET long_variable = '
1234567890123456789012345678901234567890
1234567890123456789012345678901234567890
1234567890123456789012345678901234567890
1234567890123456789012345678901234567890
1234567890123456789012345678901234567890
1234567890123456789012345678901234567890
1234567890123456789012345678901234567890
';
Assignment to 'LONG_VARIABLE' not done because value exceeds size limit for variables. Its size is 288; the limit is 256 (internal storage size in bytes)
The same string literal:
DECLARE
LONG_VARIABLE STRING DEFAULT '
1234567890123456789012345678901234567890
1234567890123456789012345678901234567890
1234567890123456789012345678901234567890
1234567890123456789012345678901234567890
1234567890123456789012345678901234567890
1234567890123456789012345678901234567890
1234567890123456789012345678901234567890
';
SQL STRING DEFAULT 'INSERT INTO t(s) VALUES(?)';
BEGIN
EXECUTE IMMEDIATE :SQL USING (LONG_VARIABLE);
RETURN :SQL;
END;
Output:
SELECT LENGTH(s), s FROM t;

Related

How to define an array variable in Snowflake

How to define an array variable in snowflake worksheet?
set columns = (SELECT array_agg(COLUMN_NAME) FROM INFORMATION_SCHEMA.COLUMNS
where table_name='MEMBERS');
I get this error:
Unsupported feature 'assignment from non-constant source expression'.
2022 update:
Now it's possible with Snowflake Scripting:
declare
tmp_array ARRAY default ARRAY_CONSTRUCT();
rs_output RESULTSET;
begin
for i in 1 to 20 do
tmp_array := array_append(:tmp_array, OBJECT_CONSTRUCT('c1', 'a', 'c2', i));
end for;
rs_output := (select value:c1, value:c2 from table(flatten(:tmp_array)));
return table(rs_output);
end;
https://stackoverflow.com/a/71231621/132438
Previously:
Instead of storing an array, aggregate in a comma separated string:
set x = (SELECT listagg(COLUMN_NAME, ',') FROM INFORMATION_SCHEMA.COLUMNS WHERE COLUMN_NAME LIKE 'TABLE_S%');
However: "The size of string or binary variables is limited to 256 bytes" according to https://docs.snowflake.com/en/sql-reference/session-variables.html.
Which means that even if you could store an array in a variable, it would probably exceed the limits. Instead store the result in [temp] tables or so.
I found if you encapsulate with to_json it will work:
set x = (SELECT to_json(array_agg(COLUMN_NAME)) FROM INFORMATION_SCHEMA.COLUMNS
where table_name='CUSTOMER');
select $x;

Are there limitations on Snowflake's split_part function?

Does Snowflake's split_part function have a limit on how large the string or individual delimited parts of the string can be? For e.g. in SQL Server, if any part of the string exceeds 256 bytes, the parsename function will return nullfor that part.
I looked here, but couldn't find any mention of such limitation
To prove that there's no limit close to 256 bytes, I generated a 3MB string with 3 substrings. split_part() was able to extract a 1MB string without problem:
create table LONG_STRING
as
select repeat('abcdefghijk', 100000)||','
||repeat('abcdefghijk', 100000)||','
||repeat('abcdefghijk', 100000) ls
;
select len(ls)
, len(split_part(ls, ',', 2))
from LONG_STRING
# 3,300,002 1,100,000

SQL Server: Handle of string concatenation

Given a variable nvarchar(max), the input is 'aaaaa...' with 16000 length.
The value of the variable has no problem with this setup.
If I break down the input into 3 smaller ones let's say (7964,4594,3442) the variable truncates the concatenation of them.
On the other hand, if at least 1 variable is over 8000 size, the concatenation works without an issue.
Is there any documentation regarding the mentioned behavior?
Taken from the docs:
If the result of the concatenation of strings exceeds the limit of
8,000 bytes, the result is truncated. However, if at least one of the
strings concatenated is a large value type, truncation does not occur.
Operations between varchar and nvarchar are limited to 8000 and 4000 characters respectively, unless you treat any of the involved data types as MAX. Please be very cautious with the order of the operations, this is a very good example from the docs:
DECLARE #x varchar(8000) = replicate('x', 8000)
DECLARE #y varchar(max) = replicate('y', 8000)
DECLARE #z varchar(8000) = replicate('z',8000)
SET #y = #x + #z + #y
-- The result of following select is 16000
SELECT len(#y) AS y
The result is 16k and not 24k because the first operation is #x + #z which is truncated at 8000 because neither of them are MAX. Then the result is concatenated to a type that is MAX, thus breaking the restriction of 8000 as limit, which adds another 8000 characters from #y. In the result, the characters from variable #z are lost at the first concatenation.
If your using CONCAT function
If none of the input arguments has a supported large object (LOB)
type, then the return type truncates to 8000 characters in length,
regardless of the return type. This truncation preserves space and
supports plan generation efficiency.
try
CONCAT(CAST('' as VARCHAR(MAX)),#var1,#var2)
or
CAST(#var1 as VARCHAR(MAX)) + #var2

SAS / Using an array to convert multiple character variables to numeric

I am a SAS novice. I am trying to convert character variables to numeric. The code below works for one variable, but I need to convert more than 50 variables, hopefully simultaneously. Would an array solve this problem? If so, how would I write the syntax?
DATA conversion_subset;
SET have;
new_var = input(oldvar,4.);
drop oldvar;
rename newvar=oldvar;
RUN;
#Reeza
DATA conversion_subset;
SET have;
Array old_var(*) $ a_20040102--a_20040303 a_302000--a_302202;
* The first list contains 8 variables. The second list contains 7 variables;
Array new_var(15) var1-var15;
Do i=1 to dim(old_var);
new_var(i) = input(old_var(i),4.);
End;
*drop a_20040102--a_20040303 a_302000--a_302202;
*rename var1-var15 = a_20040102--a_20040303 a_302000--a_302202;
RUN;
NOTE: Invalid argument to function INPUT at line 64 column 19
(new_var(i) = input(old_var(i),4.)
#Reeza
I am still stuck on this array. Your help would be greatly appreciated. My code:
DATA conversion_subset (DROP= _20040101 _20040201 _20040301);
SET replace_nulls;
Array _char(*) $ _200100--_601600;
Array _num(*) var1-var90;
Do i=1 to dim(_char);
_num(i) = input(_char(i),4.);
End;
RUN;
I am receiving the following error: ERROR: Array subscript out of range at line 64 column 6. Line 64 contains the input statement.
Yes, an array solves this issue. You will want a simple way to list the variables so look into SAS variable lists as well. For example if your converting all character variables between first and last you could list them as first_var-character-last_var.
The rename/drop are illustrated in other questions across SO.
DATA conversion_subset;
SET have;
Array old_var(50) $ first-character-last;
Array new_var(50) var1-var50;
Do i=1 to 50;
new_var(i) = input(oldvar(i),4.);
End;
RUN;
As #Parfait suggests, it would be best to adjust it when you are getting it, rather than after it is already in a SAS data set. However, if you're given the data set and have to convert that, that's what you have to do. You can add a WHERE clause to the PROC SQL to exclude variables that should not be converted. If you do so, they won't be in the final data set unless you add them in the CREATE TABLE's SELECT clause.
PROC CONTENTS DATA=have OUT=havelist NOPRINT ;
RUN ; %* get variable names ;
PROC SQL ;
SELECT 'INPUT(' || name || ',4.) AS ' || name
INTO :convert SEPARATED BY ','
FROM havelist
; %* create the select statement ;
CREATE TABLE conversion_subset AS
SELECT &convert
FROM have
;
QUIT ;
If excluding variables is an issue and/or you want to use a DATA step, then use the PROC CONTENTS above and follow with:
PROC SQL ;
SELECT COMPRESS(name || '_n=INPUT(' || name || ',4.)'),
COMPRESS(name || '_n=' || name),
COMPRESS(name)
INTO :convertlst SEPARATED BY ';',
:renamelst SEPARATED BY ' ',
:droplst SEPARATED BY ' '
FROM havelist
;
QUIT ;
DATA conversion_subset (RENAME=(&renamelst)) ;
SET have ;
&convertlst ;
DROP &droplst ;
RUN ;
Again, add a where clause to exclude variables that should not be converted. This will automatically preserve any variables that you exclude from conversion with a WHERE in the PROC SQL SELECT.
If you have too many variables, or their names are very long, or adding _n to the end causes a name collision, things can go badly (too much data for a macro variable, illegal field name, one field overwriting another, respectively).

What exactly is the meaning of nvarchar(n)

The documentation isn't super clear: https://msdn.microsoft.com/en-us/library/ms186939.aspx
What happens if I try to store a 20 character length string in a column defined as nvarchar(10)? Is 10 the max length the field could be or is it the expected length? If I can exceed n characters in the string, what are the performance implications of doing that?
The maximum number of characters you can store in a column or variable typed as nvarchar(n) is n. If you try to store more your string will be truncated, or in case of an insert into a table, the insert would be disallowed with a warning about possible truncation:
String or binary data would be truncated. The statement has been
terminated.
declare #n nvarchar(10)
set #n = N'more than ten chars'
select #n
Result:
----------
more than
(1 row(s) affected)
From my understanding, nvarchar will only only store the provided characters up to the amount defined. Nchar will actually fill in the unused characters with whitespace.

Resources