convert multiple rows to one comma separated values in snowflake - snowflake-cloud-data-platform

I need to convert multiple records/rows in a column to a single comma separated values in snowflake.
I was using FOR XML in MSSQL SERVER for the same, but I need to do the same in SNOWSQL.
Example - Column-1 with three values A, B, C
Column-1
A
B
C
I need the values concatenated like A,B,C.

Please take a look at Snowflake's LISTAGG function:
https://docs.snowflake.com/en/sql-reference/functions/listagg.html
example as follows:
CREATE OR REPLACE TABLE xyz (str varchar(100));
INSERT INTO xyz (str) VALUES ('A'), ('B'), ('C');
SELECT listagg(str, ',') as my_strings FROM xyz;
--results
MY_STRINGS
A,B,C
I hope this helps...Rich

Related

T-SQL: Splitting character values between non-consistent instances of delimiters

There is a string accompanying a value that I need to extract from a column. I can extract the value from most of the rows, but there are a few cases where the value has different properties. This is a simplified example of the problem;
IF OBJECT_ID('TEMPDB..#TABLE') IS NOT NULL
DROP TABLE #TABLE
CREATE TABLE #TABLE(
colSTRING NVARCHAR(MAX) NULL
);
INSERT INTO #TABLE (colSTRING)
VALUES (',SHOULD NOT BE STORED THIS WAY:22.67')
,(',SHOULD NOT BE STORED THIS WAY:46.32')
,(',SHOULD NOT BE STORED THIS WAY:23.45')
,(',SHOULD NOT BE STORED THIS WAY:66.67')
,(',SHOULD NOT BE STORED THIS WAY:22.35,ANOTHER BAD THING:OK')
;
SELECT * FROM #TABLE
Output:
Notice that there is a number at the end of the string to the right of the ':'. This is the number I need to extract.
The bottom row however shows that there is a second string entry in the same cell. I need to extract 22.35 from this cell while omitting the rest of the string.
This is what I have so far;
SELECT
(RIGHT(colSTRING,CHARINDEX(':',REVERSE(colSTRING))-1)) [STRING NUMBER]
FROM #TABLE
output:
This works for the other values in the table, but the bottom row does not extract the correct value. It takes the string to the right of the ':' of the second string entry.
Is there some way to use this logic on only the first occurrence of the ':'?
So this is how I solved the problem, thanks to #MartinSmith 's idea. I adjusted the example a bit to show how this interacts with a number with more than 2 digits (>=100.00).
IF OBJECT_ID('TEMPDB..#TABLE') IS NOT NULL
DROP TABLE #TABLE
CREATE TABLE #TABLE(
colSTRING NVARCHAR(MAX) NULL
);
INSERT INTO #TABLE (colSTRING)
VALUES (',SHOULD NOT BE STORED THIS WAY:22.67')
,(',SHOULD NOT BE STORED THIS WAY:46.32')
,(',SHOULD NOT BE STORED THIS WAY:23.45')
,(',SHOULD NOT BE STORED THIS WAY:766.67')
,(',SHOULD NOT BE STORED THIS WAY:22.35,ANOTHER BAD THING:OK')
;
SELECT * FROM #TABLE
Solution: In this case, every string entry always starts with a comma. I can use that information in a CASE statement. I make a column with populated entries for each case when there are numbers <100.00 or >=100.00
SELECT ISNULL(CASE WHEN [2DIGITS] LIKE ',%' THEN NULL ELSE [2DIGITS] END,[3DIGITS]) [FIXED]
FROM(
SELECT
(RIGHT(colSTRING,CHARINDEX(':',REVERSE(colSTRING))-1)) [STRING NUMBER]
,SUBSTRING(colSTRING,1 + PATINDEX('%:[0-9][0-9].[0-9][0-9]%', colSTRING),5) [2DIGITS]
,SUBSTRING(colSTRING,1 + PATINDEX('%:[0-9][0-9][0-9].[0-9][0-9]%', colSTRING),6) [3DIGITS]
FROM #TABLE
)A

SSMS - Trying to Bulk insert into a table but decimal field sometimes contains NA

I have tried to create a new table where the column is just varchar(100), but it gives the same conversion error. Most of the column consists of decimal numbers, but instead of putting null or leaving blank when no decimal found the company put NA for NULL values.
The file will not bulk insert since sometimes there are a significant amount of NA's in the decimal column.
Not sure how to get around the problem. My Bulk Insert (again, i have tried to use varchar(100) for the field and decimal (18,2) but get the same data conversion error
Bulk Insert MyExistingTable
From '\\myfile.TXT'
With (
FIELDTERMINATOR = '|',
ROWTERMINATOR = '0x0a',
BATCHSIZE = 10000
)
Once you've succeeded in loading, for example, a 2 column csv (one column is text, the other is a decimal number but sometimes literal text 'NA' instead of null/empty) you can move data like this:
INSERT INTO main(maintextcol, maindeccol)
SELECT temptextcol, NULLIF(tempdeccol, 'NA') from tmp
The conversion from text to decimal is implicit. If you have more columns, add them to the query (I made it 2 to keep life simple)
If you want to avoid duplicates in your main table because some data in tmp is already in main:
INSERT INTO main(maintextcol, maindeccol)
SELECT t.temptextcol, NULLIF(t.tempdeccol, 'NA')
FROM
tmp t
LEFT JOIN
main m
ON m.maintextcol = t.temptextcol -- the xxxtextcol columns define the ID in each table
WHERE
m.maintextcol is null
The left join will create nulls in maintextcol if there is no match so this is one of the rows we want to insert. The where clause finds rows like this
Demo of the simple scenario:
create table main(a varchar(100), b decimal(18,2
))
GO
✓
create table tmp (a varchar(100), b varchar(100))
GO
✓
insert into tmp values('a', '10.1'),
('b', 'NA')
GO
2 rows affected
insert into main
select a, NULLIF(b, 'NA') from tmp
GO
2 rows affected
select * from main
GO
a | b
:- | :----
a | 10.10
b | null
db<>fiddle here

Parse JSON Array in T-SQL

In our SQL Server table we have a json object stored with an array of strings. I want to programatically split that string into several columns. However, I cannot seem to get it to work or even if it's possible.
Is this a possibility to create multiple columns within the WITH clause or it is a smarter move to do it within the select statement?
I trimmed down some of the code to give a simplistic idea of what's given.
The example JSON is similar to { "arr": ["str1 - str2"] }
SELECT b.* FROM [table] a
OUTER APPLY
OPENJSON(a.value, '$.arr')
WITH
(
strSplit1 VARCHAR(100) SPLIT('$.arr', '-',1),
strSplit2 VARCHAR(100) SPLIT('$.arr', '-',2)
) b
Due to the tag [tsql] and the usage of OPENJSON I assume this is SQL-Server. But might be wrong... Please always specify your RDBMS (with version).
Your JSON is rather weird... I think you've overdone it while trying to simplify this for brevity...
Try this:
DECLARE #tbl TABLE(ID INT IDENTITY,YourJSON NVARCHAR(MAX));
INSERT INTO #tbl VALUES(N'{ "arr": ["str1 - str2"] }') --weird example...
,(N'{ "arr": ["a","b","c"] }'); --array with three elements
SELECT t.ID
,B.[value] AS arr
FROM #tbl t
CROSS APPLY OPENJSON(YourJSON)
WITH(arr NVARCHAR(MAX) AS JSON) A
CROSS APPLY OPENJSON(A.arr) B;
A rather short approach (but fitting to this simple example only) was this:
SELECT t.ID
,A.*
FROM #tbl t
OUTER APPLY OPENJSON(JSON_QUERY(YourJSON,'$.arr')) A
Hint
JSON support was introduced with SQL-Server 2016
UPDATE: If the JSON's content is a weird CSV-string...
There's a trick to transform a CSV into a JSON-array. Try this
DECLARE #tbl TABLE(ID INT IDENTITY,YourJSON NVARCHAR(MAX));
INSERT INTO #tbl VALUES(N'{ "arr": ["str1 - str2"] }') --weird example...
,(N'{ "arr": ["a","b","c"] }') --array with three elements
,(N'{ "arr": ["x-y-z"] }'); --array with three elements in a weird CSV format
SELECT t.ID
,B.[value] AS arr
,C.[value]
FROM #tbl t
CROSS APPLY OPENJSON(YourJSON)
WITH(arr NVARCHAR(MAX) AS JSON) A
CROSS APPLY OPENJSON(A.arr) B
CROSS APPLY OPENJSON('["' + REPLACE(B.[value],'-','","') + '"]') C;
Some simple replacements in OPENJSON('["' + REPLACE(B.[value],'-','","') + '"]') will create a JSON array out of your CSV-string, which can be opened in OPENJSON.
I'm not aware of any way to split a string within JSON. I wonder if the issue is down to your JSON containing a single string rather than multiple values?
The below example shows how to extract each string from the array; and if you wish to go further and split those strings on the hyphen, shows how to do that using SQL's normal SUBSTRING and CHARINDEX functions.
create table [table]
(
value nvarchar(max)
)
insert [table](value)
values ('{ "arr": ["str1 - str2"] }'), ('{ "arr": ["1234 - 5678","abc - def"] }')
SELECT b.value
, rtrim(substring(b.value,1,charindex('-',b.value)-1))
, ltrim(substring(b.value,charindex('-',b.value)+1,len(b.value)))
FROM [table] a
OUTER APPLY OPENJSON(a.value, '$.arr') b
If you want all values in a single column, you can use the string_split function: https://learn.microsoft.com/en-us/sql/t-sql/functions/string-split-transact-sql?view=sql-server-2017
SELECT ltrim(rtrim(c.value))
FROM [table] a
OUTER APPLY OPENJSON(a.value, '$.arr') b
OUTER APPLY STRING_SPLIT(b.value, '-') c

Postgres: How to return an integer array from stored function

I want to create a query using something like the following:
select id, array(id_adj(id)) from existingtable
which would be two columns: 1 with the id, and the 2nd column with an array of integers.
The function id_adj returns a set of rows (single column of integers) and is written as follows:
DROP FUNCTION IF EXISTS id_adj(hz_id int);
CREATE FUNCTION id_adj(id int) returns SETOF int AS $$
select b.id
from existingtable a, existingtable b
where a.id != b.id
and a.id=$1
and ST_Distance(a.wkb_geometry, b.wkb_geometry) <= 0.05
$$LANGUAGE SQL
The above function works for a single id. For example:
select id_adj(462);
returns a single column with integer values.
I know that the array() function returns an array of values given a query result from a SELECT statement. For example:
select array(select id from existingtable where id<10);
returns an array "{6,5,8,9,7,3,4,1,2}".
But combining the two together does not seem to work. Note that although I'm using a postgis ST_Distance function above, it is not required to test a solution to my problem.
I'm also open to having the function return an array instead of a setof records, but that seemed more complicated at first.
You are missing a select statement
select
id,
array(select id_adj(id))
from existingtable

comparing a column to a list of values in t-sql

I am displaying records on a page, and I need a way for the user to select a subset of those records to be displayed on another page. These records aren't stored anywhere the are a dynamically generated thing.
What is the best way to in sql to say where a uniqueid is in this list of ids not in a table etc. I know I could dynamically construct the sql with a bunch of ors, but that seems like a hack. anyone else have any suggestions?
this is the best source:
http://www.sommarskog.se/arrays-in-sql.html
create a split function, and use it like:
SELECT
*
FROM YourTable y
INNER JOIN dbo.splitFunction(#Parameter) s ON y.ID=s.Value
I prefer the number table approach
For this method to work, you need to do this one time table setup:
SELECT TOP 10000 IDENTITY(int,1,1) AS Number
INTO Numbers
FROM sys.objects s1
CROSS JOIN sys.objects s2
ALTER TABLE Numbers ADD CONSTRAINT PK_Numbers PRIMARY KEY CLUSTERED (Number)
Once the Numbers table is set up, create this function:
CREATE FUNCTION [dbo].[FN_ListToTable]
(
#SplitOn char(1) --REQUIRED, the character to split the #List string on
,#List varchar(8000)--REQUIRED, the list to split apart
)
RETURNS TABLE
AS
RETURN
(
----------------
--SINGLE QUERY-- --this will not return empty rows
----------------
SELECT
ListValue
FROM (SELECT
LTRIM(RTRIM(SUBSTRING(List2, number+1, CHARINDEX(#SplitOn, List2, number+1)-number - 1))) AS ListValue
FROM (
SELECT #SplitOn + #List + #SplitOn AS List2
) AS dt
INNER JOIN Numbers n ON n.Number < LEN(dt.List2)
WHERE SUBSTRING(List2, number, 1) = #SplitOn
) dt2
WHERE ListValue IS NOT NULL AND ListValue!=''
);
GO
You can now easily split a CSV string into a table and join on it:
select * from dbo.FN_ListToTable(',','1,2,3,,,4,5,6777,,,')
OUTPUT:
ListValue
-----------------------
1
2
3
4
5
6777
(6 row(s) affected)
Your can pass in a CSV string into a procedure and process only rows for the given IDs:
SELECT
y.*
FROM YourTable y
INNER JOIN dbo.FN_ListToTable(',',#GivenCSV) s ON y.ID=s.ListValue
You can use the solution Joel Spolsky recently gave for this problem.
SELECT * FROM MyTable
WHERE ',' + 'comma,separated,list,of,words' + ','
LIKE '%,' + MyTable.word + ',%';
That solution is clever but slow. The better solution is to split the comma-separated string, and construct a dynamic SQL query with the IN() predicate, adding a query parameter placeholder for each element in your list of values:
SELECT * FROM MyTable
WHERE word IN ( ?, ?, ?, ?, ?, ?, ?, ? );
The number of placeholders is what you have to determine when you split your comma-separated string. Then pass one value from that list per parameter.
If you have too many values in the list and making a long IN() predicate is unwieldy, then insert the values to a temporary table, and JOIN against your main table:
CREATE TEMPORARY TABLE TempTableForSplitValues (word VARCHAR(20));
...split your comma-separated list and INSERT each value to a separate row...
SELECT * FROM MyTable JOIN TempTableForSplitValues USING (word);
Also see many other similar questions on SO, including:
Dynamic SQL Comma Delimited Value Query
Passing a varchar full of comma delimited values to a SQL Server IN function
Parameterized Queries with Like and In
Parameterizing a SQL IN clause?

Resources