How to select string Rows from MS SQL Json String array?

How to select string Rows from MS SQL Json String array? - sql-server

I have a Text String which contains JSON, something like this:
'{ "d" : [ "test0", "test1", "test2" ] }'
and I would like to retrieve the item of the Array as rows.
+------------+
| data |
+------------+
| test0 |
| test1 |
| test2 |
+------------+
all examples on the Web, show how it is done with "Object Array", but I would like to do it with a simple "String Array" MS example.
The default query
select * from OPENJSON('{"d":["test0","test1","test2"]}', '$.d')
just returns a table with the key, value, type of each entry
+-----+-------+------+
| key | value | type |
+-----+-------+------+
| 0 | test0 | 1 |
| 1 | test1 | 1 |
| 2 | test2 | 1 |
+-----+-------+------+
the problem is, I don't know how to set the with part of the query, so that the query returns a row.
select * from OPENJSON('{"d":["test0","test1","test2"]}', '$.d')
with(data nvarchar(255) '$.d')
only return:
+------+
| data |
+------+
| NULL |
| NULL |
| NULL |
+------+

select * from OPENJSON('{"d":["test0","test1","test2"]}', '$.d')
with(data nvarchar(255) '$')

Related

SQLAlchemy/PostgreSQL: Change column type from ARRAY to JSON

I am trying to change the type of one of the columns on my table from one that takes arrays filled with strings to one that takes JSON.
The SQL I'm trying to execute looks like:
ALTER TABLE my_table
ALTER COLUMN my_column TYPE JSON USING my_column::json
But I get an error back saying "cannot cast type character varying[] to json".
The column I'm trying to change is empty, there are no rows so there is no data that needs to be cast to JSON. Since it's empty I've thought of dropping the column and remaking it but I'd like to keep the column and just change its type if possible. I'm not a whizz with PostgreSQL so any nudge in the right direction would be appreciated.

array_test
Table "public.array_test"
Column | Type | Collation | Nullable | Default
---------------+---------------------+-----------+----------+---------
id | integer | | |
array_fld | integer[] | | |
numeric_array | numeric[] | | |
jsonb_array | jsonb[] | | |
varchar_array | character varying[] | | |
text_array | text[] | | |
ALTER TABLE array_test
ALTER COLUMN varchar_array TYPE json
USING array_to_json(varchar_array);
\d array_test
Table "public.array_test"
Column | Type | Collation | Nullable | Default
---------------+-----------+-----------+----------+---------
id | integer | | |
array_fld | integer[] | | |
numeric_array | numeric[] | | |
jsonb_array | jsonb[] | | |
varchar_array | json | | |
text_array | text[] | | |

Use column values as filenames when COPYing into a stage

Is there an out-of-the-box method for Snowflake to use values from a column as a filename when using COPY INTO #mystage? The goal is to copy X number of files into an s3 stage (essentially PARTITION BY column1), but straight to the stage, not creating subfolders. X would be the number of distinct values in a column.
This can obviously be done manually:
copy into #mystage/mycustomfilename
However, the better option would be something like this:
copy into #mystage/$column1
Is there a version of this that Snowflake supports?
As mentioned above, the PARTITION BY setting parses data into subfolders and the subfolders are named using the values in the specified column, but Snowflake still uses a generic filename within each subfolder.

Created structure -
create temporary table temp_tab_split_members(seq_id number, member_id number, name varchar2(30));
+----------------------------------------------------+
| status |
|----------------------------------------------------|
| Table TEMP_TAB_SPLIT_MEMBERS successfully created. |
+----------------------------------------------------+
Fake data -
insert into temp_tab_split_members
with cte as
(select seq4(),(trim(mod(seq4(),4))+1)::integer,'my name-'||seq4() from table(generator(rowcount=>12)))
select * from cte;
+-------------------------+
| number of rows inserted |
|-------------------------|
| 12 |
+-------------------------+
Checking data format -
select * from TEMP_TAB_SPLIT_MEMBERS order by member_id;
+--------+-----------+------------+
| SEQ_ID | MEMBER_ID | NAME |
|--------+-----------+------------|
| 0 | 1 | my name-0 |
| 4 | 1 | my name-4 |
| 8 | 1 | my name-8 |
| 1 | 2 | my name-1 |
| 5 | 2 | my name-5 |
| 9 | 2 | my name-9 |
| 2 | 3 | my name-2 |
| 6 | 3 | my name-6 |
| 10 | 3 | my name-10 |
| 3 | 4 | my name-3 |
| 7 | 4 | my name-7 |
| 11 | 4 | my name-11 |
+--------+-----------+------------+
Checked stage is empty
list #test_row_stage;
+------+------+-----+---------------+
| name | size | md5 | last_modified |
|------+------+-----+---------------|
+------+------+-----+---------------+
Main procedure to generate files
EXECUTE IMMEDIATE $$
DECLARE
company varchar2(30);
BU varchar2(30);
eval_desc varchar2(30);
member_id varchar2(30);
file_name varchar2(30);
c1 CURSOR FOR SELECT distinct member_id FROM temp_tab_split_members;
BEGIN
for record in c1 do
member_id:=record.member_id;
file_name:='load'||'_'||member_id||'.csv';
execute immediate 'copy into #test_row_stage/'||:file_name||' from
(select * from temp_tab_split_members where member_id='||:member_id||') overwrite=false';
end for;
RETURN 0;
END;
$$
;
+-----------------+
| anonymous block |
|-----------------|
| 0 |
+-----------------+
Check stage contents after procedure execution
list #test_row_stage; -- output truncated columnwise
+----------------------------------------+------+
| name | size |
|----------------------------------------+------+
| test_row_stage/load_1.csv_0_0_0.csv.gz | 48 |
| test_row_stage/load_2.csv_0_0_0.csv.gz | 48 |
| test_row_stage/load_3.csv_0_0_0.csv.gz | 48 |
| test_row_stage/load_4.csv_0_0_0.csv.gz | 48 |
File contents cross-check
select $1,$2,$3 from #test_row_stage/load_1.csv_0_0_0.csv.gz union
select $1,$2,$3 from #test_row_stage/load_2.csv_0_0_0.csv.gz union
select $1,$2,$3 from #test_row_stage/load_3.csv_0_0_0.csv.gz union
select $1,$2,$3 from #test_row_stage/load_4.csv_0_0_0.csv.gz;
+----+----+------------+
| $1 | $2 | $3 |
|----+----+------------|
| 0 | 1 | my name-0 |
| 4 | 1 | my name-4 |
| 8 | 1 | my name-8 |
| 1 | 2 | my name-1 |
| 5 | 2 | my name-5 |
| 9 | 2 | my name-9 |
| 2 | 3 | my name-2 |
| 6 | 3 | my name-6 |
| 10 | 3 | my name-10 |
| 3 | 4 | my name-3 |
| 7 | 4 | my name-7 |
| 11 | 4 | my name-11 |
+----+----+------------+

There is no OOB for this as I understand, but you can write custom code and fetch values and use them to name files and copy them to stage/s3.
Please refer below for something similar -
EXECUTE IMMEDIATE $$
DECLARE
company varchar2(30);
BU varchar2(30);
eval_desc varchar2(30);
member_id varchar2(30);
file_name varchar2(30);
c1 CURSOR FOR SELECT * FROM test_pivot;
BEGIN
for record in c1 do
company:=record.company;
BU:=record.BU;
eval_desc:=record.eval_desc;
member_id:=record.member_id;
file_name:='load'||'_'||member_id||'.csv';
create or replace temporary table temp_test_pvt(company varchar2(30),BU varchar2(30),eval_desc varchar2(30),member_id varchar2(30));
insert into temp_test_pvt values (:company,:bu,:eval_desc,:member_id);
execute immediate 'copy into #test_row_stage/'||:file_name||' from (select * from temp_test_pvt) overwrite=false';
end for;
RETURN 0;
END;
$$
;
Also, refer a similar post here -
Copy JSON data from Snowflake into S3

Update name in Snowflake variant column

I have copied some json files into Snowflake from a stage and I have a property name which contains a hyphen.
When I try to query for this property name (as shown below), I get this error.
select my_variant:test-id from mytable;
SQL compilation error: error line 1 at position 44 invalid identifier 'ID'.
I assume it doesn't like the hyphen. Is there any way I can rename this hyphenated name in my variant column so I don't get the error?

You just need to quote the column name in the variant:
select my_variant:"test-id" from mytable;
If you want to update it, see below. It assumes that you have a key per row, so that we can aggregate it back to rebuild the variant at the row level.
Setup test table:
create or replace table test (k int, a variant);
insert into test
select 1, parse_json('{"test-id": 1, "test-id2": "2"}')
union all
select 2, parse_json('{"test-1": 1, "test-2": "2"}');
select * from test;
+---+-------------------+
| K | A |
|---+-------------------|
| 1 | { |
| | "test_id": 1, |
| | "test_id2": "2" |
| | } |
| 2 | { |
| | "test_1": 1, |
| | "test_2": "2" |
| | } |
+---+-------------------+
Update the table:
update test t
set t.a = b.value
from (
with t as (
select
k,
replace(f.key, '-', '_') as key,
f.value as value
from test,
lateral flatten(a) f
)
select
k, object_agg(key, value) as value
from t
group by k
) b
where t.k = b.k
;
select * from test;
+---+-------------------+
| K | A |
|---+-------------------|
| 1 | { |
| | "test_id": 1, |
| | "test_id2": "2" |
| | } |
| 2 | { |
| | "test_1": 1, |
| | "test_2": "2" |
| | } |
+---+-------------------+

T-SQL Merging data

I've imported data from an XML file by using SSIS to SQL Server.
The result what I got in the database is similar to this:
+-------+---------+---------+-------+
| ID | Name | Brand | Price |
+-------+---------+---------+-------+
| 2 | NULL | NULL | 100 |
| NULL | SLX | NULL | NULL |
| NULL | NULL | Blah | NULL |
| NULL | NULL | NULL | 100 |
+-------+---------+---------+-------+
My desired result would be:
+-------+---------+---------+-------+
| ID | Name | Brand | Price |
+-------+---------+---------+-------+
| 2 | SLX | Blah | 100 |
+-------+---------+---------+-------+
Is there a pretty solution to solve this in T-SQL?
I've already tried it with a SELECT MAX(ID) and then a GROUP BY ID, but I'm still stuck with the NULL values. Also I've tried it with MERGE, but also a failure.
Could someone give me a direction where to search further?

You can select MAX on all columns....
SELECT MAX(ID), MAX(NAME), MAX(BRAND), MAX(PRICE)
FROM [TABLE]
Click here for a fiddley fidd fiddle...

Flatten a recordset in SQL Server?

Say you get a recordset like the following:
| ID | Foo | Bar | Red |
|-----|------|------|------|
| 1 | 100 | NULL | NULL |
| 1 | NULL | 200 | NULL |
| 1 | NULL | NULL | 300 |
| 2 | 400 | NULL | NULL |
| ... | ... | ... | ... | -- etc.
And you want:
| ID | Foo | Bar | Red |
|-----|-----|-----|-----|
| 1 | 100 | 200 | 300 |
| 2 | 400 | ... | ... |
| ... | ... | ... | ... | -- etc.
You could use something like:
SELECT
ID,
MAX(Foo) AS Foo,
MAX(Bar) AS Bar,
MAX(Red) AS Red
FROM foobarred
GROUP BY ID
Now, how might you accomplish similar when Foo, Bar, and Red are VARCHAR?
| ID | Foo | Bar | Red |
|-----|----------|---------|---------|
| 1 | 'Text1' | NULL | NULL |
| 1 | NULL | 'Text2' | NULL |
| 1 | NULL | NULL | 'Text3' |
| 2 | 'Test4' | NULL | NULL |
| ... | ... | ... | ... | -- etc.
To:
| ID | Foo | Bar | Red |
|-----|----------|---------|---------|
| 1 | 'Text1' | 'Text2' | 'Text3' |
| 2 | 'Text4' | ... | ... |
| ... | ... | ... | ... | -- etc.
Currently working primarily with SQL Server 2000; but have access to 2005 servers.

The query you had above works just fine for VARCHAR fields as it did for INT fields. The problem with your query though is that if you have two rows with the same ID, and both of those rows had something in the "Foo" column, then only the one with the highest value (both for INT and VARCHAR) will be displayed.

I don't have access to a SQL2K box at the minute but select max(column) will work on nvarchars in 2005. The only problem will be if you have multiple text values under each column for each id in your original table...
CREATE TABLE Flatten (
id int not null,
foo Nvarchar(10) null,
bar Nvarchar(10) null,
red Nvarchar(10) null)
INSERT INTO Flatten (ID, foo, bar, red) VALUES (1, 'Text1', null, null)
INSERT INTO Flatten (ID, foo, bar, red) VALUES (1, null, 'Text2', null)
INSERT INTO Flatten (ID, foo, bar, red) VALUES (1, null, null, 'Text3')
INSERT INTO Flatten (ID, foo, bar, red) VALUES (2, 'Text4', null, null)
SELECT
ID,
max(foo),
max(bar),
max(red)
FROM
Flatten
GROUP BY ID
returns
ID Foo Bar Red
----------- ---------- ---------- ----------
1 Text1 Text2 Text3
2 Text4 NULL NULL