How to update table based on a json map? - database

If I have this table
CREATE TABLE tmp (
a integer,
b integer,
c text
);
INSERT INTO tmp (a, b, c) VALUES (1, 2, 'foo');
And this json:
{
"a": 4,
"c": "bar"
}
Where the keys map to the column names, and the values are the new values.
How can I update the tmp table without touching columns that aren't in the map?
I thought about constructing a dynamic string of SQL update statement that can be executed in pl/pgsql, but it seems the number of arguments that get passed to USING must be predetermined. But the actual number of arguments is determined by the number of keys in the map, which is dynamic, so this seems like a dead end.
I know I can update the table using multiple update statements as I loop over the keys, but the problem is that I have a trigger set up for the table that will revision the table (by inserting changed columns into another table), so the columns must be updated in a single update statement.
I wonder if it's possible to dynamically update a table with a json map?

Use coalesce(). Example table:
drop table if exists my_table;
create table my_table(id int primary key, a int, b text, c date);
insert into my_table values (1, 1, 'old text', '2017-01-01');
and query:
with jsondata(jdata) as (
values ('{"id": 1, "b": "new text"}'::jsonb)
)
update my_table set
a = coalesce((jdata->>'a')::int, a),
b = coalesce((jdata->>'b')::text, b),
c = coalesce((jdata->>'c')::date, c)
from jsondata
where id = (jdata->>'id')::int;
select * from my_table;
id | a | b | c
----+---+----------+------------
1 | 1 | new text | 2017-01-01
(1 row)

Related

Identify if a column is Virtual in Snowflake

Snowflake does not document its Virtual Column capability that uses the AS clause. I am doing a migration and needing to filter out virtual columns programatically.
Is there any way to identify that a column is virtual? The Information Schema.COLLUMNS view shows nothing different between a virtual and non-virtual column definition.
There is a difference between column defined as DEFAULT and VIRTUAL COLUMN(aka computed, generated column):
Virtual column
CREATE OR REPLACE TABLE T1(i INT, calc INT AS (i*i));
INSERT INTO T1(i) VALUES (2),(3),(4);
SELECT * FROM T1;
When using AS (expression) syntax the expression is not visible inCOLUMN_DEFAULT:
DEFAULT Expression
In case of the defintion DEFAULT (expression):
CREATE OR REPLACE TABLE T2(i INT, calc INT DEFAULT (i*i));
INSERT INTO T2(i) VALUES (2),(3),(4);
SELECT * FROM T2;
It is visible in COLUMN_DEFAULT:
SELECT *
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'T2';
Comparing side-by-side with SHOW COLUMNS:
SHOW COLUMNS LIKE 'CALC';
-- kind: VIRTUAL_COLUMN
One notable difference between them is that virtual column cannot be updated:
UPDATE T1
SET calc = 1;
-- Virtual column 'CALC' is invalid target.
UPDATE T2
SET calc = 1;
-- success
How about using SHOW COLUMNS ? you should identify them when expression
field is not null.
create table foo (id bigint, derived bigint as (id * 10));
insert into foo (id) values (1), (2), (3);
SHOW COLUMNS IN TABLE foo;
SELECT "table_name", "column_name", "expression" FROM table(result_scan(last_query_id()));
| table_name | column_name | expression |
| ---------- | ----------- | -------------- |
| FOO | ID | null |
| FOO | DERIVED | ID*10 |
I normally use the desc table option.
First lets create the table with some example data:
create or replace temporary table ColumnTypesTest (
id int identity(1,1) primary key,
userName varchar(30),
insert_DT datetime default CAST(CONVERT_TIMEZONE('UTC', CAST(CURRENT_TIMESTAMP() AS TIMESTAMP_TZ(9))) AS TIMESTAMP_NTZ(9)) not null,
nextDayAfterInsert datetime as dateadd(dd,1,insert_DT)
);
insert into ColumnTypesTest (userName) values
('John'),
('Cris'),
('Anne');
select * from ColumnTypesTest;
ID
USERNAME
INSERT_DT
NEXTDAYAFTERINSERT
1
John
2021-10-04 19:11:21.069
2021-10-05 19:11:21.069
2
Cris
2021-10-04 19:11:21.069
2021-10-05 19:11:21.069
3
Anne
2021-10-04 19:11:21.069
2021-10-05 19:11:21.069
Now the answer to your question
Using the 'desc table <table_name>;' you will get a column named kind which will tell you if it is virtual or not, also separately there is the default with NULL if it has no default value.
name
type
kind
null?
default
primary key
unique key
check
expression
comment
policy name
ID
NUMBER(38,0)
COLUMN
N
IDENTITY START 1 INCREMENT 1
Y
N
USERNAME
VARCHAR(30)
COLUMN
Y
N
N
INSERT_DT
TIMESTAMP_NTZ(9)
COLUMN
N
CAST(CONVERT_TIMEZONE('UTC', CAST(CURRENT_TIMESTAMP() AS TIMESTAMP_TZ(9))) AS TIMESTAMP_NTZ(9))
N
N
NEXTDAYAFTERINSERT
TIMESTAMP_NTZ(9)
VIRTUAL
Y
N
N
DATE_ADDDAYSTOTIMESTAMP(1, INSERT_DT)
A/
With 'desc table <table_name>' you get meta data of the table with a column named kind, which will say VIRTUAL or COLUMN. In case it is VIRTUAL, then in the column expression you get how that column is calculated.
This is used in Stored Procedures, and saved in an array of arrays with javascript, from there the next query in the stored procedure is created dynamically. A while loop is used to go through the resultSet and push each row intho the array of arrays. You can then use javascript filter to just get the virtual columns. This is part of the advantage of having a mix of javascript and SQL in Snowflake Stored Procedures.
Here the documentation which doesn't say much.

Using MERGE to delete data, or insert

From the UI I pass a datatable to a stored procedure. The type of that parameter is a user defined table field with the following structure
Personkey int
ComponentKey varchar
This data needs to go into a table, and data that exists in the table but is not in the datatable should be deleted.
Example table data
PersonKey ComponentKey
123 A1
456 B9
And my datatable has 2 rows, one matching row and one new row
Example datatable data
PersonKey ComponentKey
123 A1
786 Z6
The result is that the 456/B9 row should be deleted, nothing should happen to the 123/A1 row, and the 786/Z6 row should be inserted.
I believe I can use the MERGE statement but I am not sure how to form it.
I understand that WHEN NOT MATCHED I should do the insert but where does the delete part come into it?
MERGE Components
USING #passedInData
ON PersonKey = DatatblPersonKey AND ComponentKey = DatatblComponentKey
WHEN MATCHED THEN
-- DO nothing...
WHEN NOT MATCHED
INSERT (PersonKey, ComponentKey) VALUES (DatatblPersonKey, DatatblComponentey);
Edit: Just to be clear, the datatable could contain many rows for the same person key, but the component key would be different.
Example datatable data
PersonKey ComponentKey
123 Z6
123 C5
Example table data
PersonKey ComponentKey
123 A1
456 B9
The result after inserting the above datatable should be
PersonKey ComponentKey
123 Z6
123 C5
456 B9
Notice that 123/A1 has been deleted and 456/B9 is still in the table.
The default "WHEN NOT MATCHED" assumes that what you really mean is "WHEN NOT MATCHED BY TARGET". You can do another statement for "WHEN NOT MATCHED BY SOURCE" with the simple command "DELETE".
Be careful when you do this because it will delete all the records from the target that don't match the source based on the comparison you have specified. If it's necessary to do a subset of the target for that action, you can use a cte with that filter and then do your merge against that cte as the target.
edit ... demonstrating how to hook up what I am saying:
DECLARE #databaseTable TABLE (PersonKey INT, ComponentKey VARCHAR(10));
INSERT INTO #databaseTable
VALUES
(123, 'A1'),
(456, 'B9');
DECLARE #appDataset TABLE (PersonKey INT, ComponentKey VARCHAR(10));
INSERT INTO #appDataset
VALUES
(123, 'Z6'),
(123, 'C5');
WITH cteTarget AS
(
SELECT dt.PersonKey
, dt.ComponentKey
FROM #databaseTable AS dt
JOIN (SELECT DISTINCT PersonKey FROM #appDataset) AS pk
ON pk.PersonKey = dt.PersonKey
)
MERGE cteTarget AS tgt
USING #appDataset AS src
ON src.PersonKey = tgt.PersonKey
AND src.ComponentKey = tgt.ComponentKey
WHEN NOT MATCHED BY SOURCE THEN
DELETE
WHEN NOT MATCHED BY TARGET THEN
INSERT
(PersonKey
,ComponentKey)
VALUES
(src.PersonKey
,src.ComponentKey);
SELECT * FROM #databaseTable;

Inserting array values

How do I write and execute a query which inserts array values using libpqxx?
INSERT INTO exampleTable(exampleArray[3]) VALUES('{1, 2, 3}');
This example code gives me:
ERROR: syntax error at or near "'"
What is wrong? In PostgreSQL documentation I found that:
CREATE TABLE sal_emp (
name text,
pay_by_quarter integer[],
schedule text[][]
);
...
INSERT INTO sal_emp
VALUES ('Bill',
'{10000, 10000, 10000, 10000}',
'{{"meeting", "lunch"}, {"training", "presentation"}}');
You should use a column name without an index to insert an array:
create table example(arr smallint[]);
insert into example(arr) values('{1, 2, 3}');
-- alternative syntax
-- insert into example(arr) values(array[1, 2, 3]);
select * from example;
arr
---------
{1,2,3}
(1 row)
Use the column name with an index to access a single element of the array:
select arr[2] as "arr[2]"
from example;
arr[2]
--------
2
(1 row)
update example set arr[2] = 10;
select * from example;
arr
----------
{1,10,3}
(1 row)
You can use arr[n] in INSERT but this has special meaning. With this syntax you can create an array with one element indexed from the given number:
delete from example;
insert into example(arr[3]) values (1);
select * from example;
arr
-----------
[3:3]={1}
(1 row)
As a result you have an array which lower bound is 3:
select arr[3] from example;
arr
-----
1
(1 row)
ref: https://ubiq.co/database-blog/how-to-insert-into-array-in-postgresql/
A. use ARRAY
insert into employees (id, name, phone_numbers)
values (1, ' John Doe', ARRAY ['9998765432','9991234567']);
// less nested quotes
B. use '{}'
insert into employees (id, name, phone_numbers)
values (2, ' Jim Doe', '{"9996587432","9891334567"}');
OR
insert into employees (id, name, phone_numbers)
values (2, ' Jim Doe', '{9996587432,9891334567}');
// seems inner quotes " not necessary,
// number or string depends on column type.

Inserting values of one column of a table into different columns of another table

I have a table A with 2 columns with Group and Age. I need to insert the only 3 entries in the age column to a different table B which has columns Age 1, age 2, age 3. How can I do that.
INSERT INTO B (Age1, Age 2, Age 3)
Values/ select .....
TABLE A
#GROUP | AGE
#AGE1 23
#AGE2 25
AGE3 29
TABLE B
#(ID | AGE1 | AGE2 | AGE3)
#(1 -- -- ---)
CATCH: I cannot hard code in this case. is there any way of using cursors in dynamic sql to get this done
Thanks. I figured out the solution. All I needed to do is get both the columns in a COMMA separated varchar variable. Then pass the variables in a dynamic sql in the insert values.
Insert into table B
('+#rowsincolumn1+')
values('+rowsincolumn2+')
Make sure you insert Single quotes in your varchar variable after every value to make it run in the dyn sql.

SQLite UPSERT / UPDATE OR INSERT

I need to perform UPSERT / INSERT OR UPDATE against a SQLite Database.
There is the command INSERT OR REPLACE which in many cases can be useful. But if you want to keep your id's with autoincrement in place because of foreign keys, it does not work since it deletes the row, creates a new one and consequently this new row has a new ID.
This would be the table:
players - (primary key on id, user_name unique)
| id | user_name | age |
------------------------------
| 1982 | johnny | 23 |
| 1983 | steven | 29 |
| 1984 | pepee | 40 |
Q&A Style
Well, after researching and fighting with the problem for hours, I found out that there are two ways to accomplish this, depending on the structure of your table and if you have foreign keys restrictions activated to maintain integrity. I'd like to share this in a clean format to save some time to the people that may be in my situation.
Option 1: You can afford deleting the row
In other words, you don't have foreign key, or if you have them, your SQLite engine is configured so that there no are integrity exceptions. The way to go is INSERT OR REPLACE. If you are trying to insert/update a player whose ID already exists, the SQLite engine will delete that row and insert the data you are providing. Now the question comes: what to do to keep the old ID associated?
Let's say we want to UPSERT with the data user_name='steven' and age=32.
Look at this code:
INSERT INTO players (id, name, age)
VALUES (
coalesce((select id from players where user_name='steven'),
(select max(id) from drawings) + 1),
32)
The trick is in coalesce. It returns the id of the user 'steven' if any, and otherwise, it returns a new fresh id.
Option 2: You cannot afford deleting the row
After monkeying around with the previous solution, I realized that in my case that could end up destroying data, since this ID works as a foreign key for other table. Besides, I created the table with the clause ON DELETE CASCADE, which would mean that it'd delete data silently. Dangerous.
So, I first thought of a IF clause, but SQLite only has CASE. And this CASE can't be used (or at least I did not manage it) to perform one UPDATE query if EXISTS(select id from players where user_name='steven'), and INSERT if it didn't. No go.
And then, finally I used the brute force, with success. The logic is, for each UPSERT that you want to perform, first execute a INSERT OR IGNORE to make sure there is a row with our user, and then execute an UPDATE query with exactly the same data you tried to insert.
Same data as before: user_name='steven' and age=32.
-- make sure it exists
INSERT OR IGNORE INTO players (user_name, age) VALUES ('steven', 32);
-- make sure it has the right data
UPDATE players SET user_name='steven', age=32 WHERE user_name='steven';
And that's all!
EDIT
As Andy has commented, trying to insert first and then update may lead to firing triggers more often than expected. This is not in my opinion a data safety issue, but it is true that firing unnecessary events makes little sense. Therefore, a improved solution would be:
-- Try to update any existing row
UPDATE players SET age=32 WHERE user_name='steven';
-- Make sure it exists
INSERT OR IGNORE INTO players (user_name, age) VALUES ('steven', 32);
This is a late answer. Starting from SQLIte 3.24.0, released on June 4, 2018, there is finally a support for UPSERT clause following PostgreSQL syntax.
INSERT INTO players (user_name, age)
VALUES('steven', 32)
ON CONFLICT(user_name)
DO UPDATE SET age=excluded.age;
Note: For those having to use a version of SQLite earlier than 3.24.0, please reference this answer below (posted by me, #MarqueIV).
However if you do have the option to upgrade, you are strongly encouraged to do so as unlike my solution, the one posted here achieves the desired behavior in a single statement. Plus you get all the other features, improvements and bug fixes that usually come with a more recent release.
Here's an approach that doesn't require the brute-force 'ignore' which would only work if there was a key violation. This way works based on any conditions you specify in the update.
Try this...
-- Try to update any existing row
UPDATE players
SET age=32
WHERE user_name='steven';
-- If no update happened (i.e. the row didn't exist) then insert one
INSERT INTO players (user_name, age)
SELECT 'steven', 32
WHERE (Select Changes() = 0);
How It Works
The 'magic sauce' here is using Changes() in the Where clause. Changes() represents the number of rows affected by the last operation, which in this case is the update.
In the above example, if there are no changes from the update (i.e. the record doesn't exist) then Changes() = 0 so the Where clause in the Insert statement evaluates to true and a new row is inserted with the specified data.
If the Update did update an existing row, then Changes() = 1 (or more accurately, not zero if more than one row was updated), so the 'Where' clause in the Insert now evaluates to false and thus no insert will take place.
The beauty of this is there's no brute-force needed, nor unnecessarily deleting, then re-inserting data which may result in messing up downstream keys in foreign-key relationships.
Additionally, since it's just a standard Where clause, it can be based on anything you define, not just key violations. Likewise, you can use Changes() in combination with anything else you want/need anywhere expressions are allowed.
The problem with all presented answers it complete lack of taking triggers (and probably other side effects) into account.
Solution like
INSERT OR IGNORE ...
UPDATE ...
leads to both triggers executed (for insert and then for update) when row does not exist.
Proper solution is
UPDATE OR IGNORE ...
INSERT OR IGNORE ...
in that case only one statement is executed (when row exists or not).
To have a pure UPSERT with no holes (for programmers) that don't relay on unique and other keys:
UPDATE players SET user_name="gil", age=32 WHERE user_name='george';
SELECT changes();
SELECT changes() will return the number of updates done in the last inquire.
Then check if return value from changes() is 0, if so execute:
INSERT INTO players (user_name, age) VALUES ('gil', 32);
Option 1: Insert -> Update
If you like to avoid both changes()=0 and INSERT OR IGNORE even if you cannot afford deleting the row - You can use this logic;
First, insert (if not exists) and then update by filtering with the unique key.
Example
-- Table structure
CREATE TABLE players (
id INTEGER PRIMARY KEY AUTOINCREMENT,
user_name VARCHAR (255) NOT NULL
UNIQUE,
age INTEGER NOT NULL
);
-- Insert if NOT exists
INSERT INTO players (user_name, age)
SELECT 'johnny', 20
WHERE NOT EXISTS (SELECT 1 FROM players WHERE user_name='johnny' AND age=20);
-- Update (will affect row, only if found)
-- no point to update user_name to 'johnny' since it's unique, and we filter by it as well
UPDATE players
SET age=20
WHERE user_name='johnny';
Regarding Triggers
Notice: I haven't tested it to see the which triggers are being called, but I assume the following:
if row does not exists
BEFORE INSERT
INSERT using INSTEAD OF
AFTER INSERT
BEFORE UPDATE
UPDATE using INSTEAD OF
AFTER UPDATE
if row does exists
BEFORE UPDATE
UPDATE using INSTEAD OF
AFTER UPDATE
Option 2: Insert or replace - keep your own ID
in this way you can have a single SQL command
-- Table structure
CREATE TABLE players (
id INTEGER PRIMARY KEY AUTOINCREMENT,
user_name VARCHAR (255) NOT NULL
UNIQUE,
age INTEGER NOT NULL
);
-- Single command to insert or update
INSERT OR REPLACE INTO players
(id, user_name, age)
VALUES ((SELECT id from players WHERE user_name='johnny' AND age=20),
'johnny',
20);
Edit: added option 2.
You can also just add an ON CONFLICT REPLACE clause to your user_name unique constraint and then just INSERT away, leaving it to SQLite to figure out what to do in case of a conflict. See:https://sqlite.org/lang_conflict.html.
Also note the sentence regarding delete triggers: When the REPLACE conflict resolution strategy deletes rows in order to satisfy a constraint, delete triggers fire if and only if recursive triggers are enabled.
For those who have the latest version of sqlite available, you can still do it in a single statement using INSERT OR REPLACE but beware you need to set all the values. However this "clever" SQL works by use of a left-join on the table into which you are inserting / updating and ifnull:
import sqlite3
con = sqlite3.connect( ":memory:" )
cur = con.cursor()
cur.execute("create table test( id varchar(20) PRIMARY KEY, value int, value2 int )")
cur.executemany("insert into test (id, value, value2) values (:id, :value, :value2)",
[ {'id': 'A', 'value' : 1, 'value2' : 8 }, {'id': 'B', 'value' : 3, 'value2' : 10 } ] )
cur.execute('select * from test')
print( cur.fetchall())
con.commit()
cur = con.cursor()
# upsert using insert or replace.
# when id is found it should modify value but ignore value2
# when id is not found it will enter a record with value and value2
upsert = '''
insert or replace into test
select d.id, d.value, ifnull(t.value2, d.value2) from ( select :id as id, :value as value, :value2 as value2 ) d
left join test t on d.id = t.id
'''
upsert_data = [ { 'id' : 'B', 'value' : 4, 'value2' : 5 },
{ 'id' : 'C', 'value' : 3, 'value2' : 12 } ]
cur.executemany( upsert, upsert_data )
cur.execute('select * from test')
print( cur.fetchall())
The first few lines of that code are setting up the table, with a single ID primary key column and two values. It then enters data with IDs 'A' and 'B'
The second section creates the 'upsert' text, and calls it for 2 rows of data, one with the ID of 'B' which is found and one with 'C' which is not found.
When you run it, you'll find the data at the end produces
$python3 main.py
[('A', 1, 8), ('B', 3, 10)]
[('A', 1, 8), ('B', 4, 10), ('C', 3, 12)]
B 'updated' value to 4 but value2 (5) was ignored, C inserted.
Note: this does not work if your table has an auto-incremented primary key as INSERT OR REPLACE will replace the number with a new one.
A slight modification to add such a column
import sqlite3
con = sqlite3.connect( ":memory:" )
cur = con.cursor()
cur.execute("create table test( pkey integer primary key autoincrement not null, id varchar(20) UNIQUE not null, value int, value2 int )")
cur.executemany("insert into test (id, value, value2) values (:id, :value, :value2)",
[ {'id': 'A', 'value' : 1, 'value2' : 8 }, {'id': 'B', 'value' : 3, 'value2' : 10 } ] )
cur.execute('select * from test')
print( cur.fetchall())
con.commit()
cur = con.cursor()
# upsert using insert or replace.
# when id is found it should modify value but ignore value2
# when id is not found it will enter a record with value and value2
upsert = '''
insert or replace into test (id, value, value2)
select d.id, d.value, ifnull(t.value2, d.value2) from ( select :id as id, :value as value, :value2 as value2 ) d
left join test t on d.id = t.id
'''
upsert_data = [ { 'id' : 'B', 'value' : 4, 'value2' : 5 },
{ 'id' : 'C', 'value' : 3, 'value2' : 12 } ]
cur.executemany( upsert, upsert_data )
cur.execute('select * from test')
print( cur.fetchall())
output is now:
$python3 main.py
[(1, 'A', 1, 8), (2, 'B', 3, 10)]
[(1, 'A', 1, 8), (3, 'B', 4, 10), (4, 'C', 3, 12)]
Note pkey 2 is replaced with 3 for id 'B'
This is therefore not ideal but is a good solution when:
You don't have an auto-generated primary key
You want to create an 'upsert' query with bound parameters
You want to use executemany() to merge in multiple rows of data in one go.

Resources