Copy records with dynamic column names - arrays

I have two tables with different columns in PostgreSQL 9.3:
CREATE TABLE person1(
NAME TEXT NOT NULL,
AGE INT NOT NULL
);
CREATE TABLE person2(
NAME TEXT NOT NULL,
AGE INT NOT NULL,
ADDRESS CHAR(50),
SALARY REAL
);
INSERT INTO person2 (Name, Age, ADDRESS, SALARY)
VALUES ('Piotr', 20, 'London', 80);
I would like to copy records from person2 to person1, but column names can change in program, so I would like to select joint column names in program. So I create an array containing the intersection of column names. Next I use a function: insert into .... select, but I get an error, when I pass the array variable to the function by name. Like this:
select column_name into name1 from information_schema.columns where table_name = 'person1';
select column_name into name2 from information_schema.columns where table_name = 'person2';
select * into cols from ( select * from name1 intersect select * from name2) as tmp;
-- Create array with name of columns
select array (select column_name::text from cols) into cols2;
CREATE OR REPLACE FUNCTION f_insert_these_columns(VARIADIC _cols text[])
RETURNS void AS
$func$
BEGIN
EXECUTE (
SELECT 'INSERT INTO person1 SELECT '
|| string_agg(quote_ident(col), ', ')
|| ' FROM person2'
FROM unnest(_cols) col
);
END
$func$ LANGUAGE plpgsql;
select * from cols2;
array
------------
{name,age}
(1 row)
SELECT f_insert_these_columns(VARIADIC cols2);
ERROR: column "cols2" does not exist
What's wrong here?

You seem to assume that SELECT INTO in SQL would assign a variable. But that is not so.
It creates a new table and its use is discouraged in Postgres. Use the superior CREATE TABLE AS instead. Not least, because the meaning of SELECT INTO inside plpgsql is different:
Combine two tables into a new one so that select rows from the other one are ignored
Concerning SQL variables:
User defined variables in PostgreSQL
Hence you cannot call the function like this:
SELECT f_insert_these_columns(VARIADIC cols2);
This would work:
SELECT f_insert_these_columns(VARIADIC (TABLE cols2 LIMIT 1));
Or cleaner:
SELECT f_insert_these_columns(VARIADIC array) -- "array" being the unfortunate column name
FROM cols2
LIMIT 1;
About the short TABLE syntax:
Is there a shortcut for SELECT * FROM?
Better solution
To copy all rows with columns sharing the same name between two tables:
CREATE OR REPLACE FUNCTION f_copy_rows_with_shared_cols(
IN _tbl1 regclass
, IN _tbl2 regclass
, OUT rows int
, OUT columns text)
LANGUAGE plpgsql AS
$func$
BEGIN
SELECT INTO columns -- proper use of SELECT INTO!
string_agg(quote_ident(attname), ', ')
FROM (
SELECT attname
FROM pg_attribute
WHERE attrelid IN (_tbl1, _tbl2)
AND NOT attisdropped -- no dropped (dead) columns
AND attnum > 0 -- no system columns
GROUP BY 1
HAVING count(*) = 2
) sub;
EXECUTE format('INSERT INTO %1$s(%2$s) SELECT %2$s FROM %3$s'
, _tbl1, columns, _tbl2);
GET DIAGNOSTICS rows = ROW_COUNT; -- return number of rows copied
END
$func$;
Call:
SELECT * FROM f_copy_rows_with_shared_cols('public.person2', 'public.person1');
Result:
rows | columns
-----+---------
3 | name, age
Major points
Note the proper use of SELECT INTO for assignment inside plpgsql.
Note the use of the data type regclass. This allows to use schema-qualified table names (optionally) and defends against SQL injection attempts:
Table name as a PostgreSQL function parameter
About GET DIAGNOSTICS:
Count rows affected by DELETE
About OUT parameters:
Returning from a function with OUT parameter
The manual about format().
Information schema vs. system catalogs.

Related

Combine multiple tables into one in Snowflake

Let's say I have the following monthly tables with table names formatted such that the number after the underscore refers to the month. What I want to do is to combine these 12 tables into one without having to write 10-30 insert/union all statements
table_1
table_2
table_3
table_4
table_5
table_6
table_7
table_8
table_9
table_10
table_11
table_12 -- (only 12 in this instance but could be as many as 36)
My current approach is to first create the master table with data from table_1.
create temporary table master_table_1_12 as
select * -- * to keep it simple for this example
from table_1;
Then use variables such that I can simply keep hitting the run button until it errors out with "table_13 does not exist"
set month_id=(select max(month_id) from master_table_1_12) + 1;
set table_name=concat('table_',$month_id);
insert into master_table_1_12
select *
from identifier($table_name);
Note: All monthly tables have a month_id column
Sure it saves some space on the console(compared to multiple inserts), but I still have to run it 12 times. Are Snowflake Tasks something I could use for this? I couldn't find a fitting example from their documentation to code that up but, if anyone had success with that or with a Javascript based SP for a problem like this, please enlighten.
Here's a stored procedure that will insert into master_table_1_12 from selects on table_1 through table_12. Modify as required:
create or replace procedure FILL_MASTER_TABLE()
returns string
language javascript
as
$$
var rows = 0;
for (var i=1; i<=12; i++) {
rows += insertRows(i);
}
return rows + " rows inserted into master_table_1_12.";
// End of main function
function insertRows(i) {
sql =
`insert into master_table_1_12
select *
from table_${i};`;
return doInsert(sql);
}
function doInsert(queryString) {
var out;
cmd1 = {sqlText: queryString};
stmt = snowflake.createStatement(cmd1);
var rs = stmt.execute();;
rs.next();
return rs.getColumnValue(1);
}
$$;
call fill_master_table();
By the way, if you don't have any processing to do and just need to consolidate the tables, you can do something like this:
insert into master_table_1_12
select * from table_1
union all
select * from table_2
union all
select * from table_3
union all
select * from table_4
union all
select * from table_5
union all
select * from table_6
union all
select * from table_7
union all
select * from table_8
union all
select * from table_9
union all
select * from table_10
union all
select * from table_11
union all
select * from table_12
;
Can you not create a view on top of these 12 tables. The view will be an union of all these tables.
Based on the comments below, I further elaborated my answer. please try this approach. It will provide better performance when your table is large. Partitioning it will improve performance. This is based on real experience.
CREATE TABLE SALES_2000 (REGION VARCHAR, UNITS_SOLD NUMBER);
CREATE TABLE SALES_2001 (REGION VARCHAR, UNITS_SOLD NUMBER);
CREATE TABLE SALES_2002 (REGION VARCHAR, UNITS_SOLD NUMBER);
CREATE TABLE SALES_2003 (REGION VARCHAR, UNITS_SOLD NUMBER);
INSERT INTO SALES_2000 VALUES('ASIA', 25);
INSERT INTO SALES_2001 VALUES('ASIA', 50);
INSERT INTO SALES_2002 VALUES('ASIA', 55);
INSERT INTO SALES_2003 VALUES('ASIA', 65);
CREATE VIEW ALL_SALES AS
SELECT * FROM SALES_2000
UNION
SELECT * FROM SALES_2001
UNION
SELECT * FROM SALES_2002
UNION
SELECT * FROM SALES_2003;
SELECT * FROM ALL_SALES WHERE UNITS_SOLD = 25;
I ended up creating a UDF that spits out a create view statement and a stored procedure that executes it to create a temporary view. I work with tables following specific naming convention, so you might have to tweak this solution a little for your use case. The separation of UDF and stored proc actually helps with that as you'd mostly need to tweak the SQL UDF. I am sharing a simplified version of what I actually have in the interest of keeping it representative of the tables I listed in my question.
SQL UDF FOR GENERATING A CREATE VIEW STATETEMENT
create or replace function sandbox.public.define_view(table_pattern varchar, start_month varchar, end_month varchar)
returns table ("" varchar) as
$$
with cte1(month_id) as
(select start_month::int + row_number() over (order by 1) - 1
from table(generator(rowcount=> end_month::int - start_month::int + 1)))
,cte2(month_id,statement) as
(select 0,
concat('create or replace temporary view master_',
split_part(table_pattern,'.',-1),
start_month,
'_',
end_month,
' as ')
union all
select month_id,
concat('select * from ',
table_pattern,
month_id,
case when month_id=end_month::int then ';' else ' union all ' end)
from cte1)
select listagg(statement, '\n') within group (order by month_id) as create_view_statement
from cte2
$$;
PROCEDURE FOR EXECUTING THE OUTPUT OF THE UDF ABOVE
create or replace procedure sandbox.public.create_view(TABLE_PATTERN varchar, START_MONTH varchar,END_MONTH varchar)
returns varchar not null
language Javascript
execute as caller
as
$$
sql_command = 'select * from table(sandbox.public.define_view(:1, :2, :3))';
var stmt = snowflake.createStatement({sqlText: sql_command ,binds: [TABLE_PATTERN, START_MONTH, END_MONTH]}).execute();
stmt.next();
var ddl = stmt.getColumnValue(1);
var run=snowflake.createStatement({sqlText: ddl}).execute();
run.next();
var message=run.getColumnValue(1);
return "Temporary " + message;
$$;
USAGE DEMO
set table_pattern ='sandbox.public.table_';
set start_month ='1';
set end_month = '12';
set master_view='master_'||split_part($table_pattern,'.',-1)||$start_month||'_'||$end_month;
call create_view($table_pattern, $start_month, $end_month);
select top 100 *
from identifier($master_view);

T-SQL Check if list has values, select and Insert into Table

I'm quite new to T-SQL and currently struggling with an insert statement in my stored procedure: I use as a parameter in the stored procedure a list of ids of type INT.
If the list is NOT empty, I want to store the ids into the table Delivery.
To pass the list of ids, i use a table type:
CREATE TYPE tIdList AS TABLE
(
ID INT NULL
);
GO
Maybe you know a better way to pass a list of ids into a stored procedure?
However, my procedure looks as follows:
-- parameter
#DeliveryModelIds tIdList READONLY
...
DECLARE #StoreId INT = 1;
-- Delivery
IF EXISTS (SELECT * FROM #DeliveryModelIds)
INSERT [MyDB].[Delivery] ([DeliveryModelId], [StoreId])
OUTPUT inserted.DeliveryId
SELECT ID FROM #DeliveryModelIds;
If the list has values, I want to store the values into the DB as well as the StoreId which is always 1.
If I insert the DeliveryIds 3,7,5 The result in table Delivery should look like this:
DeliveryId | StoreId | DeliveryModelId
1...............| 1...........| 3
2...............| 1...........| 7
3...............| 1...........| 5
Do you have an idea on how to solve this issue?
THANKS !
You can add #StoreId to your select for your insert.
...
IF EXISTS (SELECT * FROM #DeliveryModelIds)
INSERT [MyDB].[Delivery] ([DeliveryModelId], [StoreId])
OUTPUT inserted.DeliveryId
SELECT ID, #StoreId FROM #DeliveryModelIds;
Additionally, if you only want to insert DeliveryModelId that do not currently exist in the target table, you can use not exists() in the where clause like so:
...
IF EXISTS (SELECT * FROM #DeliveryModelIds)
INSERT [MyDB].[Delivery] ([DeliveryModelId], [StoreId])
OUTPUT inserted.DeliveryId
SELECT dmi.ID, #StoreId
FROM #DeliveryModelIds dmi
where not exists (
select 1
from MyDb.Delivery i
where i.StoreId = #StoreId
and i.DeliveryModeId = dmi.ID
);
You need to modify the INSERT statement to:
INSERT [MyDB].[Delivery] ([DeliveryModelId], [StoreId])
OUTPUT inserted.DeliveryId
SELECT ID, 1 FROM #DeliveryModelIds;
So you are also selecting a literal, 1, along with ID field.

How to select and assign multiple entries into a array like variable in sql?

I am new to sql. I want some thing like
DECLARE #VALID_ITEM_NUMBERS ITEM_NUMBER
SELECT #ITEM_NUMBERS = ITEM_NUMBER FROM [dbo].[ITEM] where IS_VALID = 1
(
here in the first line ITEM_NUMBER is a predefined type,
and on the second line ITEM_NUMBER refers to a column (of type ITEM_NUMBER) in the table named ITEM. IS_VALID is also a column in ITEM table
)
But SET or SELECT returns only one value. I want #VALID_ITEM_NUMBERS to contain all the valid item numbers, like an array.
Is there any way to do this without crating a separate stored procedure?
create one table variable to store all the values
declare #ITEM_NUMBERS table (ITEM_NUMBER int NOT NULL)
insert into #ITEM_NUMBERS(ITEM_NUMBER)
(select ITEM_NUMBER FROM [dbo].[ITEM] where IS_VALID = 1 )
select ITEM_NUMBER from #ITEM_NUMBERS

Most effective way to check sub-string exists in comma-separated string in SQL Server

I have a comma-separated list column available which has values like
Product1, Product2, Product3
I need to search whether the given product name exists in this column.
I used this SQL and it is working fine.
Select *
from ProductsList
where productname like '%Product1%'
This query is working very slowly. Is there a more efficient way I can search for a product name in the comma-separated list to improve the performance of the query?
Please note I have to search comma separated list before performing any other select statements.
user defined functions for comma separation of the string
Create FUNCTION [dbo].[BreakStringIntoRows] (#CommadelimitedString varchar(max))
RETURNS #Result TABLE (Column1 VARCHAR(max))
AS
BEGIN
DECLARE #IntLocation INT
WHILE (CHARINDEX(',', #CommadelimitedString, 0) > 0)
BEGIN
SET #IntLocation = CHARINDEX(',', #CommadelimitedString, 0)
INSERT INTO #Result (Column1)
--LTRIM and RTRIM to ensure blank spaces are removed
SELECT RTRIM(LTRIM(SUBSTRING(#CommadelimitedString, 0, #IntLocation)))
SET #CommadelimitedString = STUFF(#CommadelimitedString, 1, #IntLocation, '')
END
INSERT INTO #Result (Column1)
SELECT RTRIM(LTRIM(#CommadelimitedString))--LTRIM and RTRIM to ensure blank spaces are removed
RETURN
END
Declare #productname Nvarchar(max)
set #productname='Product1,Product2,Product3'
select * from product where [productname] in(select * from [dbo].[![enter image description here][1]][1][BreakStringIntoRows](#productname))
Felix is right and the 'right answer' is to normalize your table. Although, maybe you have 500k lines of code that expect this column to exist as it is. So your next best (non-destructive) answer is:
Create a table to hold normalize data:
CREATE TABLE ProductsList2 (ProductId INT, ProductName VARCHAR)
Create a TRIGGER that on UPDATE/INSERT/DELETE maintains ProductList2 by splitting the string 'Product1,Product2,Product3' into three records.
Index your new table.
Query against your new table:
SELECT *
FROM ProductsList
WHERE ProductId IN (SELECT x.ProductId
FROM ProductsList2 x
WHERE x.ProductName = 'Product1')

How to view ALL COLUMNS (same column names from all table) from all tables starting with Letter A?

Is this possible? I am using ORACLE 10g.
For example: I have 50 tables name A01, A02, A03, A04.........A50.
And all these tables have the "SAME COLUMN NAME"
For example: name, age, location
(Note: The Column Names are the same but not the value in the columns).
In the END... I want to view all data from column: name, age, location FROM ALL tables starting with letter A.
(Note 2: All tables starting with letter A are NOT STATIC, they are dynamic meaning different changes could occur. Example: A01 to A10 could be deleted and A99 Could be added).
Sorry for not clarifying.
DECLARE
TYPE CurTyp IS REF CURSOR;
v_cursor CurTyp;
v_record A01%ROWTYPE;
v_stmt_str VARCHAR2(4000);
BEGIN
for rec in (
select table_name
from user_tables
where table_name like 'A%'
) loop
if v_stmt_str is not null then
v_stmt_str := v_stmt_str || ' union all ';
end if;
v_stmt_str := v_stmt_str || 'SELECT * FROM ' || rec.table_name;
end loop;
OPEN v_cursor FOR v_stmt_str;
LOOP
FETCH v_cursor INTO v_record;
EXIT WHEN v_cursor%NOTFOUND;
-- Read values v_record.name, v_record.age, v_record.location
-- Do something with them
END LOOP;
CLOSE v_cursor;
END;
As per my understanding if you want to view all column names of tables starting with A then try below
select column_name,table_name from user_tab_cols where table_name like 'A%';
If your requirement is something else then specify it clearly.
If understand you correctly and the number of tables is constant then you can create a VIEW once
CREATE VIEW vw_all
AS
SELECT name, age, location FROM A01
UNION ALL
SELECT name, age, location FROM A01
UNION ALL
...
SELECT name, age, location FROM A50
UNION ALL
And then use it
SELECT *
FROM vw_all
WHERE age < 35
ORDER BY name
This returns you all tables you need:
select table_name
from user_tables
where table_name like 'A__';
From this, you can build a dynamic sql statement as:
select listagg('select * from '||table_name,' union all ') within group(order by table_name)
from user_tables
where table_name like 'A__'
This returns actually an SQL statement which contains all tables and the unions:
select * from A01 union all select * from A02 union all select * from A03
And finally execute this via native dynamic sql. You can do that in PL/SQL, so you need a function:
create function getA
query varchar2(32000);
begin
select listagg('select * from '||table_name,' union all ') within group(order by table_name)
into query
from user_tables
where table_name like 'A__';
open res for query;
return res;
end;
Note that what you're doing manually is basically called partitioning, and Oracle has a super-great support already available for that out of the box. I.e. you can have something which looks like a super-huge table, but technically it is stored as a set of smaller tables (and smaller indexes), splitted by a partitioning criteria. For example, if you have millions of payment records, you may partition it by year, this way one physical table contains only a reasonable set of data. Still, you can freely select in this, and if you're hitting data from other partitions, Oracle takes care of pulling those in.

Resources