bulk update postgresql sequences - database

I have existing data that I want to import in a new system.
I want to set sequences accordingly to the length of existing tables. I try this, but I get number == 1.
DO
$do$
DECLARE
_tbl text;
number int;
BEGIN
FOR _tbl IN
SELECT c.relname FROM pg_class c WHERE c.relkind = 'S' and c.relname ilike '%y_id_seq'
LOOP
-- EXECUTE
SELECT count(*) FROM regexp_replace(_tbl, '(\w)y_.*', '\1ies') INTO number;
RAISE NOTICE '%', number;
EXECUTE format('SELECT setval(''"%s"'', ''%s'' )', _tbl, number);
END LOOP;
END
$do$;
What should I do to get the right count?

COUNT(*) is not the best choice for a new sequence value. Just imagine that you have holes in your numbering, for example 1, 2, 15. Count is 3 but next value should be 16 to avoid duplicates in the future.
Assuming you use sequence for id column I would suggest:
SELECT max(id) FROM _table_name_ INTO number;
Or even simpler:
SELECT setval(_sequence_name_, max(id)) FROM _table_name_;

Related

Extract words in between the separator

I have input like below
I want like below
I was trying with
Sales External?HR?Purchase Department
I did LISTAGG because finally i want in separate columns
Query Output would be like below,
meaning it should search for first occurrence of the separator (in this case "?", it can be anything but not common ones like "-" or "/" as the separator needs to be separate than sting value) and then extract the phrase before the first separator and create one column and put the value. Then it should look for second occurrence of the separator and then extract the word and keep creating columns, there can be multiple separators.
I tried with SPLIT_PART but it does not maintain the sequence in real data scenario and data does not come correct as per sequence.
I also tried with REGEXP_INSTR, but unable to use special characters as separators.
Any thought?
Regex Extract should work for you:
SELECT
REGEXP_SUBSTR_ALL("Sales External?HR?Purchase Department", "(.*)\?")
You can use LATERAL FLATTEN to convert your array into rows:
WITH MY_CTE AS (
SELECT
REGEXP_SUBSTR_ALL("Sales External?HR?Purchase Department", "(.*)\?")
)
SELECT
*
FROM
LATERAL FLATTEN(INPUT => MY_CTE, MODE=> 'ARRAY')
Deeper dive into some more cases: https://dwgeek.com/snowflake-convert-array-to-rows-methods-and-examples.html/
Here's a simplified version of the data. It uses a CTE with array_agg to group the rows. It then changes from arrays to columns. To add more columns, you can use max(), min(), or any_value() functions to get them through the aggregation. (Note that use of any_value() will not allow use of cached results from the result set cache since it's flagged as nondeterministic.)
create or replace table T1 (EMPID int, ROLE string, ACCESS string, ACCESS_LVL string, ITERATION string);
insert into T1(EMPID, ROLE, ACCESS, ACCESS_LVL, ITERATION) values
(1234, 'Sales Rep', 'Specific', 'REGION', 'DEV'),
(1234, 'Purchase Rep', 'Specific', 'EVERY', 'PROD'),
(1234, 'HR', NULL, 'Dept', 'PROD'),
(4321, 'HR', 'Foo', 'Foo', 'Foo')
;
with X as
(
select EMPID
,array_agg(nvl(ROLE,'')) within group (order by ROLE) ARR_ROLE
,array_agg(nvl(ACCESS,'')) within group (order by ROLE) ARR_ACCESS
,array_agg(nvl(ACCESS_LVL,'')) within group (order by ROLE) ARR_ACCESS_LVL
,array_agg(nvl(ITERATION,'')) within group (order by ROLE) ARR_ITERATION
from T1
group by EMPID
)
select EMPID
,ARR_ROLE[0]::string as ROLE1
,ARR_ROLE[1]::string as ROLE2
,ARR_ROLE[2]::string as ROLE3
,ARR_ACCESS[0]::string as ACCESS1
,ARR_ACCESS[1]::string as ACCESS2
,ARR_ACCESS[2]::string as ACCESS3
,ARR_ACCESS_LVL[0]::string as ACCESS_LVL1
,ARR_ACCESS_LVL[1]::string as ACCESS_LVL2
,ARR_ACCESS_LVL[2]::string as ACCESS_LVL3
,ARR_ITERATION[0]::string as ITERATION1
,ARR_ITERATION[1]::string as ITERATION2
,ARR_ITERATION[2]::string as ITERATION3
from X
;
There's nothing particular that seems interesting to sort the rows into the array so that ROLE1, ROLE2, ROLE3, etc. are deterministic. I showed simply sorting on the name of the role, but it could be any order by within that group.
Here's a stored proc that will produce a table result with a dynamic set of columns based on the input string and specified delimiter.
If you are looking for a way to generate dynamic column names based on values, I recommend visiting Felipe Hoffa's blog entry here:
https://medium.com/snowflake/dynamic-pivots-in-sql-with-snowflake-c763933987c
create or replace procedure pivot_dyn_results(input string, delimiter string)
returns table ()
language SQL
AS
declare
max_count integer default 0;
lcount integer default 0;
rs resultset;
stmt1 string;
stmt2 string;
begin
-- Get number of delimiter separated values (assumes no leading or trailing delimiter)
select regexp_count(:input, '\\'||:delimiter, 1) into :max_count from dual;
-- Generate the initial row-based result set of parsed values
stmt1 := 'SELECT * from lateral split_to_table(?,?)';
-- Build dynamic query to produce the pivoted column based results
stmt2 := 'select * from (select * from table(result_scan(last_query_id(-1)))) pivot(max(value) for index in (';
-- initialize look counter for resulting columns
lcount := 1;
stmt2 := stmt2 || '\'' || lcount || '\'';
-- append pivot statement for each column to be represented
FOR l in 1 to max_count do
lcount := lcount + 1;
stmt2 := stmt2 || ',\'' || lcount || '\'';
END FOR;
-- close out the pivot statement
stmt2 := stmt2 || '))';
-- execute the
EXECUTE IMMEDIATE :stmt1 using (input, delimiter);
rs := (EXECUTE IMMEDIATE :stmt2);
return table(rs);
end;
Invocation:
call pivot_dyn_results([string],[delimiter]);
call pivot_dyn_results('Sales External?HR?Billing?Purchase Department','?');
Results:

Truncate trailing digit of Card Prefix if 0-9 digits are present

I have some records in my table like:
Prefix Column
...
54664300
54664301
54664302
54664303
546643040
546643041
546643042
546643043
546643044
546643045
546643046
546643047
546643048
546643049
54664305
54664306
54664307
54664308
54664309
...
54665100
54665101
54665102
54665103
54665105
54665106
54665109
...
If the 0-9 series are complete for a certain prefix, I simplify them. The following records above will become:
Prefix Column
...
54664300
54664301
54664302
54664303
54664304
54664305
54664306
54664307
54664308
54664309
...
54665100
54665101
54665102
54665103
54665105
54665106
54665109
...
And I can further simplify them because the 0-9 series has been completed. Which will result to:
Prefix Column
...
5466430
...
54665100
54665101
54665102
54665103
54665105
54665106
54665109
...
But some records will not get simplified because they are incomplete.
I achieved this process using WHILE loop:
DECLARE #length INT = (SELECT MAX(LEN(CardPrefix)) FROM #Card) - 1;
IF OBJECT_ID('tempdb..#GroupedCard') IS NOT NULL DROP TABLE #GroupedCard;
CREATE TABLE #GroupedCard (CardPrefix NVARCHAR(20));
WHILE (#length > 6) -- minimum 6 digits only
BEGIN
TRUNCATE TABLE #GroupedCard;
INSERT INTO #GroupedCard
SELECT LEFT(CardPrefix, #length) CardPrefix
FROM #Card
WHERE LEN(CardPrefix) > #length
GROUP BY LEFT(CardPrefix, #length)
HAVING SUM(CAST(ISNULL(RIGHT(CardPrefix, 1), 0) AS INT)) = 45; --sum of 0 to 9 is 45
IF NOT EXISTS (SELECT 1 FROM #GroupedCard) BREAK;
DELETE #Card
FROM #Card C
INNER JOIN #GroupedCard GC ON GC.CardPrefix = LEFT(C.CardPrefix, 9);
INSERT INTO #Card
SELECT CardPrefix FROM #GroupedCard;
END
I am just checking if there is a more efficient way to do this because our records are getting huge in our live environment. There will also be a need to execute this process more frequently.
Reading between the lines, but therefore are you not after...
SELECT Prefix
FROM dbo.YourTable
WHERE Prefix <= 99999999 --If this is alphanumeric, use LEN, but this will come at a cost
UNION ALL
SELECT DISTINCT LEFT(Prefix,8)
FROM dbo.YourTable
WHERE Prefix > 99999999; --If this is alphanumeric, use LEN, but this will come at a cost

How to extract every 7 characters of an nvarchar into another table?

I have an nvarchar(200) called ColumnA in Table1 that contains, for example, the value:
ABCDEFGHIJKLMNOPQRSTUVWXYZ
I want to extract every 7 characters into Table2, ColumnB and end up with all of these values below.
ABCDEFG
BCDEFGH
CDEFGHI
DEFGHIJ
EFGHIJK
FGHIJKL
GHIJKLM
HIJKLMN
IJKLMNO
JKLMNOP
KLMNOPQ
LMNOPQR
MNOPQRS
NOPQRST
OPQRSTU
PQRSTUV
QRSTUVW
RSTUVWX
STUVWXY
TUVWXYZ
[Not the real table and column names.]
The data is being loaded to Table1 and Table2 in an SSIS Package, and I'm puzzling whether it is better to do the string handling in TSQL in a SQL Task or parse out the string in a VB Script Component.
[Yes, I think we're the last four on the planet using VB in Script Components. I cannot persuade the other three that this C# thing is here to stay. Although, maybe it is a perfect time to go rogue.]
You can use a recursive CTE calculating the offsets step by step and substring().
WITH
cte
AS
(
SELECT 1 n
UNION ALL
SELECT n + 1 n
FROM cte
WHERE n + 1 <= len('ABCDEFGHIJKLMNOPQRSTUVWXYZ') - 7 + 1
)
SELECT substring('ABCDEFGHIJKLMNOPQRSTUVWXYZ', n, 7)
FROM cte;
db<>fiddle
If you have a physical numbers table, this is easy. If not, you can create a tally-on-the-fly:
DECLARE #string VARCHAR(100)='ABCDEFGHIJKLMNOPQRSTUVWXYZ';
--We create the tally using ROW_NUMBER against any table with enough rows.
WITH Tally(Nmbr) AS
(SELECT TOP(LEN(#string)-6) ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) FROM master..spt_values)
SELECT Nmbr
,SUBSTRING(#string,Nmbr,7) AS FragmentOf7
FROM Tally
ORDER BY Nmbr;
The idea in short:
The tally returns a list of numbers from 1 to n (n=LEN(#string)-6). This Number is used in SUBSTRING to define the starting position.
You can do it with T-SQL like this:
DECLARE C CURSOR LOCAL FOR SELECT [ColumnA] FROM [Table1]
OPEN C
DECLARE #Val nvarchar(200);
FETCH NEXT FROM C into #Val
WHILE ##FETCH_STATUS = 0 BEGIN
DECLARE #I INTEGER;
SELECT #I = 1;
WHILE #I <= LEN(#vAL)-6 BEGIN
PRINT SUBSTRING(#Val, #I, 7)
SELECT #I = #I + 1
END
FETCH NEXT FROM C into #Val
END
CLOSE C
Script Component solution
Assuming that the input Column name is Column1
Add a script component
Open the script component configuration form
Go to Inputs and Outputs Tab
Click on the Output icon and set the Synchronous Input property to None
Add an Output column (example outColumn1)
In the Script editor, use a similar code in the row processing function:
Dim idx as integer = 0
While Row.Column1.length > idx + 7
Output0Buffer.AddRow()
Output0Buffer.outColumn1 = Row.
Column1.Substring(idx,7)
idx +=1
End While

Generating a unique key in SQL Server on a per company basis

I store positions in a SQL Server 2012 database, where each position is defined by a position number and a company number.
The position numbers are unique for each company only.
For instance, my database could have the following
POSITION_NO COMPANY_NO
1 1
2 1
3 1
1 2
2 2
3 2
1 3
I need a function which takes a company number as a parameter, and returns the next sequential position number, which in the example table above would be 2 for COMPANY_NO = 3
What I use at the moment is:
CREATE PROCEDURE [DB].[GenerateKey]
#p_company_no float(53),
#return_value_argument float(53) OUTPUT
AS
BEGIN
DECLARE
#v_position_no numeric(5, 0)
SELECT #v_position_no = max(POSITION_NO) + 1
FROM DB.POSITION_TABLE with (nolock)
WHERE COMPANY_NO = #p_company_no
SET #return_value_argument = #v_position_no
RETURN
END
I am aware of the potential pitfalls of using with (nolock), but this was added in an unsuccessful attempt to prevent data-locks on my database. In fact, besides the fact that well-written code is obviously preferable, the main reason I am asking this question is to try and cut down the amount of places that could be causing the data-lock.
Is there any way my code could be improved?
Create an auxilliary table with sequences, with one row for every company (as you already did):
create table seq (company int, sequence int);
go
Seed the counters, one for every company (say there are two companies, 1 and 2):
insert seq values
(1, 1), (2, 1);
go
Then all you need is a way to both update and select the new value in a single statement to avoid race conditions. This is how to do it:
declare #next int;
declare #company int;
set #company = 2;
update seq
set #next = sequence = sequence + 1
where company = #company;
select #next
It would be nice to enclose this into a scalar function, but unfortunatelly no updates in functions are allowed. But you already have a stored procedure in place, so just modify the code in it.
And please tell me that the datatypes used are not really floats? Why not ints?
WHILE(1=1)
BEGIN
SELECT #v_position_no = max(POSITION_NO)
FROM DB.POSITION_TABLE with (nolock)
WHERE COMPANY_NO = #p_company_no
INSERT INTO DB.POSITION_TABLE
(COMPANY_NO, POSITION_NO)
SELECT TOP 1 #p_company_no, #v_position_no + 1
FROM DB.POSITION_TABLE with (nolock)
WHERE NOT EXISTS (SELECT 1
FROM DB.POSITION_TABLE with (nolock)
WHERE COMPANY_NO = #p_company_no
AND POSITION_NO = #v_position_no + 1)
IF(##ROWCOUNT > 0)
BREAK;
END
SET #return_value_argument = #v_position_no + 1
Note that this would only insert in the second statement if the POSITION_NO + 1 wasn't since added. If it was then it would try again.

Exists vs select count

In SQL Server, performance wise, it is better to use IF EXISTS (select * ...) than IF (select count(1)...) > 0...
However, it looks like Oracle does not allow EXISTS inside the IF statement, what would be an alternative to do that because using IF select count(1) into... is very inefficient performance wise?
Example of code:
IF (select count(1) from _TABLE where FIELD IS NULL) > 0 THEN
UPDATE TABLE _TABLE
SET FIELD = VAR
WHERE FIELD IS NULL;
END IF;
the best way to write your code snippet is
UPDATE TABLE _TABLE
SET FIELD = VAR
WHERE FIELD IS NULL;
i.e. just do the update. it will either process rows or not. if you needed to check if it did process rows then add afterwards
if (sql%rowcount > 0)
then
...
generally in cases where you have logic like
declare
v_cnt number;
begin
select count(*)
into v_cnt
from TABLE
where ...;
if (v_cnt > 0) then..
its best to use ROWNUM = 1 because you DON'T CARE if there are 40 million rows..just have Oracle stop after finding 1 row.
declare
v_cnt number;
begin
select count(*)
into v_cnt
from TABLE
where rownum = 1
and ...;
if (v_cnt > 0) then..
or
select count(*)
into v_cnt
from dual
where exists (select null
from TABLE
where ...);
whichever syntax you prefer.
As Per:
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:3069487275935
You could try:
for x in ( select count(*) cnt
from dual
where exists ( select NULL from foo where bar ) )
loop
if ( x.cnt = 1 )
then
found do something
else
not found
end if;
end loop;
is one way (very fast, only runs the subquery as long as it "needs" to, where exists
stops the subquery after hitting the first row)
That loop always executes at least once and at most once since a count(*) on a table
without a group by clause ALWAYS returns at LEAST one row and at MOST one row (even of
the table itself is empty!)

Resources