Prevent from duplicates on async insert on postgres

Prevent from duplicates on async insert on postgres - database

I have a trigger that basically get the last number inserted (folio) based on two columns (service_type and prefix) and then increment one value. Everything works fine but in some situations when there a lot of insert statements at the same time it causes that the function use the same last value inserted and it duplicates the folio column
CREATE OR REPLACE FUNCTION public.insert_folio()
RETURNS trigger
LANGUAGE plpgsql
AS $function$
DECLARE incremental INTEGER;
SELECT max(f.folio) INTO incremental FROM "folios" f
WHERE f.prefix = NEW."prefix" AND service_type = NEW."service_type";
NEW."folio" = incremental + 1;
IF NEW."folio" IS NULL THEN NEW."folio" = 1;
END IF;
RETURN NEW;
END;
$function$
CREATE TRIGGER insert_folio BEFORE INSERT ON "folios" FOR EACH ROW EXECUTE PROCEDURE insert_folio();
sample:
folio
service_type
prefix
1
DOCUMENT
DC
1
IMAGE
IMG
1
IMAGE
O
2
IMAGE
O
2 (This should be 3)
IMAGE
O
Any ideas?
Thanks!

It is because you have concurrent transactions that both see the same data. The second transaction does not see the record inserted by the first transaction until the first commits.
To prevent this behaviour you will have to lock the whole table when inserting to prevent concurrent writes or use advisory locks to prevent concurrent insertions of the same service_type and prefix.

Related

Table insertions using stored procedures?

(Submitting for a Snowflake User, hoping to receive additional assistance)
Is there another way to perform table insertion using a stored procedure faster?
I started building a usp with the purpose to insert million or so of rows of test data into a table for the purpose of load testing.
I got to this stage show below and set the iteration value to 10,000.
This took over 10 mins to iterate 10,000 times to insert a single integer into a table each iteration
Yes - I am using a XS data warehouse, but even if this is increased to MAX - this is way to slow to be of any use.
--build a test table
CREATE OR REPLACE TABLE myTable
(
myInt NUMERIC(18,0)
);
--testing a js usp using a while statement with the intention to insert multiple rows into a table (Millions) for load testing
CREATE OR REPLACE PROCEDURE usp_LoadTable_test()
RETURNS float
LANGUAGE javascript
EXECUTE AS OWNER
AS
$$
//set the number of iterations
var maxLoops = 10;
//set the row Pointer
var rowPointer = 1;
//set the Insert sql statement
var sql_insert = 'INSERT INTO myTable VALUES(:1);';
//Insert the fist Value
sf_startInt = rowPointer + 1000;
resultSet = snowflake.execute( {sqlText: sql_insert, binds: [sf_startInt] });
//Loop thorugh to insert all other values
while (rowPointer < maxLoops)
{
rowPointer += 1;
sf_startInt = rowPointer + 1000;
resultSet = snowflake.execute( {sqlText: sql_insert, binds: [sf_startInt] });
}
return rowPointer;
$$;
CALL usp_LoadTable_test();
So far, I've received the following recommendations:
Recommendation #1
One thing you can do is to use a "feeder table" containing 1000 or more rows instead of INSERT ... VALUES, eg:
INSERT INTO myTable SELECT <some transformation of columns> FROM "feeder table"
Recommendation #2
When you perform a million single row inserts, you consume one million micropartitions - each 16MB.
That 16 TB chunk of storage might be visible on your Snowflake bill ... Normal tables are retained for 7 days minimum after drop.
To optimize storage, you could define a clustering key and load the table in ascending order with each chunk filling up as much of a micropartition as possible.
Recommendation #3
Use data generation functions that work very fast if you need sequential integers: https://docs.snowflake.net/manuals/sql-reference/functions/seq1.html
Any other ideas?

This question was also asked at the Snowflake Lodge some weeks ago.
Given the answers you received, do you still feel unanswered, then maybe hint about why?
If you just want a table with a single column of sequence numbers, use GENERATOR() as in #3 above. Otherwise, if you want more advice, share your specific requirements.

How do I set the correct transaction level?

I am using Dapper on ADO.NET. So at present I am doing the following:
using (IDbConnection conn = new SqlConnection("MyConnectionString")))
{
conn.Open());
using (IDbTransaction transaction = conn.BeginTransaction())
{
// ...
However, there are various levels of transactions that can be set. I think this is the various settings.
My first question is how do I set the transaction level (where I am using Dapper)?
My second question is what is the correct level for each of the following cases? In each of these cases we have multiple instances of a web worker (Azure) service running that will be hitting the DB at the same time.
I need to run monthly charges on subscriptions. So in a transaction I need to read a record and if it's due for a charge create the invoice record and mark the record as processed. Any other read of that record for the same purpose needs to fail. But any other reads of that record that are just using it to verify that it is active need to succeed.
So what transaction do I use for the access that will be updating the processed column? And what transaction do I use for the other access that just needs to verify that the record is active?
In this case it's fine if a conflict causes the charge to not be run (we'll get it the next day). But it is critical that we not charge someone twice. And it is critical that the read to verify that the record is active succeed immediately while the other operation is in its transaction.
I need to update a record where I am setting just a couple of columns. One use case is I set a new password hash for a user record. It's fine if other access occurs during this except for deleting the record (I think that's the only problem use case). If another web service is also updating that's the user's problem for doing this in 2 places simultaneously.
But it's key that the record stay consistent. And this includes the use case of "set NumUses = NumUses + #ParamNum" so it needs to treat the read, calculation, write of the column value as an atomic action. And if I am setting 3 column values, they all get written together.

1) Assuming that Invoicing process is an SP with multiple statements your best bet is to create another "lock" table to store the fact that invoicing job is already running e.g.
CREATE TABLE InvoicingJob( JobStarted DATETIME, IsRunning BIT NOT NULL )
-- Table will only ever have one record
INSERT INTO InvoicingJob
SELECT NULL, 0
EXEC InvoicingProcess
ALTER PROCEDURE InvoicingProcess
AS
BEGIN
DECLARE #InvoicingJob TABLE( IsRunning BIT )
-- Try to aquire lock
UPDATE InvoicingJob WITH( TABLOCK )
SET JobStarted = GETDATE(), IsRunning = 1
OUTPUT INSERTED.IsRunning INTO #InvoicingJob( IsRunning )
WHERE IsRunning = 0
-- job has been running for more than a day i.e. likely crashed without releasing a lock
-- OR ( IsRunning = 1 AND JobStarted <= DATEADD( DAY, -1, GETDATE())
IF NOT EXISTS( SELECT * FROM #InvoicingJob )
BEGIN
PRINT 'Another Job is already running'
RETURN
END
ELSE
RAISERROR( 'Start Job', 0, 0 ) WITH NOWAIT
-- Do invoicing tasks
WAITFOR DELAY '00:01:00' -- to simulate execution time
-- Release lock
UPDATE InvoicingJob
SET IsRunning = 0
END
2) Read about how transactions work: https://learn.microsoft.com/en-us/sql/t-sql/language-elements/transactions-transact-sql?view=sql-server-2017
https://learn.microsoft.com/en-us/sql/t-sql/statements/set-transaction-isolation-level-transact-sql?view=sql-server-2017
You second question is quite broad.

Error update trigger after new row has inserted into same table

I want to update OrigOrderNbr and OrigOrderType (QT type) because when I create first both of column are Null value. But after S2 was created (QT converted to S2) the OrigOrderType and OrigOrderNbr (S2) take from QT reference. Instead of that, I want to update it to QT also.
http://i.stack.imgur.com/6ipFa.png
http://i.stack.imgur.com/E6qzT.png
CREATE TRIGGER tgg_SOOrder
ON dbo.SOOrder
FOR INSERT
AS
DECLARE #tOrigOrderType char(2),
#tOrigOrderNbr nvarchar(15)
SELECT #tOrigOrderType = i.OrderType,
#tOrigOrderNbr = i.OrderNbr
FROM inserted i
UPDATE dbo.SOOrder
SET OrigOrderType = #tOrigOrderType,
OrigOrderNbr = #tOrigOrderNbr
FROM inserted i
WHERE dbo.SOOrder.CompanyID='2'
and dbo.SOOrder.OrderType=i.OrigOrderType
and dbo.SOOrder.OrderNbr=i.OrigOrderNbr
GO
After I run that trigger, it showed the message 'Error #91: Another process has updated 'SOOrder' record. Your changes will be lost.'.

Per long string of comments, including some excellent suggestions in regards to proper trigger writing techniques by #marc_s and #Damien_The_Unbeliever, as well as my better understanding of your issue at this point, here's the re-worked trigger:
CREATE TRIGGER tgg_SOOrder
ON dbo.SOOrder
FOR INSERT
AS
--Update QT record with S2 record's order info
UPDATE SOOrder
SET OrigOrderType = 'S2'
, OrigOrderNbr = i.OrderNbr
FROM SOOrder dest
JOIN inserted i
ON dest.OrderNbr = i.OrigOrderNbr
WHERE dest.OrderType = 'QT'
AND i.OrderType = 'S2'
AND dest.CompanyID = 2 --Business logic constraint
AND dest.OrigOrderNbr IS NULL
AND dest.OrigOrderType IS NULL
Basically, the idea is to update any record of type "QT" once a matching record of type "S2" is created. Matching here means that OrigOrderNbr of S2 record is the same as OrderNbr of QT record. I kept your business logic constraint in regards to CompanyID being set to 2. Additionally, we only care to modify QT records that have OrigOrderNbr and OrigOrderType set to NULL.
This trigger does not rely on a single-row insert; it will work regardless of the number of rows inserted - which is far less likely to break down the line.

Add Multiple IDs in a Transaction to a LinkSet in Orientdb

I'm trying to add IDs from a inserted column to another inserted column in a Transaction. I tried like this:
begin
let doorOne = INSERT INTO doors SET color = green
let doorTwo = INSERT INTO doors SET color = blue
let car = INSERT INTO Cars SET doors = [$doorOne , $doorTwo]
commit retry 100
return $car
I get:
Unhandled rejection OrientDB.RequestError: The field 'Cars.doors' has been declared as LINKSET but the value is not a record or a record-id
I also tried to update it afterwards in the Transaction, but this won't work either (i think because the car is not created yet, so nothing to update) and i dont want to do it in two different calls, if there is a way to do it in one Transaction.

This is because INSERT returns, by default, the number of inserted entries. Try adding this clause at the end of each INSERT statement: RETURN #rid

How can I copy records between tables only if they are valid according to check constraints in Oracle?

I don't know if that is possible, but I want to copy a bunch of records from a temp table to a normal table. The problem is that some records may violate check constraints so I want to insert everything that is possible and generate error logs somewhere else for the invalid records.
If I execute:
INSERT INTO normal_table
SELECT ... FROM temp_table
nothing would be inserted if any record violates any constraint. I could make a loop and manually insert one by one, but I think the performance would be lower.
Ps: if possible, I'd like a solution that works with Oracle 9

From Oracle 10gR2, you can use the log errors clause:
EXECUTE DBMS_ERRLOG.CREATE_ERROR_LOG('NORMAL_TABLE');
INSERT INTO normal_table
SELECT ... FROM temp_table
LOG ERRORS REJECT LIMIT UNLIMITED;
In its simplest form. You can then see what errors you got:
SELECT ora_err_mesg$
FROM err$_normal_table;
More on the CREATE_ERROR_LOG step here.
I think this approach works from 9i, but don't have an instance available to test on, so this is actually run on 11gR2
Update: tested and tweaked (to avoid PLS-00436) in 9i:
declare
type t_temp_table is table of temp_table%rowtype;
l_temp_table t_temp_table;
l_err_code err_table.err_code%type;
l_err_msg err_table.err_msg%type;
l_id err_table.id%type;
cursor c is select * from temp_table;
error_array exception;
pragma exception_init(error_array, -24381);
begin
open c;
loop
fetch c bulk collect into l_temp_table limit 100;
exit when l_temp_table.count = 0;
begin
forall i in 1..l_temp_table.count save exceptions
insert into normal_table
values l_temp_table(i);
exception
when error_array then
for j in 1..sql%bulk_exceptions.count loop
l_id := l_temp_table(sql%bulk_exceptions(j).error_index).id;
l_err_code := sql%bulk_exceptions(j).error_code;
l_err_msg := sqlerrm(-1 * sql%bulk_exceptions(j).error_code);
insert into err_table(id, err_code, err_msg)
values (l_id, l_err_code, l_err_msg);
end loop;
end;
end loop;
end;
/
With all your real columns instead of just id, which I've done just for demo purposes:
create table normal_table(id number primary key);
create table temp_table(id number);
create table err_table(id number, err_code number, err_msg varchar2(2000));
insert into temp_table values(42);
insert into temp_table values(42);
Then run the anonymous block above...
select * from normal_table;
ID
----------
42
column err_msg format a50
select * from err_table;
ID ERR_CODE ERR_MSG
---------- ---------- --------------------------------------------------
42 1 ORA-00001: unique constraint (.) violated
This is less satisfactory on a few levels - more coding, slower if you have a lot of exceptions (because of the individual inserts for those), doesn't show which constraint was violated (or any other error details), and won't retain the errors if you rollback - though you could call an autonomous transaction to log it if that was an issue, which I doubt here.
If you have a small enough volume of data to not want to worry about the limit clause you can simplify it a bit:
declare
type t_temp_table is table of temp_table%rowtype;
l_temp_table t_temp_table;
l_err_code err_table.err_code%type;
l_err_msg err_table.err_msg%type;
l_id err_table.id%type;
error_array exception;
pragma exception_init(error_array, -24381);
begin
select * bulk collect into l_temp_table from temp_table;
forall i in 1..l_temp_table.count save exceptions
insert into normal_table
values l_temp_table(i);
exception
when error_array then
for j in 1..sql%bulk_exceptions.count loop
l_id := l_temp_table(sql%bulk_exceptions(j).error_index).id;
l_err_code := sql%bulk_exceptions(j).error_code;
l_err_msg := sqlerrm(-1 * sql%bulk_exceptions(j).error_code);
insert into err_table(id, err_code, err_msg)
values (l_id, l_err_code, l_err_msg);
end loop;
end;
/
The 9i documentation doesn't seem to be online any more, but this is in a new-features document, and lots of people have written about it - it's been asked about here before too.

If you're specifically interested only in check constraints then one method to think about is to read the definitions of the target check constraints from the data dictionary and apply them as predicates to the query that extracts data from the source table using dynamic sql.
Given:
create table t1 (
col1 number check (col1 between 3 and 10))
You can:
select constraint_name,
search_condition
from user_constraints
where constraint_type = 'C' and
table_name = 'T1'
The result being:
"SYS_C00226681", "col1 between 3 and 10"
From there it's "a simple matter of coding", as they say, and the method will work on just about any version of Oracle. The most efficient method would probably be to use a multitable insert to direct rows to either the intended target table or to an error logging table based on the result of a CASE statement that applies the check constraint predicates.