Postgres long-running transaction holding lock on parent partitioned table - database

TL;DR: we have long-running imports which seem to hold locks on the parent partitioned table even though nothing is directly referencing the parent table.
Background
In our system, we have inventories and inventory_items. Inventories tend to have 200k or so items, and it made sense for our access patterns to partition the inventory_items table by inventory_id using native partitioning (we're on Postgres 12). In other words, each inventory gets its own partitioned table of inventory_items. This is accomplished with the following DDL:
CREATE TABLE public.inventory_items (
inventory_id integer NOT NULL,
/* ... */
)
PARTITION BY LIST (inventory_id);
In our app code, when an inventory is created via the web dashboard, we automatically create the partitioned child inventory_items tables table via:
CREATE TABLE IF NOT EXISTS inventory_items_#{inventory_id}
PARTITION OF inventory_items
FOR VALUES IN (#{inventory_id});
Long import jobs block creating new inventories
It's typical for these inventories to be fully reloaded / reimported once per day, via CSV or otherwise, and these import tasks can sometimes take a while.
We noticed that while these long imports are running, it's not possible to create a new inventory, because, as mentioned above, creating an inventory means creating the partitioned child inventory_items table, and there is some lock contention between the long-running import and creating the inventory in the web dashboard, which is bad: we can't block users from creating inventories just because there's a totally unrelated import happening.
Sequence of events / locks when trying to create Inventory while import running
I'm using the following query in psql to determine who holds what locks:
select pid, relname, mode
from pg_locks l
join pg_class t on l.relation = t.oid
where t.relkind = 'r';
This query returns successfully obtained/held locks; it will not display pids that are waiting to obtain a lock (because some other pid holds it). For those, you have to look at the postgres logs.
Starting the slow import
Once the import starts, the worker process (pid 9029) grabs the following locks
pid | relname | mode
------+--------------------+------------------
9029 | inventory_items_16 | AccessShareLock
9029 | inventory_items_16 | RowExclusiveLock
The inventory that we're importing into has an id of 16, so the locks being held are on the inventory_items partitioned child tables that belong to that inventory. Note that there doesn't appear to be any locks on the parent inventory_items table.
Attempt to create inventory in the web dashboard
When I try and create an inventory in the dashboard, the requests stalls and times out due to 30s SQL statement timeout. Before it times out, the locks look like this:
pid | relname | mode
------+--------------------+------------------
7089 | inventories | RowExclusiveLock
9029 | inventory_items_16 | AccessShareLock
9029 | inventory_items_16 | RowExclusiveLock
PID 7089 is the web server. It successfully grabs RowExclusiveLock on inventories (the INSERT INTO inventories), but looking at the postgres logs, it's attempting and failing to grab an AccessExclusiveLock on 119795, which is the parent inventory_items table:
postgres.7089 [RED] [29-1] sql_error_code = 00000 LOG: statement: CREATE TABLE IF NOT EXISTS inventory_items_16
postgres.7089 [RED] [29-2] PARTITION OF inventory_items
postgres.7089 [RED] [29-3] FOR VALUES IN (16);
postgres.7089 [RED] [29-4]
postgres.7089 [RED] [30-1] sql_error_code = 00000 LOG: process 7089 still waiting for AccessExclusiveLock on relation 119795 of database 16402 after 1000.176 ms
postgres.7089 [RED] [30-2] sql_error_code = 00000 DETAIL: Process holding the lock: 9029. Wait queue: 7089.
postgres.7089 [RED] [30-3] sql_error_code = 00000 STATEMENT: CREATE TABLE IF NOT EXISTS inventory_items_16
postgres.7089 [RED] [30-4] PARTITION OF inventory_items
postgres.7089 [RED] [30-5] FOR VALUES IN (16);
I figure that the reason an AccessExclusiveLock is needed on the parent table when creating a child partition is because postgres needs to write some internal schema-y metadata to the parent table so it can route rows with inventory_id=16 to this new table, which makes sense to me.
But, judging by my pg_locks query, I don't understand where the lock contention is coming from. The web server needs an AccessExclusiveLock on the parent table, but pg_locks shows that the only locks held are on the child inventory_items_16 table.
So, what could be happening here? Do locks on child tables "expand" in locks on the parent table, or otherwise contend with locks on the parent table?
And is there some other way we could approach this problem? We feel pretty confident in our decision to partition these tables, but this unexpected lock contention is causing real problems, so we're looking for a clean, minimal-maintenance way to keep this basic architecture.
Last little tidbit
In rare cases, the presence of an active import does NOT block the web worker. 90% of the time it does, but sometimes it doesn't. So, somewhere in this mix is a tiny bit of nondeterminism which confounds everything.

Creating a partition with CREATE TABLE ... PARTITION OF ... requires an ACCESS EXCLUSIVE lock on the partitioned table, which will conflict with all access to the partitioned table.
On the other hand, inserting into the partition requires an ACCESS SHARE lock on the partitioned table while the insert statement is being planned. That causes a lock conflict.
I see two ways out:
Create new partitions in two steps:
CREATE TABLE inventory_items_42 (
LIKE inventory_items INCLUDING DEFAULTS INCLUDING CONSTRAINTS
);
ALTER TABLE inventory_items
ATTACH PARTITION inventory_items_42 FOR VALUES IN (42);
That requires only a SHARE UPDATE EXCLUSIVE lock on the partitioned table (from PostgreSQL v12 on), which is compatible with concurrent inserts.
Use a server prepared statement for the INSERT into the partition and make sure you prepare the statement before you start long running transaction that loads the data. You can use PostgreSQL's PREPARE and EXECUTE statements for that or use your API's facilities.

Related

How can I change the order that data is inserted into multiple tables with a Script Component in SSIS?

I have a script component in SSIS that parses the data into 3 outputs, which in turn are inserted into three tables. The problem is that one of these tables is a foreign key table and depends on the first two tables having their rows inserted first. Put another way, it would look like this:
Person
--------
Id
Name
Age
Job
--------
Id
Job Title
Hourly Pay
PersonJob
--------
PersonId (FK to Person.Id)
JobId (FK to Job.Id)
How can I have my Script Component insert the parsed output into the Person and Job tables first, and then the PersonJob table after?
Since I have complete control over the data being parsed into these tables, I ended up creating two Execute SQL Commands in my task flow.
Basically, before my package started on the data flow, I would remove the constraint. Then, once the data flow was complete, I would enable the constraint again. I'm not happy with the process, as I don't like to have to mess with the constraints when it's simply a matter of inserting them into the table in a specific order.
However, I do think it's a lot more efficient than having to parse the data twice or use a staging table. Either way, thanks for the advice!

Reorganize ID of tables in postgreSQL so that the ID starts again at 1?

Hello i am currently try different data automation processes with python and postgreSQL. I automated the cleaning and upload of a dataset with 40.000 data emtries into my Database. Due to some flaws in my process i had to truncate some tables or data entries.
i am using: python 3.9.7 / postgeSQL 13.3 / pgAdmin 4 v.5.7
Problem
Currently i have ID's of tables who start at the ID of 44700 instead of 1 (due do my editing).
For Example a table of train stations begins with the ID 41801 and ends with ID of 83599.
Question
How can reorganize my index so that the ID starts from 1 to 41801?
After looking online i found topics like "bloat" or "reindex". I tired Vacuum or Reindex but nothing really showed a difference in my tables? As far as now my tables have no relations to each other. What would be the approach to solve my problem in postgreSQL. Some hidden Function i overlooked? Maybe it's not a problem at all, but it definitely looks weird. At some point of time i end up with the ID of 250.000 while only having 40.000 data entries in my table.
Do you use a Sequence to generate ID column of your table? You can check it in pgAdmin under your database if you have a Sequence object in your database: Schemas -> public -> Sequences.
You can change the current sequence number with right-click on the Sequence and set it to '1'. But only do this if you deleted all rows in the table and before you start to import your data again.
As long as you do not any other table which references the ID column of your train station table, you can even update the ID with an update statement like:
UPDATE trainStations SET ID = ID - 41801 WHERE 1 = 1;

it's possible to create a trigger to move data between databases in postgresql?

I will try to simplify my problem:
let's say that I have 2 databases, let's call them DBA and DBB,
I have this table on DBA
shopping
id - name - amount
and on my DBB I have this other table:
shopping_hist
id - name - amount
every end of the month, I generate a dump from table shopping on DBA and copy its data
on table shopping_hist on DBB, it's possible to create a trigger that for every insert on Shopping, it will also make an insert on Shopping_hist, since they are not even on the same database?
I know that if they were on the same database, even if not on the same schema, it would be possible, but I'm not finding anything to automate this when it's for distinct databases

How to find out what is acting on an Oracle database table?

There is this table in my Oracle database that is used to store audit information.
When I first did a SELECT * on that table, the audit timestamps were all on the same day, within the same hour (e.g. 18/10/2013 15:06:45, 18/10/2013 15:07:29); the next time I did it, the previous entries were gone, and the table then only contained entries with the 16:mm:ss timestamp.
I think something is acting on that table, such that every interval the table contents is/may be backed up to somewhere - I don't know where, and then the table is cleared. However, as I'm not familiar with databases, I'm not sure what is doing this.
I'd like to know how I can find out what is acting on this table, so that I can in turn retrieve the previous data I need.
EDIT:
What I've tried thus far...
SELECT * FROM DBA_DEPENDENCIES WHERE REFERENCED_NAME='MY_AUDIT_TABLE';
I got back four results, but all of which were (based on my programming skills) talking about putting data into the table, none about backing it up anywhere.
SELECT * FROM MY_AUDIT_TABLE AS OF TIMESTAMP ...
This only gives me a snapshot at a certain time, but since the table is being updated very frequently, it does not make sense for me to query every second.
The dba_dependencies view will give you an idea on what procedures, function etc will act on the table
SELECT * FROM DBA_DEPENDENCIES WHERE REFERENCED_NAME='MY_AUDIT_TABLE';
where MY_AUDIT_TABLE is the audit table name
if the table's synonym is used in the database then
SELECT * FROM DBA_DEPENDENCIES WHERE REFERENCED_NAME='MY_AUDIT_TABLE_SYNONYM';
where MY_AUDIT_TABLE_SYNONYM is the synonym for MY_AUDIT_TABLE
Or if any triggers are acting on the table
Select * from dba_triggers where table_name='MY_AUDIT_TABLE';
for external script to process the table
you can request DBA to turn on DB Fine grained audit for the table
Then query view DBA_FGA_AUDIT_TRAIL with timestamp between 15:00:00 and 16:00:00 to check the external call(OS_PROCESS column will give Operating System Process ID) or what SQL(SQL_TEXT) is executing on the table

Maintaining audit log for entities split across multiple tables

We have an entity split across 5 different tables. Records in 3 of those tables are mandatory. Records in the other two tables are optional (based on sub-type of entity).
One of the tables is designated the entity master. Records in the other four tables are keyed by the unique id from master.
After update/delete trigger is present on each table and a change of a record saves off history (from deleted table inside trigger) into a related history table. Each history table contains related entity fields + a timestamp.
So, live records are always in the live tables and history/changes are in history tables. Historical records can be ordered based on the timestamp column. Obviously, timestamp columns are not related across history tables.
Now, for the more difficult part.
Records are initially inserted in a single transaction. Either 3 or 5 records will be written in a single transaction.
Individual updates can happen to any or all of the 5 tables.
All records are updated as part of a single transaction. Again, either 3 or 5 records will be updated in a single transaction.
Number 2 can be repeated multiple times.
Number 3 can be repeated multiple times.
The application is supposed to display a list of point in time history entries based on records written as single transactions only (points 1,3 and 5 only)
I'm currently having problems with an algorithm that will retrieve historical records based on timestamp data alone.
Adding a HISTORYMASTER table to hold the extra information about transactions seems to partially address the problem. A new record is added into HISTORYMASTER before every transaction. New HISTORYMASTER.ID is saved into each entity table during a transaction.
Point in time history can be retrieved by selecting the first record for a particular HISTORYMASTER.ID (ordered by timestamp)
Is there any more optimal way to manage audit tables based on AFTER (UPDATE, DELETE) TRIGGERs for entities spanning multiple tables?
Your HistoryMaster seems similar to how we have addressed history of multiple related items in one of our systems. By having a single point to hang all the related changes from in the history table, it is easy to then create a view that uses the history master as the hub and attached the related information. It also allows you to not create records in the history where an audit is not desired.
In our case the primary tables were called EntityAudit (where entity was the "primary" item being retained) and all data was stored EntityHistory tables related back to the Audit. In our case we were using a data layer for business rules, so it was easy to insert the audit rules into the data layer itself. I feel that the data layer is an optimal point for such tracking if and only if all modifications use that data layer. If you have multiple applications using distinct data layers (or none at all) then I suspect that a trigger than creates the master record is pretty much the only way to go.
If you don't have additional information to track in the Audit (we track the user who made the change, for example, something not on the main tables) then I would contemplate putting the extra Audit ID on the "primary" record itself. Your description does not seem to indicate you are interested in the minor changes to individual tables, but only changes that update the entire entity set (although I may be miss reading that). I would only do so if you don't care about the minor edits though. In our case, we needed to track all changes, even to the related records.
Note that the use of an Audit/Master table has an advantage in that you are making minimal changes to the History tables as compared to the source tables: a single AuditID (in our case, a Guid, although autonumbers would be fine in non distributed databases).
Can you add a TimeStamp / RowVersion datatype column to the entity master table, and associate all the audit records with that?
But an Update to any of the "child" tables will need to update the Master entity table to force the TimeStamp / RowVersion to change :(
Or stick a GUID in there that you freshen whenever one of the associated records changes.
Thinking that through, out loud, it may be better to have a table joined 1:1 to Master Entity that only contains the Master Entity ID and the "version number" fo the record - either TimeSTamp / RowVersion, GUID, incremented number, or something else.
I think it's a symptom of trying to capture "abstract" audit events at the lowest level of your application stack - the database.
If it's possible consider trapping the audit events in your business layer. This would allow you to capture the history per logical transaction rather than on a row-by-row basis. The date/time is unreliable for resolving things like this as it can be different for different rows, and the same for concurrent (or closely spaced) transactions.
I understand that you've asked how to do this in DB triggers though. I don't know about SQL Server, but in Oracle you can overcome this by using the DBMS_TRANSACTION.LOCAL_TRANSACTION_ID system package to return the ID for the current transaction. If you can retrieve an equivalent SQLServer value, then you can use this to tie the record updates for the current transaction together into a logical package.

Resources