How to increase creation speed on AgensGraph? - agens-graph

I try to load vertex on AgensGraph.
But, upload speed is too slow.
agens=# \timing on
Timing is on.
agens=# create (:v1{id:1});
GRAPH WRITE (INSERT VERTEX 1, INSERT EDGE 0)
Time: 23.285 ms
How can I increase creation speed on AgensGraph?

Avoid lost data, DBMS must sync to disk before commit.
It takes much time to commit.
You does not mind it.
Option "synchronous_commit" helps you.
agens=# set synchronous_commit to off;
SET
Time: 0.205 ms
agens=# create (:v1{id:1});
GRAPH WRITE (INSERT VERTEX 1, INSERT EDGE 0)
Time: 0.360 ms
agens=# set synchronous_commit to on;
SET
Time: 0.234 ms
agens=# create (:v1{id:1});
GRAPH WRITE (INSERT VERTEX 1, INSERT EDGE 0)
Time: 33.787 ms

Related

Database migration without downtime

In our organization we require to run a database migration on live site data. We want to add a column with default value in a table with around 1000 rows. Can you suggest any method so that we get zero or minimum downtime . We are using postgresql database and elixir phoenix app .
Thanks.
PS : we want minimum down time not exact zero . Also we want to run migration using Ecto in elixir and not through script.
Also if you can tell expected time taken to run migration when we have default constraint set.
In general ALTER TABLE requires exclusive lock on table but adding a column with default values can be very fast because only system catalog should be updated (and this action does not depend on the table size):
For example with PostgreSQL 12, I get:
# select count(*) from t;
count
---------
1000000
(1 row)
Time: 60.003 ms
# begin;
BEGIN
Time: 0.096 ms
# alter table t add newcol int default 19;
ALTER TABLE
Time: 0.457 ms
# commit;
COMMIT
Time: 9.211 ms
You should be able to get very small downtime with PostgreSQL 11 or 12. With a lower version PG rewrites the table: but even in this case 1000 rows is very very small and should be also very fast.

How to stop statistics DB endless growth

We have Microsoft SQL Server 2012 database used for storing statistics data from the queueing system. Queueing system is sending statistics to DB every 10 minutes and database size is growing around 1 GB per week.
We do not need data older than 1 year.
We have created a SQL script to delete old data. After the script is executed, DB size is bigger.
use statdb
-- Execute as stat user
-- The following three settings are used to control what will be used
-- Amount of days of stat data which should be kept
DECLARE #numberOfDaysToKeep int=365
-- If set to 1, also the aggregated data will be removed, otherwise only the events will be removed
DECLARE #DeleteAggregateData int = 1
-- If set to 1, also the hardware monitoring data will be removed.
DECLARE #DeleteAgentDcData int = 1
-- Do not change anything below this line
DECLARE #DeleteBeforeDate int = (SELECT id FROM stat.dim_date WHERE full_date > DATEADD(day,-1-#numberOfDaysToKeep,GETDATE()) AND full_date < DATEADD(day,0-#numberOfDaysToKeep,GETDATE()))
-- Remove CFM events
DELETE FROM stat.fact_visit_events where date_key < #DeleteBeforeDate
DELETE FROM stat.fact_sp_events where date_key < #DeleteBeforeDate
DELETE FROM stat.fact_staff_events where date_key < #DeleteBeforeDate
-- ...continue to delete from other tables
We would like to keep DB size on constant size. Does MS SQL server use free space (after delete) or will DB size grow at the same speed as before delete?
Or do I need to run SHRINK after script? (based on other discussions, it is not recommended)
You do not provide information is your database is in FULL recovery mode or SIMPLE recovery mode. For sure you have to shrink your log file as well as a data file. In SSMS you can see how much free space is available for the data file and for the log file. See the image below.
There is an option to shrink the data file and the log file using T-SQL.
See more on DBCC SHRINKFILE (Transact-SQL)

Insert from select or update from select with commit every 1M records

I've already seen a dozen such questions but most of them get answers that doesn't apply to my case.
First off - the database is am trying to get the data from has a very slow network and is connected to using VPN.
I am accessing it through a database link.
I have full write/read access on my schema tables but I don't have DBA rights so I can't create dumps and I don't have grants for creation new tables etc.
I've been trying to get the database locally and all is well except for one table.
It has 6.5 million records and 16 columns.
There was no problem getting 14 of them but the remaining two are Clobs with huge XML in them.
The data transfer is so slow it is painful.
I tried
insert based on select
insert all 14 then update the other 2
create table as
insert based on select conditional so I get only so many records and manually commit
The issue is mainly that the connection is lost before the transaction finishes (or power loss or VPN drops or random error etc) and all the GBs that have been downloaded are discarded.
As I said I tried putting conditionals so I get a few records but even this is a bit random and requires focus from me.
Something like :
Insert into TableA
Select * from TableA#DB_RemoteDB1
WHERE CREATION_DATE BETWEEN to_date('01-Jan-2016') AND to_date('31-DEC-2016')
Sometimes it works sometimes it doesn't. Just after a few GBs Toad is stuck running but when I look at its throughput it is 0KB/s or a few Bytes/s.
What I am looking for is a loop or a cursor that can be used to get maybe 100000 or a 1000000 at a time - commit it then go for the rest until it is done.
This is a one time operation that I am doing as we need the data locally for testing - so I don't care if it is inefficient as long as the data is brought in in chunks and a commit saves me from retrieving it again.
I can count already about 15GBs of failed downloads I've done over the last 3 days and my local table still has 0 records as all my attempts have failed.
Server: Oracle 11g
Local: Oracle 11g
Attempted Clients: Toad/Sql Dev/dbForge Studio
Thanks.
You could do something like:
begin
loop
insert into tablea
select * from tablea#DB_RemoteDB1 a_remote
where not exists (select null from tablea where id = a_remote.id)
and rownum <= 100000; -- or whatever number makes sense for you
exit when sql%rowcount = 0;
commit;
end loop;
end;
/
This assumes that there is a primary/unique key you can use to check if a row int he remote table already exists in the local one - in this example I've used a vague ID column, but replace that with your actual key column(s).
For each iteration of the loop it will identify rows in the remote table which do not exist in the local table - which may be slow, but you've said performance isn't a priority here - and then, via rownum, limit the number of rows being inserted to a manageable subset.
The loop then terminates when no rows are inserted, which means there are no rows left in the remote table that don't exist locally.
This should be restartable, due to the commit and where not exists check. This isn't usually a good approach - as it kind of breaks normal transaction handling - but as a one off and with your network issues/constraints it may be necessary.
Toad is right, using bulk collect would be (probably significantly) faster in general as the query isn't repeated each time around the loop:
declare
cursor l_cur is
select * from tablea#dblink3 a_remote
where not exists (select null from tablea where id = a_remote.id);
type t_tab is table of l_cur%rowtype;
l_tab t_tab;
begin
open l_cur;
loop
fetch l_cur bulk collect into l_tab limit 100000;
forall i in 1..l_tab.count
insert into tablea values l_tab(i);
commit;
exit when l_cur%notfound;
end loop;
close l_cur;
end;
/
This time you would change the limit 100000 to whatever number you think sensible. There is a trade-off here though, as the PL/SQL table will consume memory, so you may need to experiment a bit to pick that value - you could get errors or affect other users if it's too high. Lower is less of a problem here, except the bulk inserts become slightly less efficient.
But because you have a CLOB column (holding your XML) this won't work for you, as #BobC pointed out; the insert ... select is supported over a DB link, but the collection version will get an error from the fetch:
ORA-22992: cannot use LOB locators selected from remote tables
ORA-06512: at line 10
22992. 00000 - "cannot use LOB locators selected from remote tables"
*Cause: A remote LOB column cannot be referenced.
*Action: Remove references to LOBs in remote tables.

Is It possible to a rebuild Index with out taking instance offline?

I have this one NONCLUSTERED INDEX that's 85.71% total fragmentation and 55.35% page fullness.
Can this be done without taking my instance offline and not enterprise edition?
TITLE: Microsoft SQL Server Management Studio
------------------------------
Rebuild failed for Index 'idx_last_success_download'. (Microsoft.SqlServer.Smo)
For help, click: http://go.microsoft.com/fwlink?ProdName=Microsoft+SQL+Server&ProdVer=10.50.2500.0+((KJ_PCU_Main).110617-0038+)&EvtSrc=Microsoft.SqlServer.Management.Smo.ExceptionTemplates.FailedOperationExceptionText&EvtID=Rebuild+Index&LinkId=20476
------------------------------
ADDITIONAL INFORMATION:
An exception occurred while executing a Transact-SQL statement or batch. (Microsoft.SqlServer.ConnectionInfo)
------------------------------
Lock request time out period exceeded. (Microsoft SQL Server, Error: 1222)
For help, click: http://go.microsoft.com/fwlink?ProdName=Microsoft+SQL+Server&ProdVer=10.50.2500&EvtSrc=MSSQLServer&EvtID=1222&LinkId=20476
------------------------------
BUTTONS:
OK
------------------------------
After Reorganized:
ALTER INDEX idx_last_success_download ON dbo.TERMINAL_SYNCH_STATS
REORGANIZE;
I'm still getting 85.71 fragmentation?
Using for my stats: DBCC SHOWCONTIG
DBCC SHOWCONTIG scanning 'TERMINAL_SYNCH_STATS' table...
Table: 'TERMINAL_SYNCH_STATS' (331148225); index ID: 38, database ID: 7
LEAF level scan performed.
- Pages Scanned................................: 7
- Extents Scanned..............................: 5
- Extent Switches..............................: 6
- Avg. Pages per Extent........................: 1.4
- Scan Density [Best Count:Actual Count].......: 14.29% [1:7]
- Logical Scan Fragmentation ..................: 85.71%
- Extent Scan Fragmentation ...................: 40.00%
- Avg. Bytes Free per Page.....................: 3613.9
- Avg. Page Density (full).....................: 55.35%
Lock time out is not a version issue
Yes it is possible to rebuild an index online.
You have a lock timeout. I suspect it is an active table and rebuild simply cannot acquire a lock.
Try a Reorganize
Reorganize and Rebuild Indexes
Please note in any case you dont have to take SQL server database or SQL Server instance offline to rebuild any index. Yes if you have Standard edition ONLINE index rebuild is not possible and you have to make sure application or some query is not accessing the table otherwise index rebuild would fail
What is output of
select ##Version
The erorr message
Lock request time out period exceeded. (Microsoft SQL Server, Error: 1222)
Only says that when index rebuild task was trying to get exlcusive lock on table, because during index rebuild index is dropped and recreated , it was not able to get hence the error message. It is not a threatening message. You can get this message both in standard and enterprise edition while rebuilding index.
Index rebuild is maintenance activity so should always be done when load on database is relatively very less or during mainteance window.
For solution try rebuilding when no body is accessing database or laod is very less
Try to run rebuild with specifying option WAIT_AT_LOW_PRIORITY
e.g. as below
ALTER INDEX idx_last_success_download ON dbo.TERMINAL_SYNCH_STATS
REBUILD WITH
( FILLFACTOR = 80, SORT_IN_TEMPDB = ON, STATISTICS_NORECOMPUTE = ON,
ONLINE = ON (WAIT_AT_LOW_PRIORITY
(MAX_DURATION = 4 MINUTES, ABORT_AFTER_WAIT = BLOCKERS ) ),
DATA_COMPRESSION = ROW
);
For more info refer: https://msdn.microsoft.com/en-us/library/ms188388.aspx

Explain locking behavior in SQL Server

Why is that, with default settings on Sql Server (so transaction isolation level = read committed), that this test:
CREATE TABLE test2 (
ID bigint,
name varchar(20)
)
then run this in one SSMS tab:
begin transaction SH
insert into test2(ID,name) values(1,'11')
waitfor delay '00:00:30'
commit transaction SH
and this one simultaneously in another tab:
select * from test2
requires the 2nd select to wait for the first to complete before returning??
We also tried these for the 2nd query:
select * from test2 NOLOCK WHERE ID = 1
and tried inserting one ID in the first query and selecting a different ID in the second.
Is this the result of page locking? When running the 2 queries, i've also ran this:
select object_name(P.object_id) as TableName, resource_type, resource_description
from
sys.dm_tran_locks L join sys.partitions P on L.resource_associated_entity_id = p.hobt_id
and gotten this result set:
test2 RID 1:12186:5
test2 RID 1:12186:5
test2 PAGE 1:12186
test2 PAGE 1:12186
requires the 2nd select to wait for
the first to complete before
returning??
read commited prevents dirty reads and by blocking you will get a consistent result, snapshot isolation gets around this but you will get slightly worse performance because now sql server will hold the old values for the duration of the transaction (better have your tempdb on a good drive)
BTW, try changing the query from
select * from test2
to
select * from test2 where id <> 1
assuming you have more than 1 row in the table and it will be over a page, insert a couple of thousand rows
List traversal with node locking is done by 'crabbing':
you have a lock current node
you grab a lock next node
you make the next node current
you release the lock on the previous node (former current)
This techniques is common in all list traversal algorithms and is meant to keep stability while traversing: you are never making a 'leap' w/o having yourself anchored in a lock. It is often compared to the techniques used by rock climbers.
A statement like SELECT ... FROM table; is a scan over the entire table. As such, it can be compared with a list traversal, and the thread doing the table scan traversal will 'crabb' ver the rows just like one doing a list traversal will crabb over the nodes. Such list traversal is guaranteed that it will attempt to lock, eventually, every single node in the list, and a table scan will similarly attempt to lock, at one time or another, every single row in the table. So any conflicting lock held by another transaction on row will block the scan, 100% guaranteed. Everything else you observe (page locks, intent locks etc) is implementation details, irrelevant to the fundamental issue.
The proper solution to this problem is to optimize the queries that they don't scan tables end-to-end. Only after that is achieved you can turn your focus to eliminate whatever contention is left: deploy snapshot isolation based row-level versionning. In other words, enable read-committed snapshot on the database.

Resources