Log file is growing with Simple Recovery mode - sql-server

I am trying to learn why the below code is writing to the log file. I am a beginner and have read that, log is not written when the database is in simple recovery mode. But the below code is writing in both FULL and SIMPLE recovery mode. In which cases, the log file gets written with simple recovery mode?
Code:
Declare #val int =1
set nocount on
BEGIN TRAN
while #val <= 100000
begin
insert into LoadTable values (REPLICATE('P',1000))
set #val = #val + 1
end
ROLLBACK TRAN

First of all your understanding that nothing is written to log file when the database is in simple recovery mode is WRONG.
SQL Server writes to the Log file in all recovery modes, the only difference is In simple recovery mode it automatically reclaims the log space (when it can) and also logs minimum stuff to maintain transaction (just incase if you have to rollback one).
Whereas in full recovery mode we have to take Transaction log backups to make the space available for SQL Server to reuse for further logging.
Now going back to your example:
Declare #val int =1
set nocount on
BEGIN TRAN --<-- Your Transaction starts here
while #val <= 100000
begin
insert into LoadTable values (REPLICATE('P',1000))
set #val = #val + 1
end
ROLLBACK TRAN --<-- Your Transaction ends here
In your example, after the transaction has begun and before it ends (rollback/commit) there is a lot of activity going on, SQL Server needs to log this activity just in-case if you decide to rollback the transaction just like you did, hence the more and more logs will be written to the log file until the transaction is completed (Committed or Rollback).
In this specific example sql server has to keep log of 100000 insert statements just in-case something goes wrong .
Another slightly different version of your query could be...
Declare #val int =1
set nocount on
while #val <= 100000
begin
BEGIN TRAN --<-- Your Transaction starts here
insert into LoadTable values (REPLICATE('P',1000))
ROLLBACK TRAN --<-- Your Transaction Ends here
CHECKPOINT;
set #val = #val + 1
end
Now in this slightly different version of the same t-sql command there is a lot less activity going on after the transaction has begun and before its comlpleted, hence sql server has to log very little data and transaction file will grow very little if any.
In this example sql server has to keep log of only 1 insert statement at a time because it is committed or rolled back after that point.

I took all of the information here from Pro SQL Server Internals 2014
https://www.amazon.com/Pro-Server-Internals-Dmitri-Korotkevitch/dp/1430259620
TL;DR;
The recovery mode SIMPLE and FULL differs on how SQL Server will inactivate Virtual Log Files(VLF).
In summary:
1 - "in the SIMPLE recovery model, the active part of transaction log starts with VLF, which contains the oldest of LSN of the oldest active transaction or the last CHECKPOINT";
2 - "in the FULL or BULK-LOGGED recovery models, the active part of transaction log starts with VLF, which contains the oldest of the following:
LSN of the last log backup
LSN of the oldest active transaction
LSN of the process that reads transaction log records"
LSN = Log Sequence Number = unique, auto-incrementing ID
More detailed explanation
Suppose that this is the SQL Server memory model:
1 - Buffer Pool is where SQL Server storage indices, rows etc... in memory;
2 - Log buffer is a little (64KB per database) buffer of the Transaction Log;
3 - Data File is where SQL Server will persist indices, rows etc... in disk;
4 - Transaction log is...well, the Transaction Log in disk.
Suppose that we have a database in the following state.
/--------------- IN MEMORY --------------\/------------ IN DISK -----------\
|--------------------------------------------------------------------------|
|Buffer Pool | Log Buffer | Data File |Transaction Log |
|---------------------------|-------------|---------------|----------------|
|Page 1:24312 | |Page: 1:24312 |LSN:7213 |
|IsDirty: False | |LSN: 4845 | |
|LSN: 4845 | |Page: 1:24313 | |
|... | |LSN: 2078 | |
|Page 1:26912 | |... | |
|isDirty:False | |Page: 1:26911 | |
|LSN:1053 | |LSN: 2078 | |
| | |Page: 1:26912 | |
| | |LSN: 2078 | |
|---------------------------|-------------|---------------|----------------|
Now suppose a change is made, a simple update.
The first step is to insert the Log Record in the Log Buffer.
/--------------- IN MEMORY --------------\/------------ IN DISK -----------\
|--------------------------------------------------------------------------|
|Buffer Pool | Log Buffer | Data File |Transaction Log |
|---------------------------|-------------|---------------|----------------|
|Page 1:24312 |LSN:7214 |Page: 1:24312 |LSN:7213 |
|IsDirty: False |Op:Update |LSN: 4845 | |
|LSN: 4845 |Page:1:24312 |Page: 1:24313 | |
|... |OldLsn:4845 |LSN: 2078 | |
|Page 1:26912 |Row:2 |... | |
|isDirty:False |Tran:T1 |Page: 1:26911 | |
|LSN:1053 |PrevLSN:7141 |LSN: 2078 | |
| | |Page: 1:26912 | |
| | |LSN: 2078 | |
|---------------------------|-------------|---------------|----------------|
And then change the data page in memory (I only change the IsDirty to simplify)
/--------------- IN MEMORY --------------\/------------ IN DISK -----------\
|--------------------------------------------------------------------------|
|Buffer Pool | Log Buffer | Data File |Transaction Log |
|---------------------------|-------------|---------------|----------------|
|Page 1:24312 |LSN:7214 |Page: 1:24312 |LSN:7213 |
|IsDirty: TRUE |Op:Update |LSN: 4845 | |
|LSN: 4845 |Page:1:24312 |Page: 1:24313 | |
|... |OldLsn:4845 |LSN: 2078 | |
|Page 1:26912 |Row:2 |... | |
|isDirty:False |Tran:T1 |Page: 1:26911 | |
|LSN:1053 |PrevLSN:7141 |LSN: 2078 | |
| | |Page: 1:26912 | |
| | |LSN: 2078 | |
|---------------------------|-------------|---------------|----------------|
This goes on until the Log Buffer is full or the transaction is committed.
The Commit generates another entry in the Log Buffer where OP is Commit and flushes the whole buffer to the disk.
/--------------- IN MEMORY --------------\/------------ IN DISK -----------\
|--------------------------------------------------------------------------|
|Buffer Pool | Log Buffer | Data File |Transaction Log |
|---------------------------|-------------|---------------|----------------|
|Page 1:24312 | |Page: 1:24312 |LSN:7213 |
|IsDirty: TRUE | |LSN: 4845 | |
|LSN: 4845 | |Page: 1:24313 |LSN:7214 |
|... | |LSN: 2078 |<ALL PROPERTIES>|
|Page 1:26912 | |... | |
|isDirty:False | |Page: 1:26911 |LSN:7215 |
|LSN:1053 | |LSN: 2078 |Op:Commit |
| | |Page: 1:26912 | |
| | |LSN: 2078 |LSN:7216 |
| | | |Op:Checkpoint |
|---------------------------|-------------|---------------|----------------|
At this point SQL Server will answer the client that the transaction succeed.
It is worth pointing that the Dirty Page in memory has not been sent to disk yet.
At this points if something happened SQL Server would be capable of recovering all changes to this exact point.
This technique is called Write Ahead Logging and for more information see:
Repeating History Beyond ARIES
http://www.vldb.org/conf/1999/P1.pdf
In some moment a Checkpoint process will create a CHECKPOINT operation that flushes all Dirty pages from the Buffer Pool to the disk. Checkpoint operations also appears in the Transaction Log as the example above shows.
With this in mind we can see how SQL Server treats the Transaction Log.
Virtual Log Files
The Transaction Log on the disk are sub-divided in Virtual Log Files (VLF). You can see this running:
DBCC LOGINFO
The important part is that the Virtual Log Files (VLF) can be categorized as Active or Inactive.
SQL Server only use the active parts of the Transaction Log in its recovery model. So the difference between SIMPLE and FULL is when a VLF becames Inactive. SQL Server inactivate a VLF because the Transaction Log is a wraparound file, wich means, "when the end of the logical log file reaches the end of physical file, the log wraps around it". For example:
/------ACTIVE-----\/----------------INACTIVE----------------\/--------ACTIVE---\
|------------------------------------------------------------------------------|
| | | | | | | | |
| VLF1 | VLF2 | VLF3 | VLF4 | VLF5 | VLF6 | VLF7 | VLF8 |
| | | | | | | | |
|------------------------------------------------------------------------------|
So if for some reason no VLF become inactive the Transaction Log will need to grow infinitely.
IN SIMPLE RECOVERY
Going back to the example. After the checkpoint, and everything is flushed to the disk, SQL Server in SIMPLE recovery will maintain activated only the VLF that:
1 - contains the oldest of the LSN of the oldest active transaction; or
2 - the last checkpoint.
For example:
Before a Checkpoint
/------INACTIVE---\/----------------ACTIVE-------\/---------INACTIVE-----------\
|------------------------------------------------------------------------------|
| | | | | | | | |
| VLF1 | VLF2 | VLF3 | VLF4 | VLF5 | VLF6 | VLF7 | VLF8 |
| | | | | | | | |
|------------------------------------------------------------------------------|
^ ^ ^ ^ ^
| | | | |> End of logical LOG file
| | | |> Current LSN
| | |> Minumin LSN (Oldest Active Transaction)
| |> Last Checkpoint
|> Start of Logical LOG file
After the Checkpoint
/------INACTIVE---------------\/----ACTIVE-------\/---------INACTIVE-----------\
|------------------------------------------------------------------------------|
| | | | | | | | |
| VLF1 | VLF2 | VLF3 | VLF4 | VLF5 | VLF6 | VLF7 | VLF8 |
| | | | | | | | |
|------------------------------------------------------------------------------|
^ ^ ^ ^
| | | |> End of logical LOG file
| | |> Current LSN (Checkpoint Occurs)
| |> Minumin LSN (Oldest Active Transaction)
|> Start of Logical LOG file
SQL Server have inactivated the VLF3 that contained the last checkpoint because:
1 - The new Checkpoint forced all Dirty pages in memory to the disk. So there is no need to redo any changes that were stored in the VLF3 because the oldest active transaction is in VLF4;
2 - But, because of this we still need VLF4 to support rollback of all active transactions.
IN FULL RECOVERY
The same process happens in FULL recovery, but now the last VLF that will stay active will be the oldest from:
1 - LSN of the LAST LOG BACKUP;
2 - LSN of the OLDEST ACTIVE TRANSACTION; or
3 - LSN of the process that reads transaction log records.
For example
/------INACTIVE---------------\/----ACTIVE-------\/---------INACTIVE-----------\
|------------------------------------------------------------------------------|
| | | | | | | | |
| VLF1 | VLF2 | VLF3 | VLF4 | VLF5 | VLF6 | VLF7 | VLF8 |
| | | | | | | | |
|------------------------------------------------------------------------------|
^ ^ ^ ^ ^
| | | | |> End of logical LOG file
| | | |> Current LSN (Checkpoint Occurs)
| | |> Minumin LSN (Oldest Active Transaction)
| |> Replication log Reader
|> Start of Logical LOG file
in this example the Replication log Reader is forcing VLF4 to stay active.
or
/------INACTIVE---\/----------------ACTIVE-------\/---------INACTIVE-----------\
|------------------------------------------------------------------------------|
| | | | | | | | |
| VLF1 | VLF2 | VLF3 | VLF4 | VLF5 | VLF6 | VLF7 | VLF8 |
| | | | | | | | |
|------------------------------------------------------------------------------|
^ ^ ^ ^ ^ ^
| | | | | |> End of logical LOG file
| | | | |> Current LSN (Checkpoint Occurs)
| | | |> Minumin LSN (Oldest Active Transaction)
| | |> Replication log Reader
| |> Last Transaction Log Backup
|> Start of logical LOG file
and in this example the "last transaction log backup" is forcing the VLF3 to stay active.
I hope these helps to understand a little better how SQL Server works.

Check out further detail on recovery modes here.
DML queries will always write to the log in order to be able to rollback. In basic terms Simple recovery will not keep the logs once the transaction has been committed but they still write when executing.

Related

what should be minimum ratio of dead tuple for a table to be considered for VACUUM FULL in Postgres

I am a developer and looking for an advise on optimisation or maintenance of Postgres database.
I am currently investigating on commands which helps in clean up/defragmentation of DB and release some memory to filesystem as DB disk storage space is usage is growing quickly. I found that "VACUUM FULL" can help release memory used by dead tuples. However could not find information on how many or percentage of dead tuples should be there before we consider running this command.
Currently we have two tables in Nextcloud Postgres database which has dead tuples. Also total relation size for these tables is higher than the disk usage reported by \dt+ command. I am providing the stats below. Please advise if they are eligible for "VACUUM FULL" based on given stats.
###########################################
Disk space usage per table (\dt+ command)
###########################################
Schema | Name | Type | Owner | Size | Description
--------+-----------------------------+-------+----------+------------+-------------
public | oc_activity | table | XXXXXXXX | 4796 MB |
public | oc_filecache | table | XXXXXXXX | 127 MB |
#################################
oc_activity total relation size
#################################
SELECT pg_size_pretty( pg_total_relation_size('oc_activity') )
----------------
pg_size_pretty
----------------
9666 MB
########################################
Additional stats for oc_activity table
########################################
relid | schemaname | relname | seq_scan | seq_tup_read | idx_scan | idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup | n_mod_since_analyze | last_vacuum | last_autovacuum | last_analyze | last_autoanalyze | vacuum_count | autovacuum_count | analyze_count | autoanalyze_count
-------+------------+-------------+----------+--------------+----------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+-------------+-----------------+--------------+-------------------------------+--------------+------------------+---------------+-------------------
yyyyy | public | oc_activity | 272 | 1046966870 | 4737 | 57914604 | 1548217 | 0 | 325585 | 0 | 11440511 | 940192 | 268430 | | | | 2023-02-15 10:01:36.657028+00 | 0 | 0 | 0 | 3
###################################
oc_filecache total relation size
###################################
SELECT pg_size_pretty( pg_total_relation_size('oc_filecache') )
----------------
pg_size_pretty
----------------
541 MB
#########################################
Additional stats for oc_filecache table
#########################################
SELECT * FROM pg_stat_all_tables WHERE relname='oc_filecache'
relid | schemaname | relname | seq_scan | seq_tup_read | idx_scan | idx_tup_fetch | n_tup_ins | n_tup_upd | n_tup_del | n_tup_hot_upd | n_live_tup | n_dead_tup | n_mod_since_analyze | last_vacuum | last_autovacuum | last_analyze | last_autoanalyze | vacuum_count | autovacuum_count | analyze_count | autoanalyze_count
-------+------------+--------------+----------+--------------+------------+---------------+-----------+-----------+-----------+---------------+------------+------------+---------------------+-------------+-------------------------------+--------------+-------------------------------+--------------+------------------+---------------+-------------------
zzzzz | public | oc_filecache | 104541 | 28525391484 | 1974398333 | 2003365293 | 43575 | 695612 | 39541 | 348823 | 461510 | 19418 | 4069 | | 2023-02-16 10:46:15.165442+00 | | 2023-02-16 16:25:32.568168+00 | 0 | 8 | 0 | 33
There is no hard rule. I personally would consider a table uncomfortably bloated if the pgstattuple extension showed that less than a third or a quarter of the table are user data and the rest is dead tuples and empty space.
Rather than regularly running VACUUM (FULL) (which is downtime), you should strive to fix the problem that causes the table bloat in the first place.

How to list a stage in snowflake?

Look at this procedure:
greendatasvc#COMPUTE_WH#POS_DATA.BLAZE>CREATE STAGE IF NOT EXISTS NDJSON_STAGE FILE_FORMAT = NDJSON;
+---------------------------------------------------+
| status |
|---------------------------------------------------|
| NDJSON_STAGE already exists, statement succeeded. |
+---------------------------------------------------+
1 Row(s) produced. Time Elapsed: 0.182s
greendatasvc#COMPUTE_WH#POS_DATA.BLAZE>SHOW FILE FORMATS;
greendatasvc#COMPUTE_WH#POS_DATA.BLAZE>LIST #NDJSON_STAGE;
+------+------+-----+---------------+
| name | size | md5 | last_modified |
|------+------+-----+---------------|
+------+------+-----+---------------+
0 Row(s) produced. Time Elapsed: 0.192s
greendatasvc#COMPUTE_WH#POS_DATA.BLAZE>SHOW STAGES;
+-------------------------------+--------------+---------------+-------------+-----+-----------------+--------------------+----------+---------+--------+----------+-------+----------------------+---------------------+
| created_on | name | database_name | schema_name | url | has_credentials | has_encryption_key | owner | comment | region | type | cloud | notification_channel | storage_integration |
|-------------------------------+--------------+---------------+-------------+-----+-----------------+--------------------+----------+---------+--------+----------+-------+----------------------+---------------------|
| 2021-10-19 12:31:31.043 -0700 | NDJSON_STAGE | POS_DATA | BLAZE | | N | N | SYSADMIN | | NULL | INTERNAL | NULL | NULL | NULL |
+-------------------------------+--------------+---------------+-------------+-----+-----------------+--------------------+----------+---------+--------+----------+-------+----------------------+---------------------+
1 Row(s) produced. Time Elapsed: 0.159s
I believe I already have a stage named NDJSON_STAGE based on its output when I try and create one. However, when I try and list it I get no results. Am I using the LIST function incorrectly?
Your stage exists, its confirmed both by the 'already exists' results response and by the fact that you did'nt receive any error when trying to list files from your stage.
If you see nothing with LIST #NDJSON_STAGE; command that's probably because you don't have any file in this stage. Upload a file in the stage using a PUT command then you should be able to list your availables stage files.
Just to be clear, LIST #stagename returns a list of files that have been staged - on that stage.
In your case the stage is empty.
If you want to display the stages for which you have access, then you can use SHOW STAGES and that lists all the stages for which you have access privileges

Alert Before Running Query Consisting Of Large Size Data

Do we have any mechanism in Snowflake where we alert Users running a Query containing Large Size Tables , this way user would get to know that Snowflake would consume many warehouse credits if they run this query against large size dataset,
There is no alert mechanism for this, but users may run EXPLAIN command before running the actual query, to estimate the bytes/partitions read:
explain select c_name from "SAMPLE_DATA"."TPCH_SF10000"."CUSTOMER";
+-------------+----+--------+-----------+-----------------------------------+-------+-----------------+-----------------+--------------------+---------------+
| step | id | parent | operation | objects | alias | expressions | partitionsTotal | partitionsAssigned | bytesAssigned |
+-------------+----+--------+-----------+-----------------------------------+-------+-----------------+-----------------+--------------------+---------------+
| GlobalStats | | | | 6585 | 6585 | 109081790976 | | | |
| 1 | 0 | | Result | | | CUSTOMER.C_NAME | | | |
| 1 | 1 | 0 | TableScan | SAMPLE_DATA.TPCH_SF10000.CUSTOMER | | C_NAME | 6585 | 6585 | 109081790976 |
+-------------+----+--------+-----------+-----------------------------------+-------+-----------------+-----------------+--------------------+---------------+
https://docs.snowflake.com/en/sql-reference/sql/explain.html
You can also assign users to specific warehouses, and use resource monitors to limit credits on those warehouses.
https://docs.snowflake.com/en/user-guide/resource-monitors.html#assignment-of-resource-monitors
As the third alternative, you may set STATEMENT_TIMEOUT_IN_SECONDS to prevent long running queries.
https://docs.snowflake.com/en/sql-reference/parameters.html#statement-timeout-in-seconds

MySQL Import into Innodb table severely spikes at a certain point

I'm trying to migrate a 30GB database from one server to another.
The short story is that at a certain point through the process, the amount of time it takes to import records severely increases as a spike. The following is from using the SOURCE command to import a chunk of 500k records (out of about ~25-30 million throughout the database) that was exported as an sql file that was ssh tunnelled over to the new server:
...
Query OK, 2871 rows affected (0.73 sec)
Records: 2871 Duplicates: 0 Warnings: 0
Query OK, 2870 rows affected (0.98 sec)
Records: 2870 Duplicates: 0 Warnings: 0
Query OK, 2865 rows affected (0.80 sec)
Records: 2865 Duplicates: 0 Warnings: 0
Query OK, 2871 rows affected (0.87 sec)
Records: 2871 Duplicates: 0 Warnings: 0
Query OK, 2864 rows affected (2.60 sec)
Records: 2864 Duplicates: 0 Warnings: 0
Query OK, 2866 rows affected (7.53 sec)
Records: 2866 Duplicates: 0 Warnings: 0
Query OK, 2879 rows affected (8.70 sec)
Records: 2879 Duplicates: 0 Warnings: 0
Query OK, 2864 rows affected (7.53 sec)
Records: 2864 Duplicates: 0 Warnings: 0
Query OK, 2873 rows affected (10.06 sec)
Records: 2873 Duplicates: 0 Warnings: 0
...
The spikes eventually average to 16-18 seconds per ~2800 rows affected. Granted I don't usually use Source for a large import, but for the sakes of showing legitimate output, I used it to understand when the spikes happen. Using mysql command or mysqlimport yields the same results. Even piping the results directly into the new database instead of through an sql file has these spikes.
As far as I can tell, this happens after a certain amount of records are inserted into a table. The first time I boot up a server and import a chunk that size, it goes through just fine. Give or take the estimated amount it handles until these spikes occur. I can't correlate that because I haven't consistently replicated the issue to evidently conclude that. There are ~20 tables that have sub 500,000 records that all imported just fine when those 20 tables were imported through a single command. This seems to only happen to tables that have an excessive amount of data. Granted, the solutions I've come cross so far seem to only address the natural DR that occurs when you import over time (The expected output in my case was that eventually at the end of importing 500k records, it would take 2-3 seconds per ~2800, whereas it seems the questions were addressing that at the end it shouldn't take that long). This comes from a single sugarCRM table called 'campaign_log', which has ~9 million records. I was able to import in chunks of 500k back onto the old server i'm migrating off of without these spikes occuring, so I assume this has to do with my new server configuration. Another thing is that whenever these spikes occur, the table that it is being imported into seems to have an awkward way of displaying the # of records via count. I know InnoDB gives count estimates, but the number doesn't succeed the ~, indicating the estimate. It usually is accurate and that each time you refresh the table, it doesn't change the amount it displays (This is based on what it reports through PHPMyAdmin)
Here's the following commands/InnoDB system variables I have on the new server:
INNODB System Vars:
+---------------------------------+------------------------+
| Variable_name | Value |
+---------------------------------+------------------------+
| have_innodb | YES |
| ignore_builtin_innodb | OFF |
| innodb_adaptive_flushing | ON |
| innodb_adaptive_hash_index | ON |
| innodb_additional_mem_pool_size | 8388608 |
| innodb_autoextend_increment | 8 |
| innodb_autoinc_lock_mode | 1 |
| innodb_buffer_pool_instances | 1 |
| innodb_buffer_pool_size | 8589934592 |
| innodb_change_buffering | all |
| innodb_checksums | ON |
| innodb_commit_concurrency | 0 |
| innodb_concurrency_tickets | 500 |
| innodb_data_file_path | ibdata1:10M:autoextend |
| innodb_data_home_dir | |
| innodb_doublewrite | ON |
| innodb_fast_shutdown | 1 |
| innodb_file_format | Antelope |
| innodb_file_format_check | ON |
| innodb_file_format_max | Antelope |
| innodb_file_per_table | OFF |
| innodb_flush_log_at_trx_commit | 1 |
| innodb_flush_method | fsync |
| innodb_force_load_corrupted | OFF |
| innodb_force_recovery | 0 |
| innodb_io_capacity | 200 |
| innodb_large_prefix | OFF |
| innodb_lock_wait_timeout | 50 |
| innodb_locks_unsafe_for_binlog | OFF |
| innodb_log_buffer_size | 8388608 |
| innodb_log_file_size | 5242880 |
| innodb_log_files_in_group | 2 |
| innodb_log_group_home_dir | ./ |
| innodb_max_dirty_pages_pct | 75 |
| innodb_max_purge_lag | 0 |
| innodb_mirrored_log_groups | 1 |
| innodb_old_blocks_pct | 37 |
| innodb_old_blocks_time | 0 |
| innodb_open_files | 300 |
| innodb_print_all_deadlocks | OFF |
| innodb_purge_batch_size | 20 |
| innodb_purge_threads | 1 |
| innodb_random_read_ahead | OFF |
| innodb_read_ahead_threshold | 56 |
| innodb_read_io_threads | 8 |
| innodb_replication_delay | 0 |
| innodb_rollback_on_timeout | OFF |
| innodb_rollback_segments | 128 |
| innodb_spin_wait_delay | 6 |
| innodb_stats_method | nulls_equal |
| innodb_stats_on_metadata | ON |
| innodb_stats_sample_pages | 8 |
| innodb_strict_mode | OFF |
| innodb_support_xa | ON |
| innodb_sync_spin_loops | 30 |
| innodb_table_locks | ON |
| innodb_thread_concurrency | 0 |
| innodb_thread_sleep_delay | 10000 |
| innodb_use_native_aio | ON |
| innodb_use_sys_malloc | ON |
| innodb_version | 5.5.39 |
| innodb_write_io_threads | 8 |
+---------------------------------+------------------------+
System Specs:
Intel Xeon E5-2680 v2 (Ivy Bridge) 8 Processors
15GB Ram
2x80 SSDs
CMD to Export:
mysqldump -u <olduser> <oldpw>, <olddb> <table> --verbose --disable-keys --opt | ssh -i <privatekey> <newserver> "cat > <nameoffile>"
Thank you for any assistance. Let me know if there's any other information I can provide.
I figured it out. I increased the innodb_log_file_size from 5MB to 1024MB. While it did significantly increase the amount of records I imported (Never went above 1 second per 3000 rows), it also fixed the spikes. There were only 2 in all the records I imported, but after they happened, they immediately went back to taking sub 1 second.

Data restore procedure failed in OrientDB

Last night I received the following error after inserting ~500k records:
2014-07-03 22:10:50:056 SEVE Internal server error:
java.lang.IllegalArgumentException: Cannot get allocation information
for database 'pumpup' because it is not a disk-based database
[ONetworkProtocolHttpDb]
My OrientDB server.sh froze, so I rebooted my computer. Now when I try to access the database, I get the following output from server.sh:
2014-07-04 13:52:35:331 INFO OrientDB Server v1.7.3 is active. [OServer]
2014-07-04 13:52:38:784 WARN segment file 'database.ocf' was not closed correctly last time [OSingleFileSegment]
2014-07-04 13:52:38:879 WARN Storage pumpup was not closed properly. Will try to restore from write ahead log. [OLocalPaginatedStorage]
2014-07-04 13:52:38:879 INFO Looking for last checkpoint... [OLocalPaginatedStorage]
2014-07-04 13:52:38:879 INFO Checkpoints are absent, the restore will start from the beginning. [OLocalPaginatedStorage]
2014-07-04 13:52:38:880 INFO Data restore procedure is started. [OLocalPaginatedStorage]
2014-07-04 13:53:15:080 INFO Heap memory is low apply batch of operations are read from WAL. [OLocalPaginatedStorage]Exception during storage data restore.
null
-> com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage.restoreWALBatch(OLocalPaginatedStorage.java:1842)
-> com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage.restoreFrom(OLocalPaginatedStorage.java:1802)
-> com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage.restoreFromBegging(OLocalPaginatedStorage.java:1772)
-> com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage.restoreFromWAL(OLocalPaginatedStorage.java:1611)
-> com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage.restoreIfNeeded(OLocalPaginatedStorage.java:1578)
-> com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage.open(OLocalPaginatedStorage.java:245)
-> com.orientechnologies.orient.core.db.raw.ODatabaseRaw.open(ODatabaseRaw.java:100)
-> com.orientechnologies.orient.core.db.ODatabaseWrapperAbstract.open(ODatabaseWrapperAbstract.java:49)
-> com.orientechnologies.orient.core.db.record.ODatabaseRecordAbstract.open(ODatabaseRecordAbstract.java:268)
-> com.orientechnologies.orient.core.db.ODatabaseWrapperAbstract.open(ODatabaseWrapperAbstract.java:49)
-> com.orientechnologies.orient.server.OServer.openDatabase(OServer.java:557)
-> com.orientechnologies.orient.server.network.protocol.http.command.OServerCommandAuthenticatedDbAbstract.authenticate(OServerCommandAuthenticatedDbAbstract.java:126)
-> com.orientechnologies.orient.server.network.protocol.http.command.OServerCommandAuthenticatedDbAbstract.beforeExecute(OServerCommandAuthenticatedDbAbstract.java:87)
-> com.orientechnologies.orient.server.network.protocol.http.command.get.OServerCommandGetConnect.beforeExecute(OServerCommandGetConnect.java:46)
-> com.orientechnologies.orient.server.network.protocol.http.ONetworkProtocolHttpAbstract.service(ONetworkProtocolHttpAbstract.java:173)
-> com.orientechnologies.orient.server.network.protocol.http.ONetworkProtocolHttpAbstract.execute(ONetworkProtocolHttpAbstract.java:572)
-> com.orientechnologies.common.thread.OSoftThread.run(OSoftThread.java:45)
2014-07-04 13:53:15:082 SEVE Internal server error:
com.orientechnologies.orient.core.exception.OStorageException: Cannot open local storage '/Users/gsquare567/Databases/orientdb-community-1.7.3/databases/pumpup' with mode=rw
--> java.lang.NullPointerException [ONetworkProtocolHttpDb]
When I try to connect every subsequent time, I get the following:
--> com.orientechnologies.common.concur.lock.OLockException: File '/Users/gsquare567/Databases/orientdb-community-1.7.3/databases/pumpup/database.ocf'
is locked by another process, maybe the database is in use by another
process. Use the remote mode with a OrientDB server to allow multiple
access to the same database. [ONetworkProtocolHttpDb]
I can't connect to the database. I'm going to update from 1.7.3 to 1.7.4, recreate the database, and try again. For now, here's some output from dserver.sh as it seems to be trying to perform a data restore procedure:
2014-07-04 14:01:09:168 INFO [192.168.1.8]:2434 [orientdb] [3.2.2] Address[192.168.1.8]:2434 is STARTED [LifecycleService]
2014-07-04 14:01:09:198 INFO [192.168.1.8]:2434 [orientdb] [3.2.2] Initializing cluster partition table first arrangement... [InternalPartitionService]
2014-07-04 14:01:09:212 INFO [node1404496844581] found no previous messages in queue orientdb.node.node1404496844581.response [OHazelcastDistributedMessageService]
2014-07-04 14:01:09:230 WARN [node1404496844581] opening database 'pumpup'... [OHazelcastPlugin]
2014-07-04 14:01:09:231 INFO [node1404496844581] loaded database configuration from disk: /Users/gsquare567/Databases/orientdb-community-1.7.3/config/default-distributed-db-config.json [OHazelcastPlugin]
2014-07-04 14:01:09:238 INFO updated distributed configuration for database: pumpup:
----------
{
"autoDeploy":true,
"hotAlignment":false,
"readQuorum":1,
"writeQuorum":2,
"failureAvailableNodesLessQuorum":false,
"readYourWrites":true,"clusters":{
"internal":{
},
"index":{
},
"*":{
"servers":["<NEW_NODE>"]
}
},
"version":0
}
---------- [OHazelcastPlugin]
2014-07-04 14:01:09:243 INFO updated distributed configuration for database: pumpup:
----------
{
"version":0,
"autoDeploy":true,
"hotAlignment":false,
"readQuorum":1,
"writeQuorum":2,
"failureAvailableNodesLessQuorum":false,
"readYourWrites":true,"clusters":{
"internal":null,
"index":null,
"*":{
"servers":["<NEW_NODE>"]
}
}
}
---------- [OHazelcastPlugin]
2014-07-04 14:01:09:243 INFO Saving distributed configuration file for database 'pumpup' to: /Users/gsquare567/Databases/orientdb-community-1.7.3/databases/pumpup/distributed-config.json [OHazelcastPlugin]
2014-07-04 14:01:09:246 INFO [node1404496844581] adding node 'node1404496844581' in partition: db=pumpup [*] [OHazelcastDistributedDatabase]
2014-07-04 14:01:09:246 INFO updated distributed configuration for database: pumpup:
----------
{
"version":1,
"autoDeploy":true,
"hotAlignment":false,
"readQuorum":1,
"writeQuorum":2,
"failureAvailableNodesLessQuorum":false,
"readYourWrites":true,"clusters":{
"internal":null,
"index":null,
"*":{
"servers":["<NEW_NODE>","node1404496844581"]
}
}
}
---------- [OHazelcastPlugin]
2014-07-04 14:01:09:247 INFO Saving distributed configuration file for database 'pumpup' to: /Users/gsquare567/Databases/orientdb-community-1.7.3/databases/pumpup/distributed-config.json [OHazelcastPlugin]
2014-07-04 14:01:09:247 INFO [node1404496844581] received added status node1404496844581.pumpup=OFFLINE [OHazelcastPlugin]
2014-07-04 14:01:09:249 INFO [node1404496844581] found no previous messages in queue orientdb.node.node1404496844581.pumpup.request [OHazelcastDistributedMessageService]
2014-07-04 14:01:09:288 WARN segment file 'database.ocf' was not closed correctly last time [OSingleFileSegment]
2014-07-04 14:01:09:378 WARN Storage pumpup was not closed properly. Will try to restore from write ahead log. [OLocalPaginatedStorage]
2014-07-04 14:01:09:378 INFO Looking for last checkpoint... [OLocalPaginatedStorage]
2014-07-04 14:01:09:378 INFO Checkpoints are absent, the restore will start from the beginning. [OLocalPaginatedStorage]
2014-07-04 14:01:09:379 INFO Data restore procedure is started. [OLocalPaginatedStorage]
2014-07-04 14:01:35:724 INFO Heap memory is low apply batch of operations are read from WAL. [OLocalPaginatedStorage]Exception during storage data restore.
null
-> com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage.restoreWALBatch(OLocalPaginatedStorage.java:1842)
-> com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage.restoreFrom(OLocalPaginatedStorage.java:1802)
-> com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage.restoreFromBegging(OLocalPaginatedStorage.java:1772)
-> com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage.restoreFromWAL(OLocalPaginatedStorage.java:1611)
-> com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage.restoreIfNeeded(OLocalPaginatedStorage.java:1578)
-> com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage.open(OLocalPaginatedStorage.java:245)
-> com.orientechnologies.orient.core.db.raw.ODatabaseRaw.open(ODatabaseRaw.java:100)
-> com.orientechnologies.orient.core.db.ODatabaseWrapperAbstract.open(ODatabaseWrapperAbstract.java:49)
-> com.orientechnologies.orient.core.db.record.ODatabaseRecordAbstract.open(ODatabaseRecordAbstract.java:268)
-> com.orientechnologies.orient.core.db.ODatabaseWrapperAbstract.open(ODatabaseWrapperAbstract.java:49)
-> com.orientechnologies.orient.server.OServer.openDatabase(OServer.java:557)
-> com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase.initDatabaseInstance(OHazelcastDistributedDatabase.java:283)
-> com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase.setOnline(OHazelcastDistributedDatabase.java:295)
-> com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.loadDistributedDatabases(OHazelcastPlugin.java:742)
-> com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.startup(OHazelcastPlugin.java:194)
-> com.orientechnologies.orient.server.OServer.registerPlugins(OServer.java:720)
-> com.orientechnologies.orient.server.OServer.activate(OServer.java:241)
-> com.orientechnologies.orient.server.OServerMain.main(OServerMain.java:32)Exception in thread "main" com.orientechnologies.orient.core.exception.OStorageException: Cannot open local storage '/Users/gsquare567/Databases/orientdb-community-1.7.3/databases/pumpup' with mode=rw
at com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage.open(OLocalPaginatedStorage.java:251)
at com.orientechnologies.orient.core.db.raw.ODatabaseRaw.open(ODatabaseRaw.java:100)
at com.orientechnologies.orient.core.db.ODatabaseWrapperAbstract.open(ODatabaseWrapperAbstract.java:49)
at com.orientechnologies.orient.core.db.record.ODatabaseRecordAbstract.open(ODatabaseRecordAbstract.java:268)
at com.orientechnologies.orient.core.db.ODatabaseWrapperAbstract.open(ODatabaseWrapperAbstract.java:49)
at com.orientechnologies.orient.server.OServer.openDatabase(OServer.java:557)
at com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase.initDatabaseInstance(OHazelcastDistributedDatabase.java:283)
at com.orientechnologies.orient.server.hazelcast.OHazelcastDistributedDatabase.setOnline(OHazelcastDistributedDatabase.java:295)
at com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.loadDistributedDatabases(OHazelcastPlugin.java:742)
at com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.startup(OHazelcastPlugin.java:194)
at com.orientechnologies.orient.server.OServer.registerPlugins(OServer.java:720)
at com.orientechnologies.orient.server.OServer.activate(OServer.java:241)
at com.orientechnologies.orient.server.OServerMain.main(OServerMain.java:32)
Caused by: java.lang.NullPointerException
at com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage.restoreWALBatch(OLocalPaginatedStorage.java:1842)
at com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage.restoreFrom(OLocalPaginatedStorage.java:1802)
at com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage.restoreFromBegging(OLocalPaginatedStorage.java:1772)
at com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage.restoreFromWAL(OLocalPaginatedStorage.java:1611)
at com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage.restoreIfNeeded(OLocalPaginatedStorage.java:1578)
at com.orientechnologies.orient.core.storage.impl.local.paginated.OLocalPaginatedStorage.open(OLocalPaginatedStorage.java:245)
... 12 more
2014-07-04 14:01:39:184 INFO [192.168.1.8]:2434 [orientdb] [3.2.2] memory.used=1.3G, memory.free=155.9M, memory.total=1.4G, memory.max=1.8G, memory.used/total=89.32%, memory.used/max=71.63%, load.process=37.00%, load.system=41.00%, load.systemAverage=170.00%, thread.count=59, thread.peakCount=59, event.q.size=0, executor.q.async.size=0, executor.q.client.size=0, executor.q.operation.size=0, executor.q.query.size=0, executor.q.scheduled.size=0, executor.q.io.size=0, executor.q.system.size=0, executor.q.operation.size=0, executor.q.priorityOperation.size=0, executor.q.response.size=0, operations.remote.size=0, operations.running.size=0, proxy.count=5, clientEndpoint.count=0, connection.active.count=0, connection.count=0 [HealthMonitor]
2014-07-04 14:02:09:195 INFO [192.168.1.8]:2434 [orientdb] [3.2.2] memory.used=1.3G, memory.free=155.2M, memory.total=1.4G, memory.max=1.8G, memory.used/total=89.37%, memory.used/max=71.68%, load.process=0.00%, load.system=4.00%, load.systemAverage=162.00%, thread.count=59, thread.peakCount=59, event.q.size=0, executor.q.async.size=0, executor.q.client.size=0, executor.q.operation.size=0, executor.q.query.size=0, executor.q.scheduled.size=0, executor.q.io.size=0, executor.q.system.size=0, executor.q.operation.size=0, executor.q.priorityOperation.size=0, executor.q.response.size=0, operations.remote.size=0, operations.running.size=0, proxy.count=5, clientEndpoint.count=0, connection.active.count=0, connection.count=0 [HealthMonitor]
2014-07-04 14:02:39:207 INFO [192.168.1.8]:2434 [orientdb] [3.2.2] memory.used=1.3G, memory.free=149.3M, memory.total=1.4G, memory.max=1.8G, memory.used/total=89.77%, memory.used/max=72.00%, load.process=0.00%, load.system=5.00%, load.systemAverage=124.00%, thread.count=61, thread.peakCount=61, event.q.size=0, executor.q.async.size=0, executor.q.client.size=0, executor.q.operation.size=0, executor.q.query.size=0, executor.q.scheduled.size=0, executor.q.io.size=0, executor.q.system.size=0, executor.q.operation.size=0, executor.q.priorityOperation.size=0, executor.q.response.size=0, operations.remote.size=0, operations.running.size=0, proxy.count=5, clientEndpoint.count=0, connection.active.count=0, connection.count=0 [HealthMonitor]
2014-07-04 14:03:09:218 INFO [192.168.1.8]:2434 [orientdb] [3.2.2] memory.used=1.3G, memory.free=149.2M, memory.total=1.4G, memory.max=1.8G, memory.used/total=89.78%, memory.used/max=72.00%, load.process=0.00%, load.system=6.00%, load.systemAverage=151.00%, thread.count=61, thread.peakCount=61, event.q.size=0, executor.q.async.size=0, executor.q.client.size=0, executor.q.operation.size=0, executor.q.query.size=0, executor.q.scheduled.size=0, executor.q.io.size=0, executor.q.system.size=0, executor.q.operation.size=0, executor.q.priorityOperation.size=0, executor.q.response.size=0, operations.remote.size=0, operations.running.size=0, proxy.count=5, clientEndpoint.count=0, connection.active.count=0, connection.count=0 [HealthMonitor]
EDIT
Here is my OrientDB info:
CLUSTERS
----------------------------------------------+-------+---------------------+---------+-----------------+
NAME | ID | TYPE | DATASEG | RECORDS |
----------------------------------------------+-------+---------------------+---------+-----------------+
default | 3 | PHYSICAL | -1 | 0 |
e | 10 | PHYSICAL | -1 | 0 |
index | 1 | PHYSICAL | -1 | 4 |
internal | 0 | PHYSICAL | -1 | 3 |
manindex | 2 | PHYSICAL | -1 | 1 |
ofunction | 7 | PHYSICAL | -1 | 0 |
orids | 6 | PHYSICAL | -1 | 0 |
orole | 4 | PHYSICAL | -1 | 3 |
oschedule | 8 | PHYSICAL | -1 | 0 |
ouser | 5 | PHYSICAL | -1 | 3 |
post | 12 | PHYSICAL | -1 | 1312295 |
user | 11 | PHYSICAL | -1 | 205795 |
v | 9 | PHYSICAL | -1 | 0 |
----------------------------------------------+-------+---------------------+---------+-----------------+
TOTAL = 13 | | 1518104 |
----------------------------------------------------------------------------+---------+-----------------+
CLASSES
----------------------------------------------+------------------------------------+------------+----------------+
NAME | SUPERCLASS | CLUSTERS | RECORDS |
----------------------------------------------+------------------------------------+------------+----------------+
E | | 10 | 0 |
OFunction | | 7 | 0 |
OIdentity | | - | 0 |
ORestricted | | - | 0 |
ORIDs | | 6 | 0 |
ORole | OIdentity | 4 | 3 |
OSchedule | | 8 | 0 |
OTriggered | | - | 0 |
OUser | OIdentity | 5 | 3 |
ParseObject | | - | 0 |
Post | ParseObject | 12 | 1312295 |
User | ParseObject | 11 | 205795 |
V | | 9 | 0 |
----------------------------------------------+------------------------------------+------------+----------------+
TOTAL = 13 1518096 |
----------------------------------------------+------------------------------------+------------+----------------+
INDEXES
----------------------------------------------+------------+-----------------------+----------------+------------+
NAME | TYPE | CLASS | FIELDS | RECORDS |
----------------------------------------------+------------+-----------------------+----------------+------------+
dictionary | DICTIONARY | | | 0 |
ORole.name | UNIQUE | ORole | name | 3 |
OUser.name | UNIQUE | OUser | name | 3 |
Post.objectId | UNIQUE_... | Post | objectId | 1312295 |
User.objectId | UNIQUE_... | User | objectId | 205795 |
----------------------------------------------+------------+-----------------------+----------------+------------+
TOTAL = 5 1518096 |
-----------------------------------------------------------------------------------------------------------------+

Resources