I am using Hive (version 0.11.0) and trying to join two tables.
One has 26,286,629 records and the other one has 931 records.
This is the query I am trying to run.
SELECT *
FROM table_1 hapmap a tabl_2 b
WHERE a.chrom = b.chrom
AND a.start_pos >= b.start_pos
AND a.end_pos <= b.end_pos
LIMIT 10
;
It looks fine at the first few minutes but if the map and reduce task reached to 100% both, then it starts to reduce again from 0%.
2014-02-20 20:38:35,531 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 399.23 sec
2014-02-20 20:38:36,533 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 399.23 sec
2014-02-20 20:38:37,536 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 399.23 sec
2014-02-20 20:38:38,539 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 399.23 sec
2014-02-20 20:38:39,541 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 192.49 sec
2014-02-20 20:38:40,544 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 192.49 sec
2014-02-20 20:38:41,547 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 192.49 sec
2014-02-20 20:38:42,550 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 192.49 sec
2014-02-20 20:38:43,554 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 192.49 sec
Here are the last 4KB logs of map and reduce task.
Map Task Log (Last 4KB)
.hadoop.mapred.MapTask: bufstart = 99180857; bufend = 16035989; bufvoid = 99614720
2014-02-20 19:57:03,008 INFO org.apache.hadoop.mapred.MapTask: kvstart = 196599; kvend = 131062; length = 327680
2014-02-20 19:57:03,180 INFO org.apache.hadoop.mapred.MapTask: Finished spill 12
2014-02-20 19:57:04,244 INFO org.apache.hadoop.mapred.MapTask: Spilling map output: record full = true
2014-02-20 19:57:04,244 INFO org.apache.hadoop.mapred.MapTask: bufstart = 16035989; bufend = 32544041; bufvoid = 99614720
2014-02-20 19:57:04,244 INFO org.apache.hadoop.mapred.MapTask: kvstart = 131062; kvend = 65525; length = 327680
2014-02-20 19:57:04,399 INFO org.apache.hadoop.mapred.MapTask: Finished spill 13
2014-02-20 19:57:05,440 INFO org.apache.hadoop.mapred.MapTask: Spilling map output: record full = true
2014-02-20 19:57:05,440 INFO org.apache.hadoop.mapred.MapTask: bufstart = 32544041; bufend = 48952648; bufvoid = 99614720
2014-02-20 19:57:05,440 INFO org.apache.hadoop.mapred.MapTask: kvstart = 65525; kvend = 327669; length = 327680
2014-02-20 19:57:05,598 INFO org.apache.hadoop.mapred.MapTask: Finished spill 14
2014-02-20 19:57:05,767 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 9 forwarding 4000000 rows
2014-02-20 19:57:05,767 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 forwarding 4000000 rows
2014-02-20 19:57:05,767 INFO ExecMapper: ExecMapper: processing 4000000 rows: used memory = 123701072
2014-02-20 19:57:06,562 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 9 finished. closing...
2014-02-20 19:57:06,574 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 9 forwarded 4182243 rows
2014-02-20 19:57:06,575 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 2 finished. closing...
2014-02-20 19:57:06,575 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 2 forwarded 0 rows
2014-02-20 19:57:06,575 INFO org.apache.hadoop.hive.ql.exec.ReduceSinkOperator: 3 finished. closing...
2014-02-20 19:57:06,575 INFO org.apache.hadoop.hive.ql.exec.ReduceSinkOperator: 3 forwarded 0 rows
2014-02-20 19:57:06,575 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 2 Close done
2014-02-20 19:57:06,575 INFO org.apache.hadoop.hive.ql.exec.MapOperator: DESERIALIZE_ERRORS:0
2014-02-20 19:57:06,575 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 finished. closing...
2014-02-20 19:57:06,575 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 forwarded 4182243 rows
2014-02-20 19:57:06,575 INFO org.apache.hadoop.hive.ql.exec.ReduceSinkOperator: 1 finished. closing...
2014-02-20 19:57:06,575 INFO org.apache.hadoop.hive.ql.exec.ReduceSinkOperator: 1 forwarded 0 rows
2014-02-20 19:57:06,575 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: 0 Close done
2014-02-20 19:57:06,575 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 9 Close done
2014-02-20 19:57:06,575 INFO ExecMapper: ExecMapper: processed 4182243 rows: used memory = 128772720
2014-02-20 19:57:06,577 INFO org.apache.hadoop.mapred.MapTask: Starting flush of map output
2014-02-20 19:57:06,713 INFO org.apache.hadoop.mapred.MapTask: Finished spill 15
2014-02-20 19:57:06,720 INFO org.apache.hadoop.mapred.Merger: Merging 16 sorted segments
2014-02-20 19:57:06,730 INFO org.apache.hadoop.mapred.Merger: Merging 7 intermediate segments out of a total of 16
2014-02-20 19:57:08,308 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 10 segments left of total size: 272242546 bytes
2014-02-20 19:57:11,762 INFO org.apache.hadoop.mapred.Task: Task:attempt_201402201604_0005_m_000000_0 is done. And is in the process of commiting
2014-02-20 19:57:11,834 INFO org.apache.hadoop.mapred.Task: Task 'attempt_201402201604_0005_m_000000_0' done.
2014-02-20 19:57:11,837 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2014-02-20 19:57:11,868 INFO org.apache.hadoop.io.nativeio.NativeIO: Initialized cache for UID to User mapping with a cache timeout of 14400 seconds.
2014-02-20 19:57:11,869 INFO org.apache.hadoop.io.nativeio.NativeIO: Got UserName root for UID 0 from the native implementation
Reduce Task Log (Last 4KB)
x.security.auth.Subject.doAs(Unknown Source)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1426)
at org.apache.hadoop.ipc.Client.call(Client.java:1113)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
at com.sun.proxy.$Proxy2.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62)
at com.sun.proxy.$Proxy2.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3720)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3580)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2600(DFSClient.java:2783)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:3023)
2014-02-20 20:11:32,743 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for null bad datanode[0] nodes == null
2014-02-20 20:11:32,743 WARN org.apache.hadoop.hdfs.DFSClient: Could not get block locations. Source file "/tmp/hive-root/hive_2014-02-20_19-56-40_541_484791785779427461/_task_tmp.-ext-10001/_tmp.000000_0" - Aborting...
2014-02-20 20:11:32,744 ERROR org.apache.hadoop.hdfs.DFSClient: Failed to close file /tmp/hive-root/hive_2014-02-20_19-56-40_541_484791785779427461/_task_tmp.-ext-10001/_tmp.000000_0
org.apache.hadoop.ipc.RemoteException:org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /tmp/hive-root/hive_2014-02-20_19-56-40_541_484791785779427461/_task_tmp.-ext-10001/_tmp.000000_0 File does not exist. Holder DFSClient_attempt_201402201604_0005_r_000000_0_636140892_1 does not have any open files
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1999)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1990)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1899)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:783)
at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1432)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1428)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Unknown Source)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1426)
at org.apache.hadoop.ipc.Client.call(Client.java:1113)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
at com.sun.proxy.$Proxy2.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62)
at com.sun.proxy.$Proxy2.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3720)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3580)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2600(DFSClient.java:2783)
at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:3023)
Namenode log at (2014-02-20 20:11:32)
2014-02-20 20:01:28,769 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 93 Total time for transactions(ms): 2 Number of transactions batched in Syncs: 0 Number of syncs: 46 SyncTimes(ms): 29
2014-02-20 20:11:32,984 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 94 Total time for transactions(ms): 2 Number of transactions batched in Syncs: 0 Number of syncs: 47 SyncTimes(ms): 30
2014-02-20 20:11:32,987 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:root cause:org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /tmp/hive-root/hive_2014-02-20_19-56-40_541_484791785779427461/_task_tmp.-ext-10001/_tmp.000000_0 File does not exist. Holder DFSClient_attempt_201402201604_0005_r_000000_0_636140892_1 does not have any open files
2014-02-20 20:11:32,987 INFO org.apache.hadoop.ipc.Server: IPC Server handler 7 on 9000, call addBlock(/tmp/hive-root/hive_2014-02-20_19-56-40_541_484791785779427461/_task_tmp.-ext-10001/_tmp.000000_0, DFSClient_attempt_201402201604_0005_r_000000_0_636140892_1, null) from 172.27.250.92:55640: error: org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /tmp/hive-root/hive_2014-02-20_19-56-40_541_484791785779427461/_task_tmp.-ext-10001/_tmp.000000_0 File does not exist. Holder DFSClient_attempt_201402201604_0005_r_000000_0_636140892_1 does not have any open files
org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException: No lease on /tmp/hive-root/hive_2014-02-20_19-56-40_541_484791785779427461/_task_tmp.-ext-10001/_tmp.000000_0 File does not exist. Holder DFSClient_attempt_201402201604_0005_r_000000_0_636140892_1 does not have any open files
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1999)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1990)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1899)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:783)
at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1432)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1428)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Unknown Source)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1426)
2014-02-20 20:14:48,995 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 96 Total time for transactions(ms): 2 Number of transactions batched in Syncs: 0 Number of syncs: 48 SyncTimes(ms): 31
2014-02-20 20:18:28,063 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Roll Edit Log from 172.27.114.218
2014-02-20 20:18:28,064 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: Number of transactions: 96 Total time for transactions(ms): 2 Number of transactions batched in Syncs: 0 Number of syncs: 49 SyncTimes(ms): 32
Anyone can help me?
Thanks in advance.
Hive 11 is not able to infer JOIN conditions from the WHERE clause. The query as you wrote it will be executed as a cross product join and then filtered based on the where conditions. This is extremely expensive and you should use this instead:
SELECT *
FROM table_1 hapmap a JOIN tabl_2 b ON a.chrom = b.chrom
WHERE
a.start_pos >= b.start_pos
AND a.end_pos <= b.end_pos
LIMIT 10;
Based on what you've said, this should be executed as a mapjoin and will be much faster.
Related
Error: This has disconnected the Server from Cluster and stopped the Distributed-Connector Services. Please help
[2021-12-08 14:07:20,484] INFO [GroupMetadataManager brokerId=0] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
[2021-12-08 14:17:20,484] INFO [GroupMetadataManager brokerId=0] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
[2021-12-08 14:24:23,198] INFO [GroupCoordinator 0]: Member connect-1-37b915a1-36f0-47df-81f0-f5b67985f278 in group connect-cluster has failed, removing it from the group (kafka.coordinator.group.GroupCoordinator)
[2021-12-08 14:24:23,200] INFO [GroupCoordinator 0]: Preparing to rebalance group connect-cluster in state PreparingRebalance with old generation 18 (__consumer_offsets-13) (reason: removing member connect-1-37b915a1-36f0-47df-81f0-f5b67985f278 on heartbeat expiration) (kafka.coordinator.group.GroupCoordinator)
[2021-12-08 14:24:25,680] INFO [GroupCoordinator 0]: Stabilized group connect-cluster generation 19 (__consumer_offsets-13) (kafka.coordinator.group.GroupCoordinator)
[2021-12-08 14:24:25,717] INFO [GroupCoordinator 0]: Assignment received from leader for group connect-cluster for generation 19 (kafka.coordinator.group.GroupCoordinator)
[2021-12-08 14:27:20,484] INFO [GroupMetadataManager brokerId=0] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
[2021-12-08 14:37:20,484] INFO [GroupMetadataManager brokerId=0] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
[2021-12-08 14:47:20,484] INFO [GroupMetadataManager brokerId=0] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
[2021-12-08 14:57:20,484] INFO [GroupMetadataManager brokerId=0] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
I have a table with partition key + clustering column, and I enabled row caching = 2000. I gave row caching 20GB by changing the yaml file.
I can see that my hit rate is ~90% (which is good enough for me), but monitoring shows no latency reduction at all (it's even a bit higher after caching). Note that the query that I'm running is to get all rows for a given partition key. Number of rows differ from 10 to 10000.
Any idea why Cassandra row caching is not effective for my case?
Host: EC2 r4.16x, which has 64 cores and 488GB memory.
C* version: 3.11.0
=========nodetool info=========:
ID : 1f05c846-ddd6-4409-8473-10f3c0490279
Gossip active : true
Thrift active : false
Native Transport active: true
Load : 145.95 GiB
Generation No : 1506984130
Uptime (seconds) : 69561
Heap Memory (MB) : 4707.58 / 7987.25
Off Heap Memory (MB) : 680.54
Data Center : us-east
Rack : 1e
Exceptions : 0
Key Cache : entries 671231, size 100 MiB, capacity 100 MiB,
558160978 hits, 566196579 requests, 0.986 recent hit rate, 14400 save
period in seconds
Row Cache : entries 1225796, size 17.8 GiB, capacity 19.53 GiB, 80015143 hits, 86502918 requests, 0.925 recent hit rate, 0 save period in seconds
Counter Cache : entries 0, size 0 bytes, capacity 50 MiB, 0 hits, 0 requests, NaN recent hit rate, 7200 save period in seconds
Chunk Cache : entries 122880, size 480 MiB, capacity 480 MiB, 334907479 misses, 6940124103 requests, 0.952 recent hit rate, 13.407 microseconds miss latency
Percent Repaired : 85.28384276963543%
Token : (invoke with -T/--tokens to see all 256 tokens)
2017-07-19 09:04:17.542 [0%] [944896 sec remaining] Web synchronization progress: 94% complete.
Article Upload Statistics:
FILE_REPLICA:
Relative Cost: 4.87%
PUBLISH_DOCUMENTS:
Updates: 827
Relative Cost: 76.73%
WF_ACTIVE_ROUTING_HISTORY:
Relative Cost: 4.29%
WF_RUN_ROUTING_HISTORY_REV:
Relative Cost: 1.87%
WF_RUN_STAGE_RES_LIST_PRES:
Relative Cost: 1.83%
WF_RUN_STAGE_STATUS_PRES:
Relative Cost: 1.83%
ORDER_RES_GROUP:
Relative Cost: 5.54%
WF_RUN_ROUTING_HISTORY:
Relative Cost: 3.04%
Article Download Statistics:
FILE_REPLICA:
Relative Cost: 7.61%
PUBLISH_DOCUMENTS:
Relative Cost: 4.18%
WF_ACTIVE_ROUTING_HISTORY:
Relative Cost: 29.20%
WF_RUN_ROUTING_HISTORY_REV:
Relative Cost: 13.25%
WF_RUN_STAGE_RES_LIST_PRES:
Relative Cost: 19.39%
WF_RUN_STAGE_STATUS_PRES:
Relative Cost: 6.54%
ORDER_RES_GROUP:
Relative Cost: 9.05%
WF_RUN_ROUTING_HISTORY:
Relative Cost: 10.78%
Session Statistics:
Upload Updates: 827
Deadlocks encountered: 18
Change Delivery Time: 753 sec
Schema Change and Bulk Insert Time: 5 sec
Delivery Rate: 1.10 rows/sec
Total Session Duration: 6556 sec
=============================================================
2017-07-19 09:04:17.596 Connecting to Subscriber 'VMSQL2014'
2017-07-19 09:04:17.609 The upload message to be sent to Publisher 'VMSQL2014' is being generated
2017-07-19 09:04:17.613 The merge process is using Exchange ID '86D0215F-E4E3-4FC1-99F4-BC9E05ACDA21' for this web synchronization session.
2017-07-19 09:04:20.168 Uploading data changes to the Publisher
2017-07-19 09:04:22.980 A query executing on Subscriber 'VMSQL2014' failed because the connection was chosen as the victim in a deadlock. Please rerun the merge process if you still see this error after internal retries by the merge process.
2017-07-19 09:04:25.513 [0%] [1227049 sec remaining] Request message generated, now making it ready for upload.
2017-07-19 09:04:25.561 [0%] [1227049 sec remaining] Upload request size is 260442 bytes.
2017-07-19 09:04:27.462 [0%] [1227049 sec remaining] Uploaded a total of 55 chunks.
2017-07-19 09:04:27.466 [0%] [1227049 sec remaining] The request message was sent to 'https://webserver/SQLReplication/replisapi.dll'
2017-07-19 09:09:28.676 The operation timed out
2017-07-19 09:09:28.679 Category:NULL
Source: Merge Process
Number: -2147209502
Message: The operation timed out
2017-07-19 09:09:28.680 Category:NULL
Source: Merge Process
Number: -2147209502
Message: The processing of the response message failed.
It says there were deadlocks encountered. A deadlock is when two transactions are trying to affect the same row. Likely, someone / some other program is writing to the same row you want to write to, thus locking you out and not letting you write.
You can:
Implement a retry procedure so your merge tries again if deadlocked.
Lock your database out from other programs / users while you run this.
There are likely other options to get around this issue. Google: "avoid deadlock"
We have the application that writes logs in Azure SQL tables. The structure of the table is the following.
CREATE TABLE [dbo].[xyz_event_history]
(
[event_history_id] [uniqueidentifier] NOT NULL,
[event_date_time] [datetime] NOT NULL,
[instance_id] [uniqueidentifier] NOT NULL,
[scheduled_task_id] [int] NOT NULL,
[scheduled_start_time] [datetime] NULL,
[actual_start_time] [datetime] NULL,
[actual_end_time] [datetime] NULL,
[status] [int] NOT NULL,
[log] [nvarchar](max) NULL,
CONSTRAINT [PK__crg_scheduler_event_history] PRIMARY KEY NONCLUSTERED
(
[event_history_id] ASC
)
)
Table stored as clustered index by scheduled_task_id column (non-unique).
CREATE CLUSTERED INDEX [IDX__xyz_event_history__scheduled_task_id] ON [dbo].[xyz_event_history]
(
[scheduled_task_id] ASC
)
The event_history_id generated by the application, it's random (not sequential) GUID. The application either creates, updates and removes old entities from the table. The log column usually holds 2-10 KB of data, but it can grow up to 5-10 MB in some cases. The items are usually accessed by PK (event_history_id) and the most frequent sort order is event_date_time desc.
The problem we see after we lowered performance tier for the Azure SQL to "S3" (100 DTUs) is crossing transaction log rate limits. It can be clearly seen within sys.dm_exec_requests table - there will be records with wait type LOG_RATE_GOVERNOR (msdn).
Occurs when DB is waiting for quota to write to the log.
The operations I've noticed that cause big impact on log rate are deletions from xyz_event_history and updates in log column. The updates made in the following fashion.
UPDATE xyz_event_history
SET [log] = COALESCE([log], '') + #log_to_append
WHERE event_history_id = #id
The recovery model for Azure SQL databases is FULL and can not be changed.
Here is the physical index statistics - there are many pages that crossed 8K per row limit.
TableName AllocUnitTp PgCt AvgPgSpcUsed RcdCt MinRcdSz MaxRcdSz
xyz_event_history IN_ROW_DATA 4145 47.6372868791698 43771 102 7864
xyz_event_history IN_ROW_DATA 59 18.1995058067705 4145 11 19
xyz_event_history IN_ROW_DATA 4 3.75277983691623 59 11 19
xyz_event_history IN_ROW_DATA 1 0.914257474672597 4 11 19
xyz_event_history LOB_DATA 168191 97.592290585619 169479 38 8068
xyz_event_history IN_ROW_DATA 7062 3.65090190264393 43771 38 46
xyz_event_history IN_ROW_DATA 99 22.0080800593032 7062 23 23
xyz_event_history IN_ROW_DATA 1 30.5534964170991 99 23 23
xyz_event_history IN_ROW_DATA 2339 9.15620212503089 43771 16 38
xyz_event_history IN_ROW_DATA 96 8.70488015814184 2339 27 27
xyz_event_history IN_ROW_DATA 1 34.3711391153941 96 27 27
xyz_event_history IN_ROW_DATA 1054 26.5034840622683 43771 28 50
xyz_event_history IN_ROW_DATA 139 3.81632073140598 1054 39 39
xyz_event_history IN_ROW_DATA 1 70.3854707190511 139 39 39
Is there a way to reduce transaction log usage?
How does SQL Server log update transactions as in example above? Is it just "old" plus "new" value? (that would conceivably make adding little pieces of data frequently quite inefficient in terms of transaction log size)
UPDATE (April, 20):
I've made some experiments with suggestions in answers and was impressed by difference that INSERT instead of UPDATE makes.
As per following msdn article about SQL Server Transaction log internals (https://technet.microsoft.com/en-us/library/jj835093(v=sql.110).aspx):
Log records for data modifications record either the logical operation
performed or they record the before and after images of the modified
data. The before image is a copy of the data before the operation is
performed; the after image is a copy of the data after the operation
has been performed.
This automatically makes the scenario with UPDATE ... SET X = X + 'more' highly inefficient in terms of transaction log usage - it requires "before image" capture.
I've created simple test suite to test original way of adding data to "log" column versus the way where we just insert new piece of data into the new table. The results I got quite astonishing (at lest for me, not too experienced with SQL Server guy).
The test is simple: 5'000 times add 1'024 character long parts of log - just 5MB of text as the result (not too bad as one might think).
FULL recovery mode, SQL Server 2014, Windows 10, SSD
UPDATE INSERT
Duration 07:48 (!) 00:02
Data file grow ~8MB ~8MB
Tran. Log grow ~218MB (!) 0MB (why?!)
Just 5000 updates that add 1KB of data can hang out SQL Server for 8 minutes (wow!) - I didn't expect that!
I think original question is resolved at this point, but the following ones raised:
Why transaction log grow looks linear (not quadratic as we can expect when simply capturing "before" and "after" images)? From the diagram we can see that "items per second" grows proportionally to the square root - it's as expected if overhead grows linearly with amount of items inserted.
Why in case with inserts transaction log appears to have the same size as before any inserts at all?
I've took a look on the transaction log (with Dell's Toad) for the case with inserts and looks like only last 297 items are in there - conceivably transaction log got truncated, but why if it's FULL recovery mode?
UPDATE (April, 21).
DBCC LOGINFO output for case with INSERT - before and after. The physical size of the log file matches the output - exactly 1,048,576 bytes on disk.
Why it looks like transaction log remains still?
RecoveryUnitId FileId FileSize StartOffset FSeqNo Status Parity CreateLSN
0 2 253952 8192 131161 0 64 0
0 2 253952 262144 131162 2 64 0
0 2 253952 516096 131159 0 128 0
0 2 278528 770048 131160 0 128 0
RecoveryUnitId FileId FileSize StartOffset FSeqNo Status Parity CreateLSN
0 2 253952 8192 131221 0 128 0
0 2 253952 262144 131222 0 128 0
0 2 253952 516096 131223 2 128 0
0 2 278528 770048 131224 2 128 0
For those who interested I've recorded "sqlserv.exe" activities using Process Monitor - I can see that file being overwritten again and again - looks like SQL Server treats old log items as no longer needed by some reason: https://dl.dropboxusercontent.com/u/1323651/stackoverflow-sql-server-transaction-log.pml.
UPDATE (April, 24). Seems I've finally started to understand what is going on there and want to share with you. The reasoning above is true in general, but has serious caveat that also produced confusion about strange transaction log re-usage with INSERTs.
Database will behave like in SIMPLE recovery mode until first full
backup is taken (even though it's in FULL recovery mode).
We can treat numbers and diagram above as valid for SIMPLE recovery mode, and I have to redo my measurement for real FULL - they are even more astonishing.
UPDATE INSERT
Duration 13:20 (!) 00:02
Data file grow 8MB 11MB
Tran. log grow 55.2GB (!) 14MB
You are violating one of the basic tenants of the normal form with the log field. The log field seams to be holding an appending sequence of info related to the primary. The fix is to stop doing that.
1 Create a table. xyz_event_history_LOG(event_history_id,log_sequence#,log)
2 stop doing updates to the log field in [xyz_event_history], instead do inserts to the xyz_event_history_LOG
The amount of data in your transaction log will decrease GREATLY.
The transaction log contains all the changes to a database in the order they were made, so if you update a row multiple times you will get multiple entries to that row. It does store the entire value, old and new, so you are correct that multiple small updates to a large data type such as nvarchar(max) would be inefficient, you would be better off storing the updates in separate columns if they are only small values.
I should run only 30 tasks per hour and maximum 2 tasks per minute. I am not sure how to configure these 2 conditions at the same time. Currently I have the following setup:
- name: task-queue
rate: 2/m # 2 tasks per minute
bucket_size: 1
max_concurrent_requests: 1
retry_parameters:
task_retry_limit: 0
min_backoff_seconds: 10
But I don't understand how to add first condition there.