I've a Camel route that gets the record from a Kinesis stream. I accidentally paused the route and enabled it 10 minutes later. The Kinesis iterator expires after 5 min. Eventhough the route is resumed, the Camel/Kinesis component seems to be holding on the expired iterator and is unable to get records from the kinesis stream. Is there any way to fix the issue without a restart?
amazonKinesisClient=%23KinesisClient&iteratorType=AFTER_SEQUENCE_NUMBER&maxResultsPerRequest=100&sequenceNumber=4957961397643123123123123933769851155815690236067842&shardId=shardId-000000000000.
Will try again at next poll. Caused by:
[com.amazonaws.services.kinesis.model.ExpiredIteratorException -
Iterator expired. The iterator was created at time Thu Mar 22 02:32:32
UTC 2018 while right now it is Thu Mar 22 03:54:44 UTC 2018 which is
further in the future than the tolerated delay of 300000 milliseconds.
(Service: AmazonKinesis; Status Code: 400; Error Code:
ExpiredIteratorException; Request ID:
f5434586-4645-45d9-a66a-1232d2e45678)]
Related
I've simple Apache Flink job:
**DataSource (Apache Kafka) - Filter - KeyBy - CEP Pattern (with timer) - PatternProcessFucntion - KeyedProcessFunction (*here I've ValueState(Boolean) and registering timer on 5 minutes. If a valueState not null I'll update valueState (nothing to send in collector) and update timer. If a valueState is null, I'll save in state TRUE, then send input event in collector and setting timer. When onTimer method is ready, I'll clean my ValueState*) - Sink (Apache Kafka)**.
Job settings:
**Checkpointing interval: 5000ms**
**Incremental checkpointing: true**
**Semantic: Exactly Once**
**State Backend: RocksDB**
**Parallelism: 4**
Logically my job is working perfectly, but I've some problems.
I had two tests on my cluster (2 job manager and 3 task manager):
**First test:**
I started my job and connected to an empty Apache Kafka topic then I saw in Flink WEB UI **Checkpointing Statistics:**
1)Latest Acknowledgement - Trigger Time = 5000ms (like my checkpoint interval)
2)State size = 340 kb at each 5sec interval
3)All status was completed (blue).
**Second test:**
I started sending json-messages with other keys (from "1" to Integer.MAX_VALUE) in Apache Kafka topic. Sending speed was: 1000 messages/sec then I saw in Flink WEB UI **Checkpointing Statistics:**
1)Latest Acknowledgement - Trigger Time = 1 - 6 minutes
**My Question #1: Why is this time growing? It is bad or OK?**
2) State size was constantly growing. I sent messages in Kafka for about 10 minutes (1000 x 60 x 10 = 600000 messages). After sending State size was 100mb - 150mb.
3)After sending I waited about an one hour and saw that:
Latest Acknowledgement - Trigger Time = 5000ms (like my checkpoint interval)
State size was: 100mb - 150mb at each 5sec interval.
**My question #2: Why doesn't it decrease? After all I checked my job logs and saw 600000 records: ValueState for **key** was cleared (OnTimer method was successfully) and job logics (see description my KeyedProcessFunction) was working great**
What was I trying to do?
1)setting pause between checkpoints
2)disable incremental checkpoints
3)enable async checkpoints (in flink-conf.yml)
It doesn't give any changes!!!
**My question #3: What should I do?? Because on industrial server speed is: *10 millions messages/hour* and checkpoint size is increases instantly.**
Version: Redis 5.0.3
In redis.conf there is an option to set snapshot period. I set this period as every 5 sec if one value is changed to see the Redis performance when it dumps. I ran two application; one is redis-server and the other is redis-benchmark.
While I was watching a log, I found out some interesting thing like below.
7269:C 27 Feb 2019 14:48:39.463 * RDB: 4535 MB of memory used by copy-on-write
7257:M 27 Feb 2019 14:48:39.939 * Background saving terminated with success
7257:M 27 Feb 2019 14:48:45.085 * 10 changes in 5 seconds. Saving...
7257:M 27 Feb 2019 14:48:45.187 * Background saving started by pid 7270
7270:C 27 Feb 2019 14:49:00.313 * DB saved on disk
7270:C 27 Feb 2019 14:49:00.401 * RDB: 4535 MB of memory used by copy-on-write
7257:M 27 Feb 2019 14:49:00.882 * Background saving terminated with success
7257:M 27 Feb 2019 14:49:06.011 * 10 changes in 5 seconds. Saving...
7257:M 27 Feb 2019 14:49:06.114 * Background saving started by pid 7271
7271:C 27 Feb 2019 14:49:21.086 * DB saved on disk
7271:C 27 Feb 2019 14:49:21.173 * RDB: 4534 MB of memory used by copy-on-write
7257:M 27 Feb 2019 14:49:21.706 * Background saving terminated with success
7257:M 27 Feb 2019 14:49:27.048 * 10 changes in 5 seconds. Saving...
7257:M 27 Feb 2019 14:49:27.155 * Background saving started by pid 7273
7273:C 27 Feb 2019 14:49:42.295 * DB saved on disk
7273:C 27 Feb 2019 14:49:42.382 * RDB: 4529 MB of memory used by copy-on-write
7257:M 27 Feb 2019 14:49:42.846 * Background saving terminated with success
7257:M 27 Feb 2019 14:49:48.023 * 10 changes in 5 seconds. Saving...
7257:M 27 Feb 2019 14:49:48.126 * Background saving started by pid 7274
7274:C 27 Feb 2019 14:50:05.251 * DB saved on disk
7274:C 27 Feb 2019 14:50:05.367 * RDB: 15 MB of memory used by copy-on-write
7257:M 27 Feb 2019 14:50:05.583 * Background saving terminated with success
As you can see, the dumped data has almost same size with others and the last one is even little. The thing that I don't understand is why the size is the same and why the last one has small size. (While the redis is dumping, the client requests set operation and the last dump probably means the end of set operation and start of get operation.)
To find out the reason, I look up the code but still don't know why the number has shown like above.
If you see rdb.c in the redis package, you can find this kind of source code.
int rdbSave(char *filename, rdbSaveInfo *rsi) {
...
snprintf(tmpfile, 256, "temp-%d.rdb", (int) getpid());
fp = fopen(tmpfile, "w");
...
rdbSaveRio(...);
}
From my understanding, everytime redis dumps the in-memory data, it should overwrite the previous saved data and this data should be lager than before. However, based on the log, the size is not linearly increased and even it is decreased at the last dump.
Am I missing some part of Redis features?
Edit
According to the comment, I definitely misinterpreted the logs. However, I still have a question about the performance. When snapshotting happens and if there are private dirty memories, the system saves them into the disk. In this point, based on the Redis dump mechanism, although the system sees the private dirty memory caused by set operation and only records this number on the log, it saves the all data from the memory. It means that everytime dumping happens, the size of disk expands and I'm pretty sure that it would lead to worse performance. But, when I see the result of benchmark, I can see the same performance drop despite of increasing disk size. I wonder why this shows the same drop rate and what is going on internally.
graph
In the above graph, the blue line indicates the throughput and you can see that it drops when snapshot happens and you can also notice that even though the second drop phase saves the larger disk size than the first phase, the drop rate is the same. So, my question would be is performance only affected by the private memory saving?
I recently wrote an app that calls google's api to get a list of threads from a users inbox. For some of the emails I'm getting an incorrect sent time. For example, I just queried it for a user and it's currently 3:55pm EST here and yet the Date header has the value of Mon, 11 Sep 2017 19:14:53 +0000 - 7:14pm EST. Each time I've encountered a time in the future it has the +0000 at the end of the string. I'm assuming that's a timezone offset, but if it was, wouldn't it have a value other than 0?
I am using sqlite3 as db manager for my application, developed on a rapsberry pi3.
My table is composed of around 200 columns (not so much), mostly boolean and numeric fields.
I add a (complete) record every minute. DB is accessed in a C program using transactions.
the transaction includes one insert and 6 updates (to maintain the code readable), avoiding to write a very long single insertion query.
The db file is on the filesystem (hence on the sd card) inside the home folder.
Every transsaction the db is opened, the pragmas
PRAGMA synchronous = NORMAL;
PRAGMA journal_mode = WAL;
are set and the query is performed.
I have good performance averagely but from the timing log I see a peak every once in a while.
Extract of the log is reported:
Apr 28 07:06:13 db write took 45.200000 ms
Apr 28 07:07:13 db commit took 0.302000 ms
Apr 28 07:07:13 db write took 75.858000 ms
Apr 28 07:08:13 db commit took 0.354000 ms
Apr 28 07:08:13 db write took 75.395000 ms
Apr 28 07:09:13 db commit took 0.268000 ms
Apr 28 07:09:13 db write took 40.620000 ms
Apr 28 07:10:13 db commit took 0.437000 ms
Apr 28 07:10:13 db write took 81.910000 ms
Apr 28 07:11:13 db commit took 0.205000 ms
Apr 28 07:11:13 db write took 43.315000 ms
Apr 28 07:12:13 db commit took 0.301000 ms
Apr 28 07:12:13 db write took 75.456000 ms
Apr 28 07:13:15 db commit took 1872.488000 ms <-----
Apr 28 07:13:15 db write took 1951.572000 ms <-----
Apr 28 07:14:13 db commit took 7.934000 ms
Apr 28 07:14:13 db write took 62.853000 ms
Apr 28 07:15:13 db commit took 0.274000 ms
Apr 28 07:15:13 db write took 80.568000 ms
Apr 28 07:16:13 db commit took 0.277000 ms
The arrow points to one of the time peak that are recurring (with variable periods) during the execution.
To bettere understand the situation, analyzing the benchmark I had two peaks in the last 12 hours, one is about 1 sec (not reported) and this one.
Could the time peaks happen because of filesystem activity on the sd?
Could making a different partition on the sd card have an impact on such performance?
Is there any other pragma that could protect my application from this behaviour?
Adding the pragmas has significantly improved the situation so far but I think is not acceptable yet.
Thanks for your time and patience.
Any hint is welcomed.
Regards,
mopyot
The database regularly moves the data from the write-ahead log into the actual database file; this is called checkpointing:
By default, SQLite will automatically checkpoint whenever a COMMIT occurs that causes the WAL file to be 1000 pages or more in size, or when the last database connection on a database file closes. […]
But programs that want more control can force a checkpoint using the wal_checkpoint pragma … The automatic checkpoint threshold can be changed or automatic checkpointing can be completely disabled using the wal_autocheckpoint pragma …
Recently, I faced a problem with Oracle 11g. It stops working frequently, every few hours, and respectively, must be started again. This was never happened before, it just began during last month.
Here is the content of clsc.log file
2016-10-06 07:00:14.344: [ default][2031720192]utgdv:2:ocr loc file /etc/oracle/olr.loc cannot be opened. errno 2
[ CLSE][2031720192]clse_get_crs_home: Error retrieving OLR configuration [0] [Error opening olr.loc file. No such file or directory]
Can anyone help me?
Edit: There is alert log messages like here, each time instance has stopped working (In this case, it has stopped working on Wed night and started again on Thu)
Thread 1 advanced to log sequence 35012 (LGWR switch)
Current log# 2 seq# 35012 mem# 0: /home/db/app/oracle/oradata/orcl/redo02.log
Wed Oct 05 18:53:08 2016
Thread 1 advanced to log sequence 35013 (LGWR switch)
Current log# 3 seq# 35013 mem# 0: /home/db/app/oracle/oradata/orcl/redo03.log
Wed Oct 05 19:05:08 2016
Thread 1 advanced to log sequence 35014 (LGWR switch)
Current log# 1 seq# 35014 mem# 0: /home/db/app/oracle/oradata/orcl/redo01.log
Wed Oct 05 19:15:03 2016
Thread 1 advanced to log sequence 35015 (LGWR switch)
Current log# 2 seq# 35015 mem# 0: /home/db/app/oracle/oradata/orcl/redo02.log
Thu Oct 06 06:41:11 2016
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Picked latch-free SCN scheme 3
Using LOG_ARCHIVE_DEST_1 parameter default value as USE_DB_RECOVERY_FILE_DEST
Autotune of undo retention is turned on.
.
.
.