running DSE 4.8.10 - I have 3 DSE Search nodes in my cluster, RF=3. I'm seeing some messages in system.log like those below. It seems they always come after a compaction. Is there a problem with the solr indexes or is there at least an explanation of these messages?
INFO [CompactionExecutor:12] 2016-11-14 23:09:31,243 CompactionTask.java:274 - Compacted 4 sstables to [/data/lib/cassandra/data/system/local/system-local-ka-13314,]. 1,564 bytes to 1,378 (~88% of original) in 17ms = 0.077304MB/s. 4 total partitions merged to 1. Partition merge counts were {4:1, }
INFO [Solr TTL scheduler-0] 2016-11-14 23:09:36,008 AbstractSolrSecondaryIndex.java:1689 - Found 200 rows with expired columns.
INFO [Solr TTL scheduler-0] 2016-11-14 23:09:36,053 AbstractSolrSecondaryIndex.java:1689 - Found 200 rows with expired columns.
INFO [Solr TTL scheduler-0] 2016-11-14 23:09:36,144 AbstractSolrSecondaryIndex.java:1689 - Found 200 rows with expired columns.
INFO [Solr TTL scheduler-0] 2016-11-14 23:09:36,187 AbstractSolrSecondaryIndex.java:1689 - Found 200 rows with expired columns.
INFO [Solr TTL scheduler-0] 2016-11-14 23:09:36,230 AbstractSolrSecondaryIndex.java:1689 - Found 200 rows with expired columns.
INFO [Solr TTL scheduler-0] 2016-11-14 23:09:36,270 AbstractSolrSecondaryIndex.java:1689 - Found 200 rows with expired columns.
INFO [Solr TTL scheduler-0] 2016-11-14 23:09:36,311 AbstractSolrSecondaryIndex.java:1689 - Found 200 rows with expired columns.
INFO [Solr TTL scheduler-0] 2016-11-14 23:09:36,353 AbstractSolrSecondaryIndex.java:1689 - Found 200 rows with expired columns.
INFO [Solr TTL scheduler-0] 2016-11-14 23:09:36,395 AbstractSolrSecondaryIndex.java:1689 - Found 200 rows with expired columns.
INFO [Solr TTL scheduler-0] 2016-11-14 23:09:36,436 AbstractSolrSecondaryIndex.java:1689 - Found 200 rows with expired columns.
INFO [Solr TTL scheduler-0] 2016-11-14 23:09:36,478 AbstractSolrSecondaryIndex.java:1689 - Found 200 rows with expired columns.
INFO [Solr TTL scheduler-0] 2016-11-14 23:09:36,519 AbstractSolrSecondaryIndex.java:1689 - Found 200 rows with expired columns.
INFO [Solr TTL scheduler-0] 2016-11-14 23:09:36,559 AbstractSolrSecondaryIndex.java:1689 - Found 200 rows with expired columns.
INFO [Solr TTL scheduler-0] 2016-11-14 23:09:36,600 AbstractSolrSecondaryIndex.java:1689 - Found 200 rows with expired columns.
INFO [Solr TTL scheduler-0] 2016-11-14 23:09:36,640 AbstractSolrSecondaryIndex.java:1689 - Found 200 rows with expired columns.
INFO [Solr TTL scheduler-0] 2016-11-14 23:09:36,681 AbstractSolrSecondaryIndex.java:1689 - Found 31 rows with expired columns.
I am assuming you have some TTL set on your data.
If you want to expire data in a cassandra, you don’t have much choice: you need a periodic task that somehow finds expired data and removes it. With lots of data, keeping this efficient can be a challenge. Cassandra actually includes a great opportunity for that kind of job: compaction. Compaction already goes through your data periodically, throwing away old versions of your data, so it is really easy and cheap to use it for data expiration.
This might be the reason why you see those messages only after compaction.
You can read more here :
http://www.datastax.com/dev/blog/whats-new-cassandra-07-expiring-columns
Related
Error: This has disconnected the Server from Cluster and stopped the Distributed-Connector Services. Please help
[2021-12-08 14:07:20,484] INFO [GroupMetadataManager brokerId=0] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
[2021-12-08 14:17:20,484] INFO [GroupMetadataManager brokerId=0] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
[2021-12-08 14:24:23,198] INFO [GroupCoordinator 0]: Member connect-1-37b915a1-36f0-47df-81f0-f5b67985f278 in group connect-cluster has failed, removing it from the group (kafka.coordinator.group.GroupCoordinator)
[2021-12-08 14:24:23,200] INFO [GroupCoordinator 0]: Preparing to rebalance group connect-cluster in state PreparingRebalance with old generation 18 (__consumer_offsets-13) (reason: removing member connect-1-37b915a1-36f0-47df-81f0-f5b67985f278 on heartbeat expiration) (kafka.coordinator.group.GroupCoordinator)
[2021-12-08 14:24:25,680] INFO [GroupCoordinator 0]: Stabilized group connect-cluster generation 19 (__consumer_offsets-13) (kafka.coordinator.group.GroupCoordinator)
[2021-12-08 14:24:25,717] INFO [GroupCoordinator 0]: Assignment received from leader for group connect-cluster for generation 19 (kafka.coordinator.group.GroupCoordinator)
[2021-12-08 14:27:20,484] INFO [GroupMetadataManager brokerId=0] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
[2021-12-08 14:37:20,484] INFO [GroupMetadataManager brokerId=0] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
[2021-12-08 14:47:20,484] INFO [GroupMetadataManager brokerId=0] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
[2021-12-08 14:57:20,484] INFO [GroupMetadataManager brokerId=0] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
I've simple Apache Flink job:
**DataSource (Apache Kafka) - Filter - KeyBy - CEP Pattern (with timer) - PatternProcessFucntion - KeyedProcessFunction (*here I've ValueState(Boolean) and registering timer on 5 minutes. If a valueState not null I'll update valueState (nothing to send in collector) and update timer. If a valueState is null, I'll save in state TRUE, then send input event in collector and setting timer. When onTimer method is ready, I'll clean my ValueState*) - Sink (Apache Kafka)**.
Job settings:
**Checkpointing interval: 5000ms**
**Incremental checkpointing: true**
**Semantic: Exactly Once**
**State Backend: RocksDB**
**Parallelism: 4**
Logically my job is working perfectly, but I've some problems.
I had two tests on my cluster (2 job manager and 3 task manager):
**First test:**
I started my job and connected to an empty Apache Kafka topic then I saw in Flink WEB UI **Checkpointing Statistics:**
1)Latest Acknowledgement - Trigger Time = 5000ms (like my checkpoint interval)
2)State size = 340 kb at each 5sec interval
3)All status was completed (blue).
**Second test:**
I started sending json-messages with other keys (from "1" to Integer.MAX_VALUE) in Apache Kafka topic. Sending speed was: 1000 messages/sec then I saw in Flink WEB UI **Checkpointing Statistics:**
1)Latest Acknowledgement - Trigger Time = 1 - 6 minutes
**My Question #1: Why is this time growing? It is bad or OK?**
2) State size was constantly growing. I sent messages in Kafka for about 10 minutes (1000 x 60 x 10 = 600000 messages). After sending State size was 100mb - 150mb.
3)After sending I waited about an one hour and saw that:
Latest Acknowledgement - Trigger Time = 5000ms (like my checkpoint interval)
State size was: 100mb - 150mb at each 5sec interval.
**My question #2: Why doesn't it decrease? After all I checked my job logs and saw 600000 records: ValueState for **key** was cleared (OnTimer method was successfully) and job logics (see description my KeyedProcessFunction) was working great**
What was I trying to do?
1)setting pause between checkpoints
2)disable incremental checkpoints
3)enable async checkpoints (in flink-conf.yml)
It doesn't give any changes!!!
**My question #3: What should I do?? Because on industrial server speed is: *10 millions messages/hour* and checkpoint size is increases instantly.**
I have a use case where lets i get balances based on date and I want to show correct balances of each day. If get an update on older date all my balances of that account from that date gets changed.
for eg
Account Date balance Total balance
IBM 1Jun 100 100
IBM 2Jun 50 150
IBM 10Jun 200 350
IBM 12Jun 200 550
Now I get a message of date 4 Jun (this is the scenario some transaction is done back dated, or some correction and its frequent scenario)
Account Date balance Total balance
IBM 1Jun 100 100
IBM 2Jun 50 150
IBM 4Jun 300 450 ----------- all data from this point changes
IBM 10Jun 200 650
IBM 12Jun 200 850
Its a streaming data and at any point I want the correct balance to be shown for each account.
I know flink and kafka are good for streaming use case where if an update of a particular date doesnt trigger update on all data from that point onwards. But can we achieve this scenario as well efficiently or is this NOT a use case of these streaming tech at all ?
Please help
You can't modify a past message in the queue, therefore you should introduce a new message that invalidates previous one. For instance, you can use an ID for each transaction (and repeat it if you need to modified it). In case you have two or more messages with the same ID, you keep with the last one.
Take a look to KTable from Kafka Streams. It can help you to aggregate data using that ID (or any other aggregation factor) and generate a table as a result with the valid resume until now. If a new message arrives, table updates will be emitted
I am diagnosing some api on the server, and I am using the IIS log but what it offers me is fine but I need two more variables that are the use of Memory and processor, I have researched and I find nothing that helps me to place those fields additional in the IIS log. Currently it shows me this and a field that I add
date time s-sitename s-computername s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs-version cs(User-Agent) cs(Referer) cs-host sc-status sc-substatus sc-win32-status sc-bytes cs-bytes time-taken x-perfomance
2020-02-13 19:54:14 W3SVC1 usuarioFake ::1 GET /api/fake/prueba api-version=1 0 - ::1 HTTP/1.1 PostmanRuntime/7.22.0 - localhost 200 0 0 1638 330 4422 x-perfomance
I am running Apache2 on Linux (Ubuntu 9.10).
I am trying to monitor the load on my server using mod_status.
There are 2 things that puzzle me (see cut-and-paste below):
The CPU load is reported as a ridiculously small number,
whereas, "uptime" reports a number between 0.05 and 0.15 at the same time.
The "requests/sec" is also ridiculously low (0.06)
when I know there are at least 10 requests coming in per second right now.
(You can see there are close to a quarter million "accesses" - this sounds right.)
I am wondering whether this is a bug (if so, is there a fix/workaround),
or maybe a configuration error (but I can't imagine how).
Any insights would be appreciated.
-- David Jones
- - - - -
Current Time: Friday, 07-Jan-2011 13:48:09 PST
Restart Time: Thursday, 25-Nov-2010 14:50:59 PST
Parent Server Generation: 0
Server uptime: 42 days 22 hours 57 minutes 10 seconds
Total accesses: 238015 - Total Traffic: 91.5 MB
CPU Usage: u2.15 s1.54 cu0 cs0 - 9.94e-5% CPU load
.0641 requests/sec - 25 B/second - 402 B/request
11 requests currently being processed, 2 idle workers
- - - - -
After I restarted my Apache server, I realized what is going on. The "requests/sec" is calculated over the lifetime of the server. So if your Apache server has been running for 3 months, this tells you nothing at all about the current load on your server. Instead, reports the total number of requests, divided by the total number of seconds.
It would be nice if there was a way to see the current load on your server. Any ideas?
Anyway, ... answered my own question.
-- David Jones
Apache status value "Total Accesses" is total access count since server started, it's delta value of seconds just what we mean "Request per seconds".
There is the way:
1) Apache monitor script for zabbix
https://github.com/lorf/zapache/blob/master/zapache
2) Install & config zabbix agentd
UserParameter=apache.status[*],/bin/bash /path/apache_status.sh $1 $2
3) Zabbix - Create apache template - Create Monitor item
Key: apache.status[{$APACHE_STATUS_URL}, TotalAccesses]
Type: Numeric(float)
Update interval: 20
Store value: Delta (speed per second) --this is the key option
Zabbix will calculate the increment of the apache request, store delta value, that is "Request per seconds".