Some of my transactions were aborted in mongodb, from the log file, I digged up the info "transaction parameters:... terminationCause:aborted timeActiveMicros:205 timeInactiveMicros:245600632 ...
I increased the transactionLifetimeLimitSeconds on the server to 3000 from 60, which was 50 minutes, should be plenty, the transaction should at most take 10 minutes. Still not working
the second thing I tweaked was at the client side (pymongo), I changed wtimeout on write_concern to 500000000 from 1000, still getting the same error.
Any other parameters I should change?
Related
I have a table in a Azure SQL Database which contains approximately 10 cols and 1.7 million rows. There data in each cell is mostly null/varchar(30).
When running a dataflow to a new table in Dataverse, I have two issues:
It takes around 14 hours (around 100k rows or so per hour)
It fails after 14 hours with the great error message (**** is just some entity names I have removed):
Dataflow name,Entity name,Start time,End time,Status,Upsert count,Error count,Status details
****** - *******,,1.5.2021 9:47:20 p.m.,2.5.2021 9:51:27 a.m.,Failed,,,There was a problem refreshing >the dataflow. please try again later. (request id: 5edec6c7-3d3c-49df-b6de-e8032999d049).
****** - ,,1.5.2021 9:47:43 p.m.,2.5.2021 9:51:26 a.m.,Aborted,0,0,
Table name,Row id,Request url,Error details
*******,,,Job failed due to timeout : A task was canceled.
Is it really so that this should take 14 hours :O ?
Are there any verbose logging I can enable to get a more friendly error message?
"Connection closed" occurs when executing a function for data pre-processing.
The data pre-processing is as follows.
Import data points of about 30 topics from the database.( Data for 9 days every 1 minute,
60 * 24 * 9 * 30 = 388,800 values)
Convert data to a pandas dataframe for pre-processing such as missing value or resampling (this process takes the longest time)
Data processing
In the above data pre-processing, the following error occurs.
volttron.platform.vip.rmq_connection ERROR: Connection closed unexpectedly, reopening in 30 seconds.
This error is probably what the VOLTTRON platform does to manage the agent.
Since it takes more than 30 seconds in step 2, an error occurs and the VOLTTRON platform automatically restarts the agent.
Because of this, the agent cannot perform data processing normally.
Does anyone know how to avoid this?
If this is happening during agent instantiation I would suggest moving the pre-processing out of the init or configuration steps to a function with the #core.receiver("onstart") decorator. This will stop the agent instantiation and configuration steps from timing out. The listener agent's on start method can be used as an example.
After encountering (on an RDS Postgres instance) for several times:
ERROR: canceling statement due to conflict with recovery
Detail: User query might have needed to see row versions that must be removed
I ran (on the hot standby):
SELECT *
FROM pg_stat_database_conflicts;
And found that all the conflicts have to do with confl_snapshot
Which is explained in the documentation as:
confl_snapshot: Number of queries in this database that have been canceled due to old
snapshots
What might be causing this conflict (being an old snapshot)?
If it helps, here are some of the relevant settings (by running SHOW ALL ; on the stand by):
hot_standby: on
hot_standby_feedback: off
max_standby_archive_delay: 30s
max_standby_streaming_delay: 1h
name,setting
old_snapshot_threshold: -1
vacuum_defer_cleanup_age: 0
vacuum_freeze_min_age: 50000000
vacuum_freeze_table_age: 150000000
vacuum_multixact_freeze_min_age: 5000000
vacuum_multixact_freeze_table_age: 150000000
wal_level: replica
wal_receiver_status_interval: 10s
wal_receiver_timeout: 30s
wal_retrieve_retry_interval: 5s
wal_segment_size: 16MB
wal_sender_timeout: 30s
wal_writer_delay: 200ms
I send emails with cron job and task queue usage. The job is executed every 15 minutes and the queue used has the following setup:
- name: send-emails
rate: 1/m
max_concurrent_requests: 1
retry_parameters:
task_retry_limit: 0
But quite often apiproxy_errors.OverQuotaError exception happens. I am checking Quota Details and see that I am still within the daily quota (Recipients Emailed, Attachment Data Sent etc.), and I believe I couldn't be over maximum per minute limit, since the the rate I use is just 1 task per minute (i.e. send no more than 1 mail per minute).
Where am I wrong and what should I check?
How many emails are you sending? You have not set a bucket-size, so it defaults to 5. Your rate sets how often the bucket is replenished. So, with your current configuration, you can send 5 emails every minute. That means if you are sending more than 75 emails to the queue every 15 minutes, the queue will fill up, and eventually go over quota.
I have not tried this myself, but when you catch the apiproxy_errors.OverQuotaError exception, does the message contain any detail as to why it is over quota/which quota has been exceeded?
try:
send_mail_here
except apiproxy_errors.OverQuotaError, message:
logging.error(message)
We recently got some data back on a benchmarking test from a software vendor, and I think I'm missing something obvious.
If there were 17 transactions (I assume they mean successfully completed requests) per second, and 1500 of these requests could be served in 5 minutes, then how do I get the response time for a single user? Is this sort of thing even possible with benchmarking? I have a lot of other data from them, including apache config settings, but I'm not sure how to do all the math.
Given the server setup they sent, I want to know how I can deduce the user response time. I have looked at other similar benchmarking tests, but I'm having trouble measuring requests to response time. What other data do I need to provide here to get that?
If only 1500 of these can be served per 5 minutes then:
1500 / 5 = 300 transactions per min can be served
300 / 60 = 5 transactions per second can be served
so how are they getting 17 completed transactions per second? Last time I checked 5 < 17 !
This doesn't seem to fit. Or am I looking at it wrongly?
I presume be user response time, you mean the time it takes to serve a single transaction:
If they can serve 5 per second than it takes 200ms (1/5) per transaction
If they can serve 17 per second than it takes 59ms (1/17) per transaction
That is all we can tell from the given data. Perhaps clarify how many transactions are being done per second.