why does CouchDBs _dbs.couch keep growing when purging/compacting DBs? - database

The setup:
A CouchDB 2.0 running in Docker on a Raspberry PI 3
A node-application that uses pouchdb, also in Docker on the same PI 3
The scenario:
At any given moment, the CouchDB has at max 4 Databases with a total of about 60 documents
the node application purges (using pouchdbs destroy) and recreates these databases periodically (some of them every two seconds, others every 15 minutes)
The databases are always recreated with the newest entries
The reason for purging the databases, instead of deleting their documents is, that i'd otherwise have a huge amount of deleted documents, and my web-client can't handle syncing all these deleted documents
The problem:
The file var/lib/couchdb/_dbs.couch always keeps growing, it never shrinks. Last time i left it alone for three weeks, and it grew to 37 GB. Fauxten showed, that the CouchDB only contains these up to 60 Documents, but this file still keeps growing, until it fills all the space available
What i tried:
running everything on an x86 machine (osx)
running couchdb without docker (because of this info)
using couchdb 2.1
running compaction manually (which didn't do anything)
googling for about 3 days now
Whatever i do, i always get the same result: the _dbs.couch keeps growing. I also wasn't really able to find out, what that files purpose is. googling that specific filename only yields two pages of search-results, none of which are specific.
The only thing i can currently do, is manually delete this file from time to time, and restart the docker-container, which does delete all my databases, but that is not a problem as the node-application recreates them soon after.

The _dbs database is a meta-database. It records the locations of all the shards of your clustered databases, but since it's a couchdb database too (though not a sharded one) it also needs compacting from time to time.
try;
curl localhost:5986/_dbs/_compact -XPOST -Hcontent-type:application/json
You can enable the compaction daemon to do this for you, and we enable it by default in the recent 2.1.0 release.
add this to the end of your local.ini file and restart couchdb;
[compactions]
_default = [{db_fragmentation, "70%"}, {view_fragmentation, "60%"}]

Related

Clickhouse DB slows down on a daily basis at 10am for seemingly no reason

I have been using Clickhouse at work for analytics purposes for a while now.
I am currently running Clickhouse v22.6.3 revision 54455 on-premise on a VM with:
fast storage
200Gb of RAM
no swap
a 40-cores CPU.
I have a few Tb of data, but no table bigger than 300 Gb. I do not use distributed tables or replication yet, and I write frequently into Clickhouse (but I don't use deletes or updates and prefer using things like the ReplacingMergeTree engine). I also leverage the MaterializedView feature for a few tables. Let me know if you need any more context or parameter, I use a pretty standard configuration.
Now, for a few months I have been experiencing performances issues where the server significantly slows down every day at 10am, and I cannot figure out why.
Based on Clickhouse built-in Graphite monitoring, the "symptoms" of the issue seem to be as follow:
At 10am:
On the server side:
Both load and RAM usage remain reasonable. Load goes up a little.
Disk write await time goes up (which I suspect is what leads to higher load)
Disk utilization % skyrockets to something between 90 and 100%
On Clickhouse side:
DiskSpaceReservedForMerge stays roughly the same (ie between 0 and 70Gb)
both OpenFileForRead and OpenFileForWrite go up by a factor of ~2
BackgroundCommonPoolTask goes slightly up, so does BackgroundSchedulePoolTask (which I found weird, because I thought this pool was dedicated to distributed operations - which I don't use) - both numbers remain seemingly reasonable
The number of active Merge tasks per minutes drop significantly but I'm unsure whether it's a consequence of slow writing or if it's causing it
both insert and general querying time are multiplied by ~10 which renders the database effectively unusable even for small tasks
Restarting Clickhouse usually fixes the problem but I obviously do not want to restart my main database every day at 10am. Most of the heavy load I put on the DB (such as data extraction and transformation, etc) happens earlier in the morning (and end around 7-8am) and runs fine. I do not have any heavy tasks running at 10am. The Clickhouse VM takes most of its host resources and I have confirmed with the devOps team that there doesn't seem to be a problem on the host or anything else scheduled on it at that time.
Is there any kind of background tasks or process that is run by Clickhouse on a daily basis and that could have a high impact on our disk capacity? What else can I monitor to figure out what is causing this problem?
Again, let me know if I can be more thorough on our settings and the state of the DB when the "bug" occurs.
Do you use https://github.com/innogames/graphite-ch-optimizer ?
Do you use TTL ?
select * from system.merges;
select * from system.part_log where event_time between ~10am~

Need to migrate the whole cluster from one DC to another DC

I have a SolrCloud cluster consists of 5 hosts in one DC.
The collection configuration is 5 shards and 3 replicas and max 3 shards per host.
Solr version used is 5.3.1.
Because of some unforeseen maintenance activity, it needs to be moved to some other DC temporarily. In order to minimize the impact we need the indexed data to be available with the new setup. All the nodes has roughly 100GB of indexed data.
I already have tried copying the whole setup to the new DC and restarted after after updating the host information in the config files. It always complains some or other shards not available from hosts while querying data. [error code 503]
Note: The back up was taken from a running setup.
I also have tried creating the whole cluster again with the same configuration and copying only the data directory from the back up. It also results in shards not available from the hosts.
I wanted to understand if there is something wrong in the process I am following. One thing I am suspecting is , the back up should be taken after stoping a particular node.
Is there any simple and better way available? I am using Solr-5.3.1.
The right way to do it is using backup and restore feature. This feature was already available in the 5.3 version, check the appropiate doc and follow the steps. Should work just fine.

Drupal 7 -> 8 migration of a large database takes forever

So I have a Drupal 7 database with 2 million users that need to move to Drupal 8 with a minimum of downtime (target is an hour). The Drupal migrate module appears to solve this problem, but it writes new rows one item at a time and in my tests, 4 thousand users + related data took 20 minutes on frankly beastly AWS instances. Extrapolating to the full dataset, it would take me 7 days to run the migration, and that amount of downtime is not reasonable.
I've made a feature request against Drupal core but I also wanted to see if the community has any ideas that I missed. Also, I want to spawn some discussion about this issue.
If anyone still cares about this, I have resolved this issue. Further research showed that not only does the Drupal migration module write new rows one at a time, but it also reads rows from the source one at a time. Further, for each row Drupal will write to a mapping table for the source table so that it can support rollback and update.
Since a user's data is stored in one separate table per custom field, this results in something like 8 reads and 16 writes for each user.
I ended up extending Drupal's Migration Executable for running the process. Then I overrode both the part that reads data and the part that writes it to do their work in batches, and to not write to the mapping tables. I believe that my projected time is now down to less then an hour (A speed up of 168 times!).
Still, trying to use the Drupal infrastructure was more trouble then it was worth. If you are doing this yourself just write a command line application and do the SQL queries manually.

Be very careful committing from the client! In fact, don’t do it

I created solr cloud cluster with 3 replicas and shards and configured with external zookeeper. Its working fine. After period of time I notice that when I restarted node it takes long time to start up.
I observe that in version in zookeeper on each node has huge amount of data in version-2 folder.After few days I noticed that nodes are down .recovery fail in many cases.
In my application I need live updates in cloud.so when user updates his profile I rewrite document to solr with commit() .
Is their something missing or live updates are not right option because thousands of peoples are updating their profiles same time?

Solr Incremental backup on real-time system with heavy index

I implement search engine with solr that import minimal 2 million doc per day.
User must can search on imported doc ASAP (near real-time).
I using 2 dedicated Windows x64 with tomcat 6 (Solr shard mode). every server, index about 120 million doc and about 220 GB (total 500 GB).
I want to get backup incremental from solr index file during update or search.
after search it, find rsync tools for UNIX and DeltaCopy for windows (GUI rsync for windows). but get error (vanished) during update.
how to solve this problem.
Note1:File copy really slow, when file size very large. therefore i can't use this way.
Note2: Can i prevent corrupt index files during update, if windows crash or hardware reset or any other problem ?
You can take a hot backup (i.e. while writing to the index) using the ReplicationHandler to copy Solr's data directory elsewhere on the local system. Then do whatever you like with that directory. You can launch the backup whenever you want by going to a URL like this:
http://host:8080/solr/replication?command=backup&location=/home/jboss/backup
Obviously you could script that with wget+cron.
More details can be found here:
http://wiki.apache.org/solr/SolrReplication
The Lucene in Action book has a section on hot backups with Lucene, and it appears to me that the code in Solr's ReplicationHandler uses the same strategy as outlined there. One of that book's authors even elaborated on how it works in another StackOverflow answer.
Don't run a backup while updating the index. You will probably get a corrupt (therefore useless) backup.
Some ideas to work around it:
Batch up your updates, i.e. instead of adding/updating documents all the time, add/update every n minutes. This will let you run the backup in between those n minutes. Cons: document freshness is affected.
Use a second, passive Solr core: Set up two cores per shard, one active and one passive. All queries are issued against the active core. Use replication to keep the passive core up to date. Run the backup against the passive core. You'd have to disable replication while running the backup. Cons: complex, more moving parts, requires double the disk space to maintain the passive core.

Resources