InfluxDb changes timestamp on writing data - database

I am trying to write some data to InfluxDB using Influx CLI. I am using the following command:
influx write -b test_bucket -f sample_data.csv
But the timestamps come out to be totally different in inserted data:
As you can see the timestamps in the original csv files are starting at 2020-01-01T00:00:00.005Z and increment 5ms but in database they start at 2020-02-15T03:24:09.716Z and don't change at all.
Any ideas why is this weird behavior happening?

Related

Do I need to dump databases from a volume before backing them up?

There are plenty of resources on how to dump Postgres/Mariadb/MySQL/etc. databases from a volume/container; my question is if I need to do so before backing them up. More explicitly, is it safe to stop my MariaDB container, copy the contents of the volume to another folder, and back that up directly? Are there consequences I should be aware of?
My current export code:
mkdir -p $HOME/backup/mariadb_backup
docker run --rm -v mariadb_volume:/data -v $HOME/backup:/backup ubuntu cp -aruT /data /backup/mariadb_backup
I then run borg on the backup folder.
It is safe to back up the files of a stopped database.
People usually don't want to shut down a database that's providing some service, so they come up with methods how not to do that.
One is run a dump operation that exports the contents of a database while it is serving other requests.
Another is a filesystem snapshot. That is atomically take a snapshot of the files underlying the database so that all files retain their content from a single point in time and then back that up.
The only thing you should not do is back up the files of a running database one by one. You will get an inconsistent copy if you do that.

AWS EC2 rsync between regions xtrabackup folder

Just to give you an idea, we have a DR db server in another region of AWS (Oregon), from the master (Virginia). We had an issue where replication broke, and we have to do a dump and restore.. we are talking about 3 tb of data.. so making a backup, creating an AMI, moving it across, dumping it back to a volume and then restoring is a lot of work. I am doing an rsync across ssh, and it is taking forever.. I estimate 2 days for the task to complete.. The data is an xtrabackup - so all db tables, and files basically..
Has anyone come across this issue, and what is the best way to transfer such massive amounts of data in the shortest amount of time? Believe me, I have thought of S3 etc.. but don't have the experience in transfer speeds to/from buckets across regions etc. Any ideas?
First made an Xtrabackup using this command:
xtrabackup -u root -H 127.0.0.1 -p 'supersecretpassword' --backup --datadir=/data/mysql/ --target-dir=/xtrabackup/
xtrabackup -u root -H 127.0.0.1 -p 'supersecretpassword' --prepare --datadir=/data/mysql/ --target-dir=/xtrabackup/
Then uploaded to S3 bucket using this command:
aws s3 sync /dbbackup s3://tmp-restore-bucket/
From the DR server in the other region, ran this command to download the xtrabackup straight to the db data folder after removing the existing db data files. This is the fastest way.
aws s3 sync s3://tmp-restore-bucket /data/mysql/
Finally start mysql on the DR server, and start your slave sync again using the command given in one of the xtrabackup files you created.
Super easy and the best and fastest way I've found.

Unable to DELETE or GET couchdb2 databases

I have a testing script that creates and deletes testing databases. At some point today it started failing. Digging further it looks like several of my testing databases are in an inconsistent state.
The databases appear in Fauxton with the message "This database failed to load." I am unable to view the database contents on this interface. Their names which are usually links are now plain text.
Issuing GET and DELETE commands with curl shows the following errors:
$ curl -s -X DELETE http://username:password#0.0.0.0:5984/dbname
{"error":"error","reason":"internal_server_error"}
$ curl -s -X GET http://username:password#0.0.0.0:5984/dbname
{"error":"internal_server_error","reason":"No DB shards could be opened.","ref":2413987899}
I have looked inside the couchdb2 data directory and I do see that shards exist for these databases.
What can I do to delete these databases? I am not sure if I can do this by manually deleting files in the couchdb2 data directory.
Have you solved your issue yet? I had this same problem, and ultimately ended up just installing a new CouchDB 2.1.0 instance and replicating to it before taking down the original. I suspect it might have had something to do with CouchDB not liking its default choice of "couchdb#localhost" as the name for a node, because it was constantly telling me that was an illegal hostname.

How to operate a postgres database from 1 hard disk on multiple systems?

My issue is that I work from various systems yet require access to a single, large (50 Gb) database that is always up to date. If it were a smaller database, dumping and restoring the database onto the external disk would by fine, e.g. via $ pg_dump mydb > /path../db.sql to save it and then on the other computer use $psql -d mydb -f /path../db.sql to recover the data and so on...
In this case, however, that option will not work as I don't have 50 Gb of free space on both machines. So I'd like the files for this particular db to be on a single external drive.
How would I do this?
I've been using pg_cltcluster for this purpose, e.g.
$ cp -rp ax/var/lib/postgresql/9.1/main /media/newdest # move the data
$ pg_cltcluster 9.1 main stop # stop the postgres server (on ubuntu: add -- [ctl args])
$ /usr/lib/postgresql/9.1/bin/initdb -D /media/newdest # initialise instance in new place
(On ubuntu, pg_ctlcluster is used instead of pg_ctl to allow multiple db clusters and this should allow pg_ctlcluster 9.1 main start -- -D /media/newdest to replace the last line of code, I think)
I suspect this approach is not the best solution because a) it's not working at present after various tries and b) I'm not sure I'll be able to access the cluster from another computer.
Database software are designed to handle large datasets and moving your data around is a common task so I am baffled about why there is less info on this on the internet:
This question that basically says "don't use TABLESPACES to do it"
This one that just solves the permissions problem and links to a (useful) IMB page on the matter that talks about moving the entire setup, not just one database as I want to.

How can I parse Serv-U FTP logs with SSIS?

A while back I needed to parse a bunch of Serve-U FTP log files and store them in a database so people could report on them. I ended up developing a small C# app to do the following:
Look for all files in a dir that have not been loaded into the db (there is a table of previously loaded files).
Open a file and load all the lines into a list.
Loop through that list and use RegEx to identify the kind of row (CONNECT, LOGIN, DISCONNECT, UPLOAD, DOWNLOAD, etc), parse it into a specific kind of object corresponding to the kind of row and add that obj to another List.
Loop through each of the different object lists and write each one to the associated database table.
Record that the file was successfully imported.
Wash, rinse, repeat.
It's ugly but it got the job done for the deadline we had.
The problem is that I'm in a DBA role and I'm not happy with running a compiled app as the solution to this problem. I'd prefer something more open and more DBA-oriented.
I could rewrite this in PowerShell but I'd prefer to develop an SSIS package. I couldn't find a good way to split input based on RegEx within SSIS the first time around and I wasn't familiar enough with SSIS. I'm digging into SSIS more now but still not finding what I need.
Does anybody have any suggestions about how I might approach a rewrite in SSIS?
I have to do something similar with Exchange logs. I have yet to find an easier solution utilizing an all SSIS solution. Having said that, here is what I do:
First I use logparser from Microsoft and the bulk copy functionality of sql2005
I copy the log files to a directory that I can work with them in.
I created a sql file that will parse the logs. It looks similar to this:
SELECT TO_Timestamp(REPLACE_STR(STRCAT(STRCAT(date,' '), time),' GMT',''),'yyyy-M-d h:m:s') as DateTime, [client-ip], [Client-hostname], [Partner-name], [Server-hostname], [server-IP], [Recipient-Address], [Event-ID], [MSGID], [Priority], [Recipient-Report-Status], [total-bytes], [Number-Recipients], TO_Timestamp(REPLACE_STR([Origination-time], ' GMT',''),'yyyy-M-d h:m:s') as [Origination Time], Encryption, [service-Version], [Linked-MSGID], [Message-Subject], [Sender-Address] INTO '%outfile%' FROM '%infile%' WHERE [Event-ID] IN (1027;1028)
I then run the previous sql with logparser:
logparser.exe file:c:\exchange\info\name_of_file_goes_here.sql?infile=c:\exchange\info\logs\*.log+outfile=c:\exchange\info\logs\name_of_file_goes_here.bcp -i:W3C -o:TSV
Which outputs a bcp file.
Then I bulk copy that bcp file into a premade database table in SQL server with this command:
bcp databasename.dbo.table in c:\exchange\info\logs\name_of_file_goes_here.bcp -c -t"\t" -T -F 2 -S server\instance -U userid -P password
Then I run queries against the table. If you can figure out how to automate this with SSIS, I'd be glad to hear what you did.

Resources