How to configure the configuration file in /var/lib/taos/vnode/vnode2/wal/ - tdengine

I'm using TDengine 3.0. Now it is found that a large amount of 0000000.log is generated under /var/lib/taos/vnode/vnode2/wal/, which takes up a lot of space.
How should the log file be configured, and how should the file be cleaned up?

you could set WAL_RETENTION_PERIOD value to 0 then each WAL file is deleted immediately after its contents are written to disk. it would decrease the space immediately.
from https://docs.tdengine.com/taos-sql/database/

Related

Using camel-smb SMB picks up (large) files while still be written to

When trying to create cyclic moving of files encountered strange behavior with readLock. Create a large file (some 100Mb's) and transfer it using SMB from out to in folder.
FROM:
smb2://smbuser:****#localhost:4455/user/out?antInclude=FILENAME*&consumer.bridgeErrorHandler=true&delay=10000&inProgressRepository=%23inProgressRepository&readLock=changed&readLockMinLength=1&readLockCheckInterval=1000&readLockTimeout=5000&streamDownload=true&username=smbuser&delete=true
TO:
smb2://smbuser:****#localhost:4455/user/in?username=smbuser
Create another flow to move the file back from IN to OUT folder. After some transfers the file will be picked up while still being written to by another route and a transfer will be done with a much smaller file, resulting in a partial file at the destination.
FROM:
smb2://smbuser:****#localhost:4455/user/in?antInclude=FILENAME*&delete=true&readLock=changed&readLockMinLength=1&readLockCheckInterval=1000&readLockTimeout=5000&streamDownload=false&delay=10000
TO:
smb2://smbuser:****#localhost:4455/user/out
Question is: why my readLock is not working properly (p.s. streamDownload is required)?
UPDATE: turns out this only happens on windows samba share, and with streamDownload=true. So, something with stream chunking. Any advice welcome.
The solution requires to prevent polling strategy from automatically picking up a file, and become aware of readLock (in progress) on the other side. So I lowered delay to 5 seconds and in FROM part, on both sides, I added readLockMinAge to 5s which will inspect file modification time.
Since streaming goes for every second this is enough time to prevent read lock.
An explanation of why the previously mentioned situation happens:
When a route prepares to pick-up from out folder, a large file (1GB) is in progress chunk by chunk to in folder. At the end of the streaming file is marked for
removal by camel-smbj and file receive status STATUS_DELETE_PENDING.
Now another part of this process starts to send a newly arrived file to the out folder and finds out that this file already exists. Because of the default fileExists=Override strategy
It tries to delete (afterward store) an existing file (which is still not deleted from the previous step) and receives an exception which causes some InputStream
chunks to be lost.

How to Keep Last 20 MB from a log file unix.(Without Logrotate and without recreating the file)

I want to truncate a file(hive.log) in unix by keeping last 20 mb and as the file is being used by other applications ,i dont want to take any risk to recreate it.
I have tried unix truncate command but it truncates randomly and could not find any option to meet my requirement.
hive uses Log4j to maintain there logs. So whatever you want to achieve can be done by modifying the log4j property file.
File Location: /etc/hive/conf/hive-log4j.properties
Content You should be interested
log4j.appender.DRFA=org.apache.log4j.DailyRollingFileAppender
log4j.appender.DRFA.File=${hive.log.dir}/${hive.log.file}
# Rollver at midnight
log4j.appender.DRFA.DatePattern=.yyyy-MM-dd
# 30-day backup
#log4j.appender.DRFA.MaxBackupIndex= 30
log4j.appender.DRFA.MaxFileSize = 256MB
log4j.appender.DRFA.layout=org.apache.log4j.PatternLayout
log4j.appender.DRFA=org.apache.log4j.DailyRollingFileAppender
Says it's Daily Rotating file
log4j.appender.DRFA.MaxBackupIndex= 30
Says it will keep 30 backups of logs.
log4j.appender.DRFA.MaxFileSize = 256MB
Says the maximum file size would be 256MB.
Now you know which properties you need to change.

Best way to transfer file from Local path to NFS path

I am looking for the best optimized way we can use to transfer the large log files from local path to NFS path.
Here the log files will keep on changing dynamically with time.
What i am currently using is a java utility which will read the file from local path and will transfer it to NFS path. But this seems to be consuming high time.
We cant use copy commands, as the log file are getting appended with more new logs. So this will not work.
What i am looking for is .. Is there any way other than using a java utility which will transfer the details of log file from local path to NFS path.
Thanks in Advance !!
If your network speed is higher than log growing speed, you can just cp src dst.
If log grows too fast and you can't push that much data, but you only want to take current snapshot, I see three options:
Like you do now, read whole file into memory, as you do now, and then copy it to destination. With large log files it may result in very large memory footprint. Requires special utility or tmpfs.
Make a local copy of file, then move this copy to destination. Quite obvious. Requires you to have enough free space and increases storage device pressure. If temporary file is in tmpfs, this is exactly the same as first method, but doesn't requires special tools (still needs memory and large enough tmpfs).
Take current file size and copy only that amount of data, ignoring anything that will be appended during copying.
E.g.:
dd if=src.file of=/remote/dst.file bs=1 count=`stat -c '%s' src.file`
stat takes current file size, and then this dd is instructed to copy only that amount of bytes.
Due to low bs, for better performance you may want to combine it with another dd:
dd status=none bs=1 count=`stat -c '%s' src.file` | dd bs=1M of=/remote/dst.file

Safely writing to and reading from the same file with multiple processes on Linux and Mac OS X

I have three processes designed to run constantly in both Linux and Mac OS X environments. One process (the Downloader) downloads and stores a local copy of a large XML file every 30 seconds. Two other processes (the Workers) use the stored XML file for input. Each Worker starts and runs at random times. Since the XML file is big, it takes a long time to download. The Workers also take a long time to read and parse it.
What is the safest way to setup the processes so the Downloader doesn't clobber the stored file while the Workers are trying to read it?
For Linux and Mac OS X machines that use inode based file systems, use temporary files to store the data while its being downloaded (and is an incomplete state). Once the download is complete, move the temporary file into its final location with an atomic action.
For a little more detail, there are two main things to watch out for when one process (e.g. Downloader) writes a file that's actively read by other processes (e.g. Workers):
Make sure the Workers don't try to read the file before the Downloader has finished writing it.
Make sure the Downloader doesn't alter the file while the Workers are reading it.
Using temporary files accommodates both of these points.
For a more specific example, when the Downloader is actively pulling the XML file, have it write to a temporary location (e.g. 'data-storage.tmp') on the same device/disk* where the final file will be stored. Once the file is completely downloaded and written, have the Downloader move it to its final location (e.g. 'data-storage.xml') via an atomic (aka linearizable) rename command like bash's mv.
* Note that the reason the temporary file needs to be on the same device as the final file location is to ensure the inode number stays the same and the rename can be done atomically.
This methodology ensures that while the file is being downloaded/written the Workers won't see it since it's in the .tmp location. Because of the way renaming works with inodes, it also make sure that any Worker that opened the file continues to see the old content even if a new version of the data-storage file is put in place.
Downloader will point 'data-storage.xml' to a new inode number when it does the rename, but the Worker will continue to access 'data-storage.xml' from the previous inode number thereby continuing to work with the file in that state. At the same time, any Worker that opens a new copy 'data-storage.xml' after Downloader has done the rename will see contents from the new inode number since it's now what is referenced directly in the file system. So, two Workers can be reading from the same filename (data-storage.xml) but each will see a different (and complete) version of the contents of the file based on which inode the filename was pointed to when the file was first opened.
To see this in action, I created a simple set of example scripts that demonstrate this functionality on github. They can also be used to test/verify that using a temporary file solution works in your environment.
An important note is that it's the file system on the particular device that matters. If you are using a Linux or Mac machine but working with a FAT file system (for example, a usb thumb drive), this method won't work.

Get `df` to show updated information on FreeBSD

I recently ran out of disk space on a drive on a FreeBSD server. I truncated the file that was causing problems but I'm not seeing the change reflected when running df. When I run du -d0 on the partition it shows the correct value. Is there any way to force this information to be updated? What is causing the output here to be different?
In BSD a directory entry is simply one of many references to the underlying file data (called an inode). When a file is deleted with the rm(1) command only the reference count is decreased. If the reference count is still positive, (e.g. the file has other directory entries due to symlinks) then the underlying file data is not removed.
Newer BSD users often don't realize that a program that has a file open is also holding a reference. The prevents the underlying file data from going away while the process is using it. When the process closes the file if the reference count falls to zero the file space is marked as available. This scheme is used to avoid the Microsoft Windows type issues where it won't let you delete a file because some unspecified program still has it open.
An easy way to observe this is to do the following
cp /bin/cat /tmp/cat-test
/tmp/cat-test &
rm /tmp/cat-test
Until the background process is terminated the file space used by /tmp/cat-test will remain allocated and unavailable as reported by df(1) but the du(1) command will not be able to account for it as it no longer has a filename.
Note that if the system should crash without the process closing the file then the file data will still be present but unreferenced, an fsck(8) run will be needed to recover the filesystem space.
Processes holding files open is one reason why the newsyslog(8) command sends signals to syslogd or other logging programs to inform them they should close and re-open their log files after it has rotated them.
Softupdates can also effect filesystem freespace as the actual inode space recovery can be deferred; the sync(8) command can be used to encourage this to happen sooner.
This probably centres on how you truncated the file. du and df report different things as this post on unix.com explains. Just because space is not used does not necessarily mean that it's free...
Does df --sync work?

Resources