how to limit the total size of log files managed by syslog? - c

How can I limit the total size of log files that are managed by syslog? The oldest archived log files should probably be removed when this size limit (quota) is exceeded.
Some of the log files are customized files specified by LOG_LOCALn, but I guess this doesn't matter regarding the quota issue.
Thanks!

The Linux utility logrotate renames and reuses system error log files on a periodic basis so that they don't occupy excessive disk space. Linux system stores all the relevant information regarding this into the file /etc/logrotate.conf
There are number of attributes which help us to manage the log size. Please read the manual("man logrotate") before doing anything. On my machine this file looks as follows:
# see "man logrotate" for details
# rotate log files weekly
weekly
# keep 4 weeks worth of backlogs
rotate 4
# create new (empty) log files after rotating old ones
create
# uncomment this if you want your log files compressed
#compress
# packages drop log rotation information into this directory
include /etc/logrotate.d
# no packages own wtmp, or btmp -- we'll rotate them here
/var/log/wtmp {
missingok
monthly
create 0664 root utmp
rotate 1
}
/var/log/btmp {
missingok
monthly
create 0660 root utmp
rotate 1
}
# system-specific logs may be configured here
As we can see that log files would be rotated on weekly basis.This may be changed to daily basis.The compress is not enabled on my machine. This may be enabled if you want to make log file size smaller.
There is excellent article which you may want to refer for the complete understanding this topic.

Related

Date in NLog file name and limit the number of log files

I'd like to achieve the following behaviour with NLog for rolling files:
1. prevent renaming or moving the file when starting a new file, and
2. limit the total number or size of old log files to avoid capacity issues over time
The first requirement can be achieved e.g. by adding a timestamp like ${shortdate} to the file name. Example:
logs\trace2017-10-27.log <-- today's log file to write
logs\trace2017-10-26.log
logs\trace2017-10-25.log
logs\trace2017-10-24.log <-- keep only the last 2 files, so delete this one
According to other posts it is however not possible to use date in the file name and archive parameters like maxArchiveFiles together. If I use maxArchiveFiles, I have to keep the log file name constant:
logs\trace.log <-- today's log file to write
logs\archive\trace2017-10-26.log
logs\archive\trace2017-10-25.log
logs\archive\trace2017-10-24.log <-- keep only the last 2 files, so delete this one
But in this case every day on the first write it moves the yesterday's trace to archive and starts a new file.
The reason I'd like to prevent moving the trace file is because we use Splunk log monitor that is watching the files in the log folder for updates, reads the new lines and feeds to Splunk.
My concern is that if I have an event written at 23:59:59.567, the next event at 00:00:00.002 clears the previous content before the log monitor is able to read it in that fraction of a second.
To be honest I haven't tested this scenario as it would be complicated to set up as my team doesn't own Splunk, etc. - so please correct me if this cannot happen.
Note also I know that it is possible to directly feed Splunk other ways like via network connection, but the current setup for Splunk at our company is reading from log files so it would be easier that way.
Any idea how to solve this with NLog?
When using NLog 4.4 (or older) then you have to go into Halloween mode and make some trickery.
This example makes hourly log-files in the same folder, and ensure archive cleanup is performed after 840 hours (35 days):
fileName="${logDirectory}/Log.${date:format=yyyy-MM-dd-HH}.log"
archiveFileName="${logDirectory}/Log.{#}.log"
archiveDateFormat="yyyy-MM-dd-HH"
archiveNumbering="Date"
archiveEvery="Year"
maxArchiveFiles="840"
archiveFileName - Using {#} allows the archive cleanup to generate proper file wildcard.
archiveDateFormat - Must match the ${date:format=} of the fileName (So remember to correct both date-formats, if change is needed)
archiveNumbering=Date - Configures the archive cleanup to support parsing of filenames as dates.
archiveEvery=Year - Activates the archive cleanup, but also the archive file operation. Because the configured fileName automatically ensures the archive file operation, then we don't want any additional archive operations (Ex. avoiding generating extra empty files at midnight).
maxArchiveFiles - How many archive files to keep around.
With NLog 4.5 (Still in BETA), then it will be a lot easier (As one just have to specify MaxArchiveFiles). See also https://github.com/NLog/NLog/pull/1993

when database has multiple log files , which one SQL Server will pick?

one of my database has multiple log files
log.ldf - 40 GB ( on D: drive)
log2.ldf -70 GB (on S: Drive)
log3.ldf -100 GB ( on L:Drive)
which log file SQL Server will pick first. is SQL server will follow any order to pick the log file ?Can we control this ?
I believe you can't control into which file LOG info will be written.
You should not concentrate only on the BIGGEST file, but on the FASTEST
General advise would be to have ONLY two files:
- First file, as big as possible on the FASTEST drive (on SSD). Set MAXSIZE to the file size, so it won't grow anymore.
- Second file, as small as possible on big drive, where it can grow in case the first file is full.
Your task would be to monitor your second file size and if it starts to grow then make log backups more often and shrink that file back.
If you want to see how your log files are used, you can use following DBCC command:
DBCC LOGINFO ();

Why are compressed files modified at the end of compression?

Using 7zip I compressed ~15GB worth of pictures split in folders in 15 1024MB files.
Compression methode: LZMA2; Level: Ultra; Dictionary size: 64M;
At the end of compression some of the files had their "last modified" time changed to the time of completion, while some of the files didn't.
Why is this?
And if I have already uploaded most of the files will I be able to unarchive them successfully?
You would need to ask the author of the program for an explanation of why it modifies volumes at the end of the operation. If I had to make an educated guess, it might be because 7-zip doesn't know which is the last volume until it's finished (because this would depend on the compression ratio of the files being archived, which can't be predicted), and so it needs to go back and update parts of the volume file headers accordingly.
In general, though, quoting the relevant 7-zip help file entry:
NOTE: Please don't use volumes (and don't copy volumes) before
finishing archiving. 7-Zip can change any volume (including first
volume) at the end of archiving operation.
The only safe assumption is that you can't reliably use any of your individual 1GB volumes until 7-zip has finished processing the whole 15GB archive.

Replay a file-based data stream

I have a live stream of data based on files in different formats. Data comes over the network and is written to files in certain subdirectories in a directory hierarchy. From there it is picked up and processed further. I would like to replay e.g. one day of this data stream for testing and simulation purposes. I could duplicate the data stream for one day to a second machine and „record“ it this way, by just letting the files pile up without processing or moving them.
I need something simple like a Perl script which takes a base directory, looks at all contained files in subdirectories and their creation time and then copies the files at the same time of the day to a different base directory.
Simple example: I have files a/file.1 2012-03-28 15:00, b/file.2 2012-03-28 09:00, c/file.3 2012-03-28 12:00. If I run the script/program on 2012-03-29 at 08:00 it should sleep until 09:00, copy b/file.2 to ../target_dir/b/file.2, then sleep until 12:00, copy c/file.3 to ../target_dir/c/file.3, then sleep until 15:00 and copy a/file.1 to ../target_dir/a/file.1.
Does a tool like this already exist? It seems I’m missing the right search keywords to find it.
The environment is Linux, command line preferred. For one day it would be thousands of files with a few GB in total. The timing does not have to be ultra-precise. Second resolution would be good, minute resolution would be sufficient.

Storing Large Number Of Files in File-System

I have millions of audio files, generated based on GUId (http://en.wikipedia.org/wiki/Globally_Unique_Identifier). How can I store these files in the file-system so that I can efficiently add more files in the same file-system and can search for a particular file efficiently. Also it should be scalable in future.
Files are named based on GUId (unique file name).
Eg:
[1] 63f4c070-0ab2-102d-adcb-0015f22e2e5c
[2] ba7cd610-f268-102c-b5ac-0013d4a7a2d6
[3] d03cf036-0ab2-102d-adcb-0015f22e2e5c
[4] d3655a36-0ab3-102d-adcb-0015f22e2e5c
Pl. give your views.
PS: I have already gone through < Storing a large number of images >. I need the particular data-structure/algorithm/logic so that it can also be scalable in future.
EDIT1: Files are around 1-2 millions in number and file system is ext3 (CentOS).
Thanks,
Naveen
That's very easy - build a folder tree based on GUID values parts.
For example, make 256 folders each named after the first byte and only store there files that have a GUID starting with this byte. If that's still too many files in one folder - do the same in each folder for the second byte of the GUID. Add more levels if needed. Search for a file will be very fast.
By selecting the number of bytes you use for each level you can effectively choose the tree structure for your scenario.
I would try and keep the # of files in each directory to some manageable number. The easiest way to do this is name the subdirectory after the first 2-3 characters of the GUID.
Construct n level deep folder hierarchy to store your files. The names of the nested folders will the first n bytes of the corresponding file name. For example: For storing a file "63f4c070-0ab2-102d-adcb-0015f22e2e5c" in a four level deep folder hierarchy, construct 6/3/f/4 and place this file in this hierarchy. The depth of the hierarchy depends on the maximum number of files you can have in your system. For a few million files in my project 4 level deep hierarchy works well.
I also did the same thing in my project having nearly 1 million files. My requirement was also to process the files by traversing this huge list. I constructed a 4 level deep folder hierarchy and the processing time reduced from nearly 10 minutes to a few seconds.
An add on to this optimization can be that, if you want to process all the files present in these deep folder hierarchies, then instead of calling a function to fetch the list for the first 4 levels just precompute all the possible 4 level deep folder hierarchy names. Suppose the guid can have 16 possible characters then we will have 16 folders each at the first four levels, we can just precompute the 16*16*16*16 folder hierarchies which takes just a few ms. This save a lot of time if these large number of files are stored at a shared location and calling a function to fetch the list in a directory takes nearly a second.
Sorting the audio files into separate subdirectories may slower if dir_index is used on the ext3 volume. (dir_index: "Use hashed b-trees to speed up lookups in large directories.")
This command will set the dir_index feature: tune2fs -O dir_index /dev/sda1

Resources