I'm writing an inotify watcher in C for a Minecraft server. Basically, it watches server.log, gets the latest line, parses it, and if it matches a regex; performs some actions.
The program works fine normally through "echo string matching the regex >> server.log", it parses and does what it should. However, when the string is written to the file automatically via Minecraft server, it doesn't work until I shut down the server or (sometimes) log out.
I would post code, but I'm wondering if it doesn't have something to do with ext4 flushing data to disk or something along those lines; a filesystem problem. It would be odd if that were the case though, because "tail -f server.log" updates whenever the file does.
Solved my own problem. It turned out the server was writing to the log file faster than the watcher could read from it; so the watcher was getting out of sync.
I fixed it by adding a check after it processes the event saying "if the number of lines currently in the log file is more than the recorded length of the log, reprocess the file until the two are equal."
Thanks for your help!
Presumably that is because you are watching for IN_CLOSE events, which may not occur until the server shuts down (and closes the log file handle). See man inotify(7) for valid mask parameters for the inotify_add_watch() call. I expect you'll want to use IN_WRITE.
Your theory is more than likely correct, the log file is being buffered by the OS, and the log writer has no flushing of that buffer, so everything will remain in the buffer till the file is closed or the buffer is full. A fast way to test is to start up the log to the point where you know it would have written events to the log, then forcibly close it so it cannot close the handle, if the log is empty is definitly the buffer. If you can get hold of the file handle/descriptor, you can use setbuf to remove buffering, at the cost of performance.
Related
I am designing a logger plugin for my tool.I have a busybox syslog on a target board, and i want to get syslog data from it so i can forward to my host(not via remote port forwarding of syslog) via my own communication framework.Initially i had made use of syslog's ability to forward messages it receives to a named pipe but this only works via a patch addition which is not feasible in my case.So now my idea is to write a configuration file in syslog to forward all log messages it receives to a file and track the file to get my data.I can use tail function to monitor my file changes but my busybox tail does not support "--follow" option since syslog performs logrotate which causes "tail -f" to fail.And also i am not sure if this is a good method to do it.So what i wanted to ask is there another way in which i can get modified data from a file.I can use inotify, but that can only be used to track file changes.So is there a way to do this?
You could try the "diff" utility (or git-diff, which has more facilities).
You may write a script/program which can receive an inotify event. And the script reopens the file and starts to read till EOF, from the previously saved last read file position.
Take the following command:
mysql -u root -p < load_data.sql > output.tab
The -p flag tells the mysql client - a C program - to provide the user with an interactive prompt to enter the password.
AFAIK, input like this is typically handled by writing a prompt to stderr and then blocking on a call like gets, which reads a line from stdin.
But the shell has already opened the load_data.sql file and set the stdin of the mysql client to its file descriptor - so shouldn't calling gets just get the first line from the file?
My initial thought was that the program seeks to the end before reading a line - but you can't seek like that on pipes!
So how does this work? Is there some magic?
Applications that prompt for passwords generally don't actually read them from stdin, on the grounds that this would (a) cause the password to appear on the screen if it was being typed in interactively and (b) encourage plain-text passwords to be bandied around in publicly-visible places when things need to be automated (e.g. in command lines visible to others via ps). PostgreSQL's psql SQL shell opens the terminal device directly, and I suspect mysql will do the same.
Some quick searching found this related question. The top-rated answer mentions the GNU function getpass(), which does indeed open a direct connection to the terminal, bypassing stdin. I suspect that function is what most password-prompting programs use in *nix.
This isn't a pipe that's being opened up, but rather is a redirection of stdin to point to a file. Thus you have both a FILE* (i.e. a stream), as well as a normal file-descriptor you can work with. In the case of the lower-level file-descriptor, there are seeking operations you can do, like lseek(), etc. that can be used along with read() in order to move around the file.
If you are wanting to still read data from the controlling terminal while stdin has been re-directed to a file, you simply need to open the controlling terminal for reading on another file-descriptor. You can use ctermid() in order to determine what the controlling terminal for your process is, and reopen it on another file-descriptor.
After doing tons of research and nor being able to find a solution to my problem i decided to post here on stackoverflow.
Well my problem is kind of unusual so I guess that's why I wasn't able to find any answer:
I have a program that is recording stuff to a file. Then I have another one that is responsible for transferring that file. Finally I have a third one that gets the file and processes it.
My problem is:
The file transfer program needs to send the file while it's still being recorded. The problem is that when the file transfer program reaches end of file on the file doesn't mean that the file actually is complete as it is still being recorded.
It would be nice to have something to check if the recorder has that file still open or if it already closed it to be able to judge if the end of file actually is a real end of file or if there simply aren't further data to be read yet.
Hope you can help me out with this one. Maybe you have another idea on how to solve this problem.
Thank you in advance.
GeKod
Simply put - you can't without using filesystem notification mechanisms, windows, linux and osx all have flavors of this. I forget how Windows does it off the top of my head, but linux has 'inotify' and osx has 'knotify'.
The easy way to handle this is, record to a tmp file, when the recording is done then move the file into the 'ready-to-transfer-the-file' directory, if you do this so that the files are on the same filesystem when you do the move it will be atomic and instant ensuring that any time your transfer utility 'sees' a new file, it'll be wholly formed and ready to go.
Or, just have your tmp files have no extension, then when it's done rename the file to an extension that the transfer agent is polling for.
Have you considered using stream interface between the recorder program and the one that grabs the recorded data/file? If you have access to a stream interface (say an OS/stack service) which also provides a reliable end of stream signal/primitive you could consider that to replace the file interface.
There is no functions/libraries available in C to do this. But a simple alternative is to rename the file once an activity is over. For example, recorder can open the file with name - file.record and once done with recording, it can rename the file.record to file.transfer and the transfer program should look for file.transfer to transfer and once the transfer is done, it can rename the file to file.read and the reader can read that and finally rename it to file.done!
you can check if file is open or not as following
FILE_NAME="filename"
FILE_OPEN=`lsof | grep $FILE_NAME`
// if [ -z $FILE_NAME ] ;then
// "File NOT open"
// else
// "File Open"
refer http://linux.about.com/library/cmd/blcmdl8_lsof.htm
I think an advisory lock will help. Since if one using the file which another program is working on it, the one will get blocked or get an error. but if you access it in force,the action is Okey, but the result is unpredictable, In order to maintain the consistency, all of the processes who want to access the file should obey the advisory lock rule. I think that will work.
When the file is closed then the lock is freed too.Other processes can try to hold the file.
I'm intending to create a programme that can permanently follow a large dynamic set of log files to copy their entries over to a database for easier near-realtime statistics. The log files are written by diverse daemons and applications, but the format of them is known so they can be parsed. Some of the daemons write logs into one file per day, like Apache's cronolog that creates files like access.20100928. Those files appear with each new day and may disappear when they're gzipped away the next day.
The target platform is an Ubuntu Server, 64 bit.
What would be the best approach to efficiently reading those log files?
I could think of scripting languages like PHP that either open the files theirselves and read new data or use system tools like tail -f to follow the logs, or other runtimes like Mono. Bash shell scripts probably aren't so well suited for parsing the log lines and inserting them to a database server (MySQL), not to mention an easy configuration of my app.
If my programme will read the log files, I'd think it should stat() the file once in a second or so to get its size and open the file when it's grown. After reading the file (which should hopefully only return complete lines) it could call tell() to get the current position and next time directly seek() to the saved position to continue reading. (These are C function names, but actually I wouldn't want to do that in C. And Mono/.NET or PHP offer similar functions as well.)
Is that constant stat()ing of the files and subsequent opening and closing a problem? How would tail -f do that? Can I keep the files open and be notified about new data with something like select()? Or does it always return at the end of the file?
In case I'm blocked in some kind of select() or external tail, I'd need to interrupt that every 1, 2 minutes to scan for new or deleted files that shall (no longer) be followed. Resuming with tail -f then is probably not very reliable. That should work better with my own saved file positions.
Could I use some kind of inotify (file system notification) for that?
If you want to know how tail -f works, why not look at the source? In a nutshell, you don't need to periodically interrupt or constantly stat() to scan for changes to files or directories. That's what inotify does.
I recently ran out of disk space on a drive on a FreeBSD server. I truncated the file that was causing problems but I'm not seeing the change reflected when running df. When I run du -d0 on the partition it shows the correct value. Is there any way to force this information to be updated? What is causing the output here to be different?
In BSD a directory entry is simply one of many references to the underlying file data (called an inode). When a file is deleted with the rm(1) command only the reference count is decreased. If the reference count is still positive, (e.g. the file has other directory entries due to symlinks) then the underlying file data is not removed.
Newer BSD users often don't realize that a program that has a file open is also holding a reference. The prevents the underlying file data from going away while the process is using it. When the process closes the file if the reference count falls to zero the file space is marked as available. This scheme is used to avoid the Microsoft Windows type issues where it won't let you delete a file because some unspecified program still has it open.
An easy way to observe this is to do the following
cp /bin/cat /tmp/cat-test
/tmp/cat-test &
rm /tmp/cat-test
Until the background process is terminated the file space used by /tmp/cat-test will remain allocated and unavailable as reported by df(1) but the du(1) command will not be able to account for it as it no longer has a filename.
Note that if the system should crash without the process closing the file then the file data will still be present but unreferenced, an fsck(8) run will be needed to recover the filesystem space.
Processes holding files open is one reason why the newsyslog(8) command sends signals to syslogd or other logging programs to inform them they should close and re-open their log files after it has rotated them.
Softupdates can also effect filesystem freespace as the actual inode space recovery can be deferred; the sync(8) command can be used to encourage this to happen sooner.
This probably centres on how you truncated the file. du and df report different things as this post on unix.com explains. Just because space is not used does not necessarily mean that it's free...
Does df --sync work?