Monitor directory for new files only - c

I'd like to monitor a directory for new files from a C app. However, I'm not interested in modified files, only in new files. Currently I'm using readdir/stat for that purpose:
while ( (ent = readdir(dir)) != NULL ) {
strcpy(path, mon_dir);
strcat(path, "/");
strcat(path, ent->d_name);
if ( stat(path, &statbuf) == -1 ) {
printf( "Can't stat %s\n", ent->d_name );
continue;
}
if ( S_ISREG(statbuf.st_mode) ) {
if ( statbuf.st_mtime > *timestamp ) {
tcomp = localtime( &statbuf.st_mtime );
strftime( s_date, sizeof(s_date), "%Y%m%d %H:%M:%S", tcomp );
printf( "%s %s was added\n", s_date, ent->d_name );
*timestamp = statbuf.st_mtime;
}
}
}
Any idea how I can detect newly created files on Linux AND Solaris 10 without keeping a list of files?
Cheers,
Martin.

gamin provides an abstraction around system dependant file notification apis for many *nixes , and it's included in many linux distros by default.
For linux, you could use the linux specific inotify api.
Win32 has a similar API via FindFirstChangeNotification

The solution is to store the last access time in a global variable and pick the latest files with a filter to scandir():
int cmp_mtime( const struct dirent** lentry, const struct dirent** rentry ):
Stat (*lentry)->d_name (extended by path, but that's a detail only)
ltime = statbuf.st_mtime;
Stat (*rentry)->d_name (extended by path, but that's a detail only)
rtime = statbuf.st_mtime;
if ( ltime < rtime ) return -1;
else if ( ltime > rtime ) return 1;
return 0;
int selector( const struct dirent* entry ):
Stat entry->d_name (extended by path, but that's a detail only)
If not normal file then return 0
If stat.st_mtime > lastseen then return 1 else return 0
Main:
Init global time variable lastseen
scandir( directory, &entries, selector, cmp_mtime );
Process list of entries
lastseen := mtime of last entry in list

There is probably no better way with Solaris 10 outside interfacing with either the dtrace command or libdtrace (not recommended). On SunOS 5.11 based OSes (eg: OpenSolaris, Solaris 11 Express, ...), you can just use the File Event Notification framework.

On MacOS X there is a file monitoring API and the provided sample code shows how to find which files have changed.

You could use FAM - File Alteration Monitor for this.

one trick may be to set the archive bit for handled files.
Edith says:
if nothing else from the other answeres helpes, you may play with chmod() instead.

Was researching the same topic on Solaris and found the example of the watchdir app, to be used from within scripts, as in
watchdir /foo/bar
which will perform a blocking wait until something happens on the watched directory.
Link: https://solarisrants.wordpress.com/2013/07/24/solaris-file-event-notification/

I know you were asking for a solution from C but in fact Java (now) has a cross-platform class for this. You can read more about it here. Also see documentation for the WatchService class which is the central class of Java's file change notification ability.
Perhaps there's some documentation somewhere as to how Java accomplishes this on the various platforms ?
From the little I understand Java uses the OS's native file change notification API on Linux, Solaris and Windows and then in addition also implements a fallback method which is based on polling and which will work on any platform.

Related

Standard Library vs Windows API Speed

My question is about whether or not I should use the Windows API if I'm trying to get the most speed out of my program, where I could instead use a Standard Library function.
I assume the answer isn't consistent among every call; Specifically, I'm curious about stat() vs dwFileAttributes, if I wanted to figure out if a file was a directory or not for example (assuming file_name is a string containing the full path to the file):
WIN32_FIND_DATA fileData;
HANDLE hSearch;
hSearch = FindFirstFile(TEXT(file_name), &fileData);
int isDir = fileData.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY
vs.
struct stat info;
stat(file_name, &info);
int isDir = S_ISDIR(info.st_mode);
If anyone knows, or can elaborate on what the speed difference between these libraries generally is (if any) I'd appreciate it.
Not an answer regarding speed, but #The Corn Inspector, the MSVC CRT code is open sourced. If you look at an older version ( before .net and common UCRT), and look at the stat() function, it is INDEED a wrapper around the same OS call.
int __cdecl _tstat (
REG1 const _TSCHAR *name,
REG2 struct _stat *buf
)
{ // stuff omitted for clarity
_TSCHAR * path;
_TSCHAR pathbuf[ _MAX_PATH ];
int drive; /* A: = 1, B: = 2, etc. */
HANDLE findhandle;
WIN32_FIND_DATA findbuf;
/* Call Find Match File */
findhandle = FindFirstFile((_TSCHAR *)name, &findbuf);
Of course there is addtional code for mapping structures, etc. Looks like it also does some time conversion:
SYSTEMTIME SystemTime;
FILETIME LocalFTime;
if ( !FileTimeToLocalFileTime( &findbuf.ftLastWriteTime,
&LocalFTime ) ||
!FileTimeToSystemTime( &LocalFTime, &SystemTime ) )
{
so theoretically, it could be slower, but probably so insignificant, as to make no practical difference in the context of a complete, complex program. If you are calling stat() a million times, and worry about milliseconds, who knows. Profile it.

How to get normalized (canonical) file path on Linux "even if the filepath is not existing on the file system"? (In a C program))

I have researched a lot on this topic but could not get anything substantial.
By normalize/canonicalize I mean to remove all the "..", ".", multiple slashes etc from a file path and get a simple absolute path.
e.g.
"/rootdir/dir1/dir2/dir3/../././././dir4//////////" to
"/rootdir/dir1/dir2/dir4"
On windows I have GetFullPathName() and I can get the canonical filepath name, but for Linux I cannot find any such API which can do the same work for me,
realpath() is there, but even realpath() needs the filepath to be present on the file system to be able to output normalized path, e.g. if the path /rootdir/dir1/dir2/dir4 is not on file system - realpath() will throw error on the above specified complex filepath input.
Is there any way by which one could get the normalized file path even if it is not existing on the file system?
realpath(3) does not resolve missing filenames.
But GNU core utilities (https://www.gnu.org/software/coreutils/) have a program realpath(1) which is similar to realpath(3) function, but have option:
-m, --canonicalize-missing no components of the path need exist
And your task can be done by canonicalize_filename_mode() function from file lib/canonicalize.c of the coreutils source.
canonicalize_filename_mode() from Gnulib is a great option but cannot be used in commercial software (GPL License)
We use the following implementation that depends on cwalk library:
#define _GNU_SOURCE
#include <unistd.h>
#include <stdlib.h>
#include "cwalk.h"
/* extended version of canonicalize_file_name(3) that can handle non existing paths*/
static char *canonicalize_file_name_missing(const char *path) {
char *resolved_path = canonicalize_file_name(path);
if (resolved_path != NULL) {
return resolved_path;
}
/* handle missing files*/
char *cwd = get_current_dir_name();
if (cwd == NULL) {
/* cannot detect current working directory */
return NULL;
}
size_t resolved_path_len = cwk_path_get_absolute(cwd, path, NULL, 0);
if (resolved_path_len == 0) {
return NULL;
}
resolved_path = malloc(resolved_path_len + 1);
cwk_path_get_absolute(cwd, path, resolved_path, resolved_path_len + 1);
free(cwd);
return resolved_path;
}

How to get Module HANDLE from func ptr in Win32?

I'm working on native call bindings for a virtual machine, and one of the features is to be able to look up standard libc functions by name at runtime. On windows this becomes a bit of a hassle because I need to get a handle to the msvcrt module that's currently loaded in the process. Normally this is msvcrt.dll, but it could be other variants as well (msvcr100.dll, etc) and a call to GetModuleHandle("msvcrt") could fail if a variant with a different name is used.
What I would like to be able to do is a reverse lookup, take a function pointer from libc (which I have in abundance) and get a handle to the module that provides it. Basically, something like this:
HANDLE hlibc = ReverseGetModuleHandle(fprintf); // Any func from libc should do the trick
void *vfunc = GetProcAddress(hlibc);
Is there such a thing in the win32 API, without descending into a manual walk of process handles and symbol tables? Conversely, if I am over-thinking the problem, is there an easier way to look up a libc function by name on win32?
The documented way of obtaining the module handle is by using GetModuleHandleEx.
HMODULE hModule = NULL;
if(GetModuleHandleEx(GET_MODULE_HANDLE_EX_FLAG_FROM_ADDRESS |
GET_MODULE_HANDLE_EX_FLAG_UNCHANGED_REFCOUNT, // behave like GetModuleHandle
(LPCTSTR)address, &hModule))
{
// hModule should now refer to the module containing the target address.
}
MEMORY_BASIC_INFORMATION mbi;
HMODULE mod;
if (VirtualQuery( vfunc, &mbi, sizeof(mbi) ))
{
mod = (HMODULE)mbi.AllocationBase;
}
Unfortunately you will have to walk through modules as you feared. It's not too bad though. Here is the idea, some code written in notepad:
MODULEENTRY32 me = {0};
HANDLE hSnapshot = CreateToolhelp32Snapshot( TH32CS_SNAPMODULE, 0 );
me.dwSize = sizeof me;
Module32First( hSnapshot, &me );
if( me.modBaseAddr <= funcPtr &&
( me.modBaseAddr + me.modBaseSize ) > funcPtr ) {
...
break;
}
do {
} while( Module32Next( hSnapshot, &me ) );
CloseHandle( hSnapshot );

How to get mounted drive's volume name in linux using C?

I'm currently working on program, which must display information about mounted flash drive. I want to display full space, free space, file system type and volume name. But problem is that, i can't find any API through which i can get volume name(volume label). Is there any api to do this?
p.s. full space, free space and file system type i'm getting via statfs function
Assuming that you work on a recent desktop-like distribution (Fedora, Ubuntu, etc.), you have HAL daemon running and a D-Bus session.
Within org.freedesktop.UDisks namespace you can find the object that represents this drive (say org/freedekstop/UDisks/devices/sdb/. It implements org.freedesktop.UDisks.interface. This interface has all the properties that you can dream of, including UUID (IdUuid), FAT label (IdLabel), all the details about filesystem, SMART status (if the drive supports that) etc. etc.
How to use D-Bus API in C is a topic for another question. I assume that's been already discussed in detail -- just search [dbus] and [c] tags.
Flash drives are generally FAT32, which means the "name" that you're looking for is probably the FAT drive label. The most common linux command to retrieve that information is mlabel from the mtools package.
The command looks like this:
[root#localhost]$ mlabel -i /dev/sde1 -s ::
Volume label is USB-DISK
This program works by reading the raw FAT header of the filesystem and retrieving the label from that data. You can look at the source code for the applciation to see how you can replicate the parsing of FAT data in your own application... or you can simply execute run the mlabel binary and read the result into your program. The latter sounds simpler to me.
To call the methods:
kernResult = self->FindEjectableCDMedia(&mediaIterator);
if (KERN_SUCCESS != kernResult) {
printf("FindEjectableCDMedia returned 0x%08x\n", kernResult);
}
kernResult = self->GetPath(mediaIterator, bsdPath, sizeof(bsdPath));
if (KERN_SUCCESS != kernResult) {
printf("GetPath returned 0x%08x\n", kernResult);
}
and the methods:
// Returns an iterator across all DVD media (class IODVDMedia). Caller is responsible for releasing
// the iterator when iteration is complete.
kern_return_t ScanPstEs::FindEjectableCDMedia(io_iterator_t *mediaIterator)
{
kern_return_t kernResult;
CFMutableDictionaryRef classesToMatch;
// CD media are instances of class kIODVDMediaTypeROM
classesToMatch = IOServiceMatching(kIODVDMediaClass);
if (classesToMatch == NULL) {
printf("IOServiceMatching returned a NULL dictionary.\n");
} else {
CFDictionarySetValue(classesToMatch, CFSTR(kIODVDMediaClass), kCFBooleanTrue);
}
kernResult = IOServiceGetMatchingServices(kIOMasterPortDefault, classesToMatch, mediaIterator);
return kernResult;
}
// Given an iterator across a set of CD media, return the BSD path to the
// next one. If no CD media was found the path name is set to an empty string.
kern_return_t GetPath(io_iterator_t mediaIterator, char *Path, CFIndex maxPathSize)
{
io_object_t nextMedia;
kern_return_t kernResult = KERN_FAILURE;
DADiskRef disk = NULL;
DASessionRef session = NULL;
CFDictionaryRef props = NULL;
char * bsdPath = '\0';
*Path = '\0';
nextMedia = IOIteratorNext(mediaIterator);
if (nextMedia) {
CFTypeRef bsdPathAsCFString;
bsdPathAsCFString = IORegistryEntryCreateCFProperty(nextMedia,CFSTR(kIOBSDNameKey),kCFAllocatorDefault,0);
if (bsdPathAsCFString) {
//strlcpy(bsdPath, _PATH_DEV, maxPathSize);
// Add "r" before the BSD node name from the I/O Registry to specify the raw disk
// node. The raw disk nodes receive I/O requests directly and do not go through
// the buffer cache.
//strlcat(bsdPath, "r", maxPathSize);
size_t devPathLength = strlen(bsdPath);
if (CFStringGetCString( (CFStringRef)bsdPathAsCFString , bsdPath + devPathLength,maxPathSize - devPathLength, kCFStringEncodingUTF8)) {
qDebug("BSD path: %s\n", bsdPath);
kernResult = KERN_SUCCESS;
}
session = DASessionCreate(kCFAllocatorDefault);
if(session == NULL) {
qDebug("Can't connect to DiskArb\n");
return -1;
}
disk = DADiskCreateFromBSDName(kCFAllocatorDefault, session, bsdPath);
if(disk == NULL) {
CFRelease(session);
qDebug( "Can't create DADisk for %s\n", bsdPath);
return -1;
}
props = DADiskCopyDescription(disk);
if(props == NULL) {
CFRelease(session);
CFRelease(disk);
qDebug("Can't get properties for %s\n",bsdPath);
return -1;
}
CFStringRef daName = (CFStringRef )CFDictionaryGetValue(props, kDADiskDescriptionVolumeNameKey);
CFStringGetCString(daName,Path,sizeof(Path),kCFStringEncodingUTF8);
if(daName) {
qDebug("%s",Path);
CFRetain(daName);
}
CFRelease(daName);
CFRelease(props);
CFRelease(disk);
CFRelease(session);
CFRelease(bsdPathAsCFString);
}
IOObjectRelease(nextMedia);
}
return kernResult;
}

What is the Windows equivalent of "pidof" from Linux?

In a batchscript, I need to get a list of process IDs with given binary path C:\path\to\binary.exe.
In Linux, I can just do pidof /path/to/binary.
Is there a Win32 executable which does the same, supported from WinXP Home to Win7 (tasklist won't work)?
The package which includes this has to be portable, so a 10MB download is not what I'm looking for.
Is there a C function available which does this and is supported from WinXP to Win7? Note: I want to match a process path, not a filename which could be used by other applications too.
wmic.exe is available on XP, Vista and 7 and can do this. However, it does not come with Windows XP Home edition.
wmic process where ExecutablePath='C:\\windows\\system32\\notepad.exe' get ProcessId
If you want support for Windows XP Home too, you can use EnumProcess and GetModuleFileNameEx. The downside here is that you won't be able to query names of processes running by another user if you're not running as administrator. QueryFullProcessImageName will probably do the trick here, but it's Vista+.
If that's not enough, you'll need Process32First (swatkat's code). For each process you need to call Module32First and then get MODULEENTRY32->szExePath. Note that even this is not completely portable and will not work well on x64 where you'll need QueryFullProcessImageName.
You can use Toolhelp APIs to enumerate processes, get their full path and compare it with the required process name. You need to walk the modules list for each process. The first module in the list is the process executable itself. Here's a sample code:
int main( int argc, char* argv[] )
{
if( argc > 1 )
{
printf( "\nGetting PID of: %s\n", argv[1] );
HANDLE hProcSnapshot = ::CreateToolhelp32Snapshot( TH32CS_SNAPPROCESS, 0 );
if( INVALID_HANDLE_VALUE != hProcSnapshot )
{
PROCESSENTRY32 procEntry = {0};
procEntry.dwSize = sizeof(PROCESSENTRY32);
if( ::Process32First( hProcSnapshot, &procEntry ) )
{
do
{
HANDLE hModSnapshot = ::CreateToolhelp32Snapshot( TH32CS_SNAPMODULE, procEntry.th32ProcessID );
if( INVALID_HANDLE_VALUE != hModSnapshot )
{
MODULEENTRY32 modEntry = {0};
modEntry.dwSize = sizeof( MODULEENTRY32 );
if( Module32First( hModSnapshot, &modEntry ) )
{
if( 0 == stricmp( argv[1], modEntry.szExePath ) )
{
printf( "\nPID: %ld\n", procEntry.th32ProcessID );
::CloseHandle( hModSnapshot );
break;
}
}
::CloseHandle( hModSnapshot );
}
}
while( ::Process32Next( hProcSnapshot, &procEntry ) );
}
::CloseHandle( hProcSnapshot );
}
}
return 0;
}
You can write a small C# application which first calls Process.GetProcessesByName(String) , then go over the results and print the Id property of each one when the MainModule.FileName is equal to the path you are looking for.
PowerShell can solve your problems, if is buit in in Win 7, and downloadable on the other OSs.
param($fileName)
Get-Process | where -FilterScript {$_.MainModule.FileName -eq $fileName}
This script will receive one parameter, the filename you are looking for, and it will output the filename of its executable.
You can call this from a bat file by doing:
powershell -Command "& {Get-Process | where -FilterScript {$_.MainModule.FileName -eq %FILENAME%}"

Resources