I am looking for a way to find out all the clearcase elements used while building the application.My application is linux based and it uses combination of Makefiles,ant scripts and shell scripting for building,I am thinking on something similar to clearaudit so I don't have to modify my existing build scripts.
Any tip/tool/help would be appreciated ,Thanks
Strictly speaking, the answer is no: there is no (free) readily available tool to replace clearaudit. Depending on what kind of data you're trying to collect, there are various ways to get the same data but it is platform dependent.
As VonC suggests, strace can be used on UNIX platforms to track read() calls. On Linux in particular, inotify based tools linux inotifywatch can be used.
On Windows, your options are much more complicated. The most straightforward programmatic solution is ReadDirectoryChangesW, but the way it works is broken by design but may be useful in some situations.
Commercially, ectool that ships with ElectricCommander appears to offer the same kind of functionality as a minor part of a larger feature-set.
There is a tool called Audited Objects. It audits all open files (not only on VOB) and provides query tool to get required list of files (you can choose only read or only written or both) from the audit file.
You can give it a try and check if it suits your needs.
I don't know of a native way in ClearCase of emulating clearaudit (like it is described in "About clearaudit")
The only other way to try and extract that information would be using strace
In the simplest case, strace runs the specified command until it exits.
It intercepts and records the system calls which are called by a process and the signals which are received by a process
strace -efile yourScript
That would list all accessed files during the execution of the script.
If you know where your ClearCase files are, you can extract from this list all the versioned files used.
But this is not a native ClearCase solution.
Related
I want to write software that will detect all used/created/modified/deleted files during the execution of a process (and its child processes). The process has not yet run - the user provides a command line which will later be subprocessed via bash, so we can do things before and after execution, and control the environment the command is run in.
I have thought of four methods so far that could be useful:
Parse the command line to identify files and directories mentioned. Assume all files explicitly mentioned are used. Check directories before/after for created/deleted files. MD5 existing files before/after to see any are modified. This works on all operating systems and environments, but obviously has serious limitations (doesnt work when command is "./script.sh")
Run the process via another process like strace (dtruss for OSX, and there are equivalent windows programs), which listens for system calls. Parse output file to find files used/modified/deleted/created. Pros that its more sensitive than MD5 method and can deal with script.sh. Cons that its very OS specific (dtruss requires root privileges even if the process being run does not - outputs of tools all different). Could also create huge log files if there are a lot of read/write operations, and will certainly slow things down.
Integrate something similar to the above into the kernel. Obviously still OS specific, but at least now we are calling the shots, creating common output format for all OS's. Wouldn't create huge log files, and could even stop hooking syscalls to, say, read() after process has requested the first read() to the file. I think this is what the tool inotify is doing, but im not familiar with it at all, nor kernel programming!
Run the process using the LD_PRELOAD trick (called DYLD_INSERT_LIBRARIES on OSX, not sure if it exists in Windows) which basically overwrites any call to open() by the process with our own version of open() which logs what we're opening. Same for write, read, etc. It's very simple to do, and very performant since you're essentially teaching the process to log itself. The downside is that it only works for dynamically-linked process, and i have no idea of the prevalence of dynamic/statically linked programs. I dont even know if it is possible before execution to tell if a process is dynamically or statically linked (with the intention of using this method by default, but falling back to a less-performant method if its not possible).
I need help choosing the optimal path to go down. I have already implemented the first method because it was simple and gave me a way to work on the logging backend (http://ac.gt/log) but really i need to upgrade to one of the other methods. Your advice would be invaluable :)
Take a look to the source code of "strace" (and its -f to trace children). It does basically what you are trying to do. It captures all the system calls of the process (or its childs) so you can grep for operations like "open", etc.
The following link provides some examples of implementing your own strace by using the ptrace system call:
https://blog.nelhage.com/2010/08/write-yourself-an-strace-in-70-lines-of-code/
My goal is to determine when executing a command, precisely which files it reads and writes. On Linux I can do this using ptrace (with work, akin to what strace does) and on FreeBSD and MacOS I can do this with the ktrace system command. What would you use to obtain this information on Windows?
My research so far suggests that I either use the debugger interface (similar to ptrace in many ways) or perhaps ETW. A third alternative is to interpose a DLL to intercept system calls as they are made. Unfortunately, I don't have the experience to guess as to how challenging each of these approaches will be.
Any suggestions?
Unfortunately it seems there is no easy way to intercept file level operations on Windows.
Here are some hints:
you could try to use FileMon from Sysinternals if it is enough for your needs, or try to look at the source of the tool
you could make use of commercial software like Detours - beware, I never used that myself and I'm not sure it really meets your needs
If you want a better understanding and are not frightened at doing it by hand, the Windows way of intercepting file I/O is using a File System Filter Driver. In fact, there is a FilterManager embedded in Windows system that can forward all file system calls to minifilters.
To build it, the interface with the system is provided by the FilterManager, and you have just (...) to code and install the minifilter that does the actual filtering - beware again never tested that ...
As you suggested, this is a fairly simple task to solve with API hooking with DLL injection.
This is a pretty good article about the application: API hooking revealed
I believe you can find more recent articles about the issue.
However, you probably need to use C++ to implement such a utility. By the way, programs can disable DLL injection. For example, I weren't able to use this approach on the trial version of Photoshop.
So, you may want to check if you can inject DLL files in the process you want with an existing solution before you start writing your own.
Please, take a look to the article CDirectoryChangeWatcher - ReadDirectoryChangesW all wrapped up.
It is a very old, but running, way to watch directory changes.
Microsoft owns a bunch of tools called Sysinternals. There is a program called Process Monitor that will show you all the file accesses for a particular process. This is very likely what you want.
Check this particular Stack Overflow question out for your question... This might help you:
Is there something like the Linux ptrace syscall in Windows?
Also, if you are running lower versions like Windows XP then you should check out Process Monitor.
Also, I would like you to check this out...
Monitoring certain system calls done by a process in Windows
If my process is loading a .so library and if a new version of the library is available is it possible to switch to the new library without doing a process restart ? Or the answer depends on things like whether there is a parameter change to one of the existing functions in the library ?
I am working in a pretty big system which runs 100s of processes and each loading 10s of libraries. The libraries provide specific functionality and are provided by separate teams. So when one of the library changes (for a bug fix lets say) ideal thing would be to publish it under-the-hood without impacting the running process. Is it possible ?
EDIT Thanks! In my case when a new library is available all the running processes have to start using it. Its not option to let them run with the old version and pick-up the new one later. So it looks like the safer option is to just reload the processes.
You cannot upgrade a linked library on the fly with a process running.
You could even try to, but if you succed (and you'll not fail with a "text file is in use" error message), you'll have to restart the process to make it mapping the new library into memory.
You can use lsof command to check which libraries are linked in (runtime or linktime):
lsof -p <process_pid> | grep ' mem '
One interesting technique, although it is somewhat prone to failure in the checkpoint restore step, is to do an invisible restart.
Your server process or whatever it is, saves all its necessary information into disk files. Including the file descriptor numbers and current states. Then, the server process does an exec system call to execute itself, replacing the current version of itself. Then it reads its state from the disk files and resumes serving its file descriptors as if nothing happened.
If all goes well, the restart is invisible and the new process is using all of the updated libraries.
At the very least, you have to make sure that the interface of the library does not change between versions. If that is assured, then I would try looking into dynamically loading the libraries with dlopen/dlsym and see if dlclose allows you to re-load.
I've never done any of this myself, but that's the path I'd pursue first. If you go this way, could you publish the results?
Linux provides several dynamic loader interfaces, and process can load dynamic librarys when running. dlopen and dlsysm provided by linux may solve your problem.
If you expect libaries to change on a fairly regular basis, and
you expect to maintain up-time, I think that your system
should be re-engineered so that such libraries actually become
loosely coupled components (e.g. services).
Having said that, my answer to the question is yes: under certain
circumstances, it possible to update shared libraries without
restarting processes. In most cases I expect it is not possible,
for instance when the API of your library changes, when the
arrangement of your data segment changes, when the library
maintains internal threads. The list is quite long.
For very small bug fixes to the code, you can still make use
of ptrace to write to the process memory space, and from
there redo what /lib/ld-linux.so does in terms of dynamic
linking. Honestly, it is an extremely complex activity.
ldd the binary of your process is one way to find out. although it is theoretically possible, it is not advisable to tinker with the running process, although i am sure utilities exist such as ksplice that tinker with the running linux kernels.
you can simply upgrade and the running process will continue with the old version, and pick up the new version when it restarts, assuming that your package management system is good and knows what is comptible to install.
You might want to learn about shared library versioning and the ld -h option.
One way to use it is as follows:
You maintain a version counter in your build system. You build the shared library with:
ld ..... -h mylibrary.so.$VERSION
however, you put it in your dev tree's lib as just plain mylibrary.so. (There's also a hack involving putting the entire .so into a .a file).
Now, at runtime, processes using the library look for the fully-versioned name. To roll out a new version, you just add the new version to the picture. Running programs linked against the old version continue to use it. As programs are relinked and tested against the new one, you roll out new executables.
Sometimes you can upgrade an in-use .so, and sometimes you cannot. This depends mostly on how you try to do it, but also on the safety guarantees of the kernel you're running on.
Don't do this:
cat new.so > old.so
...because eventually, your process may try to demand page something, and find that it's not in the correct spot anymore. It's a problem because the addresses of things may change, and it's still the same inode; you're just overwriting the bytes in the file.
However, if you:
mv new.so old.so
You'll be OK on most systems, because your running processes can hold onto a now-unnamed inode for the old library, while new invocations of your processes get the new file.
BUT, some kernels don't like to let you mv an in-use .so, perhaps out of caution, perhaps for their own simplicity.
Is there a way to write a C code that allow us to determine if a previous instance of an application is already running? I need to check this in a portable way for Linux and Windows, both using the last version of GCC avaiable.
Any examples of portable code would be of enormous help. I see two options now:
Check process list. Here linux has good tools, but I don't think the same functions apply to windows. Maybe some gnu libraries for both SO? What libraries, or functions?
Save and lock a file. Now, how to do that in a way that both systems can understand? One problem is where to save the file? Path trees are different from each systems. Also, if a relative path is chosen, two applications can still run with different locked files in different directories.
Thanks!
Beco.
PS. The SO have different requisites, so if you know one and not another, please answer. After all, if there is no portable "single" way, I still may be able to use #ifdef and the codes proposed as answer.
C language (not c++), console application, gcc, linux and windows
Unfortunately, if you limit yourself to C, you may have difficulty doing this portably. With C++, there's boost interprocess's named_mutex, but on C, you will have to either:
UNIXes (including Mac OS): Open and flock a file somewhere. Traditionally you will also write your current PID into this file. NOTE: This may not be safe on NFS; but your options are extremely limited there anyway. On Linux you can use a /dev/shm path if you want to ensure it's local and safe to lock.
Windows: Open and lock a named mutex
for windows, a mutex works well.
http://msdn.microsoft.com/en-us/library/ms682411(v=vs.85).aspx
the article also mentions an alternative to a mutex....
To limit your application to one instance per user, create a locked file in the user's profile directory.
The sort of canonical method in Unixland is to have the process write its own PID to a file in a known location. If this file exists, then the program can check its own pid (available by system call) with the one in that file, and if it's unfamiliar you know that another process is running.
C does not give in-built facilities to check if an application is already running, so, making it cross platform is difficult/impossible. However, on Linux, one can use IPC. And, on Windows (I'm not very experienced in this category), you may find this helpful.
I'm new to windows programming and I'm trying to get notified of all changes to the file system (similar to the information that FileMon from SysInternals displays, but via an API). Is a FindFirstChangeNotification for each (non-network, non-substed) drive my best bet or are there other more suitable C/C++ APIs?
FindFirstChangeNotification is fine, but for slightly more ultimate power you should be using ReadDirectoryChangesW. (In fact, it's even recommended in the documentation!)
It doesn't require a function pointer, it does require you to manually decode a raw buffer, it uses Unicode file names, but it is generally better and more flexible.
On the other hand, if you want to do what FileMon does, you should probably do what FileMon does and use IFS to create and install a file system filter.
There are other ways to do it, but most of them involve effort on your part (or take performance from your app, or you have to block a thread to use them, etc). FindFirstChangeNotification is a bit complicated if you're not used to dealing with function pointers, etc, but it has the virtue of getting the OS to do the bulk of the work for you.
Actually FileSystemWatcher works perfectly with shared network drives. I am using it right now in an application which, among other things, monitors the file system for changes. (www.tabbles.net).
You can use FileSystemWatcher class. Very efficient but cannot work with Network shared drives.