I have a process that seems to be hanging on solaris, I have tried to attach to the process using GDB to see what it is doing but no luck.
There is no error from what I can see, it is just sitting there...
Are there any other tools or techniques I can use to see what the process is stuck on?
Thanks for the help
Lynton
pstack <pid> will print you what all the threads within this process are doing (full stack traces, including function names, if your binary is not stripped.
truss is Linux's strace equivalent. It will show all the system calls that the process is doing. It might help you in debugging.
DTrace is a great debugging swiss-army-knife that can show you pretty much anything you can think of. The downside is that it needs to be run with root permissions on a global zone. It takes some time to learn, but it's time well worth spending.
Use powerful dtrace facility.
Here the short introduction how to trace user processes.
Related
I was looking through stackoverflow for the best profiling technique.
I have a bunch of processes running 24/7, written in C and using Oracle 10g. I have discovered several tools I want to try: oprofile, strace, systemtap and dtrace.
I want to start with dtrace and thus I was looking for some simple dtrace script that will connect to running process' and print out all function calls, time spent in each ... maybe callgraph. Please, suggest some good script to start with, any links, tutorials, manuals.
Simple. No DTrace on Linux (last I heard).
If you crave for DTrace and are willing to give a real operating system a try (uh-oh, flamebait :-), try FreeBSD which comes with a functional and integrated DTrace.
I want to record system calls (including parameters) invoked by an application from the kernel. Somebody told me I can hook all system calls or hook the sysenter, however, I don’t know how to do it.
By the way, I have tried the strace utility, but it seemed that the strace provided me more system calls than what I expected. For example: I build a program containing only "open, lseek, read, write and close" system calls for a simple file operation, but strace returned me more system calls, such as "access, fstat64 and so on", than those mentioned above. why?
strace is going to be a much easier way to go.
The extra system calls you're seeing are those performed by your process before your code takes control - for example, the dynamic loader loading the libc library.
You might want to try attaching strace to a running process
strace -p pid
It might be a good idea to run the program, have it wait for an event, attach to it and then trigger the event.
Cheers!
There are numerous ways to trace system calls, both from user-space (strace) and from kernel-space. I would recommend starting with strace and using this as long as it suites your needs. Moving on to other solution requires greater learning curves.
To address your need to filter the output of strace, use the -e option. See man strace for instructions on using it to limit what you are capturing.
I have a very nice idea for a kernel patch, and I want to conduct some research and see code examples before I shape my idea.
I'm looking for interesting code examples that would demonstrate advanced usage of procfs (the Linux /proc file system). By interesting, I mean more than just reading a documented value.
My idea is to provide every process with an easy broadcast mechanism. For example, let's consider a process that runs multiple instances of rsync and wants to check the transfer status (how many bytes have been transfered so far) for each child. Currently, I don't know of any way that can be done.
I intend to provide the process with a minimal interface to write data to the procfs. That data would be placed under the PID directory. For example:
/procfs/1343/data_transfered/incoming
I can think of numerous advantage for this, mainly in the concurrency field.
By the way, if such a mechanism already exists, do tell...
Yes, I've written stuff that pokes around in /proc. I suspect you are unlikely to get linux kernel patches accepted that do anything with proc, unless they are just fixing something that is already there that was broken in some way.*
/sysfs seems to be where things are moving.
/proc was originally for process information, but a lot of misc. driver stuff ended up in there.
*well, maybe they'll take it if whatever you're doing has to do with processes, and isn't in a driver.
Go look at the source code for the procps package for code that uses /proc
http://github.com/tialaramex/leakdice/tree/master
Uses proc to figure out the memory address layout of a process, and dump random pages from its heap (for reasons which are explained in its documentation).
HI, i am recently in a project in linux written in C.
This app has several processes and they share a block of shared memory...When the app run for about several hrs, a process collapsed without any footprints so it's very diffficult to know what the problem was or where i can start to review the codes....
well, it could be memory overflown or pointer malused...but i dunno exactly...
Do you have any tools or any methods to detect the problems...
It will very appreciated if it get resolved. thanx for your advice...
Before you start the program, enable core dumps:
ulimit -c unlimited
(and make sure the working directory of the process is writeable by the process)
After the process crashes, it should leave behind a core file, which you can then examine with gdb:
gdb /some/bin/executable core
Alternatively, you can run the process under gdb when you start it - gdb will wake up when the process crashes.
You could also run gdb in gdb-many-windows if you are running emacs. which give you better debugging options that lets you examine things like the stack, etc. This is much like Visual Studio IDE.
Here is a useful link
http://emacs-fu.blogspot.com/2009/02/fancy-debugging-with-gdb.html
Valgrind is where you need to go next. Chances are that you have a memory misuse problem which is benign -- until it isn't. Run the programs under valgrind and see what it says.
I agree with bmargulies -- Valgrind is absolutely the best tool out there to automatically detect incorrect memory usage. Almost all Linux distributions should have it, so just emerge valgrind or apt-get install valgrind or whatever your distro uses.
However, Valgrind is hardly the least cryptic thing in existence, and it usually only helps you tell where the program eventually ended up accessing memory incorrectly -- if you stored an incorrect array index in a variable and then accessed it later, then you will still have to figure that out. Especially when paired with a powerful debugger like GDB, however (the backtrace or bt command is your friend), Valgrind is an incredibly useful tool.
Just remember to compile with the -g flag (if you are using GCC, at least), or Valgrind and GDB will not be able to tell you where in the source the memory abuse occurred.
I want a C program to produce a core dump under certain circumstances. This is a program that runs in a production environment and isn't easily stopped and restarted to adjust other kinds of debugging code. Also, since it's in a production environment, I don't want to call abort(). The issues under investigation aren't easily replicated in a non-production environment. What I'd like is for the program, when it detects certain issues, to produce a core dump on its own, preferably with enough information to rename the file, and then continue.
void create_dump(void)
{
if(!fork()) {
// Crash the app in your favorite way here
*((void*)0) = 42;
}
}
Fork the process then crash the child - it'll give you a snapshot whenever you want
Another way might be to use the Google Coredumper library. This creates a similar result to the fork+abort technique but plays nicer with multithreaded apps (suspends all threads for a little while before forking so that they don't make a mess in the child).
Example:
#include <google/coredumper.h>
...
WriteCoreDump('core.myprogram');
/* Keep going, we generated a core file,
* but we didn't crash.
*/
Sun describes how to get a core file on Solaris, HP-UX, Redhat, and Windows here.
Solaris has the gcore program. HP-UX may have it.
Otherwise use gdb and its gcore commmand.
Windows has win-dbg-root\tlist.exe and win-dbg-root\adplus.vbs
Do you really want a core, or just a stacktrace ?
If all you want is a stacktrace you could take a look at the opensource here and try and integrate the code from there, or maybe just calling it from the command line is enough.
I believe some code in the gdb project might also be useful.
Another think you might want to do is to use gdb to attach to a running process.
$ gdb /path/to/exec 1234 # 1234 is the pid of the running process
The source code to produce a core dump is in 'gcore', which is part of the gdb package.
Also, the Sun has gcore.
Also, you have to have a separate process running the core dump, as the current process must be suspended. You'll find the details in the gcore source, or you can just run your platform's gcore with your process as the target.