How to find where a process is stuck using DDD - c

I have a TCP Svr process written in C and running on CentOS 5.5. It acts as a TCP Server for external clients and also does some IPC communication with other processes in the system using Unix Domain Sockets it has establised. It's not a multi threaded process. It does one task at a time. There's an epoll_wait() I use to listen for requests on either the TCP socket or any of the IPC sockets it has established with internal processes. When the epoll_wait() function breaks,I process the request for whoever it is and then go back into epoll_wait()
I have a TCP Client that connects to this Process from outside (not IPC). It connects sucessfully, sends a request msg, gets a response back. I've put this in an infinite loop
just to test out its robustness etc.
After a while, the TCP Server stops responding to requests coming from TCP Client. The TCP client connects successfully, sends a request message, but it doesnt get any response msg back from the TCP server.
So I reckon the TCP server is stuck somewhere else, trying to do something and has not returned to the epoll_wait() to process
other requests coming in. I've tried to figure it out using logs, but thats not helping me understand where exactly the process is stuck.
So I wanted to use any debugger that can give me some information (function name would be great), as to what the process is doing. Putting breakpoints, is overwhelming cause the TCP Server process has tons of files and functions....
I'm trying to use DDD on CentOS 5.5, to figureout whats going on. I attach to the process successfully. Then I click on "Step" or "Stepi" or "Next" button....
but nothing happens....
btw when I use Eclipse for debugging, and attach to this process (or any process), I always get "__kernel_vsyscall()"....Does this mean, the program breaks by default at
whatever its doing? If thats the case, how do I come out of the __kernel_vsyscall() call, to continue within my program? If I press f8, it comes out, but then I dont know where it was, since I loose the stack trace....Like I said earlier. Since I cant figure where it was, I dont know where to put breakpoint....
In summary, I want to figureout where my process is stuck or what its doing and try to debug from that point on....
How do I go about this?
Thanks
Amit

1) Attaching to a C process can often cause problems in itself, is there any way for you to start the process in the debugger?
2) Using the step functions of DDD need to be done after you've set a breakpoint and the program is stopped on a command. From reading your question, I'm not sure you've done that. You may not want to set many breakpoints, but is setting one or two in critical sections of code possible?

In summary, What I wanted to accomplish was to be able to find where my program is stuck, when it hangs. I figured it out - It was so simple. Create a configuration in Eclipse ...."Debug Configurations->C/C++ attach to application"...
Let the process run normally from shell (preferably with a terminal attached). When it hangs, open eclipse, click on the debug icon and run the configured process. It'll ask you to attach to a process. Look for your process name and attach to it.
Now, just look at the entire stack trace....you'll see some of your own function calls mixed with kernel function calls. That tells you where the program is stuck.

Related

mod_fcgid windows named pipes 2nd and subsequent requests

All, I'm building a fastCGI interface for a programming language which is run-time based, and runs in Windows, Linux and UNIX environments.
I've implemented the fastCGI protocol within program code that runs within that run-time but I'm having an issue with Windows code which talks to mod_fcgid. In this case I cannot use prebuilt dll's to expose fastCGI functions, but I can make calls to most C functions from within that runtime language. I cannot modify the runtime as it belongs to another company... Think of it as being a php or perl like language. What I'm trying to do is similar to creating a dll like set of code (its not a dll) to process a fastCGI request. While many would say I'm "re-inventing the wheel" here, I have no choice of using someone's prebuilt dll to provide the interface to fastCGI.
I've successfully implemented everything and I can get the initial request, and respond back with a web page thru my fastCGI interface. The problem I am having is dealing with the "next" request when running under Windows. My code when running under Linux works terrific, I accept() the socket, read() write() to do my processing, close() the socket, and then go back and accept() again, and I get the next request and everything processes perfectly.
Under Windows mod_fcgid uses named pipes. Within my code I use GetStdHandle() to get a handle to stdin, then use ReadFile() and WriteFile() and the data wrapped in the fastCGI protocol goes to mod_fcgid and then on to the browser when done with a request I use CloseHandle() and then am looping back to GetStdHandle() to wait for the next request.
Everything works perfectly for the first request, the browser gets my cgi output. The same code under Linux using sockets, gets the 2nd and subsequent requests and works like a charm.
My issues is: when running under Windows, after I process the first request, I cannot get mod_fcgid to send me a second request. It will end up killing my windows process and starts a new one in its place. Which of course is not what I want.
I must being doing something wrong between the time I send the fastCGI EndRequest and when I loop around to wait for the next request to come in.
To get the initial request from mod_fcgid, I use GetStdHandle() then I use ReadFile() and WriteFile() (all from kernel32.dll) and when I've finished the protocol with its EndRequest, I cannot get the code right to be able to receive a second request.
I've tried fflush() I've tried FileFlushBuffers(), I've tried not closing the handle I was given from GetStdHandle(), I simply cannot figure out what mod_fcgid needs from my windows app so that I can receive the second and subsequent requests.
After the first request, closing my handle, getting a handle to stdin from GetStdHandle, and then sitting on ReadFile, The ReadFile() comes back with 0 bytes, and GetLastError() always returns 6 (Invalid Handle).
I simply cannot figure out the C functions to use to cleanup after the first request is complete and to be able to wait for the next request to come in when running under Windows. As I've said before the code works perfect under Linux when using sockets instead of Windows which is using a handle from STDIN that is a named pipe.
Harry, your comment on using the NamedPipe functions did the trick, I needed to use FileFlushBuffers followed by a DisconnectNamedPipe at the end of the first request, and then a ConnectNamedPipe to wait for the next request to come in. Thank you again.

Running two socket applications within one program

I have two C/C++ socket programs, say server and client, and both communicate to each other through read and write. The entire flow works fine (i.e., communication, read, write) when I run the two programs on two separate terminals in localhost. To avoid manually starting the client program, I use system(exec_cmd_to_run_client_program) in my server program. However, doing so doesn't give me the correct result as that of two separate terminals. I do see server and client running in the job monitor, but the communication in between seems never happens. What could be the problem?
Also I tried using ssh library libssh in the server program to open a new ssh session and send execution command to run the client program. Again I see the same result as system call. Both programs showed up in the job monitor but communication never happens. Did I miss something?

In C on Linux, how would I go about using 2 programs, the latter sending text data to the first displayed using stdout?

I am writing a simple instant messenger program in C on Linux.
Right now I have a program that binds a socket to a port on the local machine, and listens for text data being sent by another program that connected to my local machine IP and port.
Well, I can have this client send text data to my program, and have it displayed using stdout on my local machine; however, I cannot program a way to send data back to the client machine, because my program is busy listening and displaying the text sent by the client machine.
How would I go about either creating a new process (that listens and displays the text sent to it by the client machine, then takes that text and sends it to the other program's stdout, while the other program takes care of stdin being sent to the client machine) or create 2 programs that do the separate jobs (sending, receiving, and displaying), and sends the appropriate data to one another?
Sorry if that is weirdly worded, and I will clarify if need be. I looked into exec, execve, fork, etc. but am confused as to whether this is the appropriate path to look in to, or if there is a simpler way that I am missing.
Any help would be greatly appreciated, Thank you.
EDIT: In retrospect, I figured that this would be much easier accomplished with 2 separate programs. One, the IM server, and the others, the IM clients.
The IM Clients would connect to the IM server program, and send whatever text they wanted to the IM server. Then, the IM server would just record the data sent to it in a buffer/file with the names/ip's of the clients appended to the text sent to it by each client, and send that text (in format of name:text) to each client that is connected.
This would remove the need for complicated inter-process/program communication for stdin and stdout, and instead, use a simple client/server way of communicating, with the client programs displaying text sent to it from server via stdout, and using stdin to send whatever text to the server.
With this said, I am still interested in someone answering my original question: for science. Thank you all for reading, and hopefully someone will benefit from my mental brainstorming, or whatever answers come from the community.
however, i cannot program a way to send data back to the client machine, because my program is busy listening and displaying the text sent by the client machine.
The same socket that was returned from a listening-socket by accept() can be used for both sending and receiving data. So your socket is never "busy" just because you're reading from it ... you can write back on the same socket.
If you need to both read and write concurrently, then share the socket returned from accept() across two different threads. Since two different buffers are being used by the networking stack for sending and receiving on the socket, a dedicated thread for reading and another dedicated thread for writing to the socket will be thread-safe without the use of mutexes.
I would go with fork() - create a child process and now you have two different processes that can do two different things on two different sockets- one can receive and the other can send. I have no personal experience with coding a client/server like this yet, but that would be my first stab at solving your issue...
As #bdonlan mentioned in a comment, you definitely need a multiplexing call like select or preferably poll (or related syscalls like pselect, ppoll ...). These multiplexing calls are the primitive to wait on several channels at once (with pselect and ppoll able to atomically wait for both I/O events and signals). Read also the select tutorial man page. Of course, you can wait for several file descriptors, and you can wait for both reading & writing abilities (even on the same socket, if needed), in the same select or poll syscall.
All event-based loops and frameworks are using these multiplexing calls (like poll or select). You could also use libevent, or even (particularly when coding a graphical user interface application) some GUI toolkit like Gtk or Qt, which are all based around a central event loop.
I don't think that having a multi-process or multi-threaded application is useful in your case. You just need some event loop.
You might also ask to get a SIGIO signal when data arrives on your socket using fcntl with F_SETOWN, but this is not very useful for you. Then you often want to have your socket non-blocking.

Arbitrary two-way UNIX socket communication

I've been working on a complex server-client system in C and I'm not sure how to implement the socket communication.
In a nutshell, the system is a server application which communicates with a database and uses a UNIX socket to communicate with one or more child processes created with fork(). The purpose of the children is to run game servers. The process of launching a game server is like this:
The server/"manager" identifies a game server in the database that is to be made. (Assume database communication is already sorted.)
The manager forks a child (the "game controller").
The game controller sets up two pipe pairs, then forks, replacing its child's stdin with a pipe, and it's stdout and stderr with another pipe.
The game controller's child then runs execlp() to begin running the actual game server executable.
My experience with sockets is fairly minimal. I have used select() on a server application before to 'multiplex' numerous clients, as demonstrated by the simple example in the GNU C documentation here.
I now have a new challenge, as the system must be able to do more: the manager needs to be able to arbitrarily send commands to the game controller children (that it will find by periodically checking the database) and get replies, but also expect incoming arbitrary commands/errors from them and send replies back.
So, I need a sort-of "context" system, where sockets are meaningful only between themselves. In other words, when a command is sent from the manager to the game controller, each party needs to be aware of who is asking and know what the reply is (and, therefore, which command it is a reply to).
Because select() is only useful for knowing when we have incoming data, and a thread should block on it, would I need another thread that sends data and gets the replies? Will this require each game controller, although technically a 'client', to use a listening socket and use select() as well?
I hope I've explained the system and the problem concisely; I will add more detail if required. Thanks!
Ok, I am still not really sure I understand exactly where your trouble is, so I will just spout off some things about writing a client/server app. If I am off track, just let me know.
The way that the server will know which clients corresponds to which socket is that the clients will tell the server. Essentially, you need to have a log-in protocol. When the game controller connects to the server, it will send a message that says "Hi, i am registering as controller foo1 on host xyz, port abc..." and whatever else the server needs to know about its clients. The server will keep a data structure that maps sockets to client metadata, state, etc. Whenever it gets a new message, it can easily map from the incoming host/port to its metadata. Or your protocol can require that on each incoming message, the will client send the name it registered with as a field.
Handling the request/response can be done several ways. First lets deal with the networking part of it on the server side. One way to manage this, as you mentioned, is by using select (or poll, or epoll) to multiplex the sockets. This is actually usually considered the more complicated way to do things. Another way is to spawn off a thread (or fork a process, which is less common these days) for each incoming client. Each spawned thread can read its own assigned socket, responding to messages one at a time without worrying about the fact that there are other clients besides the own it is dealing with. This simple one to one thread to socket model breaks down if there are many clients, but if that is not the case, then it is worth consideration.
Part 2 really covers only the client sending the server a message, and the server replying. What happens when the server wants to initiate communication? How does it do it and how does the client handle it? Also, how do you model the model the communication at the application level, meaning assuming we have the read/write part down, how do we know what to send? You will probably want to model things in terms of state machines. There is also a lot more to deal with like what happens when a client crashes? What about when the server crashes? Also, what if you really have your heart set of using select, perhaps because you expect many client? I will try to add more to this answer tomorrow.

Cleanest way to stop a process on Win32?

While implementing an applicative server and its client-side libraries in C++, I am having trouble finding a clean and reliable way to stop client processes on server shutdown on Windows.
Assuming the server and its clients run under the same user, the requirements are:
the solution should work in the following cases:
clients may each feature either a console or a gui.
user may be unprivileged.
clients may be or become unresponsive (infinite loop, deadlock).
clients may or may not be children of the server (direct or indirect).
unless prevented by a client-side defect, clients shall be allowed the opportunity to exit cleanly (free their ressources, sync some data to disk...) and some reasonable time to do so.
all client return codes shall be made available (if possible) to the server during the shutdown procedure.
server shall wait until all clients are gone.
As of this edit, the majority of the answers below advocate the use of a shared memory (or another IPC mechanism) between the server and its clients to convey shutdown orders and client status. These solutions would work, but require that clients successfully initialize the library.
What I did not say, is that the server is also used to start the clients and in some cases other programs/scripts which don't use the client library at all. A solution that did not rely on a graceful communication between server and clients would be nicer (if possible).
Some time ago, I stumbled upon a C snippet (in the MSDN I believe) that did the following:
start a thread via CreateRemoteThread in the process to shutdown.
had that thread directly call ExitProcess.
Unfortunately now that I'm looking for it, I'm unable to find it and the search results seem to imply that this trick does not work anymore on Vista. Any expert input on this ?
If you use thread, a simple solution is to use a named system event, the thread sleeps on the event waiting for it to be signaled, the control application can signal the event when it wants the client applications to quit.
For the UI application it (the thread) can post a message to the main window, WM_ CLOSE or QUIT I forget which, in the console application it can issue a CTRL-C or if the main console code loops it can check some exit condition set by the thread.
Either way rather than finding the client applications an telling them to quit, use the OS to signal they should quit. The sleeping thread will use virtually no CPU footprint provided it uses WaitForSingleObject to sleep on.
You want some sort of IPC between clients and servers. If all clients were children, I think pipes would have been easiest; since they're not, I guess a server-operated shared-memory segment can be used to register clients, issue the shutdown command, and collect return codes posted there by clients successfully shutting down.
In this shared-memory area, clients put their process IDs, so that the server can forcefully kill any unresponsive clients (modulo server privileges), using TerminateProcess().
If you are willing to go the IPC route, make the normal communication between client and server bi-directional to let the server ask the clients to shut down. Or, failing that, have the clients poll. Or as the last resort, the clients should be instructed to exit when the make a request to server. You can let the library user register an exit callback, but the best way I know of is to simply call "exit" in the client library when the client is told to shut down. If the client gets stuck in shutdown code, the server needs to be able to work around it by ignoring that client's data structures and connection.
Use PostMessage or a named event.
Re: PostMessage -- applications other than GUIs, as well as threads other than the GUI thread, can have message loops and it's very useful for stuff like this. (In fact COM uses message loops under the hood.) I've done it before with ATL but am a little rusty with that.
If you want to be robust to malicious attacks from "bad" processes, include a private key shared by client/server as one of the parameters in the message.
The named event approach is probably simpler; use CreateEvent with a name that is a secret shared by the client/server, and have the appropriate app check the status of the event (e.g. WaitForSingleObject with a timeout of 0) within its main loop to determine whether to shut down.
That's a very general question, and there are some inconsistencies.
While it is a not 100% rule, most console applications run to completion, whereas GUI applications run until the user terminates them (And services run until stopped via the SCM). Hence, it's easier to request a GUI to close. You send them the equivalent of Alt-F4. But for a console program, you have to send them the equivalent of Ctrl-C and hope they handle it. In both cases, you simply wait. If the process sticks around, you then shoot it down (TerminateProcess) and pray that the damage is limited. But your HDD can fill up with temporary files.
GUI application in general do not have exit codes - where would they go? And a console process that is forcefully terminated by definition does not exit, so it has no exit code. So, in a server shutdown scenario, don't expect exit codes.
If you've got a debugger attached, you generally can't shutdown the process from another application. That would make it impossible for debuggers to debug exit code!

Resources