At some point my site, running on Apache2 with mod_wsgi just stops processing requests. The connection to server is maintained and client waits for responce, but it never is returned by apache. The server at this time is at 0% CPU, and nothing is processing. I think, apache just sends request to queue and never gets them out of there.
When I perform apache2ctl graceful the problem does not resolve. Only after apache2ctl restart.
My site is a 4 instance wsgi application of Pyramid and 2 instances of Zope 3. It is running normaly and does not have speed problems, that I am aware of.
versions:
Ubuntu 10.04
apache2 2.2.14-5ubuntu8.9
libapache2-mod-wsgi 2.8-2ubuntu1
Sounds like you are using embedded mode to run the multiple applications and you are using third party C extensions that have problems in sub interpreters, resulting in potential deadlock. Else your code is internally deadlocking or blocking on external services and never returning, causing exhaustion of available processes/threads.
For a start, you should look at using daemon mode and delegate each web application to a distinct daemon process group and then forcing each to run in the main interpreter.
See:
http://code.google.com/p/modwsgi/wiki/QuickConfigurationGuide#Delegation_To_Daemon_Process
http://code.google.com/p/modwsgi/wiki/ApplicationIssues#Python_Simplified_GIL_State_API
Otherwise use debugging tips described in:
http://code.google.com/p/modwsgi/wiki/DebuggingTechniques
for getting stack traces about what application is doing.
Related
I understand with the help of Signals, we can pass interrupts to the executing C programs, and direct them to behave according to the Handlers assigned. When ctrl+c is pressed, SIGINT is executed.
Currently I am running a setup, where in I have 2 systems. Both have servers(DiscoveryServers) capable of multicast extension(mDNS). I have made use of Signals like SIGHUP and SIGKILL.
I am trying to understand what SIGKILL and SIGHUP actually does here.
So in one of the systems, I have an extra server which registers to the local discovery server. This server is advertised throughout the network with the help of mDNS.
Now, through SIGHUP, i can see that the terminal close of the server is detected amd the corresponding handler is executed.
But when I shut the system down, both the server and the Discovery server of that system, should go off. But that is detected inconsistently( not always) through SIGHUP. I tried with SIGKILL and still the response is the same. It is very unclear as to why it is happening?
Is it because the mDNS is using UDP and the UDP is unreliable?
Eg: Consider A and B as two systemss. Both A and B have discoveryservers running (capable of mDNS: meaning they can advertise the server records throughout the network) which contains the list of servers running on their system respectively. An extra server (server E) running on A, registers to discovery server of A. Now the registered server records are also advertised. Because of mDNS, the discovery server in B also updates its cache with all the advertiesments from system A. B can be seen as a client, runs a particular API, to get the list of servers running on the netowrk. All of this works fine.
If i close the terminal of the extra server on purpose, SIGHUP handler works smoothly. I can see from logs of both the discovery servers that the extra server has been removed.
Now if i shutdown the system A unexpectedly, the terminals running the server applications should close. That can be achieved with SIGHUP handler. What i observe from the log of system B is that, sometimes there are clean removal of the servers, sometimes partial.
There is no clarity as to why it is happening.
According to https://cloud.google.com/trace/docs/setup/php, App Engine flexible environment for PHP can run a daemon that sends trace spans to Stackdriver in the background rather than as part of the request processing (which could cause increased response latency).
I am running Kubernetes Engine, but would still like to send trace requests in the background. Therefore:
Is it possible to run that batch daemon myself?
Out of curiosity, how does the Stackdriver PHP Exporter pass these spans to the daemon? I tried to search for that in the source code, but could not find out how it is done.
If #1 is not possible, is there another way to perform span sending in the background?
Stackdriver Trace with Google Cloud Run seems to cover a similar topic, but does not address how to run the daemon manually.
In case anyone else is looking for this, I was able to run the batch daemon as follows:
sudo -u www-data -E vendor/bin/google-cloud-batch daemon
Note that the daemon itself must be run as the same user as your “serving” PHP processes in order to access SysV memory shared between both, hence the sudo.
You will also need the PHP sysv and pcntl extensions enabled.
I know this should be obvious, but I have found far too many DIFFERENT answers and the ones I've tried all fail (sometimes or all the time), so...
We are working on a service and some applications that run at startup on a Windows 10 computer that performs an automatic login. The service and applications require Windows sockets for TCP, UDP and Multicast. Most of the time, our programs fail because they get errors about the network not being ready and such. Currently, we work around this by just adding a dumb, fixed length delay time before attempting to start, but we would prefer to start as soon at the network is ready to be used.
Our most recent attempt was to wait on the LanmanWorkstation (Workstation) service, but that generally reports it is running/ready before the sockets functions will succeed. I have also seen suggestions to use LanmanServer (Server) or Netman (Network Connections) or maybe even Tcpip (TCP/IP Protocol Driver), but I cannot find anything definitive. One would think this is a common requirement, so why would Microsoft make the info so difficult to find?
Ahem. Does any know a definitive method for a service or application to wait until winsock functions will succeed before using them? Short of a spin wait on a failing winsock function, of course!
We use the rsyslogd daemon for logging debug messages in several applications. With full debug on, the server will hang in the openlog() call being made in the application to log messages. Looks to be doing a lock in the kernel library code in glibc.
Is this a known issue under load or are we exposing some contention issue in the kernel? RHEL 6.4 running kernel 2.6.32-358.2.1 (i386).
rsyslog version was 8.4.1 but we upgraded to latest this afternoon 8.17 and are still running tests to see if the problem continues.
I have an system running embedded linux and it is critical that it runs continuously. Basically it is a process for communicating to sensors and relaying that data to database and web client.
If a crash occurs, how do I restart the application automatically?
Also, there are several threads doing polling(eg sockets & uart communications). How do I ensure none of the threads get hung up or exit unexpectedly? Is there an easy to use watchdog that is threading friendly?
You can seamlessly restart your process as it dies with fork and waitpid as described in this answer. It does not cost any significant resources, since the OS will share the memory pages.
Which leaves only the problem of detecting a hung process. You can use any of the solutions pointed out by Michael Aaron Safyan for this, but a yet easier solution would be to use the alarm syscall repeatedly, having the signal terminate the process (use sigaction accordingly). As long as you keep calling alarm (i.e. as long as your program is running) it will keep running. Once you don't, the signal will fire.
That way, no extra programs needed, and only portable POSIX stuff used.
The gist of it is:
You need to detect if the program is still running and not hung.
You need to (re)start the program if the program is not running or is hung.
There are a number of different ways to do #1, but two that come to mind are:
Listening on a UNIX domain socket, to handle status requests. An external application can then inquire as to whether the application is still ok. If it gets no response within some timeout period, then it can be assumed that the application being queried has deadlocked or is dead.
Periodically touching a file with a preselected path. An external application can look a the timestamp for the file, and if it is stale, then it can assume that the appliation is dead or deadlocked.
With respect to #2, killing the previous PID and using fork+exec to launch a new process is typical. You might also consider making your application that runs "continuously", into an application that runs once, but then use "cron" or some other application to continuously rerun that single-run application.
Unfortunately, watchdog timers and getting out of deadlock are non-trivial issues. I don't know of any generic way to do it, and the few that I've seen are pretty ugly and not 100% bug-free. However, tsan can help detect potential deadlock scenarios and other threading issues with static analysis.
You could create a CRON job to check if the process is running with start-stop-daemon from time to time.
use this script for running your application
#!/bin/bash
while ! /path/to/program #This will wait for the program to exit successfully.
do
echo “restarting” # Else it will restart.
done
you can also put this script on your /etc/init.d/ in other to start as daemon