I've got a conditional systemd service foo.service with condition ConditionDirectoryNotEmpty=/tmp/foo. It's a one-shot service to empty /tmp/foo.
This works correctly in isolation. If I start the service when /tmp/foo holds a file, the service runs and the file is removed. If not, the service is skipped. Either way, it runs in less than a second.
The problem arises when I try to start this service from an associated timer. The timer gets stuck in a broken state. systemctl list-timers shows that foo.timer has a NEXT entry in the past (!) and it fired one millisecond ago. Indeed, it appears to be continously firing. That's of course not the timer setting:
[Timer]
OnUnitActiveSec=
OnUnitActiveSec=60s
Persistent=true
60 seconds is a lot more than one minute. The empty OnUnitActiveSec= line is intentional; it should clear any existing timer period.
Why is the timer going berserk? Why is it firing so often, why is the next time to fire in the past? Most importantly, how do I run a service once a minute, but only if needed?
systemd version 215, on Debian 8 (Armbian 5.30)
There are apparently two parts to this question, each of which individually does not solve the problem.
The timer should have OnBootSec=... which defines the first time that the service runs after boot. The OnUnitActiveSec=60s only defines the interval between timer events, and does not tell you when the timer first fires.
Secondly, it appears that the service you're starting should be disabled. That doesn't make much sense to me either, but remember that a disabled service can still be started explicitly by systemctl start foo.service. It looks like an "enabled service" is merely one that is started automatically when its conditions are met (typically at boot). That's not how this one-shot service should behave; it should just run in response to the timer.
The start of foo.service can still be controlled by conditions in the service itself; this is still the case when the service start request is initiated by a timer.
Related
I am working on an application project on Debian Linux which involves software watchdog to monitors other services by PID file created by services.
I am following the steps from http://linux.die.net/man/5/watchdog.conf
and installed it by
apt-get install watchdog
The mechanism behind is that watchdog checks these PID files existence those are configured in /etc/watchdog,conf file.
I have tested it by stopping any service by
service service-name stop
Watchdog will detect that service is not in running state hence it reboot the system after some seconds equal to watchdog timeout period.
Consider we have a display less product then it would rebooting the system infinite time without any intimation to end user in case of a service's configuration files are corrupted etc.
The practical expectation is that before taking action by watchdog for reboot/halt/soft-restart I am want to know the status of watchdog so that programmer can implement intimation logic for end user.
Otherwise can it possible to modify watchdog init script in /etc/init.d/ to call user program on stopping the software watchdog so that programmer will able to maintain a counter in non-volatile memory to avoid infinite time reboot.
Except above I want more about this software watchdog or watchdog daemon to get status. I have implemented it to monitor services, CPU overload, temperature etc but I am not getting any event before watchdog action hence I am not getting why the system restarting due to a service down, CPU overheat or CPU overload etc.
A watchdog is designed as a last resort to rescue a system after it has failed beyond recovery. A hardware watchdog will physically reset the CPU, and is used to make sure that a system doesn't hang for long periods.
There is no way to receive a warning that this will happen in software because it's assumed that all software has failed.
If you need a solution that detects that a process is no longer responding, you should make that separate from the watchdog.
See the answers to this question for something similar:
Designing a monitor process for monitoring and restarting processes
In my websocket server developed with Erlang, I would like to use a timer (start_timer/3), for each connection, to terminate the connection if the timeout elapses without receiving a "ping" from the client.
Do Erlang timers scale well, assuming I will have a large number of client connestions?
What is a large number of connections? Erlangs VM uses a timer wheel internally to handle the timers so it scales pretty well up to some thousand connections. Then you might run into trouble.
Usually the trick is to group pids together on timers. This is also what kernels tend to do. If for instance you have a timer that has to awake in 200ms you schedule yourself ahead of time not on the next, but the next 200ms timer again. This means you will wait at least 200ms and perhaps 400ms, 300ms being typical. By approximating timers like this, you are able to run many more since you can have a single timer wake up large numbers of processes in one go. But depending on the timer frequency and amounts of timers a standard send_after/3 may be enough.
In any case, I would start by assuming it can scale and then handle the problem if it can't by doing approximate timing like envisioned above.
A usual pattern for this kind of server is to take advantage of the light weight Erlang processes, and create a server per connection.
You can build for example your server using a gen_server behavior that provide you both
the different state to manage the connection (waiting for a connection, login,...) with the State variable,
individual timeout for each connection and at each state, managed by the VM and the OTP behavior.
The nice thing is that each server has to take care of a single client, so it is really easier to write.
The init phase should launch one server waiting for connection,
Then on connection the server should launch a new one ready for next client (ideally through a supervisor launching simple_one_for_one child) and go to login step or whatever you want to do.
You will find very interesting information on the site LearnYouSomeErlang, particularly in the chapter http://learnyousomeerlang.com/supervisors and the following ones.
I have a process wherein a program running in an application server must access a table in an Oracle database server whenever at least one row exists in this table. Each row of data relates to a client requesting some number crunching performed by the program. The program can only perform this number crunching serially (that is, for one client at a time rather than multiple clients in parallel).
Thus, the program needs to be informed of when data is available in the database for it to process. I could either
have the program poll the database, or
have the database trigger the program.
QUESTION 1: Is there any conventional wisdom why one approach might be better than the other?
QUESTION 2: I wonder if programs have any issues "running" for months at a time (would any processes in the server stop or disrupt the program from running? -- if so I don't know how I'd learn there was a problem unless from angry customers). Anyone have experience running programs on a server for a long time without issues? Or, if the server does crash, is there a way to auto-start a (i.e. C language executable) program on it after the server re-boots, thus not requiring a human to start it specifically?
Any advice appreciated.
UPDATE 1: Client is waiting for results, but a couple seconds additional delay (from polling) isn't a deal breaker.
I would like to give a more generic answer...
There is no right answer that applies every time. Some times you need a trigger, and some times is better to poll.
But… 9 out of 10 times, polling is much more efficient, safe and fast than triggering.
It's really simple. A trigger needs to instantiate a single program, of whatever nature, for every shot. That is just not efficient most of the time. Some people will argue that that is required when response time is a factor, but even then, half of the times polling is better because:
1) Resources: With triggers, and say 100 messages, you will need resources for 100 threads, with 1 thread processing a packet of 100 messages you need resources for 1 program.
2) Monitoring: A thread processing packets can report time consumed constantly on a defined packet size, clearly indicating how it is performing and when and how is performance being affected. Try that with a billion triggers jumping around…
3) Speed: Instantiating threads and allocating their resources is very expensive. And don’t get me started if you are opening a transaction for each trigger. A simple program processing a say 100 meessage packet will always be much faster that initiating 100 triggers…
3) Reaction time: With polling you can not react to things on line. So, the only exception allowed to use polling is when a user is waiting for the message to be processed. But then you need to be very careful, because if you have lots of clients doing the same thing at the same time, triggering might respond LATER, than if you where doing fast polling.
My 2cts. This has been learned the hard way ..
1) have the program poll the database, since you don't want your database to be able to start host programs (because you'd have to make sure that only "your" program can be started this way).
The classic (and most convenient IMO) way for doing this in Oracle would be through the DBMS_ALERT package.
The first program would signal an alert with a certain name, passing an optional message. A second program which registered for the alert would wait and receive it immediatly after the first program commits. A rollback of the first program would cancel the alert.
Of cause you can have many sessions signaling and waiting for alerts. However, an alert is a serialization device, so if one program signaled an alert, other programs signaling the same alert name will be blocked until the first one commits or rolls back.
Table DBMS_ALERT_INFO contains all the sessions which have registered for an alert. You can use this to check if the alert-processing is alive.
2) autostarting or background execution depends on your host platform and OS. In Windows you can use SRVANY.EXE to run any executable as a service.
I recommend using a C program to poll the database and a utility such as monit to restart the C program if there are any problems. Your C program can touch a file once in a while to indicate that it is still functioning properly, and monit can monitor the file. Monit can also check the process directly and make sure it isn't using too much memory.
For more information you could see my answer of this other question:
When a new row in database is added, an external command line program must be invoked
Alternatively, if people aren't sitting around waiting for the computation to finish, you could use a cron job to run the C program on a regular basis (e.g. every minute). Then monit would be less needed because your C program will start and stop all the time.
You might want to look into Oracle's "Change Notification":
http://docs.oracle.com/cd/E11882_01/appdev.112/e25518/adfns_cqn.htm
I don't know how well this integrates with a "regular" C program though.
It's also available through .Net and Java/JDBC
http://docs.oracle.com/cd/E11882_01/win.112/e23174/featChange.htm
http://docs.oracle.com/cd/E11882_01/java.112/e16548/dbchgnf.htm
There are simple job managers like gearman that you can use to send a job message from the database to a worker. Gearman has among others a MySQL user defined function interface, so it is probably easy to build one for oracle as well.
I am making an application which runs on our every PC in random times. It works fine, however if the PC is currently shutting down, then I can't read the WMI and i get some errors. So I need to determinate if a PC is shutting down currently, and so i could avoid these errors. Does anyone has an idee?
Thanks!
Call GetSystemMetrics with index SM_SHUTTINGDOWN (0x2000).
Create a hidden top-level window and listen for WM_ENDSESSION messages. The value of wParam will tell you whether the entire system is going down, or whether the user is logging off.
If your app is a console app then use SetConsoleCtrlHandler to register to receive shutdown notifications.
Any attempt to detect this situation will have a race condition: the system shutdown might start immediately after you detect that it's not shutting down, but before you try to perform the operations that won't work during shutdown. Thus your approach to fixing the problem is wrong. Instead you just need to handle the WMI read failures and determine if they're cause by system shutdown, and in this case abort the operation or proceed in whatever alternate way makes sense.
It might be possible to use a sort of synchronous shutdown detection mechanism where you can actually lock/delay the shutdown for a brief interval before it proceeds, and do your processing in that interval. If so, that would also be a safe approach without race conditions.
I have a silverlight 4 application using the ClientHttp stack to make a WebRequest which serves a binary stream. I then read from this stream and do stuff. However, I have the following problem: the server buffers the data that it sends down, so that the send process is like send-pause-send-pause-send...
Sometimes the server takes a little longer pause (around 20 seconds), at which point the connection seems to somehow break. I don't get any exception in Silverlight, actually to the code it looks like the read from the web response stream finished ok (i.e. no more data). However, the server did not actually send all its data down (which I can test from a non-Silverlight application that will get more data after that pause). I'm thinking this might be some timeout issue (which from what I read around one can't set in Silverlight explicitly), but it's weird that I don't get an Exception indicating the timeout. Also, the pause is not that long, I would expect 20sec to be a reasonable time.
I've also looked at the TCP traffic and looks like after the pause, Silverlight sends a FIN message to the server. So it seems like it kind of times out and decides to break the connection, but it doesn't actually report the timeout as an Exception or give me any way to avoid it.
Any ideas what's actually going on and how could I prevent it?
Thanks!
UPDATE: Found the problem. There is a registry key that controls system-wide web request timeout behavior and some apps set it to 10 seconds (e.g. Install Anywhere) and "forget" to ever set it back. The key is this: HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Internet Settings\ReceiveTimeout
I changed it back to a greater value and now it works fine! Hth.
Merely quoting the OP but providing an answer:
UPDATE: Found the problem. There is a registry key that controls system-wide web request timeout behavior and some apps set it to 10 seconds (e.g. Install Anywhere) and "forget" to ever set it back. The key is this: HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Internet Settings\ReceiveTimeout
I changed it back to a greater value and now it works fine! Hth.