I stumbled upon the following code:
if(prinkt_ratelimit())
printk(KERN_NOTICE "Something went wrong with...");
and wonder when/how I should use it.
Isn't there always the danger of the printk statement following the if-statement to not be executed because the ratelimit has just been reached by other debug messages? Initiating a retry would add some overhead leading to several lines of code just to print an error message.
What are your best practices? How and when do you use it?
Related
I encounter a problem currently with a custom STM32L151 board that I will try to explain here.
The program I am testing runs properly for some time, I get debug messages on puTTY as intended but at a time, the program seems to be "blocked".
It is pretty weird behaviour because the function which prints over UART is called (I put a breakpoint here to see if I reach this point) but I get no output on a terminal.
So I was wondering what could be the issue, if someone as an idea because I kinda run out of ideas to be honest, I tried to understand. I will assume there is no hardware issue, it is still possible but I do not think it could be that reason.
Also, the programs is aimed at receiving a FSK message and answering it and it seems that I have the same behaviour with the radio chip: I receive the message and send a response (I get the TxDone callback called which indicates that the FSK message has normally been sent but the device which waits for this response does not receive it).
So to sum up a bit: the program runs properly for a moment then "blocks" & I do not get any output anymore (debug or radio communication) but still runs (functions effectively called) and after a moment again, the program "unblocks" itself and runs properly again (debug messages work).
The device I work on is STM32L151 based, I work with Keil, UART config is: 19200 baudrate, 8 data bits, 1 stop bit, no parity, XON/XOFF flow control & the radio chip I use is SX1272.
If someone has any idea or any trail I can investigate on. If you need any further details, I am not sure I am accurate enough on the description of the problem but any help is appreciated.
I am trying to use SCIF inter-process communication on Xeon Phi. My program has two processes, one process writes data to another process using scif_writeto. Currently, I encountered an error " No device or address" for the scif_writeto API. I checked that the end point is set up correct, the offset is also returned correctly. I don't have any idea about what's going wrong here. Is there any good suggestion to debug this issue?
In user mode scif_writeto() returns -1 in case of fail and set errno to indicate the error. Possible errors are described in scif.h.
You could check the errno to debug your problem.
How would be the correct way to prevent a soft lockup/unresponsiveness in a long running while loop in a C program?
(dmesg is reporting a soft lockup)
Pseudo code is like this:
while( worktodo ) {
worktodo = doWork();
}
My code is of course way more complex, and also includes a printf statement which gets executed once a second to report progress, but the problem is, the program ceases to respond to ctrl+c at this point.
Things I've tried which do work (but I want an alternative):
doing printf every loop iteration (don't know why, but the program becomes responsive again that way (???)) - wastes a lot of performance due to unneeded printf calls (each doWork() call does not take very long)
using sleep/usleep/... - also seems like a waste of (processing-)time to me, as the whole program will already be running several hours at full speed
What I'm thinking about is some kind of process_waiting_events() function or the like, and normal signals seem to be working fine as I can use kill on a different shell to stop the program.
Additional background info: I'm using GWAN and my code is running inside the main.c "maintenance script", which seems to be running in the main thread as far as I can tell.
Thank you very much.
P.S.: Yes I did check all other threads I found regarding soft lockups, but they all seem to ask about why soft lockups occur, while I know the why and want to have a way of preventing them.
P.P.S.: Optimizing the program (making it run shorter) is not really a solution, as I'm processing a 29GB bz2 file which extracts to about 400GB xml, at the speed of about 10-40MB per second on a single thread, so even at max speed I would be bound by I/O and still have it running for several hours.
While the posed answer using threads might possibly be an option, it would in reality just shift the problem to a different thread. My solution after all was using
sleep(0)
Also tested sched_yield / pthread_yield, both of which didn't really help. Unfortunately I've been unable to find a good resource which documents sleep(0) in linux, but for windows the documentation states that using a value of 0 lets the thread yield it's remaining part of the current cpu slice.
It turns out that sleep(0) is most probably relying on what is called timer slack in linux - an article about this can be found here: http://lwn.net/Articles/463357/
Another possibility is using nanosleep(&(struct timespec){0}, NULL) which seems to not necessarily rely on timer slack - linux man pages for nanosleep state that if the requested interval is below clock granularity, it will be rounded up to clock granularity, which on linux depends on CLOCK_MONOTONIC according to the man pages. Thus, a value of 0 nanoseconds is perfectly valid and should always work, as clock granularity can never be 0.
Hope this helps someone else as well ;)
Your scenario is not really a soft lock up, it is a process is busy doing something.
How about this pseudo code:
void workerThread()
{
while(workToDo)
{
if(threadSignalled)
break;
workToDo = DoWork()
}
}
void sighandler()
{
signal worker thread to finish
waitForWorkerThreadFinished;
}
void main()
{
InstallSignalHandler;
CreateSemaphore
StartThread;
waitForWorkerThreadFinished;
}
Clearly a timing issue. Using a signalling mechanism should remove the problem.
The use of printf solves the problem because printf accesses the console which is an expensive and time consuming process which in your case gives enough time for the worker to complete its work.
What's the best practice for exiting C code on discovery of an error back to R? Package guidance says don't use exit(), which makes sense (as you kill everything), but how do you exit to R and indicate an error has occurred. Obviously you could have an error flag in the return vector, but is there a better way?
You're looking for error(). It's described in Section 6.2 of Writing R Extensions... and you should listen to your subconscious more often. ;-)
I am using the library Function ConnectToTCPServer. This function times out when the host is not reachable. In that case the application crashes with the following error:
"NON-FATAL RUN-TIME ERROR: "MyClient.c", line 93, col 15, thread id 0x000017F0: Library function error (return value == -11 [0xfffffff5]). Timeout error"
The Errorcode 11 is a Timeout error, so this could happen quite often in my application - however the application crashes - i would like to catch this error rather than having my application crash.
How can i catch this runtime error in Ansi C90?
EDIT:
Here is a Codesnippet of the current use:
ConnectToTCPServer(&srvHandle, srvPort, srvName, HPMClientCb, answer, timeout);
with
int HPMClientCb(UINT handle, int xType, int errCode, void *transData){
printf("This was never printed\n");
return errCode;
}
The Callbackfunction is never called. My Server is not running, so ConnectToTCPServer will timeout. I would suspect that the callback is called - but it never is called.
EDIT 2: The Callback function is actually not called, the Returnvalue of ConnectToTCPServer contains the same error information. I think it might be a bug that ConnectToTCPServer throws this error. I just need to catch it and bin it in C90. Any Ideas?
EDIT 3: I tested the Callbackfunction, on the rare occaision that my server is online the callback function is actually called - this does not help though because the callback is not called when an error occurs.
Looking in NI documentation, I see this:
"Library error breakpoints -- You can set an option to break program execution whenever a LabWindows/CVI library function returns an error during run time. "
I would speculate they have a debug option to cause the program to stop on run-time errors, which you need to disable in configuration, in compile time or in run-time.
My first guess would have been configuration value or compilation flag, but this is the only option I found, which is a run-time option:
// If debugging is enabled, this function directs LabWindows/CVI not
// to display a run-time error dialog box when a National Instruments
// library function reports an error.
DisableBreakOnLibraryErrors();
Say if it helped.
Theres no such thing as a general case of "catching" an error (or an 'exception') in standard C. Thats up to your library to decide what to do with it. Likely its logging its state and then simply calling abort(). In Unix, that signals SIGABRT which can be handled and not just exit()ed. Or their library may just be logging and then calling exit().
You could run your application under a utility like strace to see what system calls are being performed and what signals are being asserted.
I'd work with your vendor if you can't make any headway otherwise.
From the documentation, it seems you should get a call to your clientCallbackFunction when an error occurs. If you don't, you should edit your question to clarify that.
I'm not sure I understand you.
I looked at the documentation for the library function ConnectToTCPServer(). It returns an int; 0 means success, negative numbers are the error codes.
EDIT: Here is a Codesnippet of the
current use:
ConnectToTCPServer(&srvHandle, srvPort, srvName, HPMClientCb, answer, timeout);
If that's really the current use, you don't seem to be trying to tell whether ConnectToTCPServer() succeeds. To do that, you'd need
int err_code;
...
err_code = ConnectToTCPServer(&srvHandle, srvPort, srvName, HPMClientCb, answer, timeout);
and then test err_code.
The documentation for ConnectToTCPServer()implies that your callback function won't be called unless there's a message from a TCP server. No server, no message. In that case,
ConnectToTCPServer() should return a negative number.
You should check the return value of ConnectToTCPServer().
Finding a negative number there, you should do something sensible.
Did I understand the documentation correctly?
Normally, you should be able to simply check the return value. The fact that your application exits implies that something is already catching the error and asserting (or something similar). Without seeing any context (i.e. code demonstrating how you're using this function), it's difficult to be any more precise.
The documentation states that ConnectToTCPServer will return the error code. The callback is only called if the connection is established, disconnected or when there is data ready to be read.
The message you get states that the error is NON-FATAL, hence it shouldn't abort. If you're sure the code doesn't abort later it seems indeed like a bug in the library.
I'm not familiar with CVI, but there might be a (compile-/runtime-) option to abort even on non-fatal errors (for debugging purposes). If you can reproduce this in a minimal example you should report it to NI.