Parallel processing - `forking` fails under Mac OS 10.6.8 - c

It appears that fork fails under Mac OS 10.6.8. The algorithm is coded in R and I have mainly be using the foreach package for parallel processing. The code works perfectly well when run sequentially (foreach and %do%) but not when run in parallel (foreach and %dopar%) even though the processes run in parallel do NOT communicate.
I used foreach on this same machine a month ago and it worked fine. An update of the OS has been performed in the meantime.
Error Messages
I received several kinds of error messages that seems to come almost stochastically. Also the errors differ depending on whether the code is run from the terminal (either by copy-pasting in the R environment or with R CMD BATCH) or from the R console.
When run on the Terminal, I get
Error in { : task 1 failed - "object 'dp' not found"
When run on the R console I get either
The process has forked and you cannot use this CoreFoundation functionality safely. You MUST exec().
Break on __THE_PROCESS_HAS_FORKED_AND_YOU_CANNOT_USE_THIS_COREFOUNDATION_FUNCTIONALITY___YOU_MUST_EXEC__() to debug.
....
<repeated many times when run on the R console>
or
Error in { : task 1 failed - "object 'dp' not found"
with the exact same code! Note that although this second error message is the same than the one received on the Terminal, the number of things that are printed (through the print() function) on the screen vastly differ!
What I've tried
I updated the package foreach and I also restarting my computer but it did not quite help.
I tried to print pretty much anything I could but it ended up being quite hard to keep track of what this algorithm is doing. For example, it often through the error about the missing object dp without executing the print statement at the line that precedes the call of the object dp.
I tried to use %dopar% but registering only 1 CPU. The output did not change on the Terminal, but it changed on the Console. Now the console gives the exact same error, at the same time than the terminal.
I made sure that several CPUs were in used when I ask for several CPUs.
I tried to use mclapply instead of foreach and registerDoMC() instead of registerDoParallel() to register the number of cores.
Extra-Info
My version of R is 3.0.2 GUI 1.62 Snow Leopard build. My machine has 16 cores.

Related

How to abort a macports portfile on an error condition?

I working on a version bump on the cc65 and encountered a problem with the linuxdoc-tools. Since I can't fix the linuxdoc-tools and there is a simple workaround possible I decided to add an if statement to inform the user together with the workaround:
if {! [file exists ${prefix}/bin/perl] } {
ui_error "
«${prefix}/bin/perl» is missing but the linuxdoc-tools depends on it.
Please create an appropriate symbolic link for linuxdoc-tools to work.
"
exit 1
}
Crude but the best I can do since I'm neither the perl5 nor the linuxdoc-tools maintainer and I don't want to spend to much time on a version bump.
However, the MacPorts doesn't understand exit 1 and ui_error won't stop execution on its own.
How do I stop the execution so not to waste the users time on a build which will otherwise fail right at the end.
Use return -code error "error message", or the shorthand for the same thing, error "error message".
Note that you should use ui_error before that to print a human-readable message for the user – while the error message is also being printed, it can sometimes get lost in the output.
Additionally, note that $prefix/bin/perl is a build dependency of linuxdoc-tools. If it is also needed at runtime, you should submit a pull request that adds depends_run path:bin/perl:perl5 to the port rather than attempting to fix this bug in your port.

generating trap/segfault messages in dmesg

I had a program that was segfaulting.
When I went to investigate and ran dmesg I could see lines like this:
[955.915050] traps: foo_bar[123] general protection ip:7f5fcc2d4306 sp:7ffd9e5868b8 ...
Now the program has been fixed and I'm trying to write some analysis scripts across different systems to find similar messages and was hoping to induce a line in the dmesg log to get a baseline for what to look for and see if there's a difference between, say, a sigbus(10) and a sigill(4)
I tried to do it via kill -11 on the command line . No entry in dmesg
I tried to do it via signal(getpid(), 11) in the code. No entry in dmesg
I tried to do it via signal 11 after attaching in gdb . No entry in dmesg
I tried to do it via writing bad code and it worked for SEGV, but I can't figure out how to trigger a SIGBUS (for example)
I'm guessing that there is more than one path for handling the signal depending on how it occurs and my attempts above just aren't doing it the right way.
How can I trigger/send a signal to my program that'll get a line in dmesg? Is there some kernel or log configuration I can twiddle to get those lines?
Update:
" __builtin_trap: when to use it? " shows how to get a SIGILL but alas doesn't have a signal-agnostic solution)

More than 2 notebooks with R code in zeppelin failing with error sparkr intrepreter not responding

I have met with a strange issue in running R notebooks in zeppelin(0.7.2).
Spark intrepreter is in per note Scoped mode and spark version is 1.6.2 and SPARK_HOME is set.
Please find the steps below to reproduce the issue:
Create a notebook (Note1) and run any r code in a paragraph. I ran the following code.
%r
rdf <- data.frame(c(1,2,3,4))
colnames(rdf) <- c("myCol")
sdf <- createDataFrame(sqlContext, rdf)
withColumn(sdf, "newCol", sdf$myCol * 2.0)
Create another notebook (Note2) and run any r code in a paragraph. I ran the same code as above.
Till now everything works fine.
Create third notebook (Note3) and run any r code in a paragraph. I ran the same code. This notebook fails with the error
org.apache.zeppelin.interpreter.InterpreterException: sparkr is not
responding
What I understood from the analysis is that the process created for sparkr interpreter is not getting killed properly and this makes every third model to throw an error while executing. The process will be killed on restarting the sparkr interpreter and another 2 models could be executed successfully. ie, For every third model run using the sparkr interpreter, the error is thrown.
Help me to fix the problem.
you need to set spark.r.numRBackendThreads larger than 2. By default it is 2 which means you can only have 2 threads for RBackend. Since you are using scoped mode per note, each note will consume one thread for RBackend, so you can only run 2 note.

CommandSequence taking too long to download

With both DKPy-SITL and our APM2 board, the wait_ready method is causing our program to raise an API Exception due to the command list (waypoints) taking too long to download. In the past (with droneapi) this wasn't an issue for me. Some waypoints are being downloaded, but the process takes about 10 seconds for each one, which leads me to believe something weird is going on.
Are there any ways to speed up the download process? I've posted the relevant code below.
self.vehicle = connect(connection_string, baud=baud_rate,
status_printer=dronekit_printer, wait_ready=True)
and later in another asynchronous method
def commands(self):
commands = self.vehicle.commands
commands.download()
commands.wait_ready()
return commands
The error occurs on commands.wait_ready(). There has to be a faster way to download commands than sitting there for over 30 seconds on an i7 4790k processor, especially since I've run the same code off a slower computer in the past with droneapi. If need be, I can raise an issue on the dronekit github as well.
I had the same issue. First time download call always goes well (0 commands). Once you have uploaded some commands the second time you try to download it fails ('Timeout' exception).
What I did to solve this was calling clear without download after the first time.
Something like this:
cmds = vehicle.commands
if not cmds.count > 0:
# Download
cmds.download()
# Wait until download is finished
cmds.wait_ready()
cmds.clear()
# Add / Modify the commands here and then upload them

Debugging crash during app exit (WPF)

I'm trying to figure out why an WPF-app won't exit imediately on closing it. Using Process Explorer I hade found out that WerFault.exe is started while exiting which seem to indicate that something crashes during the teardown, perhaps some destructor or dispose that fails. This started happening when I recently switched to VS2015. I am running Windows 8.
My question is: How can I find out what the real problem is? Any way of finding a crash log for WerFault.exe? I have hundreds of destructors and dispose-methods so it's a bit hard to put breakpoints in all of them. Any other way of capturing these kinds of errors in VS?
The exit code is -1073740791 which "indicate a bug in the executed software that causes stack overflow, leading to abnormal termination of the software". But where?
Some more info from the event log:
Faulting module name: ucrtbase.DLL, version: 10.0.10240.16390, time stamp: 0x55a5b718
Exception code: 0xc0000409
Fault offset: 0x0000000000065a4e
You could try enabling user mode dumps:
Create the registry key HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\Windows Error Reporting\LocalDumps
Within LocalDumps, create a key that is the name of your executable
Within the key you just created, set the values of DumpFolder, DumpCount, DumpType, and CustomDumpFlags as needed (you should definitely set DumpType to 2 for full dumps, otherwise I don't think that enough information will be captured to debug a managed dump).
Once you have done this, whenever your executable crashes a dump file will be created in the folder specified by DumpFolder (or %LOCALAPPDATA%\CrashDumps by default).

Resources