system command not executing with mpiicc -O - c

I have intel Parallel studio XE cluster edition 2015 on my 10 Node server connected with infiniband band. I wrote my code in C. My code consists of system commands with sprintf command like below:
printf("started \n");
system("cp metis_input.txt $HOME/metis-4.0/.");
sprintf(filename,"$HOME/metis-4.0/./partdmesh metis_input.txt %d",size-1);
system(filename);
sprintf(filename,"mv metis_input.txt.npart.%d nodes_proc.txt",size-1);
system(filename);
printf("completed \n");
When I compile my code and run it without any opmization flags it runs smoothly but when I compile my code with "mpiicc -O" the above lines dont even seem to be executed. I think that the above lines are being skipped. Only the printf's are executed. Do I need to add anything extra in my code (like including any headers) to get these system commands runnning for INTEL mpi compiler with -O ?

Related

msvc compiled programs output differently under cygwin tty

Under Cygwin: How can I stop output from msvc compiled programs from being transcoded in the tty.
Under Cygwin: gcc vs msvc compiled programs appear to run differently to each other under a tty. Specifically, I am seeing some strange character set translations from only msvc generated binaries output under a tty when the character's 8th bit is set. I'd really like to know how to turn off this annoying behaviour please. Consider:
screen-cap of terminal output (duplicated in a code quote below)
! pwd
/tmp/demo_dir
! echo $LC_ALL "," $LANG "," $LC_CTYPE
, ,
! ./compiled_with_gcc.exe | hexdump
0000000 cece cece
0000004
! ./compiled_with_msvc.exe | hexdump
0000000 cece cece
0000004
! ./compiled_with_gcc.exe
▒▒▒▒!
! ./compiled_with_msvc.exe
╬╬╬╬!
The problem is the last line. The output from the msvc compiled version is not as expected. The two programs are demonstrated above to be outputting the same data: so the last two outputs should be the same. But the tty version (without the pipe) gets changed in only the msvc case. gcc compiled program outputs are passed through the tty unharmed. The output presented here is from the cygwin terminal, but I see exactly the same output difference in xterm.
I am confident it is happenning in the tty not the terminal: because I written a standalone cygwin program in C that runs either the gcc and msvc compiled programs, either under a pipe or under a tty that is not connected to a terminal. The program logs the actual bytes received from the tty.
When running the gcc compiled one, the tty gives the '0xce's bytes as expected.
But a sequence of '0x8ec3' patterns is instead received from the msvc compiled program when listening to it via an identical tty.
When using a pipe instead of a tty, they both output '0xce's.
This notes that the msvc compiled program's output via a tty has an increased width. Given cygwin's preference for UTF-8: it is easy to suspect something is going wrong here and cygwin is causing an extra transcoding that does not happen with gcc compiled programs. I wish to turn that off... How do I successfully disable UTF-8 translations in todays cygwin.
I note that LC_ALL does not appear to be respected to stop this happenning for msvc compiled binaries accessed via a tty. Even when the C program begins with setlocale(0,"");
The output-generating program (to be alternately compiled with the two compilers for the test) is exactly as you'd expect it to be. The same C source in both cases. It simply calls printf or write with some bytes. The msvc version is compiled with Visual Studio 2019 cl.exe (all running on Windows10).
reproduce with:
#ifndef __CYGWIN__
#include <windows.h>
#else
#include <unistd.h>
#endif
#include <io.h>
#include <fcntl.h>
#include <locale.h>
int main()
{
if(!setlocale(LC_ALL, "")) {
return 77; //historically: non-filesystem permission-denied exit-code
}
#ifndef __CYGWIN__
//Irrelevent: But avoids stackexchange users asking for it.
_setmode(1,_O_BINARY);
_set_fmode(_O_BINARY);
#endif
char *dat="\316\316\316\316";
write(1,dat,4); // printf/fflush here gives same results.
return 0;
}
#echo off
:: ugly msvc build script. ms_cl.bat
:: full of M$ hardcoded paths. Likely includes some unused libraries.
:: Load compilation environment
call "C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Auxiliary\Build\vcvars64.bat"
:: Invoke compiler with any options passed to this batch file
"C:\Program Files (x86)\Microsoft Visual Studio\2019\Professional\VC\Tools\MSVC\14.29.30037\bin\Hostx64\x64\cl.exe" /std:c17 %* kernel32.lib user32.lib gdi32.lib winspool.lib comdlg32.lib advapi32.lib shell32.lib ole32.lib oleaut32.lib uuid.lib odbc32.lib odbccp32.lib
Build msvc version with:
! ms_cl.bat code.c
Run in terminal with:
! ./code.exe | hexdump
! ./code.exe
Build gcc version with:
! gcc code.c
Run in terminal with:
! ./a.exe | hexdump
! ./a.exe
Note identical programs, with identical output in a hex, have output transcoded differently. The msvc one being 'wrong' in my usage case.
I obviously, suspected M$ was doing some translation: so I have tried every combination of _fmode setmode() and more to set binary mode. I suspected some failed cygwin UTF-8 detection situation, so tried setting LC_ALL etc. to plain "C" mode with export in the shell. I similarly tried setting the locale within the msvc source.
Cygwin does a lot of work to make a unix-like environment under windows. Given the hexdumps above I can only guess Cygwin (or some hidden msvc console layer) are doing something quite specialised here and getting in my way. It maybe related to cygwin's migration to ConPty. Either way. I'd like help turning it off.
OP: It's been a little while. And the problem presented in the question presently remains unsolved. However, I have discovered a hacky hack hack from the planet hack that allows for avoiding the problem without solving it. I am posting this non-answer problem-avoidance-hack as an answer and will (eventually) mark it as the solution.. But only if no actual solutions to the problem can be found..
To avoid the output of msvc-compiled program from being transcoded in the tty: first pipe the output of the msvc-compiled program to a gcc compiled program that simply repeats it (such as 'cat' or 'tail -f') and connect that gcc-compiled program to the tty instead.
This hides whatever is going on in the msvc case by seperating it from the tty. The environment is then respected. The tty only knows it is connected to a gcc-compiled program -and works right.
! ./compiled_with_gcc.exe # gcc_compiled->tty = good
▒▒▒▒!
! ./compiled_with_msvc.exe # msvc_compiled->tty = bad
╬╬╬╬!
! ./compiled_with_msvc.exe|cat # msvc_compiled->gcc_compiled->tty = hacky but good
▒▒▒▒!

How to get debugging symbols when compiling with clang on Windows

I am having trouble getting the debugger to work properly when setting up clang on my Windows 10 machine. Compilation seems to work OK, at least for the simple "hello, world" program I tried. However, when I try to run the lldb or gdb debuggers on this test program (or any other program I tried), it does not recognize function names.
Here's my C program code:
#include <stdio.h>
int main(void) {
puts("Hello, world!");
return 0;
}
Nothing too spectacular here, I know. I'm compiling with the following command:
> clang -g -O0 hello.c -o hello.exe
I then try to run the debugger:
> lldb hello
(lldb) target create "hello"
Current executable set to 'hello' (x86_64).
(lldb) b main
Breakpoint 1: no locations (pending).
WARNING: Unable to resolve breakpoint to any actual locations.
(lldb) r
Process 12156 launched: 'C:\Users\********\Projects\clang-test\hello.exe' (x86_64)
Process 12156 exited with status = 0 (0x00000000)
(lldb)
Apparently the symbol "main" was not recognized, and the program did not halt at the start of the "main" function but ran to completion (in a different console window, hence no program output here).
How do I get debugging symbols to work? In a different stackoverflow answer I found that adding compiler options "-g -O0" should do the trick, but as you can see that does not solve the problem for me. I also found a different stackoverflow answer about how to set up debugging if the code is not in the same directory as the executable, but that is not relevant to my case: the current working directory is the same as the directory with the code and executable in them.
Some version information:
> clang --version
clang version 9.0.0 (tags/RELEASE_900/final)
Target: x86_64-pc-windows-msvc
Thread model: posix
InstalledDir: C:\Program Files\LLVM\bin
> lldb --version
lldb version 9.0.0
The "-g -O0" options you provided should indeed let the debugger know all the symbols it needs from the executable.
Therefore, I suspect the problem is elsewhere, perhaps with your terminal, or your version/implementation of LLDB.
Are you using the windows cmd.exe commandline ? or something else, like Powershell ?
I've never managed to get debuggers working properly in those environments, but it was much easier with Cygwin, which is a bash shell for windows (it creates a "simulated" linux environment within its install folder, so you have all the /usr,/bin,/etc folders a bash shell needs)
This way you can actually use gdb the way you would on a UNIX system.
If the above method sounds like more of a hassle than a time-gain, then yeah I would recommend another debugger altogether, like the Visual Studio debugger.
In fact, maybe a memory-analysis tool like Dr.Memory can give you what you need

Daemon on embedded Linux device using Busybox be written in C or as a script

Should a daemon on an embedded device using Busybox be written in C or as a script?
All the examples I have seen use #! /bin/ash at the top of the file and that is for scripting? But in the device I'm writing to has only complied C files (I think) and symbolic links in /usr/bin.
Every way I try to compile a C file with #include </bin/ash> (e.g. gcc -Wall -o daemon_busybox daemon_busybox.c) I get error after error report in /bin/ash:
/bin/ash:174:1: error: stray ‘\213’ in program
/bin/ash:174:1: error: stray ‘\10’ in program
/bin/ash:174:1: error: stray ‘\273’ in program
/bin/ash:174:1: error: stray ‘\204’ in program
/bin/ash:174:1: error: stray ‘\342’ in program
Note I have set this: /bin/ash -> busybox
Any ideas which way I should go?
Update:
I've been given the task trying to see if a daemon can be run on a small device that runs Linux (2.6.35-at-alpha4) and Java (SE Embedded Runtime Environment) with very limited memory (i.e. a 10 second wait to get java -version to report back).
Two weeks ago I didn't know much about daemons — only knew the word. So, this is all new to me.
On my development machine I have built two different daemon files, one in C and one as a script. Both run very nicely on my Linux machine.
But because of the very small size of the target device there is only busybox (no /lib/lsb/init-functions). So I'm trying to build a 3rd daemon file. I believe it should be written in C for this device, but all examples for busybox point to scripting.
Once your question is edited so that the file name you're trying to #include is visible, the problem becomes self-evident:
#include </bin/ash>
This tries to make the C compiler include the binary of busybox (via the symlink /bin/ash) into the code to be compiled. The average binary is not a valid C source file; this is doomed to failure.
Perhaps you simply need to drop that line — the C compiler stands a better chance of working if it is given header files and source files to compile. Maybe there's more work needed; we don't have enough information to help there.
Many daemons are written as C programs, but a carefully written shell script could be used instead.
Personally, I would like to do this as a script (I've never liked C). But on the device everything in the /usr/sbin folder looks like a C file. So, the conservative coder in me says C is the way to go. I know: ask the guys developed the device — but they're long gone. Right now my daemon is just a test (i.e. printf("Hello World\n"); ). I'm trying to get printf passed to Busybox. But so far I cannot get this file to compile. I just need a simple daemon in C to start.
OK; your C code for that should be just:
#include <stdio.h>
int main(void)
{
printf("Hello World\n");
return 0;
}
Save it in hw_daemon.c. Compile it using:
gcc -o hw_daemon hw_daemon.c
If that won't compile, then you've not got a workable C development environment for the target machine. If that will compile, you should be able to run it with:
./hw_daemon
and you should see the infamous 'Hello World' message appear.
If that does not work, then you can go with the script version instead, in a file hw_script.sh:
#!/bin/ash
printf "Hello World\n"
You should be able to run that with:
Predicted output — not output observed on a machine.
$ ash hw_script.sh
Hello World
$ chmod +x hw_script.sh
$ ./hw_script.sh
Hello World
$
If neither of those works at all, then you've got major problems on the system (maybe Busybox doesn't provide a printf command workalike, for example, and you need to use echo "Hello World" instead of the printf).

Why are different nodes running different compiles of my executable? (MPI)

After I recompile my (C) program, some nodes are running old compiles (with the debug information still in it), and some nodes are running the new copy. The server is running Gentoo Linux and all nodes get the file from the same storage. I'm told the filesystem is NFS. The MPI I'm using is MPICH Version 1.2.7. Why are some nodes not using the newly compiled copy?
Some more details (in case you're having trouble sleeping):
I'm trying to create my first MPI program (and I'm new to C and Linux, too). I have the following in my code:
#if DEBUG
{
int i=9;
pid_t PID;
char hostname[256];
gethostname(hostname, sizeof(hostname));
printf("PID %d on %s ready for attach.\n", PID=getpid(), hostname);
fflush(stdout);
while (i>0) {
printf("PID %d on %s will wait for `gdb` to attach for %d more iterations.\n", PID, hostname, i);
fflush(stdout);
sleep(5);
i--;
}
}
#endif
Then I recompiled with (no -DDEBUG=1 option, so the above code is excluded)
$ mpicc -Wall -I<directories...> -c myprogram.c
$ mpicc -o myprogram myprogram.o -Wall <some other options...>
The program compiles with no problems. Then I execute it like this:
$ mpirun -np 3 myprogram
Sometimes (and more and more frequently), different copies of the executable run on different nodes of the cluster? On some nodes, the debugging code executes (and prints) and on some nodes it doesn't.
Note that the cluster is currently experiencing some "clock skew" (or something like that), which may be the cause. Is that the problem?
Also note that I actually just change the compile options by commenting/uncommenting lines in a Makefile because I haven't had time to implement these suggestions yet.
Edit: When the problem occurs, md5sum myprogram returns a different value on the nodes where the issue presents itself.
Your different nodes have retained a copy of a file and are using that instead of the latest when you run the binary. This has little to nothing to do with Gentoo because it is an artifact of the Linux (kernel) caching and/or NFS implementations.
In other words, your binary is cached. Read this answer:
NFS cache-cleaning command?
Tweaking some settings may also help.
I happen to have a command here that syncs and flushes:
$ cat /home/jaroslav/bin/flush_cache
sudo sync
sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches'

Debugging Segmentation Faults on a Mac?

I'm having some problems with a program causing a segmentation fault when run on a Mac. I'm putting together an entry for the IOCCC, which means the following things are true about my program:
It's a very small C program in a single file called prog.c
I won't post it here, because it won't help (and would probably render the contest entry invalid)
It compiles cleanly under gcc using "cc -o prog prog.c -Wall"
Despite (or, more accurately, because of) the fact it contains a bunch of really bizarre uses of C, it has been constructed extremely carefully. I don't know of any part of it which is careless with memory (which is not to say that there can't possibly be bugs, just that if there are they're not likely to be obvious ones)
I'm primarily a Windows user, but several years ago I successfully compiled and ran it on several windows machines, a couple of Macs and a Linux box, with no problems. The code hasn't changed since then, but I no longer have access to those machines.
I don't have a Linux machine to re-test on, but as one final test, I tried compiling and running it on a MacBook Pro - Mac OSX 10.6.7, Xcode 4.2 (i.e. GCC 4.2.1). Again, it compiles cleanly from the command line. It seems that on a Mac typing "prog" won't make the compiled program run, but "open prog" seems to. Nothing happens for about 10 seconds (my program takes about a minute to run when it's successful), but then it just says "Segmentation fault", and ends.
Here is what I've tried, to track down the problem, using answers mostly gleaned from this useful StackOverflow thread:
On Windows, peppered the code with _ASSERTE(_CrtCheckMemory()); - The code ran dog-slow, but ran successfully. None of the asserts fired (they do when I deliberately add horrible code to ensure that _CrtCheckMemory and _ASSERTE are working as expected, but not otherwise)
On the Mac, I tried Mudflap. I tried to build the code using variations of "g++ -fmudflap -fstack-protector-all -lmudflap -Wall -o prog prog.c", which just produces the error "cc1plus: error: mf-runtime.h: No such file or directory". Googling the matter didn't bring up anything conclusive, but there does seem to be a feeling that Mudflap just doesn't work on Macs.
Also on the Mac, I tried Valgrind. I installed and built it, and built my code using "cc -o prog -g -O0 prog.c". Running Valgrind with the command "valgrind --leak-check=yes prog" produces the error "valgrind: prog: command not found". Remembering you have you "open" an exectable on a Mac I tried "valgrind --leak-check=yes open prog", which appears to run the program, and also runs Valgrind, which finds no problems. However, Valgrind is failing to find problems for me even when I run it with programs which are designed specifically to make it trigger error messages. I this also broken on Macs?
I tried running the program in Xcode, with all the Diagnostics checkboxes ticked in the Product->Edit Scheme... menu, and with a symbolic breakpoint set in malloc_error_break. The breakpoint doesn't get hit, the code stops with a callstack containing one thing ("dlopen"), and the only thing of note that shows up in the output window is the following:
Warning: Unable to restore previously selected frame.
No memory available to program now: unsafe to call malloc
I'm out of ideas. I'm trying to get Cygwin set up (it's taking hours though) to see if any of the tools will work that way, but if that fails then I'm at a loss. Surely there must be SOME tools which are capable of tracking down the causes of Segmentation faults on a Mac?
For the more modern lldb flavor
$ lldb --file /path/to/program
...
(lldb) r
Process 89510 launched
...
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x726f00)
* frame #0: 0x00007fff73856e52 libsystem_platform.dylib`_platform_strlen + 18
...
Have you compiled with -g and run it inside gdb? Once the app crashes, you can get a backtrace with bt that should show you where the crash occurs
In many cases, macOS stores the recent program crash logs under ~/Library/Logs/DiagnosticReports/ folder.
Usually I will try the following steps when doing troubleshooting on macOS:
Clean the existing crash logs under the ~/Library/Logs/DiagnosticReports/
Run the program again to reproduce the issue
Wait for a few seconds, the crash log will appear under the folder. The crash log is named like {your_program}_{crashing_date}_{id}_{your_host}.crash
Open the crash log with your text editor, search for the keyword Crashed to locate the thread causing the crash. It will show you the stack trace during crash, and in many cases, the exact line of source code causing the crash will be recorded as well.
Some links:
[1] https://mac-optimization.bestreviews.net/analyze-mac-crash-reports/

Resources