Im trying to configure CLion so that I can use openMP. When using the default settings on my Mac, the compiler is clang. Default Apple clang does not support openMP.
When I change my compiler to GCC, the debugger will not stop at breakpoints. The program just runs as it would when executing the compiled file.
The CMakeLists.txt file below works perfectly with CLion debugger. When I uncomment out the compiler flags, the debugger ignores the breakpoints.
cmake_minimum_required(VERSION 3.8)
project(CLionTest)
set(CMAKE_C_STANDARD 99)
#set(CMAKE_C_COMPILER /usr/local/bin/gcc-7)
#set(CMAKE_C_FLAGS -fopenmp)
#set(CMAKE_C_FLAGS_DEBUG "-D_DEBUG")
set(MAIN main.c)
add_executable(CLionTest ${MAIN})
add_custom_target(CLionTestMake COMMAND make all WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR})
How do I fix this?
Toolchain settings:
CMake executable: Bundeled CMake 3.8.2
Debugger: Bundled LLDB 3.9.0
main.c:
#include <stdio.h>
#include <unistd.h>
#ifdef _OPENMP
#include <omp.h>
#endif
int main() {
printf("Hello, World!\n");
#pragma omp parallel
{
#ifdef _OPENMP
int size = omp_get_num_threads();
int rank = omp_get_thread_num();
#else
int rank = 0;
int size = 1;
#endif
printf("%d/%d\n", rank, size);
};
return 0;
}
set(CMAKE_C_FLAGS -fopenmp)
set(CMAKE_C_FLAGS_DEBUG "-D_DEBUG")
You are replacing the C flags instead of appending them, so you are dropping the builtin -g option that generates debug symbols. Instead, do
set(CMAKE_C_FLAGS "${CMAKE_CFLAGS} -fopenmp")
set(CMAKE_C_FLAGS_DEBUG "${CMAKE_C_FLAGS_DEBUG} -D_DEBUG")
Related
We have a translation unit we want to compile with AVX2 (only that one):
It's telling GCC upfront, first line in the file:
#pragma GCC target "arch=core-avx2,tune=core-avx2"
This used to work with GCC 4.8 and 4.9 but from 6 onward (tried 7 and 8 too) we get this warning (that we treat as an error):
error: SSE instruction set disabled, using 387 arithmetics
On the first function returning a float. I have tried to enable back SSE 4.2 (and avx and avx2) like so
#pragma GCC target "sse4.2,arch=core-avx2,tune=core-avx2"
But that is not enough, the error persists.
EDIT:
Relevant compiler flags, we target AVX for most stuff:
-mfpmath=sse,387 -march=corei7-avx -mtune=corei7-avx
EDIT2: minimal sample:
#pragma GCC target "arch=core-avx2,tune=core-avx2"
#include <immintrin.h>
#include <math.h>
static inline float
lg1pf( float x ) {
return log1pf(x)*1.44269504088896338700465f;
}
int main()
{
log1pf(2.0f);
}
Compiled that way:
gcc -o test test.c -O2 -Wall -Werror -pedantic -std=c99 -mfpmath=sse,387 -march=corei7-avx -mtune=corei7-avx
In file included from /home/xxx/gcc-7.1.0/lib/gcc/x86_64-pc-linux-gnu/7.1.0/include/immintrin.h:45:0,
from test.c:3:
/home/xxx/gcc-7.1.0/lib/gcc/x86_64-pc-linux-gnu/7.1.0/include/avx512fintrin.h: In function ‘_mm_add_round_sd’:
/home/xxx/gcc-7.1.0/lib/gcc/x86_64-pc-linux-gnu/7.1.0/include/avx512fintrin.h:1412:1: error: SSE register return with SSE disabled
{
^
GCC details (I don't have the flags that were used to compile it though)
gcc --version
gcc (GCC) 7.1.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Potential solution
#pragma GCC target "avx2"
Worked for me without other changes to the code.
Applying the attribute to individual functions did not work either:
Related problem:
__attribute__((__target__("arch=broadwell"))) // does not compile
__m256 use_avx(__m256 a) { return _mm256_add_ps(a,a); }
__attribute__((__target__("avx2,arch=broadwell"))) // does not compile
__m256 use_avx(__m256 a) { return _mm256_add_ps(a,a); }
__attribute__((__target__("avx2"))) // compiles
__m256 use_avx(__m256 a) { return _mm256_add_ps(a,a); }
This looks like a bug. #pragma GCC target before #include <immintrin.h> breaks the header somehow, IDK why. Even if AVX2 was enabled on the command line with -march=haswell, a #pragma seems to break inlining of any intrinsics defined after that.
You can use #pragma after the header, but then using instrinsics that weren't enabled on the command line fails.
Even a more modern target name like #pragma GCC target "arch=haswell" causes the error, so it's not that the old nebulous target names like corei7-avx are broken in general. They still work on the command line. If you want to enable something for a whole file, the standard way is to use compiler options and not pragmas.
GCC does claim to support target options on a per-function basis with pragmas or __attribute__, though. https://gcc.gnu.org/onlinedocs/gcc/Function-Specific-Option-Pragmas.html.
This is as far as I've gotten playing around with this (Godbolt compiler explorer with gcc8.1). Clang is unaffected because it ignores #pragma GCC target. (So that means that #pragma is not very portable; you probably want your code work with any GNU C compiler, not just gcc itself.)
// breaks gcc when before immintrin.h
// #pragma GCC target "arch=haswell"
#include <immintrin.h>
#include <math.h>
//#pragma GCC target "arch=core-avx2,tune=core-avx2"
#pragma GCC target "arch=haswell"
//static inline
float
lg1pf( float x ) {
return log1pf(x)*1.44269504088896338700465f;
}
// can accept / return wide vectors
__m128 nop(__m128 a) { return a; }
__m256 require_avx(__m256 a) { return a; }
// but error on using intrinsics if #include happened without target options
//__m256 use_avx(__m256 a) { return _mm256_add_ps(a,a); }
// this works, though, because AVX is enabled at this point
// presumably so would __builtin_ia32_whatever
// Without `arch=haswell`, this breaks, so we know the pragma "worked"
__m256 use_native_vec(__m256 a) { return a+a; }
Is there a good way to use OpenMP to parallelize a for-loop, only if an -omp argument is passed to the program?
This seems not possible, since #pragma omp parallel for is a preprocessor directive and thus evaluated even before compile time and of course it is only certain if the argument is passed to the program at runtime.
At the moment I am using a very ugly solution to achieve this, which leads to an enormous duplication of code.
if(ompDefined) {
#pragma omp parallel for
for(...)
...
}
else {
for(...)
...
}
I think what you are looking for can be solved using a CPU dispatcher technique.
For benchmarking OpenMP code vs. non-OpenMP code you can create different object files from the same source code like this
//foo.c
#ifdef _OPENMP
double foo_omp() {
#else
double foo() {
#endif
double sum = 0;
#pragma omp parallel for reduction(+:sum)
for(int i=0; i<1000000000; i++) sum += i%10;
return sum;
}
Compile like this
gcc -O3 -c foo.c
gcc -O3 -fopenmp -c foo.c -o foo_omp.o
This creates two object files foo.o and foo_omp.o. Then you can call one of these functions like this
//bar.c
#include <stdio.h>
double foo();
double foo_omp();
double (*fp)();
int main(int argc, char *argv[]) {
if(argc>1) {
fp = foo_omp;
}
else {
fp = foo;
}
double sum = fp();
printf("sum %e\n", sum);
}
Compile and link like this
gcc -O3 -fopenmp bar.c foo.o foo_omp.o
Then I time the code like this
time ./a.out -omp
time ./a.out
and the first case takes about 0.4 s and the second case about 1.2 s on my system with 4 cores/8 hardware threads.
Here is a solution which only needs a single source file
#include <stdio.h>
typedef double foo_type();
foo_type foo, foo_omp, *fp;
#ifdef _OPENMP
#define FUNCNAME foo_omp
#else
#define FUNCNAME foo
#endif
double FUNCNAME () {
double sum = 0;
#pragma omp parallel for reduction(+:sum)
for(int i=0; i<1000000000; i++) sum += i%10;
return sum;
}
#ifdef _OPENMP
int main(int argc, char *argv[]) {
if(argc>1) {
fp = foo_omp;
}
else {
fp = foo;
}
double sum = fp();
printf("sum %e\n", sum);
}
#endif
Compile like this
gcc -O3 -c foo.c
gcc -O3 -fopenmp foo.c foo.o
You can set the number of threads at run-time by calling omp_set_num_threads:
#include <omp.h>
int main()
{
int threads = 1;
#ifdef _OPENMP
omp_set_num_threads(threads);
#endif
#pragma omp parallel for
for(...)
{
...
}
}
This isn't quite the same as disabling OpenMP, but it will stop it running calculations in parallel. I've found it's always a good idea to set this using a command line switch (you can implement this using GNU getopt or Boost.ProgramOptions). This allows you to easily run single-threaded and multi-threaded tests on the same code.
As Vladimir F pointed out in the comments, you can also set the number of threads by setting the environment variable OMP_NUM_THREADS before executing your program:
gcc -Wall -Werror -pedantic -O3 -fopenmp -o test test.c
OMP_NUM_THREADS=1
./test
unset OMP_NUM_THREADS
Finally, you can disable OpenMP at compile-time by not providing GCC with the -fopenmp option. However, you will need to put preprocessor guards around any lines in your code that require OpenMP to be enabled (see above). If you want to use some functions included in the OpenMP library without actually enabling the OpenMP pragmas you can simply link against the OpenMP library by replacing the -fopenmp option with -lgomp.
One solution would be to use the preprocessor to ignore the pragma statement if you do not pass an additional flag to the compiler.
For example in your code you might have:
#ifdef MP_ENABLED
#pragma omp parallel for
#endif
for(...)
...
and then when you compile you can pass a flag to the compiler to define the MP_ENABLED macro. In the case of GCC (and Clang) you would pass -DMP_ENABLED.
You then might compile with gcc as
gcc SOME_SOURCE.c -I SOME_INCLUDE.h -lomp -DMP_ENABLED -o SOME_OUTPUT
then when you want to disable the parallelism you can make a minor tweek to the compile command by dropping -DMP_ENABLED.
gcc SOME_SOURCE.c -I SOME_INCLUDE.h -lomp -DMP_ENABLED -o SOME_OUTPUT
This causes the macro to be undefined which leads to the preprocessor ignoring the pragma.
You could also use a similar solution using ifndef instead depending on whether you consider the parallel behavior the default or not.
Edit: As noted by some comments, inclusion of OMP lib defines some macros such as _OPENMP which you could use in place of your own user-defined macros. That looks to be a superior solution, but the difference in effort is reasonably small.
cmake works just fine (I guess! I just started using cmake) with no error so does mingw32-make. I suppose compilation and linking is done correctly. But when I open the program (graphics.exe) it immediately stops working. (BTW I'm not missing SDL2.dll) whats wrong?
edit:
It works just fine when using 32bit libraries and 32bit dll and "-m32". BUT it crashes when using 64bit stuff.
CMakeList.txt:
cmake_minimum_required(VERSION 3.8)
project(graphics)
set(CMAKE_C_STANDARD 11)
set(SOURCE_FILES main.c)
set(DEPS ${CMAKE_CURRENT_SOURCE_DIR}/Deps)
include_directories("${DEPS}/include")
add_executable(graphics ${SOURCE_FILES})
#find sdl2 libraries
find_library(SDL2MAIN_LIB SDL2main ${DEPS}/lib)
find_library(SDL2_LIB SDL2 ${DEPS}/lib)
message("SDL2main found at: ${SDL2MAIN_LIB}")
message("SDL2 found at: ${SDL2_LIB}")
target_link_libraries(graphics ${SDL2MAIN_LIB})
target_link_libraries(graphics ${SDL2_LIB})
main.c:
#include <stdio.h>
#include <SDL2/SDL.h>
int main(int argc,char** argv) {
if(SDL_Init(SDL_INIT_EVERYTHING) == 0)
printf("SDL2 up and running\n");
else
printf("Could not initialize SDL2");
SDL_Quit();
return 0;
}
I am quite new to meson and C, please forgive me if the answer to this question is trivial ...
I want to use OpenMP in a C project, and I am using meson as a build tool.
I want to compile the parallel for example from this tutorial.
My main.c looks very similar:
#include <omp.h>
#define N 1000
#define CHUNKSIZE 100
int main(int argc, char *argv[]) {
int i, chunk;
float a[N], b[N], c[N];
/* Some initializations */
for (i=0; i < N; i++)
a[i] = b[i] = i * 1.0;
chunk = CHUNKSIZE;
#pragma omp parallel for \
shared(a,b,c,chunk) private(i) \
schedule(static,chunk)
for (i=0; i < N; i++)
c[i] = a[i] + b[i];
return 0;
}
My short meson.build file contains this:
project('openmp_with_meson', 'c')
# add_project_arguments('-fopenmp', language: 'c')
exe = executable('some_exe', 'src/main.c') #, c_args: '-fopenmp')
I commented out the c_args keyword in the call to executable here.
Now I end up with the following scenarios:
without '-fopenmp' option, I get the warning, that the pragma is unknown and will be ignored (as I would expect): ../src/main.c:15:0: warning: ignoring pragma omp parallel [-Wunknown-pragmas] #pragma omp parallel for
with the option c_args: '-fopenmp' inserted, I do not get the above warning anymore, instead I get errors for undefined references to GOMP_parallel, omp_get_num_threads and omp_get_thread_num, and nothing gets built
when I use gcc manually with gcc -Wall -o manually_with_gcc ../src/main.c -fopenmp the program compiles and executes without any errors.
Can anyone tell me how to get the executable to compile with meson?
Meson 0.46 or later
Meson 0.46 (released Apr 23, 2018) added OpenMP support. So, if you have meson 0.46 or later,
project('openmp_with_meson', 'c')
omp = dependency('openmp')
exe = executable('some_exe', 'src/main.c',
dependencies : omp)
Should work with both GCC and Clang.
Meson 0.45 or earlier
If you happen to have older version, Debian Stretch, Ubuntu Bionic (18.04LTS), or Fedora 27, you can do the following:
You need another keyword arg link_args : '-fopenmp' for executable().
exe = executable('some_exe', 'src/main.c',
c_args: '-fopenmp',
link_args : '-fopenmp')
Meson builds C program in two phases, compiling and linking. You can pass extra arguments with c_args for compiling and link_args for linking.
The option -fopenmp enables OpenMP directives while compiling, and
the flag also arranges for automatic linking of the OpenMP runtime
library.
That is, -fopenmp is dual purpose option.
Now, the above is simple and good. Once you understand it, however, you can also compile your program with -fopenmp to activate the OpenMP directives and link the OpenMP libraries by yourself without -fopenmp to the link_args.
Here is a complete meson.build:
project('openmp_with_meson', 'c')
cc = meson.get_compiler('c')
libgomp = cc.find_library('gomp')
exe = executable('some_exe', 'src/main.c',
c_args: '-fopenmp',
dependencies : libgomp)
Meson >= 0.46 now has a builtin for this (docs):
openmp = dependency('openmp') # meson builtin
I've recently started to play around with OpenMP and like it very much.
I am a just-for-fun Classic-VB programmer and like coding functions for my VB programs in C. As such, I use Windows 7 x64 and GCC 4.7.2.
I usually set up all my C functions in one large C file and then compile a DLL out of it. Now I would like to use OpenMP in my DLL.
First of all, I set up a simple example and compiled an exe file from it:
#include <omp.h>
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
int n = 520000;
int i;
int a[n];
int NumThreads;
omp_set_num_threads(4);
#pragma omp parallel for
for (i = 0; i < n; i++)
{
a[i] = 2 * i;
NumThreads = omp_get_num_threads();
}
printf("Value = %d.\n", a[77]);
printf("Number of threads = %d.", NumThreads);
return(0);
}
I compile that using gcc -fopenmp !MyC.c -o !MyC.exe and it works like a charm.
However, when I try to use OpenMP in my DLL, it fails. For example, I set up this function:
__declspec(dllexport) int __stdcall TestAdd3i(struct SAFEARRAY **InArr1, struct SAFEARRAY **InArr2, struct SAFEARRAY **OutArr) //OpenMP Test
{
int LengthArr;
int i;
int *InArrElements1;
int *InArrElements2;
int *OutArrElements;
LengthArr = (*InArr1)->rgsabound[0].cElements;
InArrElements1 = (int*) (**InArr1).pvData;
InArrElements2 = (int*) (**InArr2).pvData;
OutArrElements = (int*) (**OutArr).pvData;
omp_set_num_threads(4);
#pragma omp parallel for private(i)
for (i = 0; i < LengthArr; i++)
{
OutArrElements[i] = InArrElements1[i] + InArrElements2[i];
}
return(omp_get_num_threads());
}
The structs are defined, of course. I compile that using
gcc -fopenmp -c -DBUILD_DLL dll.c -o dll.o
gcc -fopenmp -shared -o mydll.dll dll.o -lgomp -Wl,--add-stdcall-alias
The compiler and linker do not complain (not even warnings come up) and the dll file is actually being built. But as I try to call the function from within VB, the VB compiler claims the the DLL file could not be found (run-time error 53). The strange thing about that is that as soon as one single OpenMP "command" is present inside the .c file, the VB compiler claims a missing DLL even if I call a function that does not even contain a single line of OpenMP code. When I comment all OpenMP stuff out, the function works as expected, but doesn't use OpenMP for parallelization, of course.
What is wrong here? Any help appreciated, thanks in advance! :-)
The problem most probably in this case is LD_LIBRARY_PATH is not set . You must use set LD_LIBRARY_PATH to the path that contains the dll or the system will not be able to find it and hence complains about the same