I started using Magick.net an it only performs tasks using one Thread.
Do I have to do something make it perform filter operations using multiple Threads?
Or is OpenMP not part of Magick.net?
Using ImageMagick from command line uses OpenMP and all cores flawlessly.
Just checked against GraphicsMagick's Magick.net. OpenMP working there...
Tweeted to the Magick.NET dev and he confirmed, that Magick.NET does not have OpenMP.
The GraphicsMagick version of Magick.NET does.
However in future releases even GM.NET will drop OpenMP as well, so check the changelog if you are reading this in the future.
Related
We're developing a multithreaded project. My colleague said that gprof works perfectly with no work around with multithreaded programs. I read otherwise some time ago.
http://sam.zoy.org/writings/programming/gprof.html
http://lists.gnu.org/archive/html/bug-binutils/2010-05/msg00029.html
I also read this:
How to profile multi-threaded C++ application on Linux?
So I'm guessing the workaround is no longer needed? If so, since when is it not needed?
Unless you change the processing the gprof would work fine.
Changing the processing means using co-processor or gpus as computing units. In the worst case you have to manually call the setitimer function for every thread. But as per latest version, (2013-14) it's not needed.
In certain cases it behaves mischievously. So I advice to use the VTUNE from Intel which would give more accurate and more detailed information.
I have just converted a program to make use of MPI calls for use on multiple nodes but I am having a problem getting IO to work well with MPI calls.
I am using standard MPI2 IO Methods like MPI_File_open and MPI_File_write to write my final results to a file. On my laptop, I experience a slight speedup (0.2s -> 0.1s) but on the University's super computer my file writing speed becomes abysmal - (0.2s -> 90s!).
I cant understand why performance would be so bad on the supercomputer but improved on my desktop. Is there something I am overlooking which would heavily contribute to the slow speed?
Some Notes:
The file system on my laptop is ext4 and the one used by the University is nfs
I am using OpenMP 1.4.4 on the super computer and OpenMP 1.4.5 on my laptop.
I have the change the processes view multiple times using MPI_File_set_view due to a requirement set in the guidelines which I dont think I can get past.
I have tried using the asynchronous version of write -MPI_File_iwrite, but this actually gives worse results.
As part of my reserach I am looking for alternatives to profile an OpenMP code with explicit tasks (as per OpenMP 3.0). My main objective is to study the amount of overhead incurred when tasks are lying idle at a global barrier (such as a taskwait), prior to being scheduled and executed.
I looked into using the latest version of TAU, which has support for Opari which in turn instruments the source code to produce profiling statistic. Unfortunately since it instruments the source code, this is leading to large amount of overhead in program execution.
Tools like Gprof and PGprof do not provide the detail I am looking for. I have already tried and tested with them.
I am looking for a tool, which can help me in profiling an OpenMP program with tasks by levying minimum overhead. I am tempted to look into HPCToolkit and Scalasca, but I am not sure if they provide support for OpenMP tasks.
Looking for directions and your suggestions.
Thanks!!
Try LIKWID = Like I Knew What I’m Doing.
It is very reliable and free.
Is there a task library for C? I'm talking about the parallel task library as it exists in C#, or Java. In other words, I need a layer of abstraction over pthread for Linux. Thanks.
Give a look at OpenMP.
In particular, you might be interested in the Task feature of OpenMP 3.0.
I suggest you, however, to try to see if your problem can be solved using other, "basic" constructs, such as parallel for, since they are simpler to use.
Probably the most widely-used parallel programming primitives aside from the Win32 ones are those provided by pthreads.
They are quite low-level but include everything you need to write an efficient blocking queue and so create a thread pool of workers that carry out a queue of asynchronous tasks.
There is also a Win32 implementation so you can use the same codebase on Windows as well as POSIX systems.
Many concepts in TPL (Task, Work-Stealing Scheduler,...) are inspired by a very successful project named Cilk at MIT. Their advanced framework (Cilk Plus) was acquired by Intel and integrated to Intel Parallel Building Block. You still can use Cilk as an open source project without some advanced features. The good news is Intel is releasing Cilk Plus as open source in GCC.
You should try out Cilk as it adds another layer of abstraction to C, which makes it easy to express parallel algorithms but it is close enough to C to ensure good performance.
I've been meaning to checking out libdispatch. Yeah it's built for OS X and blocks, but they have function interfaces as well. Haven't really had time to look at it yet though so not sure if it fills all your needs.
There is an academia project called Wool that implements work stealing scheduler in C (with significant help of C preprocessor AFAIK). Might be worth looking at, though it does not seem actively developed.
My question is whether is it a good idea to mix OpenMP with pthreads. Are there applications out there which combine these two. Is it a good practice to mix these two? Or typical applications normally just use one of the two.
Typically it's better to just use one or the other. But for myself at least, I do regularly mix the two and it's safe if it's done correctly.
The most common case I do this is where I have a lower-level library that is threaded using pthreads, but I'm calling it in a user application that uses OpenMP.
There are some cases where it isn't safe. If for example, you kill a pthread before you exit all OpenMP regions in that thread.
I don't think so..
Its not a good idea. See the thing is, OpenMP is basically made for portability. Now if u are using pthread, then you are loosing the very essence of it!
pthread could only be supported by POSIX compliant OS's. While OpenMP could be used virtually on any OS provided they have a support for it.
Anyway, OpenMP gives you an abstraction much higher than what is provided by pthead.
No problem.
The purpose of OpenMP and pthreads are different. OpenMP is perfect to write a loop-level parallelism. However, OpenMP is not adequate to express sophisticated thread communications and synchronizations. OpenMP does not support all kinds of synchronizations, such as condition variables.
The caveat would be, as Mystrical pointed out, handling and accessing native threads within OpenMP parallel constructs.
FYI, Intel's TBB and Cilk Plus are also often used in a mixed way.
On Windows and Linux it seems to work just fine. However, OpenMP does not work on a Mac if it is run in a new thread. It only works in the main thread.
It appears that the behavior of how to mix the two threading modules is not defined. Some platform/compilers support it, others do not.
Sure. I do it all the time. You have to be careful. Why do it, though? Because there are some instances in which you have to! In complicated tasking models, such as pipelined functions where you want to keep the pipe going, it may be the only way to take advantage of all the power available.
I find very hard that you would need to use pthreads if you already use OpenMP. You could use a section pragma to run procedures with different functions. I personally have used it to implement pipeline parallelism.
Nowadays OpenMP does much more than pthreads, so if you use OpenMP you are covered. For instance, GCC 5.0 forward implements OpenMP extensions that exports code to GPU. :D