Task library for C? - c

Is there a task library for C? I'm talking about the parallel task library as it exists in C#, or Java. In other words, I need a layer of abstraction over pthread for Linux. Thanks.

Give a look at OpenMP.
In particular, you might be interested in the Task feature of OpenMP 3.0.
I suggest you, however, to try to see if your problem can be solved using other, "basic" constructs, such as parallel for, since they are simpler to use.

Probably the most widely-used parallel programming primitives aside from the Win32 ones are those provided by pthreads.
They are quite low-level but include everything you need to write an efficient blocking queue and so create a thread pool of workers that carry out a queue of asynchronous tasks.
There is also a Win32 implementation so you can use the same codebase on Windows as well as POSIX systems.

Many concepts in TPL (Task, Work-Stealing Scheduler,...) are inspired by a very successful project named Cilk at MIT. Their advanced framework (Cilk Plus) was acquired by Intel and integrated to Intel Parallel Building Block. You still can use Cilk as an open source project without some advanced features. The good news is Intel is releasing Cilk Plus as open source in GCC.
You should try out Cilk as it adds another layer of abstraction to C, which makes it easy to express parallel algorithms but it is close enough to C to ensure good performance.

I've been meaning to checking out libdispatch. Yeah it's built for OS X and blocks, but they have function interfaces as well. Haven't really had time to look at it yet though so not sure if it fills all your needs.

There is an academia project called Wool that implements work stealing scheduler in C (with significant help of C preprocessor AFAIK). Might be worth looking at, though it does not seem actively developed.

Related

advanced-ish C programming in Windows (pthreads, signals and semaphores, oh my!)

I'm in my third year studying computer science, so I should probably actually know the answer to this question already, but nonetheless, I don't. Anyway, I'm taking the OS course for my degree currently and we've been been covering a lot of new programming concepts like signals, semaphores, and threads in C. Unfortunately, my prof is covering all of these in a Linux/OS X perspective. What this means for me, being on a 64-bit windows machine, is that things like installing an alarm signal, or using semaphores and pthreads won't compile or run on my machine (as far as I can tell).
Anyway, currently I have just been doing my assignments in a VM running Linux, which has worked well so far, but I much prefer the Windows environment for coding.
So, after that heavy winded introduction, my question is, as you might have already guessed, is there a way to code with all these features (alarm signals, semaphores, pthreads, etc.) and be able to compile and test them in Windows? I'm fully aware that the Windows OS does not support the alarm signal and has limited POSIX capability, but I've heard rumors tossed around about cygwin (which I did try to get to work, but not very hard :P) and micro Linux kernels that you can run in the background to use these features.
Anyway, if anyone can give me maybe a list of options they would recommend (preferably not stick with your VM, even though I'm thinking that might be my best option) and maybe some tips, pros, cons, maybe a setup guide, or really any non-empty subset of these options I would really appreciate it. Also, before you ask, we have to use C and the above mentioned programming features in our assignments, so there's no switch to Java or code in win32 option unfortunately :(
Thanks in advance to anyone who can lend some words of wisdom :)
The basic principles are all there in Windows but done differently. And I recommend that, if you're going to program for Windows, you do this in the Windows API rather than through an emulation layer like Cygwin. If anything at all you'll quickly learn that different operating systems take a different approach to signalling and process handling.
First thing to be aware of is that threads are much more lightweight in Windows while processes are significantly more heavyweight. With that in mind Windows programs operate most efficiently when using threads. There is the concept of the CriticalSection that you should become very familiar with. And the Semaphore Object. Keep reading the API and you'll find a wealth of information about these topics - the Microsoft documentation is actually rather good. A key thing to understand about the Windows API is that you almost always have to "create/get" a new object (and obtain a handle) before you can use it. And Windows doesn't like programs having too many handles.
Personally I like the POSIX API and have a love for Linux. But I do appreciate that if you want to do things properly in Windows you should be using the Windows OS API - they have thought about this carefully even though the results and methods may be somewhat different.
PS Windows doesn't have the "alarm". It is perhaps the single most prohibitive barrier to simply porting Unix/Linux utilities to Windows. (That and the fact that Windows processes have to "bootstrap" Internet/socket support before using it whereas Linux processes are good to go).
There's MinGW-w64 - a Windows port of the GNU toolchain - and Pthreads-win32, a POSIX wrapper of the Windows threading API.
I'm using these via the mingw64-x86_64 Cygwin cross-compiler packages (which currently provide the somewhat dated gcc 4.5.3) instead of directly for two reasons: First, I need other stuff from the GNU toolbox, and second because of the package manager.
I'm not sure to what degree Pthreads-win32 complies to POSIX, but I can confirm that LLVM and Clang both compile with this setup.

Mixing OpenMP with pthreads

My question is whether is it a good idea to mix OpenMP with pthreads. Are there applications out there which combine these two. Is it a good practice to mix these two? Or typical applications normally just use one of the two.
Typically it's better to just use one or the other. But for myself at least, I do regularly mix the two and it's safe if it's done correctly.
The most common case I do this is where I have a lower-level library that is threaded using pthreads, but I'm calling it in a user application that uses OpenMP.
There are some cases where it isn't safe. If for example, you kill a pthread before you exit all OpenMP regions in that thread.
I don't think so..
Its not a good idea. See the thing is, OpenMP is basically made for portability. Now if u are using pthread, then you are loosing the very essence of it!
pthread could only be supported by POSIX compliant OS's. While OpenMP could be used virtually on any OS provided they have a support for it.
Anyway, OpenMP gives you an abstraction much higher than what is provided by pthead.
No problem.
The purpose of OpenMP and pthreads are different. OpenMP is perfect to write a loop-level parallelism. However, OpenMP is not adequate to express sophisticated thread communications and synchronizations. OpenMP does not support all kinds of synchronizations, such as condition variables.
The caveat would be, as Mystrical pointed out, handling and accessing native threads within OpenMP parallel constructs.
FYI, Intel's TBB and Cilk Plus are also often used in a mixed way.
On Windows and Linux it seems to work just fine. However, OpenMP does not work on a Mac if it is run in a new thread. It only works in the main thread.
It appears that the behavior of how to mix the two threading modules is not defined. Some platform/compilers support it, others do not.
Sure. I do it all the time. You have to be careful. Why do it, though? Because there are some instances in which you have to! In complicated tasking models, such as pipelined functions where you want to keep the pipe going, it may be the only way to take advantage of all the power available.
I find very hard that you would need to use pthreads if you already use OpenMP. You could use a section pragma to run procedures with different functions. I personally have used it to implement pipeline parallelism.
Nowadays OpenMP does much more than pthreads, so if you use OpenMP you are covered. For instance, GCC 5.0 forward implements OpenMP extensions that exports code to GPU. :D

Coding in C in Linux vs Windows. Any adequate debugging oriented C-centric IDE?

I've run into an issue with writing some code in c. My basic problem is that I am pressed for time and the code I am dealing with has lots of bugs which I have to "erradicate" before tomorrow evening.
The bigger part of the problem is that there is no adequate IDE that can do real time debugging, especially when using threads or spawning processes with fork(). I tried Mono, Eclipse, and finally NetBeans, and have concluded that those are very good but not for coding in C. More over the learning curve to utilize the command line debugger properly is quite steep. (Like I mentioned earlier... I am pressed on time.)
So, since I am a C# developer by profession I was wondering whether I can pull this off in VS2003/VS2005/VS2008/VS2010. If I abstain from using system calls, can I do this?
Of particular interest are FILE* descriptor and fread(), fclose(), fseek() methods. I know they are part of the standard C library, however are they tied to the platform itself? Are the headers the same in Linux vs Windows? What about fork() or shared memory?
Maybe if I use VS2010 to build parts of the component at a time (by mocking inputs and stuff), debug those, and then migrate the working code in the overall Linux project would prove most useful?
Any input would be greatly appreciated.
The bigger part of the problem is that there is no adequate IDE that can do real time debugging, especially when using threads or spawning processes with fork().
The Eclipse CDT would probably have the best overall support for C/C++ development and integrated debugging.
Note that multithreaded and multiprocess debugging can be difficult at the best of times. Investing in a good logging framework would be advisable at this point, and probably more useful than relying on a debugger. There are many to choose from - have a look at Log4C++ and so on. Even printf in a pinch can be invaluable.
So, since I am a C# developer by profession I was wondering whether I can pull this off in VS2003/VS2005/VS2008/VS2010. If I abstain from using system calls, can I do this?
If you take care to only use portable calls and not Win32-specific APIs, you should be ok. Also, there are many libraries (for C++ libraries such as Boost++ that provide a rich set of functionality which work the same on Windows, Linux and others.
Of particular interest are FILE* descriptor and fread(), fclose(), fseek() methods. I know they are part of the standard C library, however are they tied to the platform itself? Are the headers the same in Linux vs Windows? What about fork() or shared memory?
Yes, the file I/O functions you mention are in <stdio.h> and part of the portable standard C library. They work essentially the same on both Windows and Linux, and are not tied to a particular platform.
However, fork() and the shared memory functions shmget() are POSIX functions, available on *nix platforms but not natively on Windows. The Cygwin project provides implementations of these functions in a library for ease of porting.
If you are using C++, Boost++ will give you portable versions of all these system-level calls.
Maybe if I use VS2010 to build parts of the component at a time (by mocking inputs and stuff), debug those, and then migrate the working code in the overall Linux project would prove most useful?
You could certainly do that. Just be mindful that Visual Studio has a tendency to lead you down the Win32 path, and you must be vigilant to not start using non-portable functions. Fortunately the library reference on MSDN gives you the compatibility information. In general, using standard C or POSIX calls will be portable. In my experience, it is actually easier to write on *nix and port to Windows, but YMMV.
Looks like I am the first to recommend Emacs here. Here is how Emacs works. When you install it, it is simply a text editor with a lot of extensions(debugger and C font-locking are included by default). As you start using it and install the extensions you miss, it becomes more than just an editor. It grows to become an IDE very soon and easily, then on to something that can eschew the OS under one frame.
Emacs might take long to learn, in the mean time, you could use Visual Slick Edit if you are not pressed on the cost part. I have used it on both platforms and seen it work good with version control, tags, etc.
Perhaps Code::Blocks? I love it and while it says it's for C++ it is, of course, very good for plain C as well.

Pthreads vs. OpenMP

I'm creating a multi-threaded application in C using Linux.
I'm unsure whether I should use the POSIX thread API or the OpenMP API.
What are the pros & cons of using either?
Edit:
Could someone clarify whether both APIs create kernel-level or user-level threads?
Pthreads and OpenMP represent two totally different multiprocessing paradigms.
Pthreads is a very low-level API for working with threads. Thus, you have extremely fine-grained control over thread management (create/join/etc), mutexes, and so on. It's fairly bare-bones.
On the other hand, OpenMP is much higher level, is more portable and doesn't limit you to using C. It's also much more easily scaled than pthreads. One specific example of this is OpenMP's work-sharing constructs, which let you divide work across multiple threads with relative ease. (See also Wikipedia's pros and cons list.)
That said, you've really provided no detail about the specific program you're implementing, or how you plan on using it, so it's fairly impossible to recommend one API over the other.
If you use OpenMP, it can be as simple as adding a single pragma, and you'll be 90% of the way to properly multithreaded code with linear speedup. To get the same performance boost with pthreads takes a lot more work.
But as usual, you get more flexibility with pthreads.
Basically, it depends on what your application is. Do you have a trivially-parallelisable algorithm? Or do you just have lots of arbitrary tasks that you'd like to simultaneously? How much do the tasks need to talk to each other? How much synchronisation is required?
OpenMP has the advantages of being cross platform, and simpler for some operations. It handles threading in a different manner, in that it gives you higher level threading options, such as parallelization of loops, such as:
#pragma omp parallel for
for (i = 0; i < 500; i++)
arr[i] = 2 * i;
If this interests you, and if C++ is an option, I'd also recommend Threading Building Blocks.
Pthreads is a lower level API for generating threads and synchronization explicitly. In that respect, it provides more control.
It depends on 2 things- your code base and your place within it. The key questions are- 1) "Does you code base have threads, threadpools, and the control primitives (locks, events, etc.)" and 2) "Are you developing reusable libraries or ordinary apps?"
If your library has thread tools (almost always built on some flavor of PThread), USE THOSE. If you are a library developer, spend the time (if possible) to build them. It is worth it- you can put together much more fine-grained, advanced threading than OpenMP will give you.
Conversely, if you are pressed for time or just developing apps or something off of 3rd party tools, use OpenMP. You can wrap it in a few macros and get the basic parallelism you need.
In general, OpenMP is good enough for basic multi-threading. Once you start getting to the point that you're managing system resourced directly on building highly async code, its ease-of-use advantage gets crowded out by performance and interface issues.
Think of it this way. On a linux system, it is very highly likely that the OpenMP API itself uses pthreads to implement its features such as parallelism, barriers and locks/mutex. Having said that, there are good reasons to work directly with the pthreads API.
It is my opinion that -
You use OpenMP when -
Your program contains easy to spot for loops where each iteration is independent from other.
You want to retain the readability of the program. (Basically keep the parallelized program looking like the sequential version)
You want to stay away from the nitty-gritty details of how threads are spawned and controlled at a micro level.
And pthreads when -
You don't have easy to parallelize loops.
You have different tasks which need to be performed concurrently, you might be wanting to give different responsibilities to each of those tasks.
You want to control the flow of thread execution at a micro level.
Feel free to correct me in the comments.

How does one write cross-platform parallel programs in C?

I am working on a programming language. Currently it compiles to C. I would like to be able to include parallel programming facilities natively in my language so as to take advantage of multiple cores. Is there a way to write parallel C programs which is cross-platform? I would prefer to stick to straight C so as to maximize the number of platforms on which the language will compile.
Depending on what you want to do, OpenMP might work for you. It is supported by GCC, VC++, ICC and more.
Use a cross-platform threads library, like pthreads.
C has no standard, built-in support for threads or parallel processing.
"Straight C" has no concept of threading, so I'm afraid you're out of luck. You'll need to find some sort of cross-platform supporting thread library or port one to the various platforms you want to use. pthreads are as good a place to start as any, I guess.
GLib library (from the GTK project) has many useful cross-platform facilities, including threading.
If you're looking to eventually target large-scale parallelism, have a look at Charm++ and its underlying portable machine layer Converse. We run efficiently on machines ranging from multicore desktops to clusters, to BlueGene and Cray supercomputers.

Resources