Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
Say you wanted to write your own version of Opencl from scratch in C. How would you go about doing it? How does OpenCL accomplish parallel programming "under the hood"? Is it just pthreads?
OpenCL covers much functionality, including a runtime API library, a programming language based on C, a library environment for that language, and likely a loader library for supporting multiple implementations. If you want to look at an open source example of how it could be implemented, Pocl, Clover, Beignet and ROCm exist. At least Pocl's CPU target does indeed use pthreads, but OpenCL is designed to support offloading tasks to coprocessors such as GPUs, as well as using vector operations, so one thread does not necessarily run one work item.
The title does not refer to OpenCL, but does request to use "standard" libraries. The great thing about standards is that there are so many to choose from; for instance, the C standard provides no multithreading and no guarantee of multitasking. Multiprocessing frequently refers to running in multiple processes (in e.g. CPython, this is the only way to get concurrent execution of Python code because of the global interpreter lock). That can be done with the Unix standard function fork. Multithreading can be done using POSIX threads (POSIX.1c standard extension) or OpenMP. Recent versions of OpenMP also support accelerator offloading, which is what OpenCL was designed for. Since OpenMP and OpenCL provide restricted and abstracted environments, they could in principle be implemented on top of many of the others, for instance CUDA.
Implementing parallel execution itself requires hardware knowledge and access, and is typically the domain of the operating system; POSIX threads is often an abstraction layer on this, using e.g. clone on Linux.
OpenMP is frequently the easiest way to convert a C program to parallel execution, as it is supported by many compilers; you annotate branching points using pragmas and compile with e.g. -fopenmp for GCC. Such programs will still work as before if compiled without OpenMP.
First off: OpenCL != parallel processing. That is one of its strengths, but there's a lot more to it.
Focusing on one part of your question:
Say you wanted to write your own version of Opencl from scratch in C.
For one: get familiar with driver development. Our GPU CL runtime is pretty intimately involved with the drivers. If you want to start from scratch, you're going to need to get very familiar with the PCIe protocols and dig up some memories about toggling pins. This is doable, but it exemplifies "nontrivial."
Multithreading at the CPU level is an entirely different matter that's been documented out the yin-yang. The great thing about using an OS that you didn't have to write yourself is that this is already handled for you.
Is it just pthreads?
How do you think those are implemented? Their functionality is part of the spec, but their implementation is entirely platform-dependent, which you may call "non standard." The underlying implementation of a thread depends on the OS (if there is one, which is not a given), compiler, and a ton of other factors.
This is a great question.
Related
I am a beginner C/C++ programmer first of all, but I am curious about it.
My question is more theoretical.
I heard that C does not have explicit multithreading (MT) support, however there are libraries which implement this. I found "process.h" header which has to be included for building MT programs, but the thing I don't understand is how the MT itself works.
I know there are threads in CPU (assume it's single core for simplicity) running and there is only one thread per moment. The CPU is switching between threads really fast so that user sees it as a simultaneous work (correct me if not).
But - what really happens when I write the following
beginthread( Thread, 0, NULL ) //or whatever function/class method we use
keeping in mind that C does not have MT support. I mean, how does code tell the PC to run two functions multithreaded while it is not possible by the language explicit methods? I guess there is some "cheat" inside library related to "process.h", but what is that cheat, I can't just find on the web.
To be more specific - I am not asking about how to use MT, but how is it build?
Sorry if was answered earlier, or question is too complicated :)
UPD:
Imagine we have C language. It has functions, variables, pointers etc. I dont know any "special" function type that can run concurrently with other. Unless there are calls to some other functions from it. But then the caller function stops and waits?
Is it so that when I run MT applications, there is a special "global" function that calls my f1() and f2() repeatedly which looks like they were simultaneously working?
First of all, C11 does actually add multithreading support to the standard, so the premise that C does not support multithreading is no longer entirely correct.
However, I'm assuming your question is more to do with how can multithreading be implemented by a C library when standard C does(/did) not provide the necessary tools. The answer lies in the word “standard” – compilers and platforms can provide additional functionality beyond that required by the standard. Using such extra features makes the program/library less portable (i.e., more is required than is specified in the C standard), but the language and function call semantics can still be C.
Perhaps it is helpful to consider a standard library function such as fopen – somewhere inside that function code must eventually be called which could not be written in standard C, i.e., the implementation of the standard library itself must also rely on platform-specific code to access operating system functionality such as the file system. Every implementation of the standard library must thus implement the non-portable parts in a way specific to that platform (this is kind of the point of having a standard library instead of all code being platform-specific). But likewise a multithreading library can be implemented with non-standard features provided by that platform, but using such a library makes the code portable only to the platforms for which the same (or compatible) multithreading library is available.
As for how multithreading itself works, it is certainly outside the scope of what can be answered here, but as a simplified conceptual model on a single processor core, you can imagine the operating system managing “concurrent” processes by running one process for a short time, interrupting it, saving its state (current instruction, registers, etc), loading the saved state of another process, and repeating this. This gives the illusion of concurrent execution though in actual fact it is switching rapidly between different processes. On multi-core systems the execution on different cores can actually be concurrent, but there are typically more processes than there are cores, so this kind of switching will still happen on individual cores. Things are further complicated by processes waiting for something (I/O, another process, a timer, etc). Perhaps it suffice to say that the scheduler is a piece of software managing all of this inside the operating system and the multithreading library communicates with it.
(Note that there are many different ways to implement multithreading and multitasking, and statements in the above paragraph do not apply to all of them.)
It's platform specific. On Windows it eventually goes down to NtCreateThread which uses the assembly instruction syscall to call the operating system. So you can qualify it as a cheat.
On Linux it's the same, just the function with the syscall is called clone instead.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
This may sound naive, but are there any data structures / algorithms that cannot be constructed in C, given enough code? I understand the argument of being Turing complete. I also know it's beneficial to have an elegant solution and that time complexity is important (i.e. more expressive or succinct when implemented in Ruby / Java / C# / Haskell / Lisp). All the languages I've researched or used all seem to have been created or subsequently refactored into C based compilers, interpreters, and/or virtual machines. Are some complex data structures only implementable with an interpreter and/or virtual machine? If that virtual machine or interpreter is C based, isn't that just another data structure abstraction of the underlying C code? i.e. C has a simple type system but serves as the foundation for a dynamic type system. I was surprised to learn metaprogramming seems possible in C using the preprocessor (ioccc.org Immanuel Herrmann). I've also seen some intriguing C algorithms that mimic the concurrency model of Erlang, but don't recall the source.
What inspired this question was the StackOverflow post (Lesser Known Useful Data Structures) and the Patrick Dussud interview on channel9 (Garbage Collection - Past, Present and Future) - explaining how they wrote the the first CLR garbage collector (written in Lisp targeting the JVM, compiled from Lisp to C++ for the CLR).
So, at the end of the day, after I finish punching my cards, I'm wondering if this question is probably more about C programming language design than convenience of programming and time complexity. For example, I could implement a highly complex algorithm in Prolog that is very elegant and quite difficult to understand expressed any other way, but I'm still limited by the assembly instructions and the computer architecture (on/off) at the other end of the stick, so I'd be here all night.
Shor's algorithm for factorizing integers in O((log n)^3) polynomial time cannot be implemented in C, because the computers that it can run on do not yet officially exist. Maybe someday there will be a quantum circuit complete version of C and I'll have to revise my answer.
Joking aside, I don't think anybody can give you a satisfying answer to this. I will try to cover some aspects:
Vanilla, standard C might not be able to make use of the whole feature set of your processor. For example, you are not able to use the TSX feature of recent Intel processors explicitly. You can of course resort to OS primitives, inline assembly, language extensions or third-party libraries to circumvent that.
C by itself is not very good at parallel/asynchronous/concurrent/distributed programming. Some examples of languages that probably make a lot of tasks infinitely easier in this area are Haskell (maybe Data Parallel Haskell soon?), Erlang, etc. that provide very fast and lightweight threads/processes and async I/O. Working with green threads and heavily asynchronous I/O in C is probably less pleasant, although I'm sure it can be done.
In the end, on the user level side of things, of course you can emulate every Turing complete language with any other, as you pointed out so correctly.
Any Turing-complete machine or language can implement any other Turing-complete language, which means it can implement any program in any other Turing-complete language by interpretation if no other way. So the question you're asking is ill-formed; the issue is not whether tasks can be accomplished but how hard you have to work to accomplish them.
C in particular functions almost as a "high-level assembler language", since it will let you get away with many things that more recent languages won't, and thus may allow solutions that would be harder to implement in a more strongly-checked language.
That doesn't mean C is the best language for all those purposes. It forces you to pay much more attention to detail in many areas ranging from memory management to bounds checking to object-orientation (you CAN write OO code in C, but you have to implement it from the ground up). You have to explicitly load and invoke libraries for things that may be built into other languages. C datatypes can be incredibly convoluted (though typedefs and macros can hide much of that complexity). And so on.
The best tool for any given task is the one that (a) you are, or can become, comfortable with; (b) that's a good fit for the task at hand, and (c) that you have available.
Take a look at Turing completeness: Turing Completeness
Basically, any language which is Turing complete can execute all Turing-computable functions. C is a Turing complete language, so in theory you can implement any known solvable algorithm in C (albeit it may be terribly inefficient).
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
So I am not a C programmer so pardon this question.
I was reading this blog entry Google Zopfli Compression and I was a little dumbfounded by the following sentence : "Zopfli is written in C for portability".
How exactly is C a portable language? Or does he not mean portable in a compile-to-machine-code sense, but some other context? I guess C is more portable than writing assembly code. But is that really the comparison he is trying to make? I hope someone can enlighten me as to what he means and how exactly C is a portable language.
Thanks a lot!
Portable in this context means something like "Anybody can take this source code and compile it on their own computer and have this program." Very nearly all computers drawing power somewhere today have a C compiler available for them (it may not be installed on that machine, but it's either available to be installed or is available as a cross-compiler (eg embedded systems)), so that same source code is portable virtually everywhere. (EDIT: I'm assuming based on context that the source code doesn't have system-specific things in it, as system-specific things would limit your portability.)
"Portability" has multiple meanings, depending on the context:
The C language is "portable" in the sense that C compilers have been written for a wide variety of platforms, from mainframes to microcontrollers;
The language is also "portable" in the sense that there is an agreed-upon standard that implementations conform to (to greater or lesser degree), so you don't have subtly different versions of the language depending on the vendor - the behavior of a conforming program should be the same on any conforming implementation;
C programs that don't make any assumptions about the system they're running on (type sizes, alignment, endianess) or use system-specific libraries are often "trivially" portable; they only need to be recompiled for the target platform, without needing to edit the source code.
Compared to the majority of its contemporaries (Pascal, Fortran, etc.), C is highly portable, and I spent the bulk of the '90s writing C code that had to run on multiple platforms concurrently (one project required the same code to run on Windows NT, Solaris, and Classic MacOS).
C's portability can be summed up as "write once1, build and run everywhere", where Java and C#'s portability can be summed up as "write and build once, run everywhere."
1. Subject to the caveats in the third bullet
For a piece of software to be considered cross-platform, it must be able to function on more than one computer architecture or operating system.
Developing such program can be a time-consuming task because different operating systems have different application programming interfaces (API).
For example, Linux uses a different API for application software than Windows does.
C is a language you can use in most of the API.
C code can be directly called in C++, and easily used in C# and I believe Objective-C. That and the wide availability of c compilers, it does make sense.
Of course, the argument can also be made that Java is more portable as far as running it directly on other machines. But Java can't be moved from language to language as easily.
Is there a task library for C? I'm talking about the parallel task library as it exists in C#, or Java. In other words, I need a layer of abstraction over pthread for Linux. Thanks.
Give a look at OpenMP.
In particular, you might be interested in the Task feature of OpenMP 3.0.
I suggest you, however, to try to see if your problem can be solved using other, "basic" constructs, such as parallel for, since they are simpler to use.
Probably the most widely-used parallel programming primitives aside from the Win32 ones are those provided by pthreads.
They are quite low-level but include everything you need to write an efficient blocking queue and so create a thread pool of workers that carry out a queue of asynchronous tasks.
There is also a Win32 implementation so you can use the same codebase on Windows as well as POSIX systems.
Many concepts in TPL (Task, Work-Stealing Scheduler,...) are inspired by a very successful project named Cilk at MIT. Their advanced framework (Cilk Plus) was acquired by Intel and integrated to Intel Parallel Building Block. You still can use Cilk as an open source project without some advanced features. The good news is Intel is releasing Cilk Plus as open source in GCC.
You should try out Cilk as it adds another layer of abstraction to C, which makes it easy to express parallel algorithms but it is close enough to C to ensure good performance.
I've been meaning to checking out libdispatch. Yeah it's built for OS X and blocks, but they have function interfaces as well. Haven't really had time to look at it yet though so not sure if it fills all your needs.
There is an academia project called Wool that implements work stealing scheduler in C (with significant help of C preprocessor AFAIK). Might be worth looking at, though it does not seem actively developed.
I am working on a programming language. Currently it compiles to C. I would like to be able to include parallel programming facilities natively in my language so as to take advantage of multiple cores. Is there a way to write parallel C programs which is cross-platform? I would prefer to stick to straight C so as to maximize the number of platforms on which the language will compile.
Depending on what you want to do, OpenMP might work for you. It is supported by GCC, VC++, ICC and more.
Use a cross-platform threads library, like pthreads.
C has no standard, built-in support for threads or parallel processing.
"Straight C" has no concept of threading, so I'm afraid you're out of luck. You'll need to find some sort of cross-platform supporting thread library or port one to the various platforms you want to use. pthreads are as good a place to start as any, I guess.
GLib library (from the GTK project) has many useful cross-platform facilities, including threading.
If you're looking to eventually target large-scale parallelism, have a look at Charm++ and its underlying portable machine layer Converse. We run efficiently on machines ranging from multicore desktops to clusters, to BlueGene and Cray supercomputers.