I have small c program that use lot of cpu , this program compiled to exe , and I run it as a process from my c# gui.
When I want to run it parallel on all over my cpu cores ,I have 2 options.
I have 4 cpu cores.
Run this c exe from my c# as 4 process so my os seperate those process 1 for each core .
Edit my c code so it run 4 thread so os will seperate 1 thread for each core, and from c# I will run it as 1 process.
Which way will be faster?
Edit: those processes/ thread will run like 3-5 houres ,and dont need to communicate between anotger thread/ process.
All of this running on windows
Running 4 threads in C will be faster than running 4 processes in C#.
There is a higher penalty for switching between processes than for switching between threads and communication between processes is slower than between threads.
Allocate a process require more time that allocate a thread, because threads share the resources (code, data...)
If your cpu continuously change processes (this usually happens) is not better solution using the processes, using of thread is more efficent and specially with a single cpu systems...
https://stackoverflow.com/a/200543/3476815
Related
I've recently started running MPI on my computer for some practice, after having some experience using MPI on a cluster. I have a dual core processor but was curious about what would happen if I specified a large number of processes and to my surprise it worked.
mpirun -np 40 ./wha
How exactly is this happening. Even considering the number of threads a single one of the processors could spawn this doesn't seem possible.
In case of OpenMPI, if the number of processes running is larger than the number of processors (i.e when Oversubscription happens), OpenMPI starts running the MPI processes in degraded mode. Running in degraded mode means yielding it's processor to other MPI processes for making progress (i.e time sharing happens). mpi_yield_when_idle can be set to 0 for making the mode aggressive explicitly, in such case the MPI process won't give the processor to other processes voluntarily.
See here
I was recently doing some experiments with the fork function and I was concerned by a "simple" (short) question:
Does fork uses concurrency or parallelism (if more than one core) mechanisms?
Or, is it the OS which makes the best choice?
Thanks for your answer.
nb: Damn I fork bombed myself again!
Edit:
Concurrency: Each operation run on one core. Interrupts are received in order to switch from one process to another (Sequential computation).
Parallelism: Each operation run on two cores (or more).
fork() duplicates the current process, creating another independent process. End of story.
How the kernel chooses to schedule these processes is a different, very broad question. In general the kernel will try to use all available resources (cores) to run as many tasks as possible. When there are more runnable tasks than cores, it has to start making decisions about who gets to run, and for how long.
The fork function creates a separate process. It is up to the operating system how it handles different processes.
Of course, if only once core is available, the OS has no other choice but running all processes interleaved.
If more cores are available, every sane OS will distribute the processes to the different cores, so every core runs at least one process.
However, even then, more processes can be active than there are cores. So even then, it is up to the OS to decide which processes can be run parallel (by distributing to cores) and which have to be run interleaved (on a single core).
In fact, fork() is a system call (aka. system service) which creates a new process from the current process (read the return code to see who you are, the parent or the child).
In the UNIX work, processes shares the CPU computing time. This works like that :
a process is running
the clock generates an interrupt, calling the kernel and pausing the process
the kernel takes the list of available processes, and decide to resume one (this is called scheduling)
go to point 1)
When there is multiples processor cores, kernels are able to dispatch processes on them.
Well, you can do something. Write a complex program say, O(n^3), so that it takes a good amount of time to compute. fork() four times (if you have quad-core). Now open any graphical CPU monitor. Anything cool?
I have a C program (graphics benchamrk) that runs on a MIPS processor simulator(I'm looking to graph some performance characteristics). The processor has 8 cores but it seems like core 0 is executing more than its fair share of instructions. The benchmark is multithreaded with the work exactly distributed between the threads. Why could it be that core 0 happens to run about between 1/4 and half the instructions even though it is multithreaded on a 8 core processor?
What are some possible reasons this could be happening?
Most application workloads involve some number of system calls, which could block (e.g. for I/O). It's likely that your threads spend some amount of time blocked, and the scheduler simply runs them on the first available core. In an extreme case, if you have N threads but each is able to do work only 1/N of the time, a single core is sufficient to service the entire workload.
You could use pthread_setaffinity_np to assign each thread to a specific core, then see what happens.
You did not mention which OS you are using.
However, most of the code in most OSs is still written for a single core CPU.
Therefore, the OS will not try to evenly distribute the processes over the array of cores.
When there are multiple cores available, most OSs start a process on the first core that is available (and a blocked process leaves the related core available.)
As an example, on my system (a 4 core amd-64) running ubuntu linux 14.04, the CPUs are usually less than 1 percent busy, So everything could run on a single core.
There must be lots of applications running like videos and background long running applications, with several windows open to show much real activity on other than the first core.
I have just adding threading to a large application I have been developing for years. It is written in C and runs on Mac and Linux. This question is about OS X, 10.8.2 or 10.6.8.
Problem: I see the program opening two threads as I expect. However, apparently both threads are running on the same CPU, or at least, I never get more than 100% of a CPU allocated to the program. This almost defeats the entire purpose of having threads.
I use a fair number of mutexes, if that matters.
How can I force the OS to run each thread at 100% of different CPUs? (There are 8 CPUs on this machine.)
The mutexes may matter a lot here. Open up Instruments and run the time profiler instrument on your program after setting it to "record all thread states". This will let you see where your threads are blocked waiting for something (likely a mutex) instead of running.
Multiple running threads will be concurrent as long as they execute on different cores - as each core has it's own instance of the scheduler in every Unix-like OS. Being on separate CPU dies matters little: if fact, there's a benefit to sharing resources between threads running on separate cores of the same die.
I have two threads in my application. Is it possible to execute both the threads simultaneously without sleeping any thread?
You can run the threads parallel in your application especially if they are not waiting on each other for some inputs or conditions. For example: One thread may be parsing a file and other maybe playing a song in your application.
Generally OS takes care of the thread time slicing. So at the application level it would look like these threads are running parallel but the OS does the time slicing giving each thread certain execution time.
With multi-core processors/cores it is possible to run the threads parallel in realtime, however the OS decides which threads to run unless you specifically code at lower level to ensure which threads you want to run in parallel.
As others have mentioned, with multiple cores it is possible, but, it depends on how the OS decides to distribute the threads. You don't have any control, that I have seen, on dictating where each thread is ran.
For a really good tutorial, with some nice explanation and pictures you can look at this page, with code as to how to do multi-threading using the POSIX library.
http://www.pathcom.com/~vadco/parallel.html
The time slice for sleep is hard to see, so your best bet is to test it out, for example, have your two threads begin to count every millisecond, and see if the two are identical. If they are not, then at least one is going to sleep by the cpu.
Most likely both will go to sleep at some time, the test is to see how much of a difference there is between the two threads.
Once one thread blocks, either waiting to send data, or waiting to receive, it will be put to sleep so that other threads can run, so that the OS can continue to make certain everything is working properly.
C does not, itself, have any means to do multi-threaded code.
However, POSIX has libraries that allow you to work with threads in C.
One good article about this topic is How to write multi-threaded software in C and C++.
Yes, if you have multiple processors or multi-core processors. One thread will run in one core.