I'm running a piece of program on a Linux development board, and I found that when the CPU load is low, it works alright, but if the CPU load gets to a high peak, it will take a much longer time.
Here is how it looks like: There are 2 programs running on board. Comsumer application has multiple threads, and they will call func1 to request some info from the producer process. The producer is a daemon process that will feed info back to the Consumer process.
The sample code looks like this :
Comsumer:
static int send_to_service(msg_t *cmd)
{
int cnt = send(fd, cmd, sizeof(msg_t), MSG_WAITALL);
if (cnt != sizeof(msg_t)) {
log(L_ERROR, "send failed");
return -1;
}
return 0;
}
static int func1(int aaa)
{
struct_t msg = {...};
msg.a = ...;
msg.b = ...;
gettimeofday(time1, NULL);
send_to_service(&msg);
...
}
Producer:
while(1) {
gettimeofday(time2, NULL);
int ret = select(maxfd+1, &fd_list, NULL, NULL, NULL);
gettimeofday(time3, NULL);
for (int fd = 0; fd <= maxfd; fd ++) {
if (!FD_ISSET(fd, &fd_list))
continue;
...
}
}
The time difference between time1 and time3 can be 30ms+ more during CPU high load time. And it does not happen all the time, only once in a while.
I tried the single process way earlier, that way the Consumer calls the driver and get the info directly. That worked well. Now I have to add another process to the system to get the same info, so I have to use a daemon process to feed to both process. The performance is not as good as the single process method. even if there is only one consumer.
The system I use is Linux version 4.14.74, I'm not sure about the socket type and network, both consumer processes are within the same system waiting to get image info. I just used "send, recv and select" system provided.
Related
This might be a dumb question, i'm very sorry if that's the case. But i'm struggling to take advantage of the multiple cores in my computer to perform multiple computations at the same time in my Quad-Core MacBook. This is not for any particular project, just a general question, since i want to learn for when i eventually do need to do this kind of things
I am aware of threads, but the seem to run in the same core, so i don't seem to gain any performance using them for compute-bound operations (They are very useful for socket based stuff tho!).
I'm also aware of processed that can be created with fork, but i'm nor sure they are guaranteed to use more CPU, or if they, like threads, just help with IO-bound operations.
Finally i'm aware of CUDA, allowing paralellism in the GPU (And i think OpenCL and Compute Shaders also allows my code to run in the CPU in parallel) but i'm currently looking for something that will allow me to take advantage of the multiple CPU cores that my computer has.
In python, i'm aware of the multiprocessing module, which seems to provide an API very similar to threads, but there i do seem to gain an edge by running multiple functions performing computations in parallel. I'm looking into how could i get this same advantage in C, but i don't seem to be able
Any help pointing me to the right direction would be very much appreciated
Note: I'm trying to achive true parallelism, not concurrency
Note 2: I'm only aware of threads and using multiple processes in C, with threads i don't seem to be able to win the performance boost i want. And i'm not very familiar with processes, but i'm still not sure if running multiple processes is guaranteed to give me the advantage i'm looking for.
A simple program to heat up your CPU (100% utilization of all available cores).
Hint: The thread starting function does not return, program exit via [CTRL + C]
#include <pthread.h>
void* func(void *arg)
{
while (1);
}
int main()
{
#define NUM_THREADS 4 //use the number of cores (if known)
pthread_t threads[NUM_THREADS];
for (int i=0; i < NUM_THREADS; ++i)
pthread_create(&threads[i], NULL, func, NULL);
for (int i=0; i < NUM_THREADS; ++i)
pthread_join(threads[i], NULL);
return 0;
}
Compilation:
gcc -pthread -o thread_test thread_test.c
If i start ./thread_test, all cores are at 100%.
A word to fork and pthread_create:
fork creates a new process (the current process image will be copied and executed in parallel), while pthread_create will create a new thread, sometimes called a lightweight process.
Both, processes and threads will run in 'parallel' to the parent process.
It depends, when to use a child process over a thread, e.g. a child is able to replace its process image (via exec family) and has its own address space, while threads are able to share the address space of the current parent process.
There are of course a lot more differences, for that i recommend to study the following pages:
man fork
man pthreads
I am aware of threads, but the seem to run in the same core, so i don't seem to gain any performance using them for compute-bound operations (They are very useful for socket based stuff tho!).
No, they don't. Except if you block and your threads don't block, you'll see alll of them running. Just try this (beware that this consumes all your cpu time) that starts 16 threads each counting in a busy loop for 60 s. You will see all of them running and makins your cores to fail to their knees (it runs only a minute this way, then everything ends):
#include <assert.h>
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#define N 16 /* had 16 cores, so I used this. Put here as many
* threads as cores you have. */
struct thread_data {
pthread_t thread_id; /* the thread id */
struct timespec end_time; /* time to get out of the tunnel */
int id; /* the array position of the thread */
unsigned long result; /* number of times looped */
};
void *thread_body(void *data)
{
struct thread_data *p = data;
p->result = 0UL;
clock_gettime(CLOCK_REALTIME, &p->end_time);
p->end_time.tv_sec += 60; /* 60 s. */
struct timespec now;
do {
/* just get the time */
clock_gettime(CLOCK_REALTIME, &now);
p->result++;
/* if you call printf() you will see them slowing, as there's a
* common buffer that forces all thread to serialize their outputs
*/
/* check if we are over */
} while ( now.tv_sec < p->end_time.tv_sec
|| now.tv_nsec < p->end_time.tv_nsec);
return p;
} /* thread_body */
int main()
{
struct thread_data thrd_info[N];
for (int i = 0; i < N; i++) {
struct thread_data *d = &thrd_info[i];
d->id = i;
d->result = 0;
printf("Starting thread %d\n", d->id);
int res = pthread_create(&d->thread_id,
NULL, thread_body, d);
if (res < 0) {
perror("pthread_create");
exit(EXIT_FAILURE);
}
printf("Thread %d started\n", d->id);
}
printf("All threads created, waiting for all to finish\n");
for (int i = 0; i < N; i++) {
struct thread_data *joined;
int res = pthread_join(thrd_info[i].thread_id,
(void **)&joined);
if (res < 0) {
perror("pthread_join");
exit(EXIT_FAILURE);
}
printf("PTHREAD %d ended, with value %lu\n",
joined->id, joined->result);
}
} /* main */
Linux and all multithread systems work the same, they create a new execution unit (if both don't share the virtual address space, they are both processes --not exactly so, but this explains the main difference between a process and a thread--) and the available processors are given to each thread as necessary. Threads are normally encapsulated inside processes (they share ---not in linux, if that has not changed recently--- the process id, and virtual memory) Processes run each in a separate virtual space, so they can only share things through the system resources (files, shared memory, communication sockets/pipes, etc.)
The problem with your test case (you don't show it so I have go guess) is that probably you will make all threads in a loop in which you try to print something. If you do that, probably the most time each thread is blocked trying to do I/O (to printf() something)
Stdio FILEs have the problem that they share a buffer between all threads that want to print on the same FILE, and the kernel serializes all the write(2) system calls to the same file descriptor, so if the most of the time you pass in the loop is blocked in a write, the kernel (and stdio) will end serializing all the calls to print, making it to appear that only one thread is working at a time (all the threads will become blocked by the one that is doing the I/O) This busy loop will make all the threads to run in parallel and will show you how the cpu is collapsed.
Parallelism in C can be achieved by using the fork() function. This function simulates a thread by allowing two threads to run simultaneously and share data. The first thread forks itself, and the second thread is then executed as if it was launched from main(). Forking allows multiple processes to be Run concurrently without conflicts arising.
To make sure that data is shared appropriately between the two threads, use the wait() function before accessing shared resources. Wait will block execution of the current program until all database connections are closed or all I/O has been completed, whichever comes first.
When I strace this code
void printMsg();
int main() {
signal(SIGPROF, printMsg);
struct itimerval tick;
memset(&tick, 0, sizeof(tick));
tick.it_value.tv_sec = 1; // sec
tick.it_value.tv_usec = 0; // micro sec.
tick.it_interval.tv_sec = 0;
tick.it_interval.tv_usec = 0;
setitimer(ITIMER_PROF, &tick, NULL);
while(1) {
;
}
return 0;
}
void printMsg() {
printf("%s","Hello World!!\n");
}
I got the SIGPROF signal after 1 second as expected
...
05:54:10 setitimer(ITIMER_PROF, {it_interval={0, 0}, it_value={1, 0}}, NULL) = 0
05:54:11 --- SIGPROF {si_signo=SIGPROF, si_code=SI_KERNEL} ---
...
But when I add a system call like write(2, "", 0) or read(2, "", 0) in while(1) and strace again, it looks like SIGPROF can not be fired. However, time(0) in while(1) can trigger SIGPROF properly.
Btw, I use this code to emulate following PHP script, which ignored time limit under PHP-FPM SAPI,
<?php
set_time_limit(5); // PHP uses setitimer(ITIMER_PROF) to implement this function
while (true) {
flush(); // PHP uses write(<fd>, "", 0) to implement this function
}
The code trigger the SIGPROF using ITIMER_PROF, which is counting when the process is executing, or when the system is running on behalf of the process.
When the process goes into 'read' or 'write' calls, it is not 'running'. The process goes into IO Wait (and the system will allocate the CPU to other processes). When the process is in IOWait, the timer is not moving. When there is a system call that perform processing (mostly in system space) - e.g. time() is a tight loop, the timer will be moving.
See more about process states in https://www.tecmint.com/linux-process-management/
If you want to measure 'clock' time, consider using timer_create with CLOCK_REALTIME or CLOCK_MONOTONIC
I would need some help with some C code.
Basically I have n processes which execute some code. Once they're almost done, I'd like the "Manager Process" (which is the main function) to send to each of the n processes an int variable, which may be different for every process.
My idea was to signal(handler_function, SIGALRM) once all processes started. When process is almost done, it uses kill(getpid(), SIGSTOP) in order to wait for the Manager Process.
After SIM_TIME seconds passed, handler_function sends int variable on a Message Queue then uses kill(process_pid, SIGCONT) in order to wake up waiting processes. Those processes, after being woken up should receive that int variable from Message Queue, print it and simply terminate, letting Manager Process take control again.
Here's some code:
/**
* Child Process creation using fork() system call
* Parent Process allocates and initializes necessary variables in shared memory
* Child Process executes Student Process code defined in childProcess function
*/
pid_t runChild(int index, int (*func)(int index))
{
pid_t pid;
pid = fork();
if (pid == -1)
{
printf(RED "Fork ERROR!\n" RESET);
exit(EXIT_FAILURE);
}
else if (pid == 0)
{
int res = func(index);
return getpid();
}
else
{
/*INSIGNIFICANT CODE*/
currentStudent = createStudent(pid);
currentStudent->status = FREE;
students[index] = *currentStudent;
currentGroup = createGroup(index);
addMember(currentStudent, currentGroup);
currentGroup->closed = FALSE;
groups[index] = *currentGroup;
return pid;
}
}
Code executed by each Process
/**
* Student Process Code
* Each Student executes this code
*/
int childProcess(int index)
{
/*NOTICE: showing only relevant part of code*/
printf("Process Index %d has almost done, waiting for manager!\n", index);
/* PROGRAM GETS STUCK HERE!*/
kill(getpid(), SIGSTOP);
/* mex variable is already defines, it's a struct implementing Message Queue message struct*/
receiveMessage(mexId, mex, getpid());
printf(GREEN "Student %d has received variable %d\n" RESET, getpid(), mex->variable);
}
Handler Function:
* Handler function
* Will be launched when SIM_TIME is reached
*/
void end_handler(int sig)
{
if (sig == SIGALRM)
{
usleep(150000);
printf(RED "Time's UP!\n" RESET);
printGroups();
for(int i = 0; i < POP_SIZE; i++){
mex->mtype = childPids[i];
mex->variable = generateInt(18, 30);
sendMessage(mexId, mex);
//childPids is an array containing PIDs of all previously launched processes
kill(childPids[i], SIGCONT);
}
}
I hope my code is understandable.
I have an issue though, Using provided code the entire program gets stuck at kill(getpid(), SIGSTOP) system call.
I also tried to launch ps in terminal and no active processes are detected.
I think handler_function doesn't send kill(childPids[i], SIGCONT) system call for some reason.
Any idea how to solve this problem?
Thank you
You might want to start by reading the manual page for mq_overview (man mq_overview). It provides a portable and flexible communication mechanism between processes which permits sync and async mechanisms to communicate.
In your approach, there is a general problem of “how does one process know if another is waiting”. If the process hasn’t stopped itself, the SIGCONT is ignored, and when it subsequently suspends itself, nobody will continue it.
In contrast, message-based communication between the two can be viewed as a little language. For simple exchanges (such as yours), the completeness of the grammar can be readily hand checked. For more complex ones, state machines or even nested state machines can be constructed to analyze their behaviour.
I need to implement a timer that checks for conditions every 35 seconds. My program is using IPC schemes to communicate information back and forth between client and server processes. The problem is that I am running msgrcv() function in a loop, which pauses the loop until it finds a message, which is no good because I need the timer to always be checking if a client has stopped sending messages. (if it only checks when it receives a message, this will be useless...)
The problem may seem unclear, but the basics of what I need is a way to implement a Watchdog timer that will check a condition every 35 seconds.
I currently have this code:
time_t start = time(NULL);
//Enter main processing loop
while(running)
{
size_t size = sizeof(StatusMessage) - sizeof(long);
if(msgrcv(messageID, &statusMessage, size, 0, 0) != -1)
{
printf("(SERVER) Message Data (ID #%ld) = %d : %s\n", statusMessage.machineID, statusMessage.status_code, statusMessage.status);
masterList->msgQueueID = messageID;
int numDCs = ++masterList->numberOfDCs;
masterList->dc[numDCs].dcProcessID = (pid_t) statusMessage.machineID;
masterList->dc[numDCs].lastTimeHeardFrom = 1000;
printf("%d\n", masterList->numberOfDCs);
}
printf("%.2f\n", (double) (time(NULL) - start));
}
The only problem is as I stated before, the code to check how much time has passed, won't be reached if there is no message to come in, as the msgrcv function will hold the process.
I hope I am making sense, and that someone will be able to assist me in my problem.
You may want to try the msgctl(msqid, IPC_STAT, struct msqid_ds *msgqdsbuf); If the call is successful, then the current number of messages can be found using msgdsbuf->msg_qnum. The caller needed read permissions, which I think you may have in here.
I have a ressource manager that handles multiple TCP-Connections. These connections are pthreads. How can I manage it to send data from the Ressource Manager to all of these threads? Or even better: How can I figure out to which thread I have to send this command?
For example: I have 2 Threads, one with pid 3333, one with pid 4444. The user sends a task to program a board (it is a ressource manager that manages FPGA-boards). The ressource manager picks a board from a list, where also this pid is saved. Then the program-command should be send to the thread with this pid or, what I was thinking in the first place, to all of the threads and the threads decide if they go on or not. Protocol looks like this: <pid>#<board-id>#<file>
I open 2 pipes (for writing to the threads and reading from the threads) in the main.c and give them as an argument to the listening-thread (forthread-struct).
main.c
// open Pipes to SSL
int rmsslpipe[2];
int sslrmpipe[2];
if (pipe(rmsslpipe) == -1) {
writelog(LOGERROR, "main: could not create RM-SSL reading pipe");
exit(1);
}
if (pipe(sslrmpipe) == -1) {
writelog(LOGERROR, "main: could not create RM-SSL reading pipe");
exit(1);
}
int rmtosslserver = rmsslpipe[1];
int sslservertorm = sslrmpipe[0];
// start SSL-Server as a pthread
pthread_t thread;
forthread* ft = malloc(sizeof(forthread));
ft->rmtosslserver = rmsslpipe[0];
ft->sslservertorm = sslrmpipe[1];
pthread_mutex_t ftmutex;
pthread_mutex_init(&ftmutex, NULL);
ft->mutex = ftmutex;
pthread_create(&thread, NULL, startProgramserver, (void*) ft);
This thread now listens for new connections and if there is a new connection, it creates a new thread with the forthread-struct as argument. This thread is where the action happens :)
void* startProgramserver(void* ft) {
int sock, s;
forthread* f = (forthread*) ft;
// open TCP-Socket
sock = tcp_listen();
while(1){
if((s=accept(sock,0,0))<0) {
printf("Problem accepting");
// try again
sleep(60);
continue;
}
writelog(LOGNOTE, "New SSL-Connection accepted");
f->socket = s;
pthread_t thread;
pthread_create(&thread, NULL, serveClient, (void*) f);
}
exit(0);
}
This thread now initializes the connection, gets some information from the client and then waits for the ressource manager to get new commands.
n=read(f->rmtosslserver, bufw, BUFSIZZ);
But this fails if there is more than only one thread. So how can I manage that?
If you have one thread per board, the "pid" shouldn't be needed in the command -- you just need a way to find the right thread (or queue, or whatever) for the specified board.
You could keep a list of your forthread structures, and include the board ID in the structure. Also include a way of passing commands; this could be a pipe, but you may as well use some sort of queue or list instead. That way you use one pipe (or other mechanism) per thread instead of a single shared one, and can find the right one for each board by searching the forthread list for the one with the right board ID. Just be sure to protect any parts of the structure that may be modified while the thread runs with a mutex.
The problem with using a single pipe as you've suggested is that only one thread will get each command -- if it's the wrong one, too bad; the command is gone.
The answer is Yes. I would use a list of them.However I can open a pipe more than 1 when the the speed of the PC is very slow. 2 connections for 2 connections.