Currently, I am having a hard time to discover what the problem with my multithreading C program on the RPi is. I have written an application relying on two pthreads, one of them reading data from a gps device and writing it to a text file and the second one is doing exactly the same but with a temperature sensor. On my laptop (Intel® Core™ i3-380M, 2.53GHz) I am have the program nicely working and writing to my files up to the frequencies at which both of the devices send information (10 Hz and 500 Hz respectively).
The real problem emerges when I compile and execute my C program to run on the RPi; The performance of my program running on RPi considerably decreases, having my GPS log file written with a frequency of 3 Hz and the temperature log file at a frequency of 17 Hz (17 measurements written per second)..
I do not really know why I am getting those performance problems with my code running on the PI. Is it because of the RPi has only a 700 MHz ARM Processor and it can not process such a Multithreaded application? Or is it because my two threads routines are perturbing the nice work normally carried out by the PI? Thanks a lot in advance Guys....!!!
Here my code. I am posting just one thread function because I tested the performance with just one thread and it is still writing at a very low frequency (~4 Hz). At first, the main function:
int main(int argc, char *argv[]) {
int s1_hand = 0;
pthread_t routines[2];
printf("Creating Thread -> Main Thread Busy!\n");
s1_hand = pthread_create(&(routines[1]), NULL, thread_2, (void *)&(routines[1]));
if (s1_hand != 0){
printf("Not possible to create threads:[%s]\n", strerror(s1_hand));
exit(EXIT_FAILURE);
}
pthread_join(routines[1], NULL);
void* result;
if ((pthread_join(routines[1], &result)) == -1) {
perror("Cannot join thread 2");
exit(EXIT_FAILURE);
}
pthread_exit(NULL);
return 0;
}
Now, thread number 2 function:
void *thread_2(void *parameters) {
printf("Thread 2 starting...\n");
int fd, chars, parsing, c_1, parse, p_parse = 1;
double array[3];
fd = open("dev/ttyUSB0", O_RDONLY | O_NOCTTY | O_SYNC);
if (fd < 0){
perror("Unable to open the fd!");
exit (EXIT_FAILURE);
}
FILE *stream_a, *stream_b;
stream_a = fdopen(fd, "r");
stream_b = fopen (FILE_I, "w+");
if (stream_a == NULL || stream_b == NULL){
perror("IMPOSSIBLE TO CREATE STREAMS");
exit(EXIT_FAILURE);
}
c_1 = fgetc(stream_a);
parse = findit(p_parse, c_1, array);
printf("First Parse Done -> (%i)\n", parse);
while ((chars = fgetc(stream_a)) != EOF){
parsing = findit(0, (uint8_t)chars, array);
if (parsing == 1){
printf("MESSAGE FOUND AND SAVED -> (%i)\n", parsing);
fprintf(stream_b,"%.6f %.3f %.3f %.3f\n", time_stamp(), array[0], array[1], array[2]);
}
}
fflush(stream_b);
fclose(stream_b);
fclose(stream_a);
close(fd);
pthread_exit(NULL);
return 0;
}
Note that on my thread 2 function I am using findit(), function which returns 0 or 1 in case of having found and parsed a message from the gps, writing the parsed info in my array (0 no found, 1 found and parsed). The function time_stamp() just call the clock_gettime(CLOCK_MONOTONIC, &time_stamp) function in order to have a time reference on each written event. Hope with this information you guys can help me. Thank you!
Obviously the processor is capable of running 20 things a second. I'd first check your filesystem performance.
Write a small program that simulates the writes just the way you're doing them and see what the performance is like.
Beyond that, I'd suggest it's the task swapping that's causing delays. Try without one of the threads. What type of performance do you get?
I'd guess it's the filesystem, though. Try buffering your writes into memory and do large (4k+) writes every few seconds and I bet that will make your system a lot happier.
Also, post your code. Otherwise all we can do is guess.
Related
My whole system (Ubuntu 18.04) always freezes after around one hour when my c program continuously writes some logs to files. Each file created is around 100 to 200MB and the total amount of these files before system down is around 40-60GB. Usually, I have 150GB more SSD spaces available at this moment.
I checked system condition by System Monitor but couldn't find any problem. When my program runs, only one of the eight cores has 100% usage. Others are pretty low. Before system down, only 2.5GB of 15.5GB memory are used. Every time I reboot my machine, the latest 4-6 created files are empty. Even though most of them was showing some sizes at the moment of freezing. (looks like they were not actual written to SSD)
My c code can be simplified as below:
#define MEM_LEN 50000
#define FILE_LEN 10000*300
struct log_format {
long cnt;
long tv_sec;
long tv_nsec;
unsigned int user;
char rw;
char pathbuffer[256];
size_t count;
long long pos;
};
int main(int argc, const char *argv[])
{
int fd=0;
struct log_format *addr = NULL;
int i=0;
FILE *file;
char filestr[20];
int data_cnt = 0;
int file_cnt =0;
// open shared memory device //
fd = open("/dev/remap_pfn", O_RDWR);
if (fd < 0) {
perror("....open shared memory device1 failed\n");
exit(-1); }
// memory mapping to shared memory device //
addr = mmap(NULL, BUF_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_LOCKED, fd, OFFSET);
if (!addr) {
perror("....mmap1 failed\n");
exit(-1); }
// open a file //
sprintf(filestr, "%d.csv", file_cnt);
file = fopen(filestr, "w");
printf("%s created\n",filestr);
// continuously check the memory replacement of last, and write to file //
while(1){
fprintf(file, "%lu,%lu,%lu,%u,%c,%s,%zu,%lld\n", addr[i].cnt, addr[i].tv_sec,
addr[i].tv_nsec, addr[i].user, addr[i].rw, addr[i].pathbuffer,
addr[i].count, addr[i].pos);
i++;
data_cnt++;
if(i>=MEM_LEN)
i=0;
// when reaching a threshold, create another file to write //
if(data_cnt>=FILE_LEN){
data_cnt = 0;
fclose(file);
file_cnt++;
// open a file //
sprintf(filestr, "%d.csv", file_cnt);
file = fopen(filestr, "w");
printf("%s created\n",filestr);
}
}
fclose(file);
return 0;
}
I didn't find any error message from syslog & kern.log. It just freezes.
Does anyone have ideas what could be the problem. Thanks.
I tried to add some delays into my While loop, to slow down the Write:
(since 1 nano second is still too long for the loop, I make it only Sleep per 10 runs)
While(1){
struct timespec ts = {0,1L};
if(data_cnt%10==0)
nanosleep(&ts, NULL);
......
}
The freeze problem seems gone now.
So... what might be the reason for this? For now, I only saw Write becoming slower and CPU loading decreased to 50% for that core. Is there a write buffer in between and my program exceeded its limit and crushed the system?
(I will also keep track if it is a overheated problem resulting in machine down)
I am trying to implement file copy program with POSIX Asynchronous IO APIs in linux.
I tried this:
main() {
char data[200];
int fd = open("data.txt", O_RDONLY); // text file on the disk
struct aiocb aio;
aio.aio_fildes = fd;
aio.aio_buf = data;
aio.aio_nbytes = sizeof(data);
aio.aio_offset = 0;
memset(&aio, 0, sizeof(struct aiocb));
aio_read(arg->aio_p);
int counter = 0;
while (aio_error(arg->aio_p) == EINPROGRESS) {
printf("counter: %d\n", counter++);
}
int ret = aio_return(&aio);
printf("ret value %d \n",ret);
return 0;
}
But counter giving different results every time when I run
Is it possible to display progress of aio_read and aio_write functions?
You have different results because each different execution has its own context of execution that may differe from others (Do you always follow the exact same path to go from your house to the bank? Even if yes, elapsed time to reach the bank always exactly the same? Same task at the end - you're in the bank, different executions). What your program tries to measure is not I/O completion but the number of time it tests for completion of some async-I/O.
And no, there is no concept of percentage of completion of a given async-I/O.
I'm working on a SOM board running Linux embedded for ARM, and I'm developing a C program to communicate with an external device through a serial (RS232) port. I am experiencing a strange behavior though. I'm also using another serial port of the board to communicate with the linux running on the board.
The software has a simple structure: is a text-only console-like program, with this as main menu:
Possible commands:
1 - 4: Select serial device (pump should be on 1)
m - pump op. mode configuration
r - reads from the serial device
w - writes to the serial device
>>>>>>>>>>>>>>Current device is /dev/ttymxc1
>>>>>>>>>>>>>>Enter input (q quits):
and a secondary menu (opened by the "m" option above)
SPEED:
r - rpm (sends 1M<CR>) //<CR> stands for carriage return
f - flow rate (sends 1N<CR>)
QUANTITY:
v - volume (sends 1H<CR>)
t - time (sends 1O<CR>)
DIRECTION:
c - clockwise (sends 1T<CR>)
a - c-clockwise (sends 1K<CR>)
>>>>>>>>>>>>>>Enter input (q quits):
Communication using the main menu options "r" and "w" works fine (thus removing any doubt I may have regarding serial settings like baud rate): "w" invokes a routine ("serial_write" below) that sends a single character input by the user, while "r" returns the data read as soon it arrives (using "serial_read" below). The character I send arrives correctly, and the answer is shown correctly on the console, no matter the times I repeat the "w" and "r" cycle.
The options in the secondary menu should behave in the same way: they simply invoke a routine ("sendSimpleCmd" below) that invokes "serial_write" with a costant char as argument (different for each option), and then it invokes "serial_read".
The problem is that this only works for the first option selected: after that, the program keeps sending data linked to the first option selected, no matter the option I choose. It keeps doing this until I go back to the main menu, then choose again "m": at this point the data sent is the one I expect, but the subsequent choices will be ignored until I go back to the main menu (or close the software, if that matters).
The strangest thing is that I receive the expected data on the same serial I'm using to communicate with board while on the "right" serial port I keep getting the first message. This is the text pasted from the console when I choose "a" as second option, after having chosen "f" as first option (comments added by me):
SPEED:
r - rpm (sends 1M<CR>)
f - flow rate (sends 1N<CR>)
QUANTITY:
v - volume (sends 1H<CR>)
t - time (sends 1O<CR>)
DIRECTION:
c - clockwise (sends 1T<CR>)
a - c-clockwise (sends 1K<CR>)
>>>>>>>>>>>>>>Enter input (q quits):
a //second option
1Knding 1M //mixup of data
wrote 4 characters on fs 4
serial_read: *
The mixup is made by the software output ("Sending data 1M") and the data that should be sent after choosing option "a" (1K). Since on the "right" port I get the same message over and over, while on the "wrong" port I get the right message, it seems that somehow the software autonomously changes port.
The question is:
Could this behavior be caused by my coding, or is bound to something else, like some kernel configuration? If more information is needed, just ask.
Thank you in advance
Serial_write
void serial_write(char text[], int length){
if (selectedDevice == 0){
printf("Select device first!\r\n");
return;
}
int n;
length = length +1 + 2;
char toBeSent[length];
strcat(toBeSent, PUMP_CMD_MSG_START); //header, "1"
strcat(toBeSent, text);
strcat(toBeSent, PUMP_CMD_MSG_END); //footer, "<CR>"
printf("Sending %s\r\n", toBeSent);
n = write (fd, toBeSent, length);
if (n<0){
printf("writing failed on /dev/ttymxc%i\r\n", selectedDevice);
} else {
printf("wrote %i characters on fs %i\r\n", n, fd);
}
}
Serial_read
int serial_read(char *buffer, int size){
int bytes = 0;
int n;
int i = 0;
char tmp_buffer[size];
while(1){
ioctl(fd, FIONREAD, &bytes);
if (bytes > 0){
break;
}
i++;
if(i> 1000){
printf("FIONREAD tries exceeded 1000, aborting read\r\n");
return;
}
usleep(1000);
}
n=read(fd, tmp_buffer, sizeof(tmp_buffer));
for(i=0;i<n;i++) {
buffer[i]=tmp_buffer[i];
}
printf("serial_read: %s\r\n", buffer);
return 0;
}
sendSimpleCmd
void sendSimpleCmd(char text[]){
int bufSize= 20;
char answer[bufSize];
serial_write(text,1);
if (serial_read(answer, bufSize) == 0) {
printf("Ricevuto da pompa \"%s\":", answer);
//handling of possible answers, doesn't do anything relevant since it always receives "*" as answer
if (strcmp(answer, PUMP_ANS_OK) == 0){ //PUMP_ANS_OK is "*"
printf("ok!\r\n");
} else if (strcmp(answer, PUMP_ANS_NOK) == 0){
printf("errore!\r\n");
} else {
printf("sconosciuto!\r\n");
}
} else {
// printf("read failed\r\n");
}
}
You should be initializing toBeSent's contents before using strcat.
Your compiler might be saving you by initializing the array with 0s rather than garbage, but if not, it could be causing buffer overflows. Theres no code protecting against such, so this could be the cause of unexpected program behavior. Without seeing the rest of your code and knowing some other details, it will be difficult to know what exactly the issue is. If this serves as an example of the rest of the code, then the solution is likely to revise the code to fix these issues.
Consider using safe string functions to help prevent buffer overflows.
static const int MAX_BUFFER_LEN = 1024*12; //in byets
char *bff = new char[MAX_BUFFER_LEN];
int fileflag = O_CREAT | O_WRONLY | O_NONBLOCK;
fl = open(filename, fileflag, 0666);
if(fl < 0)
{
printf("can not open file! \n");
return -1;
}
do
{
///begin one loop
struct timeval bef;
struct timeval aft;
gettimeofday(&bef, NULL);
write(fl, bff, MAX_BUFFER_LEN);
gettimeofday(&aft, NULL);
if(aft.tv_usec - bef.tv_usec > 20000) //ignore second condition
{
printf(" cost too long:%d \n", aft.tv_usec - bef.tv_usec);
}
//end one loop
//sleep
usleep(30*1000); //sleep 30ms
}while(1);
When I run the program on Linux ubuntu 2.6.32-24-generic, I find that the COST TOO LONG printing shows 1~2 times in a minutes. I tried both to USB disk and hard disk.I also tried this program in arm platform .This condition also happened. I think that 3.2Mbps is too high for low speed IO device. So I reduce to 0.4Mbps.It significantly reduce the printing frequency. Is any solution to control the time cost ?
Is write() just copying the data to kenal buffer and returning immediately or waiting fo disk IO complete? Is it possible that kenal IO buffer is full and must be waiting for flush but why only several times cost so long?
You can't accelerate the disk, but you can do other stuff while the disk is working. You needn't wait for it to be done.
This is, however, highly non-trivial to do in C. You would need nonblocking I/O, multithreading or multiprocessing. Try googling up these keywords and how to use the different techniques (you are already using a nonblocking fd up there).
Your disk I/O performance is being negatively impacted by the code around each write to measure the time (and measuring time at this granularity is going to have occasional spikes as the computer does other things).
Instead, measure the performance of the code to write the entire data - start/end times outside the loop (with the loop properly bounded, of course).
If you are calling a file write which you think is going to take a lot of time, then make your process to run two threads, while one is doing the main task let the other write to disk.
ANSWER
https://stackoverflow.com/a/12507520/962890
it was so trivial.. args! but lots of good information received. thanks to everyone.
EDIT
link to github: https://github.com/MarkusPfundstein/stream_lame_testing
ORIGINAL POST
i have some questions regarding IPC through pipelines. My goal is to receive MP3 data per TCP/IP stream, pipe it through LAME to decode it to wav, do some math and store it on disk (as a wav). I am using non blocking IO for the whole thing.
What irritates me a bit is that the tcp/ip read is way more fast than the pipe line trough lame. When i send a ~3 MB mp3 the file gets read on the client side in a couple of seconds. In the beginning, i can also write to the stdin of the lame process, than it stops writing, it reads the rest of the mp3 and if its finished i can write to lame again. 4096 bytes take approx 1 second (to write and read from lame). This is pretty slow, because i want to decode my wav min 128kbs.
The OS Is a debian 2.6 kernel on a this micro computer:
https://www.olimex.com/dev/imx233-olinuxino-maxi.html
65 MB RAM
400 MhZ
ulimit -n | grep pipe returns 512 x 8 , means 4096 which is ok. Its a 32 bit system.
The weird thing is that
my_process | lame --decode --mp3input - output.wav
goes very fast.
Here is my fork_lame code (which shall essentialy connect stout of my process to stdin of lame and visa versa)
static char * const k_lame_args[] = {
"--decode",
"--mp3input",
"-",
"-",
NULL
};
static int
fork_lame()
{
int outfd[2];
int infd[2];
int npid;
pipe(outfd); /* Where the parent is going to write to */
pipe(infd); /* From where parent is going to read */
npid = fork();
if (npid == 0) {
close(STDOUT_FILENO);
close(STDIN_FILENO);
dup2(outfd[0], STDIN_FILENO);
dup2(infd[1], STDOUT_FILENO);
close(outfd[0]); /* Not required for the child */
close(outfd[1]);
close(infd[0]);
close(infd[1]);
if (execv("/usr/local/bin/lame", k_lame_args) == -1) {
perror("execv");
return 1;
}
} else {
s_lame_pid = npid;
close(outfd[0]); /* These are being used by the child */
close(infd[1]);
s_lame_fds[WRITE] = outfd[1];
s_lame_fds[READ] = infd[0];
}
return 0;
}
This are the read and write functions. Please not that in write_lame_in. when i write to stderr instead of s_lame_fds[WRITE], the output is nearly immedieatly so its definitly the pipe through lame. But why ?
static int
read_lame_out()
{
char buffer[READ_SIZE];
memset(buffer, 0, sizeof(buffer));
int i;
int br = read(s_lame_fds[READ], buffer, sizeof(buffer) - 1);
fprintf(stderr, "read %d bytes from lame out\n", br);
return br;
}
static int
write_lame_in()
{
int bytes_written;
//bytes_written = write(2, s_data_buf, s_data_len);
bytes_written = write(s_lame_fds[WRITE], s_data_buf, s_data_len);
if (bytes_written > 0) {
//fprintf(stderr, "%d bytes written\n", bytes_written);
s_data_len -= bytes_written;
fprintf(stderr, "data_len write: %d\n", s_data_len);
memmove(s_data_buf, s_data_buf + bytes_written, s_data_len);
if (s_data_len == 0) {
fprintf(stderr, "finished\n");
}
}
return bytes_written;
}
static int
read_tcp_socket(struct connection_s *connection)
{
char buffer[READ_SIZE];
int bytes_read;
bytes_read = connection_read(connection, buffer, sizeof(buffer)-1);
if (bytes_read > 0) {
//fprintf(stderr, "read %d bytes\n", bytes_read);
if (s_data_len + bytes_read > sizeof(s_data_buf)) {
fprintf(stderr, "BUFFER OVERFLOW\n");
return -1;
} else {
memcpy(s_data_buf + s_data_len,
buffer,
bytes_read);
s_data_len += bytes_read;
}
fprintf(stderr, "data_len: %d\n", s_data_len);
}
return bytes_read;
}
The select stuff is pretty basic select logic. All blocks are non blocking of course.
Anyone any idea? I'd really appreciate any help ;-)
Oops! Did you check your LAME output?
Looking at your code, in particular
static char * const k_lame_args[] = {
"--decode",
"--mp3input",
"-",
"-",
NULL
};
and
if (execv("/usr/local/bin/lame", k_lame_args) == -1) {
means you are accidentally omitting the --decode flag as it will be argv[0] for LAME, instead of the first argument (argv[1]). You should use
static char * const k_lame_args[] = {
/* argv[0] */ "lame",
/* argv[1] */ "--decode",
/* argv[2] */ "--mp3input",
/* argv[3] */ "-",
/* argv[4] */ "-",
NULL
};
instead.
I think you are seeing a slowdown because you're accidentally recompressing the MP3 audio. (I noticed this just a minute ago, so haven't checked if LAME does that if you omit the --decode flag, but I believe it does.)
It is possible there is some sort of a blocking issue wrt. nonblocking pipes (not really being nonblocking), causing your end to block until LAME consumes the data.
Could you try an alternative approach? Use normal, blocking pipes, and a separate thread (using pthreads), which has the singular purpose of writing data from a circular buffer to LAME. Your main thread then keeps filling the circular buffer from your TCP/IP connection, and can easily also track and report buffer levels -- very useful during development and debugging. I've had much better success with blocking pipes and threads than nonblocking pipes, in general.
In Linux, threads really do not have that much of an overhead, so you should be comfortable in using them even on embedded architectures. The only trick you must master is specifying a sensible stack size for the worker thread -- in this case 16384 bytes is quite likely enough -- because only the initial stack given to the process will automatically grow and threads stacks are fixed an by default quite large.
Do you need example code?
Edited to add:
Your program receives data from the TCP/IP connection probably at a steady rate. However, LAME consumes the data in largeish chunks. In other words, the situation is like a car being towed, with the tow car jerking and stopping, with the towee jerking into it every time: both your process and LAME are most of the time waiting the other to receive/send more data.
First, those two close are not required (actually, you shouldn't do that), because the two dup2 which follow will do it automatically :
close(STDOUT_FILENO);
close(STDIN_FILENO);