Gibberish in read() buffer? - c

I am new to C, and so I am completely confused by the following behavior. Using pipe() and fork() I am reading the output of the following trivial ruby program:
puts "success"
via a call to the read function in C:
n = read(fd[0], readbuffer, sizeof(readbuffer));
printf("received: %s", readbuffer);
However, printf is printing a bunch of those 'unrecognised character' symbols (like the question mark in a diamond) to the console. Furthermore, doing a comparison like:
if (strcmp(readbuffer, "success") == 0)
{
/* do something */
}
fails. What am I doing wrong?
Edit: Declarations as requested. I have no idea about memsetting, my first day in C.
int fd[2], in;
pid_t pid;
char readbuffer[6];
Edit:
The answer by 'mu is too short' also solves the problem. The consensus seems to be that using memset is overkill. I am a novice C programmer so I will have to believe the commentors' opinions. This is, however, argumentum ad populum and mu is too short may indeed be more in the right. In any case, I recommend a reading of both answers as any 'overkill' is probably still trivially so.

As others have pointed out, your buffer isn't big enough to hold the text you're reading, and you don't ensure it's null terminated.
But using memset() to zero the entire buffer before each read is unnecessary; you just need to ensure that there's a null at the end of data you've read (and make your buffer bigger of course).
If you make readbuffer at least 9 characters long, and replace:
n = read(fd[0], readbuffer, sizeof(readbuffer));
..with..
n = read(fd[0], readbuffer, sizeof(readbuffer) - 1);
readbuffer[n] = '\0';
..then that should do it (though you should ideally check that n is >= 0 to make sure the read() succeeded). Specifying one less than the size of the read buffer ensures that readbuffer[n] won't overrun (but if read() failed it could underrun).
Then you'll just have to deal with the linefeed at the end.
This also assumes that the entire string is read in one read call. It's likely in this case, but often when using read its necessary to concatenate multiple reads until you've read enough of the data.

As I understand it, your line:
puts "success"
will output (in C-terms)
success\n\0
which I count as 9 characters.
You declared readbuffer as only 6. The previous answer only upped it to 7.

As the comments note, you won't have a null terminator on readbuffer so it isn't really a C string. You could do this:
#include <string.h>
/* ... */
memset(readbuffer, 0, sizeof(readbuffer));
n = read(fd[0], readbuffer, sizeof(readbuffer) - 1);
That will give you a proper null terminated string. But, if you actually want a string of length 6, then change the declaration of readbuffer to:
char readbuffer[7];
If you only need your readbuffer once, you could say:
char readbuffer[7] = { 0 };
to initialize it to all zeros. However, if you're doing the read in a loop then you'll want to memset(readbuffer, 0, sizeof(readbuffer)) before each read to make sure you don't end up with any leftover data from the last step.
C won't automatically initialize a local variable, you have to do it yourself.

Related

Use of string functions with manually set NUL-terminator

This might sound like a silly question, but I learned that sometimes, especially in C, there are things that seem obvious but aren't really safe or correct.
I have a char buffer that gets filled with text (no binary data is expected) via HTTP.
Now I want to process the request body and I think strstr() is exactly what I want.
However, strstr() needs both strings to be nul terminated.
Since I have no control over what the user will actually send, I decided to just terminate the "string" (buffer) at the end like this:
char buffer[1024];
// receive request
readHTTPRequest(buffer, sizeof(buffer));
// buffer contents is undetermined
buffer[sizeof(buffer) - 1] = 0; // always terminate buffer
const char *request_body = strstr(buffer, "\r\n\r\n");
if (request_body) {
size_t request_body_size = strlen(request_body);
}
Is this approach safe? Am I missing something?
This will only work if the buffer was completely filled. If not, you'll have uninitialized bytes in between what was actually read and the last byte.
A simple way to handle this is to initialize the buffer with all zeros:
char buffer[1024] = {0};
Or, if readHTTPRequest returns the number of bytes read, use that value instead as the index to write the 0 byte to.

C UNIX - read() reads none existing letters

I've got a little problem while experimenting with some C code. I've tried to use read()-Command to read a text out of a file and store the results in a charArray. But when I print the results they're always different from the file.
Here is the code:
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
void main() {
int fd = open("file", 2);
char buf[2];
printf("Read elements: %ld\n", read(fd, buf, 2));
printf("%s\n", buf);
close(fd);
}
The file "file" was created in the same directory using the following UNIX commands:
cat > file
Hi
So it contains just the word "Hi". When I run it, I expect it to read 2 bytes from the file (which are 'H' and 'i') and store them at buf[0] and buf[1]. But when I want to print the result, it appears, that there was an issue, because besides the word "Hi" there are several wierd characters printed (indicating a memory reading/writing problem i guess, due to bad buffer size). I've tried to increase the size of the buf-Array and it appears that when i change the size, the wierd characters printed change. The problem is removed when size reaches 32 bytes.
Can someone explain to me in detail why this is happening?
I've understood so far that read() does not read \'0' when it reads something, and that the third parameter of read() indicates the maximum number of bytes to read.
Antoher thing I've noticed while experimenting with the above code is the following: Let's assume one changes the third parameter (maximum bytes to read) of read() to 3, and the size of buf-Array to 512 (overkill i know, but I really wanted to see what will happen). Now read will acutally read a third character (in my case 'e') and store it into the buffer, even tho this third character does not exist.
I've searched for a while now #stackoverflow and I found many similiar cases, but none of them made me understand my problem. If there is any other thread i missed, it would be a pleasure if u could link me to it.
At last: sry for my bad english, it's not my native language.
Clearly you need to make buf 3 bytes long and use the last byte as the null byte (0 or '\0'). That way, when you print the string, your computer doesn't carry on until he finds another 0 !
The way strings (char arrays really) are handled in C is quite straightforward. Indeed, when dealing with strings (most) if not all functions take under the assumption that string parameters are null terminated (puts) or return null terminated strings (strdup).
The point is that, by default the computer can't tell where a string ends unless it is given the strings size each time he processes it. The easiest implementation around this approach was to append after each string a 0 (namely the null byte). That way, the computer just need to iterate over the string's characters and stop when he finds the termination character (other name for null byte).

read ( C function ) behave strangely

Let consider a fragment of code:
char *buffer = (char*) malloc(MAX_LENGTH_OF_COMMAND);
while(1){
printf("gsh> ");
read(0, buffer, sizeof(buffer) );
}
And behaviour is quite strange. I mean, the output for following input "input,input,input" is "gsh> gsh> gsh> gsh>".
So I expected that there is a interrupt during I/O process ( I mean getting data from user) because waiting for user is a wasting of time. Ok, I understand it. But, what if I have to use buffer in next line, for example:
char *buffer = (char*) malloc(MAX_LENGTH_OF_COMMAND);
while(1){
printf("gsh> ");
read(0, buffer, sizeof(buffer) );
// do something with buffer.
}
So it is necessary to have COMPLETE buffer ( input). I don't understand that and I don't know what is way to ensure that complete input is available.
Please explain. ( and correct my train of thought).
Thanks in advance.
You just discovered that read() doesn't guarantee how many bytes it will read. You normally have to call read() in a loop until you find input delimiting characters (such as a newline). In addition, we note that after said newline, you will need to keep whatever remains in the read buffer (if anything) as it is valid input to the next thing that needs to read().
Please note that read()'s return is the number of bytes it read, and your input will not be null-terminated (because it's not expecting a string).

How do I properly store characters in an array using read?

I have written the following code, and I don't understand why read is not storing the characters the way I expect:
char temp;
char buf[256];
while(something)
read (in,&temp, 1);
buf[strlen(buf)] = temp;
}
If I print temp and the last place of the buf array as I am reading, sometimes they don't match up. For example maybe the character is 'd' but the array contains % or the character is 0 and the array contains .
I am reading less than 256 characters but it doesn't matter because I am printing as I am reading.
Am I missing something obvious?
Yes, you're not initializing buf -- strlen(buf) is undefined. You should initialize it like so:
buf[0] = 0;
Also, it's better to keep track of the length instead of calling strlen each iteration to avoid a Shlemiel the painter algorithm.
You should also be checking for errors in the call to read(2) -- if it returns -1 or 0, you should break out of your loop, since it means either an error occurred or you reached the end of the file/input stream.
Don't use strlen in this code. strlen relies on it's argument being a NULL terminated C string. So unless you initialize your entire buffer to 0, then this code doesn't work.
At any rate strlen isn't a good choice to use when buffering data, even if you know that you're working with printable string data, if only because strlen will traverse the string every time just to get your length.
Keep a separate counter, named e.g. numRead, only append to buf at the numRead position, and increment numRead by the amount that you read.

Printf a buffer of char with length in C

I have a buffer which I receive through a serial port. When I receive a certain character, I know a full line has arrived, and I want to print it with printf method. But each line has a different length value, and when I just go with:
printf("%s", buffer);
I'm printing the line plus additional chars belonging to the former line (if it was longer than the current one).
I read here that it is possible, at least in C++, to tell how much chars you want to read given a %s, but it has no examples and I don't know how to do it in C. Any help?
I think I have three solutions:
printing char by char with a for loop
using the termination character
or using .*
QUESTION IS: Which one is faster? Because I'm working on a microchip PIC and I want it to happen as fast as possible
You can either add a null character after your termination character, and your printf will work, or you can add a '.*' in your printf statement and provide the length
printf("%.*s",len,buf);
In C++ you would probably use the std::string and the std::cout instead, like this:
std::cout << std::string(buf,len);
If all you want is the fastest speed and no formatting -- then use
fwrite(buf,1,len,stdout);
The string you have is not null-terminated, so, printf (and any other C string function) cannot determine its length, thus it will continue to write the characters it finds there until it stumbles upon a null character that happens to be there.
To solve your problem you can either:
use fwrite over stdout:
fwrite(buffer, buffer_length, 1, stdout);
This works because fwrite is not thought for printing just strings, but any kind of data, so it doesn't look for a terminating null character, but accepts the length of the data to be written as a parameter;
null-terminate your buffer manually before printing:
buffer[buffer_length]=0;
printf("%s", buffer); /* or, slightly more efficient: fputs(buffer, stdout); */
This could be a better idea if you have to do any other string processing over buffer, that will now be null-terminated and so manageable by normal C string processing functions.
Once you've identified the end of the line, you must append a '\0' character to the end of the buffer before sending it to printf.
You can put a NUL (0x0) in the buffer after receiving the last character.
buffer[i] = 0;

Resources