Use of string functions with manually set NUL-terminator - c

This might sound like a silly question, but I learned that sometimes, especially in C, there are things that seem obvious but aren't really safe or correct.
I have a char buffer that gets filled with text (no binary data is expected) via HTTP.
Now I want to process the request body and I think strstr() is exactly what I want.
However, strstr() needs both strings to be nul terminated.
Since I have no control over what the user will actually send, I decided to just terminate the "string" (buffer) at the end like this:
char buffer[1024];
// receive request
readHTTPRequest(buffer, sizeof(buffer));
// buffer contents is undetermined
buffer[sizeof(buffer) - 1] = 0; // always terminate buffer
const char *request_body = strstr(buffer, "\r\n\r\n");
if (request_body) {
size_t request_body_size = strlen(request_body);
}
Is this approach safe? Am I missing something?

This will only work if the buffer was completely filled. If not, you'll have uninitialized bytes in between what was actually read and the last byte.
A simple way to handle this is to initialize the buffer with all zeros:
char buffer[1024] = {0};
Or, if readHTTPRequest returns the number of bytes read, use that value instead as the index to write the 0 byte to.

Related

Why does dynamically allocated array does not update with the new data coming?

I am trying to receive a message from the socket server which sends a large file of around 7MB. Thus in the following code, I try to concatenate all data into one array s from buffer. But as I try the following, I see that the length of s does not change at all, although the total bytes received continue to increase.
char buffer[300];
char* s = calloc(1, sizeof(char));
size_t n = 1;
while ((b_recv = recv(socket_fd,
buffer,
sizeof(buffer), 0)) > 0) {
char *temp = realloc(s, b_recv + n);
s = temp;
memcpy(s + n -1, buffer, b_recv);
n += b_recv;
s[n-1] = '\0';
printf("%s -- %zu",s, strlen(s));
}
free(s);
Is this not the correct way to update receive data of varying sizes? Also when I try to print s, it gives some random question mark characters. What is the mistake that I am making?
Why does dynamically allocated array does not update with the new data coming?
You have not presented any reason to believe that the behavior is as the question characterizes it. You are receiving binary data and storing it in memory, which is fine, but you cannot expect sensible results from treating such data as if it were a C string. Not even when you replace the last byte with a string terminator.
Binary data can and generally does contain bytes with value 0. C strings use such bytes as terminators marking the end of the string data, so, for example, strlen will measure only the number of bytes before the first zero byte, regardless of how many additional bytes have been stored after it. Moreover, even if you do not receive any zero bytes at all, your particular code inserts them, clobbering some of the real bytes received.
You may attempt to print such data to the console as if it were text, but if in fact it does not consist of text encoded according to the runtime character encoding then there is no reason to expect the resulting display to convey useful information. Instead, examine it in memory via a debugger, or write the raw bytes to a file and examine the result with a hex editor, or write them (still raw) through a filter that converts to hexadecimal or some other text representation, or similar. And you have as many bytes to examine as you have copied to the allocated space. You're keeping track of that already, so you don't need strlen() to tell you how many that is.

strcpy() always miss some characters

I'm working on a project using UDP protocol to transfer a file, but when I use strcpy() to copy a buffer into another string, it always missing some characters.
The simple idea is that:
I defined a struct:
struct frame{
int kind;//transmission(0) or retransmission(1)
int seq;
int ack;
char info[256];
};
Then I use fread to get the content of a text file into the buffer:
char buffer[256] = {0};
fread(buffer, 256, 1, fp);//read file: 256 byte
struct frame currFrame;
currFrame.ack = 0;
bzero(currFrame.info, 256);
strcpy(currFrame.info, buffer); //store the content to transfer
printf("%s\n", buffer);
printf("%s\n", currFrame.info);
The code above is in a for loop because i read file multiple times.
when I use printf(), half time, the result is right. But half time, they are different(like missing some characters in the head). How can I fix this?
The output is attached(The above is buffer, which is right):
The strcpy function is only for strings. To copy arbitrary data, use memcpy. Also, the %s format specifier is only for strings. Functions like fread read arbitrary binary data and don't try to form strings.
Also, you called fread in such a way that it won't tell you how many bytes it actually read. Unless you're positive you're always going to read exactly 256 bytes, that isn't smart. Instead, set the second parameter of fread to 1 and use the third parameter to set the maximum number of bytes to read. And don't ignore the return value -- that's how you know how many bytes it was actually able to read.

Heap Overflow Attack

I am learning about heap overflow attacks and my textbook provides the following vulnerable C code:
/* record type to allocate on heap */
typedef struct chunk {
char inp[64]; /* vulnerable input buffer */
void (*process)(char *); /* pointer to function to process inp */
} chunk_t;
void showlen(char *buf)
{
int len;
len = strlen(buf);
printf("buffer5 read %d chars\n", len);
}
int main(int argc, char *argv[])
{
chunk_t *next;
setbuf(stdin, NULL);
next = malloc(sizeof(chunk_t));
next->process = showlen;
printf("Enter value: ");
gets(next->inp);
next->process(next->inp);
printf("buffer5 done\n");
}
However, the textbook doesn't explain how one would fix this vulnerability. If anyone could please explain the vulnerability and a way(s) to fix it that would be great. (Part of the problem is that I am coming from Java, not C)
The problem is that gets() will keep reading into the buffer until it reads a newline or reaches EOF. It doesn't know the size of the buffer, so it doesn't know that it should stop when it hits its limit. If the line is 64 bytes or longer, this will go outside the buffer, and overwrite process. If the user entering the input knows about this, he can type just the right characters at position 64 to replace the function pointer with a pointer to some other function that he wants to make the program call instead.
The fix is to use a function other than gets(), so you can specify a limit on the amount of input that will be read. Instead of
gets(next->inp);
you can use:
fgets(next->inp, sizeof(next->inp), stdin);
The second argument to fgets() tells it to write at most 64 bytes into next->inp. So it will read at most 63 bytes from stdin (it needs to allow a byte for the null string terminator).
The code uses gets, which is infamous for its potential security problem: there's no way to specify the length of the buffer you pass to it, it'll just keep reading from stdin until it encounters \n or EOF. It may therefore overflow your buffer and write to memory outside of it, and then bad things will happen - it could crash, it could keep running, it could start playing porn.
To fix this, you should use fgets instead.
You can fill up next with more than 64 bytes you will by setting the address for process. Thereby enable one to insert whatever address one wishes. The address could be a pointer to any function.
To fix simple ensure that only 63 bytes (one for null) is read into the array inp - use fgets
The function gets does not limit the amount of text that comes from stdin. If more than 63 chars come from stdin, there will be an overflow.
The gets discards the LF char, that would be an [Enter] key, but it adds a null char at the end, thus the 63 chars limit.
If the value at inp is filled with 64 non-null chars, as it can be directly accessed, the showlen function will trigger an access violation, as strlen will search for the null-char beyond inp to determine its size.
Using fgets would be a good fix to the first problem but it will also add a LF char and the null, so the new limit of readable text would be 62.
For the second, just take care of what is written on inp.

How do I properly store characters in an array using read?

I have written the following code, and I don't understand why read is not storing the characters the way I expect:
char temp;
char buf[256];
while(something)
read (in,&temp, 1);
buf[strlen(buf)] = temp;
}
If I print temp and the last place of the buf array as I am reading, sometimes they don't match up. For example maybe the character is 'd' but the array contains % or the character is 0 and the array contains .
I am reading less than 256 characters but it doesn't matter because I am printing as I am reading.
Am I missing something obvious?
Yes, you're not initializing buf -- strlen(buf) is undefined. You should initialize it like so:
buf[0] = 0;
Also, it's better to keep track of the length instead of calling strlen each iteration to avoid a Shlemiel the painter algorithm.
You should also be checking for errors in the call to read(2) -- if it returns -1 or 0, you should break out of your loop, since it means either an error occurred or you reached the end of the file/input stream.
Don't use strlen in this code. strlen relies on it's argument being a NULL terminated C string. So unless you initialize your entire buffer to 0, then this code doesn't work.
At any rate strlen isn't a good choice to use when buffering data, even if you know that you're working with printable string data, if only because strlen will traverse the string every time just to get your length.
Keep a separate counter, named e.g. numRead, only append to buf at the numRead position, and increment numRead by the amount that you read.

Gibberish in read() buffer?

I am new to C, and so I am completely confused by the following behavior. Using pipe() and fork() I am reading the output of the following trivial ruby program:
puts "success"
via a call to the read function in C:
n = read(fd[0], readbuffer, sizeof(readbuffer));
printf("received: %s", readbuffer);
However, printf is printing a bunch of those 'unrecognised character' symbols (like the question mark in a diamond) to the console. Furthermore, doing a comparison like:
if (strcmp(readbuffer, "success") == 0)
{
/* do something */
}
fails. What am I doing wrong?
Edit: Declarations as requested. I have no idea about memsetting, my first day in C.
int fd[2], in;
pid_t pid;
char readbuffer[6];
Edit:
The answer by 'mu is too short' also solves the problem. The consensus seems to be that using memset is overkill. I am a novice C programmer so I will have to believe the commentors' opinions. This is, however, argumentum ad populum and mu is too short may indeed be more in the right. In any case, I recommend a reading of both answers as any 'overkill' is probably still trivially so.
As others have pointed out, your buffer isn't big enough to hold the text you're reading, and you don't ensure it's null terminated.
But using memset() to zero the entire buffer before each read is unnecessary; you just need to ensure that there's a null at the end of data you've read (and make your buffer bigger of course).
If you make readbuffer at least 9 characters long, and replace:
n = read(fd[0], readbuffer, sizeof(readbuffer));
..with..
n = read(fd[0], readbuffer, sizeof(readbuffer) - 1);
readbuffer[n] = '\0';
..then that should do it (though you should ideally check that n is >= 0 to make sure the read() succeeded). Specifying one less than the size of the read buffer ensures that readbuffer[n] won't overrun (but if read() failed it could underrun).
Then you'll just have to deal with the linefeed at the end.
This also assumes that the entire string is read in one read call. It's likely in this case, but often when using read its necessary to concatenate multiple reads until you've read enough of the data.
As I understand it, your line:
puts "success"
will output (in C-terms)
success\n\0
which I count as 9 characters.
You declared readbuffer as only 6. The previous answer only upped it to 7.
As the comments note, you won't have a null terminator on readbuffer so it isn't really a C string. You could do this:
#include <string.h>
/* ... */
memset(readbuffer, 0, sizeof(readbuffer));
n = read(fd[0], readbuffer, sizeof(readbuffer) - 1);
That will give you a proper null terminated string. But, if you actually want a string of length 6, then change the declaration of readbuffer to:
char readbuffer[7];
If you only need your readbuffer once, you could say:
char readbuffer[7] = { 0 };
to initialize it to all zeros. However, if you're doing the read in a loop then you'll want to memset(readbuffer, 0, sizeof(readbuffer)) before each read to make sure you don't end up with any leftover data from the last step.
C won't automatically initialize a local variable, you have to do it yourself.

Resources