How do I properly store characters in an array using read? - c

I have written the following code, and I don't understand why read is not storing the characters the way I expect:
char temp;
char buf[256];
while(something)
read (in,&temp, 1);
buf[strlen(buf)] = temp;
}
If I print temp and the last place of the buf array as I am reading, sometimes they don't match up. For example maybe the character is 'd' but the array contains % or the character is 0 and the array contains .
I am reading less than 256 characters but it doesn't matter because I am printing as I am reading.
Am I missing something obvious?

Yes, you're not initializing buf -- strlen(buf) is undefined. You should initialize it like so:
buf[0] = 0;
Also, it's better to keep track of the length instead of calling strlen each iteration to avoid a Shlemiel the painter algorithm.
You should also be checking for errors in the call to read(2) -- if it returns -1 or 0, you should break out of your loop, since it means either an error occurred or you reached the end of the file/input stream.

Don't use strlen in this code. strlen relies on it's argument being a NULL terminated C string. So unless you initialize your entire buffer to 0, then this code doesn't work.
At any rate strlen isn't a good choice to use when buffering data, even if you know that you're working with printable string data, if only because strlen will traverse the string every time just to get your length.
Keep a separate counter, named e.g. numRead, only append to buf at the numRead position, and increment numRead by the amount that you read.

Related

Use of string functions with manually set NUL-terminator

This might sound like a silly question, but I learned that sometimes, especially in C, there are things that seem obvious but aren't really safe or correct.
I have a char buffer that gets filled with text (no binary data is expected) via HTTP.
Now I want to process the request body and I think strstr() is exactly what I want.
However, strstr() needs both strings to be nul terminated.
Since I have no control over what the user will actually send, I decided to just terminate the "string" (buffer) at the end like this:
char buffer[1024];
// receive request
readHTTPRequest(buffer, sizeof(buffer));
// buffer contents is undetermined
buffer[sizeof(buffer) - 1] = 0; // always terminate buffer
const char *request_body = strstr(buffer, "\r\n\r\n");
if (request_body) {
size_t request_body_size = strlen(request_body);
}
Is this approach safe? Am I missing something?
This will only work if the buffer was completely filled. If not, you'll have uninitialized bytes in between what was actually read and the last byte.
A simple way to handle this is to initialize the buffer with all zeros:
char buffer[1024] = {0};
Or, if readHTTPRequest returns the number of bytes read, use that value instead as the index to write the 0 byte to.

Getting strange characters from strncpy() function

I am supposed to load a list of names from a file, and then find those names in the second file and load them in a structure with some other data (for the simplicity, I will load them to another array called "test".
The first part is just fine, I am opening a file and loading all the names into a 2dimensional array called namesArr.
The second part is where unexpected characters occur, and I can't understand why. Here is the code of the function:
void loadStructure(void){
char line[MAX_PL_LENGTH], *found;
int i, j=0;
char test[20][20];
FILE *plotPtr=fopen(PLOT_FILE_PATH, "r");
if (plotPtr==NULL){perror("Error 05:\nError opening a file in loadStructure function. Check the file path"); exit(-5);}
while(fgets(line, MAX_PL_LENGTH, plotPtr)!=NULL){ // This will load each line from a file to an array "line" until it reaches the end of file.
for(i=0; i<numOfNames; i++){ // Looping through the "namesArr" array, which contains the list of 20 character names.
if((found=strstr(line, namesArr[i]))!=NULL){ // I use strstr() to find if any of those names appear in the particular line.
printf("** %s", found); // Used of debugging.
strncpy(test[j], found, strlen(namesArr[i])); j++; // Copying the newly found name to test[j] (copying only the name, by defining it's length, which is calculated by strlen function).
}
}
}
fclose(plotPtr);
printf("%s\n", test[0]);
printf("%s\n", test[1]);
printf("%s\n", test[2]);
}
This is the output I get:
...20 names were loaded from the "../Les-Mis-Names-20.txt".
** Leblanc, casting
** Fabantou seems to me to be better," went on M. Leblanc, casting
** Jondrette woman, as she stood
Leblanct╕&q
Fabantou
Jondretteⁿ  └
Process returned 0 (0x0) execution time : 0.005 s
Press any key to continue.
The question is, why am I getting characters like "╕&q" and "ⁿ  └" in the newly created array? And also, is there any other more efficient way to achieve what I am trying to do?
The problem is that strncpy does not store a null in the target array if the length specified is less than the source string (as is always the case here). So whatever garbage happpend to be in the test array will remain there.
You can fix this specific problem by zeroing the test array, either when you declare it:
char test[20][20] = { { 0 } };
or as you use it:
memset(test[j], 0, 20);
strncpy(test[j], found, strlen(namesArr[i]));
but in general, it is best to avoid strncpy for this reason.
The length limitation for strncpy should be based on the target size, not the source length: that's the point of using it over strcpy, which uses only the source length. In your code
strncpy(test[j], found, strlen(namesArr[i]));
the length parameter is from the source array, which defeats the purpose of using strncpy. In addition, the nul terminator will not be present if the function copies the full limit of bytes, so the code should be
strncpy(test[j], found, 19); // limit to target size, leaving room for terminator
test[j][19] = '\0'; // add terminator (if copy did not complete)
Whether you loaded namesArr[] from file correctly is another potential issue, since you do not show the code.
Edited:
Slight modification to a previous answer:
1) Since you are working with C strings, make sure (since strncpy(...) does not do it for you) that you null terminate the buffer.
2) When using strncpy the length argument should represent the target string byte capacity - 1 (space for null terminator), not the source string length.
...
int len = strlen(found)
memset(test[j], 0, 20);
strncpy(test[j], found, 19);//maximum length (19) matches array size
//of target string -1 ( test[j] ).
if(len > 19) len = 19; //in case length of found is longer than the target string.
test[j][len+1] = 0;
...
In addition to what Chris Dodd said,, quoted from man strncpy
The strncpy() function is similar [to the strcpy() function], except that at most n bytes of src are copied. Warning: If there is no null byte among the first n bytes of src, the string placed in dest will not be null-terminated.
Since the size parameter in your strncpy call is the length of the string, this will not include the null byte at the end of the string and thus your destination string will not be null-terminated from this call.

Pthreads, fread(), and printf(): Getting random D4's in my string

The Scoop:
I am creating a method that runs through a lengthy file in chunks: using pthreads. I am calling fread() to read the file in this sort of fashion:
fread( thread_data[i].buffer, 1, 50, f )
/*
thread_data is a data structure for each thread (hence i)
buffer is in thread_data as an array of length 50
*/
I am then directly calling a print statement to see what each thread is doing, as a weird pattern was showing up in some of the parts that I was printing. Namely, my print statement would look something like this:
this is suppose to be 50 characters, but it is only a fewgD4
That D4 directly above is what I have my question on. Every thread that I make, at the end of the string, we are printing D4, and in this case, followed by a g. Other times, it is followed by a d, and most commonly a �. Now, I did read the wikipedia page on this character, which states:
replacement character used to replace an unknown or unrepresentable character
My question:
What kind of an error am I running into? Why is the end of each read statement containing unknown characters, especially the weird gD4 guy?
Aside:
I am trying to make a function in c that utilizes pthreads to find the frequency of each word in a file, in case anyone was wondering. These weird characters were showing up in my list, which is something that I find slightly unpleasent. Finally, don't bother linking me to the Obligaroty Unicode article, I am already aware of it, and the characters are not outside of what I am working with.
The strings you are printing out are not null-terminated — fread() does not null-terminate its output, it simply reads in as many raw bytes as you asked for (or fewer). So when you print out your buffer, your print function is walking past the end of the data and printing out whatever garbage memory comes after the buffer, which in your case just happens to be gD4.
You need to either explicitly null-terminate your buffer; or, if your print function supports it, tell it exactly how many characters to print. Either way, you need to save the return value from fread to know how many characters you read. For example:
int n = fread(thread_data[i].buffer, 1, 50, f);
if (n < 0) /* Handle error */ ;
// Explicitly add a null terminator -- make sure the buffer has room for it!
thread_data[i].buffer[n] = 0;

Printf a buffer of char with length in C

I have a buffer which I receive through a serial port. When I receive a certain character, I know a full line has arrived, and I want to print it with printf method. But each line has a different length value, and when I just go with:
printf("%s", buffer);
I'm printing the line plus additional chars belonging to the former line (if it was longer than the current one).
I read here that it is possible, at least in C++, to tell how much chars you want to read given a %s, but it has no examples and I don't know how to do it in C. Any help?
I think I have three solutions:
printing char by char with a for loop
using the termination character
or using .*
QUESTION IS: Which one is faster? Because I'm working on a microchip PIC and I want it to happen as fast as possible
You can either add a null character after your termination character, and your printf will work, or you can add a '.*' in your printf statement and provide the length
printf("%.*s",len,buf);
In C++ you would probably use the std::string and the std::cout instead, like this:
std::cout << std::string(buf,len);
If all you want is the fastest speed and no formatting -- then use
fwrite(buf,1,len,stdout);
The string you have is not null-terminated, so, printf (and any other C string function) cannot determine its length, thus it will continue to write the characters it finds there until it stumbles upon a null character that happens to be there.
To solve your problem you can either:
use fwrite over stdout:
fwrite(buffer, buffer_length, 1, stdout);
This works because fwrite is not thought for printing just strings, but any kind of data, so it doesn't look for a terminating null character, but accepts the length of the data to be written as a parameter;
null-terminate your buffer manually before printing:
buffer[buffer_length]=0;
printf("%s", buffer); /* or, slightly more efficient: fputs(buffer, stdout); */
This could be a better idea if you have to do any other string processing over buffer, that will now be null-terminated and so manageable by normal C string processing functions.
Once you've identified the end of the line, you must append a '\0' character to the end of the buffer before sending it to printf.
You can put a NUL (0x0) in the buffer after receiving the last character.
buffer[i] = 0;

fread() size argument

I want to read some data from the file, the data will have different sizes at different times.
If I use the below code, then:
char dataStr[256];
fread(dataStr, strlen(dataStr), 1, dFd);
fread is returning 0 for the above call and not reading any thing from the file.
But, if I give size as 1 then it successfully reads one char from the file.
What should be the value of size argument to the fread() function when we do not know how much is the size of the data in the file?
strlen counts the number of characters until it hits \0.
In this case you probably hit \0 on the very first character hence strlen returns 0 as the length and nothing is read.
You sould use sizeof instead of strlen.
You can't do that, obviously.
You can read until a known delimiter, often line feed, using fgets() to read a line. Or you can read a known-in-advance byte count, using that argument.
Of course, if there's an upper bound on the amount of data, you can read that always, and then somehow inspect the data to see what you got.
Also, in your example you're using strlen() on the argument that is going to be overwritten, that implies that it already contains a proper string of the exact same size as the data that is going to be read. This seems unlikely, you probably mean sizeof dataStr there.
You should use:
fread(dataStr, 1, sizeof dataStr, dFd);
to indicate that you want to read the number of bytes equal to the size of your array buffer.
The reason why your code doesn't work is that strlen() finds the length of a NULL-terminated string, not the size of the buffer. In your case, you run it on an uninitialized buffer and simply get lucky, your first byte in the buffer is NULL, so strlen(dataStr) returns 0, but is just as likely to crash or return some random number greater than your buffer size.
Also note that fread() returns the number of items read, not the number of characters (I swapped the second and the third arguments so that each character is equivalent to one item).
fread returns the number of successfully readed numblocks.
You can:
if( 1==fread(dataStr, 256, 1, dFd) )
puts("OK");
It reads ever the full length of your defined data; fread can't break on '\0'.

Resources