memcpy vs strcat - c

Seems to be a basic question but I would rather ask this to clear up than spend many more days on this.I am trying to copy data in a buffer which I receive(recv call) which will be then pushed to a file. I want to use memcpy to continuously append/add data to the buffer until the size of buffer is not enough to hold more data where I than use the realloc. The code is as below.
int vl_packetSize = PAD_SIZE + (int)p_size - 1; // PAD_SIZE is the size of char array sent
//p_size is the size of data to be recv. Same data size is used by send
int p_currentSize = MAX_PROTO_BUFFER_SIZE;
int vl_newPacketSize = p_currentSize;
char *vl_data = (char *)malloc(vl_packetSize);
memset((char *)vl_data,'\0',vl_packetSize);
/* Allocate memory to the buffer */
vlBuffer = (char *)malloc(p_currentSize);
memset((char *)vlBuffer,'\0',p_currentSize);
char *vlBufferCopy = vlBuffer;
if(vlBuffer==NULL)
return ERR_NO_MEM;
/* The sender first sends a padding data of size PAD_SIZE followed by actual data. I want to ignore the pad hence do vl_data+PAD_SIZE on memcpy */
if((p_currentSize - vl_llLen) < (vl_packetSize-PAD_SIZE)){
vl_newPacketSize +=vl_newPacketSize;
char *vlTempBuffer = (char *)realloc(vlBufferCopy,(size_t)vl_newPacketSize);
if(vlTempBuffer == NULL){
if(debug > 1)
fprintf(stdout,"Realloc failed:%s...Control Thread\n\n",fn_strerror_r(errno,err_buff));
free((void *)vlBufferCopy);
free((void *)vl_data);
return ERR_NO_MEM;
}
vlBufferCopy = vlTempBuffer;
vl_bytesIns = vl_llLen;
vl_llLen = 0;
vlBuffer = vlBufferCopy+vl_bytesIns;
fprintf(stdout,"Buffer val after realloc:%s\n\n",vlBufferCopy);
}
memcpy(vlBuffer,vl_data+PAD_SIZE,vl_packetSize-PAD_SIZE);
/*
fprintf(stdout,"Buffer val before increment:%s\n\n",vlBuffer);
fprintf(stdout,"vl_data length:%d\n\n",strlen(vl_data+PAD_SIZE));
fprintf(stdout,"vlBuffer length:%d\n\n",strlen(vlBuffer));
*/
vlBuffer+=(vl_packetSize-PAD_SIZE);
vl_llLen += (vl_packetSize-PAD_SIZE);
vl_ifNotFlush = 1;
//fprintf(stdout,"Buffer val just before realloc:%s\n\n",vlBufferCopy);
}
Problem: Whan ever I fputs the data into the file later on. Only the first data recv/added to buffer is gets into the file.
Also when I print the value of vlBufferCopy(which points to first location of data returned by malloc or realloc) I get the same result.
If I decrease the size by 1, I see entire data in the file, but it somehow misses the new line character and hence the data is
not inserted in the proper format in the file.
I know it is because of trailing '\0' but some how reducing the size by 1
(vlBuffer+=(vl_packetSize-PAD_SIZE-1);)
misses the new line character. fputs while putting the data removes the trailing null character
Please let me know what I am missing here to check or in the logic
(Note: I tried using strcat:
strcat(vlBuffer,vl_data+PAD_SIZE);
but I wanted to use memcpy as it is faster and also it can be used for any kind of buffer and not only character pointer
Thanks

strcat and memcpy are very different functions.
I suggest you read the documentation of each.
Mainly, there are two differences:
1. memcpy copies data where you tell it to. strcat finds the end of the string, and copies there.
2. memcpy copies the number of bytes you request. strcat copies until the terminating null.
If you're dealing with packets of arbitrary contents, you have no use for strcat, or other string functions.

You need to write to the file in a binary-safe way. Check how to use fwrite instead of fputs. fwrite will copy all the buffer, even if there's a zero in the middle of it.
const char *mybuff= "Test1\0Test2";
const int mybuff_len = 11;
size_t copied = fwrite(mybuff, mybuff_len, 1, output_file);

Related

Segmentation fault strcat

I got a problem when reading SSL response that causes a segmentation fault. I read the response into a buffer, then append it to a malloced string and memory reset it to 0 till the response is fully read, but when I try this in a multi threaded program, after some operations it gives me segmentation fault. When I remove strcat it doesn't give me segmentation fault even if I run it for hours.
Example:
char* response = malloc(10000);
char buffer[10000] = { 0 };
while(SSL_read(ssl,buf,sizeof(buffer)) > 0){
strcat(response,buffer);
memset(buffer,0,sizeof(buffer));
}
Errors
free(): invalid next size (normal)
malloc_consolidate(): invalid chunk
I made sure of freeing both of SSL and CTX and close socket and free the malloced string.
There are a few problems with your code:
You are not dealing with C strings, you are dealing with arbitrary byte sequences. SSL_read() reads bytes, not C strings, and you cannot treat them as strings. What you read cannot be assumed to be NUL-terminated (\0), so you should not use strcat, strlen or other similar functions that operate on strings. Zeroing out the entire buffers just to make sure there is a terminator makes little to no sense, as the terminator could very well be found in the middle of the data.
You are reading data continuously in a loop into a fixed size buffer. Your code will overflow the destination buffer (response) very easily.
Not an error, but there isn't really any need for an intermediate buffer to begin with. You are needlessly copying stuff around two times (one with SSL_read() and one with strcat) when you can read directly into response instead. On top of that, the memset() to clear the contents of buffer also adds a third scan of the data, slowing things down even more.
Again, not an error, but SSL_read() returns int and uses that to return the read size. You are not really using it, but you should, as you need to keep track of how much space is left on the buffer. You would be however much better off using size_t to avoid unwanted problems with signed math and possible overflows. You can use SSL_read_ex() for this purpose.
Here's a snippet of code that does what you want in a more robust way:
#define CHUNK_SIZE 10000
unsigned char *response = NULL;
size_t size = 0;
size_t space_left = 0;
size_t total_read = 0;
size_t n;
while (1) {
// Allocate more memory if needed.
if (space_left < CHUNK_SIZE) {
unsigned char *tmp = realloc(response, size + CHUNK_SIZE);
if (!tmp) {
// Handle realloc error
break;
}
response = tmp;
size += CHUNK_SIZE;
space_left += CHUNK_SIZE;
}
if (SSL_read_ex(ssl, response + total_read, space_left, &n)) {
total_read += n;
space_left -= n;
} else {
// Handle error
break;
}
}
You never initialized response(). The arguments to strcat() have to be null-terminated strings.
You should also subtract 1 from the size of the buffer when calling SSL_read(), to ensure there will always be room for its null terminator.
char* response = malloc(10000);
response[0] = '\0';
char buffer[10000] = { 0 };
while(SSL_read(ssl,buf,sizeof(buffer)-1) > 0){
strcat(response,buffer);
memset(buffer,0,sizeof(buffer));
}

How to read in the entire word, and not just the first character?

I am writing a method in C in which I have a list of words from a file that I am redirecting from stdin. However, when I attempt to read in the words into the array, my code will only output the first character. I understand that this is because of a casting issue with char and char *.
While I am challenging myself to not use any of the functions from string.h, I have tried iterating through and am thinking of writing my own strcpy function, but I am confused because my input is coming from a file that I am redirecting from standard input. The variable numwords is inputted by the user in the main method (not shown).
I am trying to debug this issue via dumpwptrs to show me what the output is. I am not sure what in the code is causing me to get the wrong output - whether it is how I read in words to the chunk array, or if I am pointing to it incorrectly with wptrs?
//A huge chunk of memory that stores the null-terminated words contiguously
char chunk[MEMSIZE];
//Points to words that reside inside of chunk
char *wptrs[MAX_WORDS];
/** Total number of words in the dictionary */
int numwords;
.
.
.
void readwords()
{
//Read in words and store them in chunk array
for (int i = 0; i < numwords; i++) {
//When you use scanf with '%s', it will read until it hits
//a whitespace
scanf("%s", &chunk[i]);
//Each entry in wptrs array should point to the next word
//stored in chunk
wptrs[i] = &chunk[i]; //Assign address of entry
}
}
Do not re-use char chunk[MEMSIZE]; used for prior words.
Instead use the next unused memory.
char chunk[MEMSIZE];
char *pool = chunk; // location of unassigned memory pool
// scanf("%s", &chunk[i]);
// wptrs[i] = &chunk[i];
scanf("%s", pool);
wptrs[i] = pool;
pool += strlen(pool) + 1; // Beginning of next unassigned memory
Robust code would check the return value of scanf() and insure i, chunk do not exceed limits.
I'd go for a fgets() solution as long as words are entered a line at a time.
char chunk[MEMSIZE];
char *pool = chunk;
// return word count
int readwords2() {
int word_count;
// limit words to MAX_WORDS
for (word_count = 0; word_count < MAX_WORDS; word_count++) {
intptr_t remaining = &chunk[MEMSIZE] - pool;
if (remaining < 2) {
break; // out of useful pool memory
}
if (fgets(pool, remaining, stdin) == NULL) {
break; // end-of-file/error
}
pool[strcspn(pool, "\n")] = '\0'; // lop off potential \n
wptrs[word_count] = pool;
pool += strlen(pool) + 1;
}
return word_count;
}
While I am challenging myself to not use any of the functions from string.h, ...
The best way to challenge yourself to not use any of the functions from string.h is to write them yourself and then use them.
your program reads the next word in the i-esim position of the buffer chunk, so you are getting the first letters of each word (as long as i doesn't get above the size of chunk) as each time you read, you overwrite the second and rest of the chars of the last word with the ones of the just read one. Then, you are putting all the pointers in wptrs to point to these places, making it impossible to distinguish the end of one string to the next (you overwrote all the null terminators, leaving only the last) so you will get a first string with all the first letters of your words but the last, which is complete. then the second will have the same string, but beginning at the second... then the third.... etc.
Build your own version of strdup(3) and use chunk to store temporarily the string... then make a dynamically allocated copy of the string with your version of strdup(3) and make the pointer to point to it.... etc.
Finally, when you are finished, just free all the allocated strings and voilĂ !!
Also, this is very important: read How to create a Minimal, Complete, and Verifiable example as it is very frequent that your code lacks of some errors that you have eliminated from the posted code (you don't normally know where the error is, or you would have corrected it and no question here, right?)

Invalid Argument Reported By getdelim

I'm trying to use the getdelim function to read an entire text file's contents into a string.
Here is the code I am using:
ssize_t bytesRead = getdelim(&buffer, 0, '\0', fp);
This is failing however, with strerror(errno) saying "Error: Invalid Argument"
I've looked at all the documentation I could and just can't get it working, I've tried getline which does work but I'd like to get this function working preferably.
buffer is NULL initialised as well so it doesn't seem to be that
fp is also not reporting any errors and the file opens perfectly
EDIT: My implementation is based on an answer from this stackoverflow question Easiest way to get file's contents in C
Kervate, please enable compiler warnings (-Wall for gcc), and heed them. They are helpful; why not accept all the help you can get?
As pointed out by WhozCraig and n.m. in comments to your original question, the getdelim() man page shows the correct usage.
If you wanted to read records delimited by the NUL character, you could use
FILE *input; /* Or, say, stdin */
char *buffer = NULL;
size_t size = 0;
ssize_t length;
while (1) {
length = getdelim(&buffer, &size, '\0', input);
if (length == (ssize_t)-1)
break;
/* buffer has length chars, including the trailing '\0' */
}
free(buffer);
buffer = NULL;
size = 0;
if (ferror(input) || !feof(input)) {
/* Error reading input, or some other reason
* that caused an early break out of the loop. */
}
If you want to read the contents of a file into a single character array, then getdelim() is the wrong function.
Instead, use realloc() to dynamically allocate and grow the buffer, appending to it using fread(). To get you started -- this is not complete! -- consider the following code:
FILE *input; /* Handle to the file to read, assumed already open */
char *buffer = NULL;
size_t size = 0;
size_t used = 0;
size_t more;
while (1) {
/* Grow buffer when less than 500 bytes of space. */
if (used + 500 >= size) {
size_t new_size = used + 30000; /* Allocate 30000 bytes more. */
char *new_buffer;
new_buffer = realloc(buffer, new_size);
if (!new_buffer) {
free(buffer); /* Old buffer still exists; release it. */
buffer = NULL;
size = 0;
used = 0;
fprintf(stderr, "Not enough memory to read file.\n");
exit(EXIT_FAILURE);
}
buffer = new_buffer;
size = new_size;
}
/* Try reading more data, as much as fits in buffer. */
more = fread(buffer + used, 1, size - used, input);
if (more == 0)
break; /* Could be end of file, could be error */
used += more;
}
Note that the buffer in this latter snippet is not a string. There is no terminating NUL character, so it's just an array of chars. In fact, if the file contains binary data, the array may contain lots of NULs (\0, zero bytes). Assuming there was no error and all of the file was read (you need to check for that, see the former example), buffer contains used chars read from the file, with enough space allocated for size. If used > 0, then size > used. If used == 0, then size may or may not be zero.
If you want to turn buffer into a string, you need to decide what to do with the possibly embedded \0 bytes -- I recommend either convert to e.g. spaces or tabs, or move the data to skip them altogether --, and add the string-terminating \0 at end to make it a valid string.

Scanning string with length restriction

Using the standard C library, is there a way to scan a string (containing no whitespace) from standard input only if it fits in a buffer? In the following example I would like scanCount to be 0 if the input string is larger than 32:
char str[32];
int scanCount;
scanCount = scanf("%32s", str);
Edit: I also need file pointer rollback when the input string is too large.
You specified a requirement to only read if the whole data fits your buffer. This requirement makes no sense at all as it doesn't provide any functionality to your program. You can easily achieve the same sort of tasks without it. It also is not how operating systems present files to the user applications.
You can simply create a buffer of any size you see fit and then you can keep the data in the buffer until you can handle it, or you can do magic like actually resizing the buffer to accomodate more incoming data.
You can read any number of characters from a file using the ANSI fread() function:
size_t count;
char buffer[50];
count = fread(buffer, 1, sizeof buffer, stdin);
You can then see how many characters have actually been read by looking at the count variable, you can fill in the final NUL character if it's less than the buffer size or you can decide what to do next, if the whole buffer has been read and more data may be availabe. You could of course read sizeof buffer - 1 instead, to be able to always finalize the string. When the count is smaller than your specified value, feof() and ferror() can be used to see what happened. You can also look at the actual and check for a LF character to see how many lines you have read.
When using an enlarging buffer, you will need malloc() or just create a NULL pointer that will later be allocated using realloc():
/* Set initial size and offset. */
size_t offset = 0;
size_t size = 0;
char *buffer = NULL;
When you need to change the size of the buffer, you can use realloc():
/* Change the size. */
size = 100;
buffer = realloc(buffer, size);
(The first time it's equivalent to buffer = malloc(size).)
You can then read data into the buffer:
size_t count = fread(buffer + offset, 1, size - offset, stdin);
count += offset;
(The first time it's equivalent to fread(buffer, 1, size, stdin).)
When finished, you should free the buffer:
free(buffer);
At any time, you still have all the already read data somewhere in a buffer, so you can get back to it at any time, you just decouple the reading and processing, where the above examples are all about reading.
The processing then depends on what you need. You generally need to identify the start and end of the data that you want to extract.
Example start and end, where end means one character after the last one you want, so the arithmetics work better:
size_t start = 0;
size_t end = 10;
Extract the data (using bits of C99):
char data[end - start + 1];
memcpy(data, buffer + start, end - start);
data[end] = '\0';
Now you have a NUL-terminated string containing the data you wanted to extract. Sometimes you just assume start = 0 and then want to consume the data from the buffer to make place for new data:
char data[end + 1];
/* copy out the data */
memcpy(data, buffer, end);
/* move data between end end offset to the beginning */
memmove(buffer, buffer + end, offset - end);
/* adjust the offset accordingly */
offset -= end;
Now you have your data extracted but you still have the buffer ready with the rest of the data you haven't processed, yet. This effectively achieves what you wanted, as by keeping the data in an intermediate buffer, you're effectively peeking into an arbitrary part of the data received on input and taking out the data only if it fits your expectations, doing whatever else if they don't.Of course you should carefully test all return values to check for exceptional conditions and such stuff.
I personally would also turn all indexes in the examples into pointers directly to the memory and adjust the arithmetics accordingly, but not everyone enjoys pointer arithmetics as I do ;). I also tend to prefer low-level POSIX API over the intermetiate layer in form of the ANSI API. Ready to fix bugs or improve explanations, please comment.
Your comment that you need the file pointer reset on scan failure makes this impossible to do with scanf().
scanf() is basically specified as "fscanf( stdin, ... )", and fscanf() is defined to "[push] back at most one input character onto the input stream" (C99, footnote 242). (I assume this is for the same reason that ungetc() is only required to support one byte of push-back: So that it can be conveniently buffered in memory.)
*scanf() is a poor choice to read uncertain inputs, for the reason described above and several other shortcomings when it comes to recovery-from-error. Generally speaking, if there is any chance that the input might not conform to the expected format, read input into an internal memory buffer first and then parse it from there.
Just read and store one character too many, and test for that.
char str[34]; // 33 characters + NUL terminator
int scanCount = scanf("%33s", str);
if (scanCount > 0 && strlen(str) > 32)
{
scanCount = 0;
}
On scanning a stream such as stdin is only allowed to "put back" up to 1 char. So scanning 32 or 33 char and then undoing is not possible.
If your input could use ftell() and fseek() (Available when stdin is redirected), code could
long pos = ftell(input);
char str[32+1];
int scanCount;
scanCount = fscanf(input, "%32s", str);
if (scanCount != 1 || strlen(str) >= 32) {
fseek(input, pos, SEEK_SET);
scanCount = fscanf(input, some_new_format, ....);
}
Otherwise use fgets() to read a maximal line and use sscanf()
char buf[1024];
if (fget(buf, sizeof buf, stdin) == NULL) Handle_IOError_or_EOF();
char str[32+1];
int scanCount;
scanCount = sscanf(buf, "%32s", str);
if (scanCount != 1 || strlen(str) >= 32) {
scanCount = sscanf(buf, some_new_format, ....);
}

Estimate size of formatted snprintf() string?

I'm considering writing a function to estimate at least the full length of a formatted string coming from the sprintf(), snprintf() functions.
My approach was to parse the format string to find the various %s, %d, %f, %p args, creating a running sum of strlen()s, itoa()s, and strlen(format_string) to get something guaranteed to be big enough to allocate a proper buffer for snprintf().
I'm aware the following works, but it takes 10X as long, as all the printf() functions are very flexible, but very slow because if it.
char c;
int required_buffer_size = snprintf(&c, 1, "format string", args...);
Has this already been done ? - via the suggested approach, or some other reasonably efficient approach - IE: 5-50X faster than sprintf() variants?
Allocate a big enough buffer first and check if it was long enough. If it wasn't reallocate and call a second time.
int len = 200; /* Any number well chosen for the application to cover most cases */
int need;
char *buff = NULL;
do {
need = len+1;
buff = realloc(buff, need); /* I don't care for return value NULL */
len = snprintf(buff, need, "...", ....);
/* Error check for ret < 0 */
} while(len > need);
/* buff = realloc(buff, len+1); shrink memory block */
By choosing your initial value correctly you will have only one call to snprintf() in most cases and the little bit of over-allocation shouldn't be critical. If you're in a so tight environment that this overallocation is critical, then you have already other problems with the expensive allocation and formating.
In any case, you could still call a realloc() afterwards to shrink the allocated buffer to the exact size.
If the first argument to snprintf is NULL, the return value is the number of characters that would have been written.

Resources