I'm trying to use the getdelim function to read an entire text file's contents into a string.
Here is the code I am using:
ssize_t bytesRead = getdelim(&buffer, 0, '\0', fp);
This is failing however, with strerror(errno) saying "Error: Invalid Argument"
I've looked at all the documentation I could and just can't get it working, I've tried getline which does work but I'd like to get this function working preferably.
buffer is NULL initialised as well so it doesn't seem to be that
fp is also not reporting any errors and the file opens perfectly
EDIT: My implementation is based on an answer from this stackoverflow question Easiest way to get file's contents in C
Kervate, please enable compiler warnings (-Wall for gcc), and heed them. They are helpful; why not accept all the help you can get?
As pointed out by WhozCraig and n.m. in comments to your original question, the getdelim() man page shows the correct usage.
If you wanted to read records delimited by the NUL character, you could use
FILE *input; /* Or, say, stdin */
char *buffer = NULL;
size_t size = 0;
ssize_t length;
while (1) {
length = getdelim(&buffer, &size, '\0', input);
if (length == (ssize_t)-1)
break;
/* buffer has length chars, including the trailing '\0' */
}
free(buffer);
buffer = NULL;
size = 0;
if (ferror(input) || !feof(input)) {
/* Error reading input, or some other reason
* that caused an early break out of the loop. */
}
If you want to read the contents of a file into a single character array, then getdelim() is the wrong function.
Instead, use realloc() to dynamically allocate and grow the buffer, appending to it using fread(). To get you started -- this is not complete! -- consider the following code:
FILE *input; /* Handle to the file to read, assumed already open */
char *buffer = NULL;
size_t size = 0;
size_t used = 0;
size_t more;
while (1) {
/* Grow buffer when less than 500 bytes of space. */
if (used + 500 >= size) {
size_t new_size = used + 30000; /* Allocate 30000 bytes more. */
char *new_buffer;
new_buffer = realloc(buffer, new_size);
if (!new_buffer) {
free(buffer); /* Old buffer still exists; release it. */
buffer = NULL;
size = 0;
used = 0;
fprintf(stderr, "Not enough memory to read file.\n");
exit(EXIT_FAILURE);
}
buffer = new_buffer;
size = new_size;
}
/* Try reading more data, as much as fits in buffer. */
more = fread(buffer + used, 1, size - used, input);
if (more == 0)
break; /* Could be end of file, could be error */
used += more;
}
Note that the buffer in this latter snippet is not a string. There is no terminating NUL character, so it's just an array of chars. In fact, if the file contains binary data, the array may contain lots of NULs (\0, zero bytes). Assuming there was no error and all of the file was read (you need to check for that, see the former example), buffer contains used chars read from the file, with enough space allocated for size. If used > 0, then size > used. If used == 0, then size may or may not be zero.
If you want to turn buffer into a string, you need to decide what to do with the possibly embedded \0 bytes -- I recommend either convert to e.g. spaces or tabs, or move the data to skip them altogether --, and add the string-terminating \0 at end to make it a valid string.
Related
First of all, I know this question is very close to this topic, but the question was so poorly worded that I am not even sure it is a duplicate plus no code were shown so I thought it deserved to be asked properly.
I am trying to read a file line by line and I need to store a line in particular in a variable. I have managed to do so quite easily using fgets, nevertheless the size of the lines to be read and the number of lines in the file remain unknown.
I need a way to properly allocate memory to the variable whatever the size of the line might be, using C and not C++.
So far my code looks like that :
allowedMemory = malloc(sizeof(char[1501])); // Checks if enough memory
if (NULL == allowedMemory)
{
fprintf(stderr, "Not enough memory. \n");
exit(1);
}
else
char* res;
res = allowedMemory;
while(fgets(res, 1500, file)) // Iterate until end of file
{
if (res == theLineIWant) // Using strcmp instead of ==
return res;
}
The problem of this code is that it is not adaptable at all. I am looking for a way to allocate just enough memory to res so that I don't miss any data in line.
I was thinking about something like that :
while ( lineContainingKChar != LineContainingK+1Char) // meaning that the line has not been fully read
// And using strcmp instead of ==
realloc(lineContainingKChar, K + 100) // Adding memory
But I would need to iterate through two FILE object in order to fill these variables which would not be very efficient.
Any hints about how to implement this solution or advise about how to do it in a easier way would be appreciated.
EDIT : Seems like using getline() is the best way to do so because this function allocates the memory needed by itself and free it when needed. Nevertheless I don't think that it is 100% portable since I still can't use it though I have included <stdio.h>. To be verified though, since my issues are often situated between keyboard and computer. Until then I am still open to a solution which would not use POSIX-compliant C.
getline() appears to do exactly what you want:
DESCRIPTION
The getdelim() function shall read from stream until it encounters a
character matching the delimiter character.
...
The getline() function shall be equivalent to the getdelim()
function with the delimiter character equal to the <newline>
character.
...
EXAMPLES
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
FILE *fp;
char *line = NULL;
size_t len = 0;
ssize_t read;
fp = fopen("/etc/motd", "r");
if (fp == NULL)
exit(1);
while ((read = getline(&line, &len, fp)) != -1) {
printf("Retrieved line of length %zu :\n", read);
printf("%s", line);
}
if (ferror(fp)) {
/* handle error */
}
free(line);
fclose(fp);
return 0;
}
And per the Linux man page:
DESCRIPTION
getline() reads an entire line from stream, storing the address of
the buffer containing the text into *lineptr. The buffer is null-
terminated and includes the newline character, if one was found.
If *lineptr is set to NULL and *n is set 0 before the call, then
getline() will allocate a buffer for storing the line. This buffer
should be freed by the user program even if getline() failed.
Alternatively, before calling getline(), *lineptr can contain a
pointer to a malloc(3)-allocated buffer *n bytes in size. If the
buffer is not large enough to hold the line, getline() resizes it
with realloc(3), updating *lineptr and *n as necessary.
In either case, on a successful call, *lineptr and *n will be
updated to reflect the buffer address and allocated size respectively.
Using the standard C library, is there a way to scan a string (containing no whitespace) from standard input only if it fits in a buffer? In the following example I would like scanCount to be 0 if the input string is larger than 32:
char str[32];
int scanCount;
scanCount = scanf("%32s", str);
Edit: I also need file pointer rollback when the input string is too large.
You specified a requirement to only read if the whole data fits your buffer. This requirement makes no sense at all as it doesn't provide any functionality to your program. You can easily achieve the same sort of tasks without it. It also is not how operating systems present files to the user applications.
You can simply create a buffer of any size you see fit and then you can keep the data in the buffer until you can handle it, or you can do magic like actually resizing the buffer to accomodate more incoming data.
You can read any number of characters from a file using the ANSI fread() function:
size_t count;
char buffer[50];
count = fread(buffer, 1, sizeof buffer, stdin);
You can then see how many characters have actually been read by looking at the count variable, you can fill in the final NUL character if it's less than the buffer size or you can decide what to do next, if the whole buffer has been read and more data may be availabe. You could of course read sizeof buffer - 1 instead, to be able to always finalize the string. When the count is smaller than your specified value, feof() and ferror() can be used to see what happened. You can also look at the actual and check for a LF character to see how many lines you have read.
When using an enlarging buffer, you will need malloc() or just create a NULL pointer that will later be allocated using realloc():
/* Set initial size and offset. */
size_t offset = 0;
size_t size = 0;
char *buffer = NULL;
When you need to change the size of the buffer, you can use realloc():
/* Change the size. */
size = 100;
buffer = realloc(buffer, size);
(The first time it's equivalent to buffer = malloc(size).)
You can then read data into the buffer:
size_t count = fread(buffer + offset, 1, size - offset, stdin);
count += offset;
(The first time it's equivalent to fread(buffer, 1, size, stdin).)
When finished, you should free the buffer:
free(buffer);
At any time, you still have all the already read data somewhere in a buffer, so you can get back to it at any time, you just decouple the reading and processing, where the above examples are all about reading.
The processing then depends on what you need. You generally need to identify the start and end of the data that you want to extract.
Example start and end, where end means one character after the last one you want, so the arithmetics work better:
size_t start = 0;
size_t end = 10;
Extract the data (using bits of C99):
char data[end - start + 1];
memcpy(data, buffer + start, end - start);
data[end] = '\0';
Now you have a NUL-terminated string containing the data you wanted to extract. Sometimes you just assume start = 0 and then want to consume the data from the buffer to make place for new data:
char data[end + 1];
/* copy out the data */
memcpy(data, buffer, end);
/* move data between end end offset to the beginning */
memmove(buffer, buffer + end, offset - end);
/* adjust the offset accordingly */
offset -= end;
Now you have your data extracted but you still have the buffer ready with the rest of the data you haven't processed, yet. This effectively achieves what you wanted, as by keeping the data in an intermediate buffer, you're effectively peeking into an arbitrary part of the data received on input and taking out the data only if it fits your expectations, doing whatever else if they don't.Of course you should carefully test all return values to check for exceptional conditions and such stuff.
I personally would also turn all indexes in the examples into pointers directly to the memory and adjust the arithmetics accordingly, but not everyone enjoys pointer arithmetics as I do ;). I also tend to prefer low-level POSIX API over the intermetiate layer in form of the ANSI API. Ready to fix bugs or improve explanations, please comment.
Your comment that you need the file pointer reset on scan failure makes this impossible to do with scanf().
scanf() is basically specified as "fscanf( stdin, ... )", and fscanf() is defined to "[push] back at most one input character onto the input stream" (C99, footnote 242). (I assume this is for the same reason that ungetc() is only required to support one byte of push-back: So that it can be conveniently buffered in memory.)
*scanf() is a poor choice to read uncertain inputs, for the reason described above and several other shortcomings when it comes to recovery-from-error. Generally speaking, if there is any chance that the input might not conform to the expected format, read input into an internal memory buffer first and then parse it from there.
Just read and store one character too many, and test for that.
char str[34]; // 33 characters + NUL terminator
int scanCount = scanf("%33s", str);
if (scanCount > 0 && strlen(str) > 32)
{
scanCount = 0;
}
On scanning a stream such as stdin is only allowed to "put back" up to 1 char. So scanning 32 or 33 char and then undoing is not possible.
If your input could use ftell() and fseek() (Available when stdin is redirected), code could
long pos = ftell(input);
char str[32+1];
int scanCount;
scanCount = fscanf(input, "%32s", str);
if (scanCount != 1 || strlen(str) >= 32) {
fseek(input, pos, SEEK_SET);
scanCount = fscanf(input, some_new_format, ....);
}
Otherwise use fgets() to read a maximal line and use sscanf()
char buf[1024];
if (fget(buf, sizeof buf, stdin) == NULL) Handle_IOError_or_EOF();
char str[32+1];
int scanCount;
scanCount = sscanf(buf, "%32s", str);
if (scanCount != 1 || strlen(str) >= 32) {
scanCount = sscanf(buf, some_new_format, ....);
}
looking for some advice on a problem I've been trying to solve for hours.
The program reads from a text file and does some formatting based on commands given within the file. It seems to work for every file I've tried except 2, which are both fairly large.
Here's the offending code:
/* initalize memory for output */
output.data = (char**)calloc(1,sizeof(char*));
/* initialize size of output */
output.size = 0;
/* iterate through the input, line by line */
int i;
for (i = 0; i < num_lines; i++)
{
/* if it is not a newline and if formatting is on */
if (fmt)
{
/* allocate memory for a buffer to hold the line to be formatted */
char *line_buffer = (char*)calloc(strlen(lines[i]) + 1, sizeof(char));
if (line_buffer == NULL)
{
fprintf(stderr, "ERROR: Memory Allocation Failed\n");
exit(1);
}
/* copy the unformatted line into the buffer and tokenize by whitespace */
strcpy(line_buffer, lines[i]);
char* word = strtok(line_buffer, " \n");
/* while there is a word */
while (word)
{
/* if the next word will go over allocated width */
if (current_pos + strlen(word) + 1 > width)
{
/* make ze newline, increase output size */
strcat(output.data[output.size], "\n");
output.size++;
------->>>>> output.data = (char**)realloc(output.data, sizeof(char*) * (output.size + 1));
Using gdb I've figured out the error is on the line with the arrow pointing to it, only thing is I can't figure out why it occurs. It only happens when the text file that is being formatted is large (716 lines), and it seems to happen on the final iteration (num_lines = 716). Any thoughts would be hugely appreciated. Thanks!
EDIT: Sorry folks, should have mentioned that I'm pretty new to this! Fixed some of the errors.
The most immediate problem is:
strncat(output.data[output.size], "\n", 2);
as pointed out by BLUEPIXY. Currently output.data[output.size] is a null pointer 1, so you cannot strncat to it.
To fix this you could allocate some space:
output.data[output.size] = malloc(2);
if ( NULL == output.data[output.size] )
// error handling...
strcpy(output.data[output.size], "\n");
However there might be another solution that fits in better with the rest of your function, which you haven't shown. (Presumably you allocate space somewhere to store word).
It would be helpful to update your post and show the rest of the function. Also make sure you are posting the exact code, as (output.size + ) does not compile. I guess this is a typo you introduced when trying to put those big arrows on your line.
1 Actually it is all bits zero, which is a null pointer on common systems but not guaranteed to be so for all systems.
I'm looking to copy the FIRST line from a LONG string P into a buffer
I have no idea how to make it.
while (*pros_id != '/n'){
*pros_id_line=*pros_id;
pros_id++;
pros_id_line++;
}
And tried
fgets(pros_id_line, sizeof(pros_id_line), pros_id);
Both are not working. Can I get some help please?
Note, as Adriano Repetti pointed out in a comment and an answer, that the newline character is '\n' and not '/n'.
Your initial code can be fixed up to work, provided that the destination buffer is big enough:
while (*pros_id != '\n' && *pros_id != '\0')
*pros_id_line++ = *pros_id++;
*pros_id_line = '\0';
This code does not include the newline in the copied buffer; it is easy enough to add it if you need it.
One advantage of this code is that it makes a single pass through the data up to the newline (or end of string). An alternative makes two passes through the data, one to find the newline and another to copy to the newline:
if ((end = strchr(pros_id, '\n')) != 0)
{
memmove(pros_id_line, pros_id, end - pros_id);
pros_id_line[end - pros_id] = '\0';
}
This ensures that the string is null-terminated; again, it omits the newline, and assumes there is enough space in the pros_id_line buffer for the data. You have to decide what is the correct behaviour when there is no newline in the buffer. It might be sufficient to copy the buffer without the newline into the target area, or you might prefer to report a problem.
You can use strncpy() instead of memmove() but it has a more complex loop condition than memmove() — it has to check for a null byte as well as the count, whereas memmove() only has to check the count. You can use memcpy() instead of memmove() if you're sure there's no overlap between source and target, but memmove() always works and memcpy() sometimes doesn't (though only when the source and target areas overlap), and I prefer reliability over possible misbehaviour.
Note that setting a buffer to zero before copying a string to it is a waste of energy. The parts that you're about to overwrite with data didn't need to be zeroed. The parts that you aren't going to overwrite with data didn't need to be zeroed either. You should know exactly which byte needs to be zeroed, so why waste the time on zeroing anything except the one byte that needs to be zeroed?
(One exception to this is if you are dealing with sensitive data and are concerned that some function that your code will call may deliberately read beyond the end of the string and come across parts of a password or other sensitive data. Then it may be appropriate to wipe the memory before writing new data to it. On the whole, though, most people aren't writing such code.)
New line is \n not /n anyway I'd use strchar for this:
char* endOfFirstLine = strchr(inputString, '\n');
if (endOfFirstLine != NULL)
{
strncpy(yourBuffer, inputString,
endOfFirstLine - inputString);
}
else // Input is one single line
{
strcpy(yourBuffer, inputString);
}
With inputString as your char* multiline string and inputBuffer (assuming it's big enough to contain all data from inputString and it has been zeroed) as your required output (first line of inputString).
If you're going to be doing a lot of reading from long text buffers, you could try using a memory stream, if you system supports them: https://www.gnu.org/software/libc/manual/html_node/String-Streams.html
#define _GNU_SOURCE
#include <stdio.h>
#include <string.h>
static char buffer[] = "foo\nbar";
int
main()
{
char arr[100];
FILE *stream;
stream = fmemopen(buffer, strlen(buffer), "r");
fgets(arr, sizeof arr, stream);
printf("First line: %s\n", arr);
fgets(arr, sizeof arr, stream);
printf("Second line: %s\n", arr);
fclose (stream);
return 0;
}
POSIX 2008 (e.g. most Linux systems) has getline(3) which heap-allocates a buffer for a line.
So you could code
FILE* fil = fopen("something.txt","r");
if (!fil) { perror("fopen"); exit(EXIT_FAILURE); };
char *linebuf=NULL;
size_t linesiz=0;
if (getline(&linebuf, &linesiz, fil) {
do_something_with(linebuf);
}
else { perror("getline"; exit(EXIT_FAILURE); }
If you want to read an editable line from stdin in a terminal consider GNU readline.
If you are restricted to pure C99 code you have to do the heap allocation yourself (malloc or calloc or perhaps -with care- realloc)
If you just want to copy the first line of some existing buffer char*bigbuf; which is non-NULL, valid, and zero-byte terminated:
char*line = NULL;
char *eol = strchr(bigbuf, '\n');
if (!eol) { // bigbuf is a single line so duplicate it
line = strdup(bigbuf);
if (!line) { perror("strdup"); exit(EXIT_FAILURE); }
} else {
size_t linesize = eol-bugbuf;
line = malloc(linesize+1);
if (!line) { perror("malloc"); exit(EXIT_FAILURE);
memcpy (line, bigbuf, linesize);
line[linesize] = '\0';
}
Seems to be a basic question but I would rather ask this to clear up than spend many more days on this.I am trying to copy data in a buffer which I receive(recv call) which will be then pushed to a file. I want to use memcpy to continuously append/add data to the buffer until the size of buffer is not enough to hold more data where I than use the realloc. The code is as below.
int vl_packetSize = PAD_SIZE + (int)p_size - 1; // PAD_SIZE is the size of char array sent
//p_size is the size of data to be recv. Same data size is used by send
int p_currentSize = MAX_PROTO_BUFFER_SIZE;
int vl_newPacketSize = p_currentSize;
char *vl_data = (char *)malloc(vl_packetSize);
memset((char *)vl_data,'\0',vl_packetSize);
/* Allocate memory to the buffer */
vlBuffer = (char *)malloc(p_currentSize);
memset((char *)vlBuffer,'\0',p_currentSize);
char *vlBufferCopy = vlBuffer;
if(vlBuffer==NULL)
return ERR_NO_MEM;
/* The sender first sends a padding data of size PAD_SIZE followed by actual data. I want to ignore the pad hence do vl_data+PAD_SIZE on memcpy */
if((p_currentSize - vl_llLen) < (vl_packetSize-PAD_SIZE)){
vl_newPacketSize +=vl_newPacketSize;
char *vlTempBuffer = (char *)realloc(vlBufferCopy,(size_t)vl_newPacketSize);
if(vlTempBuffer == NULL){
if(debug > 1)
fprintf(stdout,"Realloc failed:%s...Control Thread\n\n",fn_strerror_r(errno,err_buff));
free((void *)vlBufferCopy);
free((void *)vl_data);
return ERR_NO_MEM;
}
vlBufferCopy = vlTempBuffer;
vl_bytesIns = vl_llLen;
vl_llLen = 0;
vlBuffer = vlBufferCopy+vl_bytesIns;
fprintf(stdout,"Buffer val after realloc:%s\n\n",vlBufferCopy);
}
memcpy(vlBuffer,vl_data+PAD_SIZE,vl_packetSize-PAD_SIZE);
/*
fprintf(stdout,"Buffer val before increment:%s\n\n",vlBuffer);
fprintf(stdout,"vl_data length:%d\n\n",strlen(vl_data+PAD_SIZE));
fprintf(stdout,"vlBuffer length:%d\n\n",strlen(vlBuffer));
*/
vlBuffer+=(vl_packetSize-PAD_SIZE);
vl_llLen += (vl_packetSize-PAD_SIZE);
vl_ifNotFlush = 1;
//fprintf(stdout,"Buffer val just before realloc:%s\n\n",vlBufferCopy);
}
Problem: Whan ever I fputs the data into the file later on. Only the first data recv/added to buffer is gets into the file.
Also when I print the value of vlBufferCopy(which points to first location of data returned by malloc or realloc) I get the same result.
If I decrease the size by 1, I see entire data in the file, but it somehow misses the new line character and hence the data is
not inserted in the proper format in the file.
I know it is because of trailing '\0' but some how reducing the size by 1
(vlBuffer+=(vl_packetSize-PAD_SIZE-1);)
misses the new line character. fputs while putting the data removes the trailing null character
Please let me know what I am missing here to check or in the logic
(Note: I tried using strcat:
strcat(vlBuffer,vl_data+PAD_SIZE);
but I wanted to use memcpy as it is faster and also it can be used for any kind of buffer and not only character pointer
Thanks
strcat and memcpy are very different functions.
I suggest you read the documentation of each.
Mainly, there are two differences:
1. memcpy copies data where you tell it to. strcat finds the end of the string, and copies there.
2. memcpy copies the number of bytes you request. strcat copies until the terminating null.
If you're dealing with packets of arbitrary contents, you have no use for strcat, or other string functions.
You need to write to the file in a binary-safe way. Check how to use fwrite instead of fputs. fwrite will copy all the buffer, even if there's a zero in the middle of it.
const char *mybuff= "Test1\0Test2";
const int mybuff_len = 11;
size_t copied = fwrite(mybuff, mybuff_len, 1, output_file);