I'm trying to create a function to read a single line from a file of text using fgets() and store it in a dynamically allocating char* using malloc()but I am unsure as to how to use realloc() since I do not know the length of this single line of text and do not want to just guess a magic number for the maximum size that this line could possibly be.
#include "stdio.h"
#include "stdlib.h"
#define INIT_SIZE 50
void get_line (char* filename)
char* text;
FILE* file = fopen(filename,"r");
text = malloc(sizeof(char) * INIT_SIZE);
fgets(text, INIT_SIZE, file);
//How do I realloc memory here if the text array is full but fgets
//has not reach an EOF or \n yet.
printf(The text was %s\n", text);
free(text);
int main(int argc, char *argv[]) {
get_line(argv[1]);
}
I am planning on doing other things with the line of text but for sake of keeping this simple, I have just printed it and then freed the memory.
Also: The main function is initiated by using the filename as the first command line argument.
The getline function is what you looking for.
Use it like this:
char *line = NULL;
size_t n;
getline(&line, &n, stdin);
If you really want to implement this function yourself, you can write something like this:
#include <stdlib.h>
#include <stdio.h>
char *get_line()
{
int c;
/* what is the buffer current size? */
size_t size = 5;
/* How much is the buffer filled? */
size_t read_size = 0;
/* firs allocation, its result should be tested... */
char *line = malloc(size);
if (!line)
{
perror("malloc");
return line;
}
line[0] = '\0';
c = fgetc(stdin);
while (c != EOF && c!= '\n')
{
line[read_size] = c;
++read_size;
if (read_size == size)
{
size += 5;
char *test = realloc(line, size);
if (!test)
{
perror("realloc");
return line;
}
line = test;
}
c = fgetc(stdin);
}
line[read_size] = '\0';
return line;
}
One possible solution is to use two buffers: One temporary that you use when calling fgets; And one that you reallocate, and append the temporary buffer to.
Perhaps something like this:
char temp[INIT_SIZE]; // Temporary string for fgets call
char *text = NULL; // The actual and full string
size_t length = 0; // Current length of the full string, needed for reallocation
while (fgets(temp, sizeof temp, file) != NULL)
{
// Reallocate
char *t = realloc(text, length + strlen(temp) + 1); // +1 for terminator
if (t == NULL)
{
// TODO: Handle error
break;
}
if (text == NULL)
{
// First allocation, make sure string is properly terminated for concatenation
t[0] = '\0';
}
text = t;
// Append the newly read string
strcat(text, temp);
// Get current length of the string
length = strlen(text);
// If the last character just read is a newline, we have the whole line
if (length > 0 && text[length - 1] == '\n')
{
break;
}
}
[Discalimer: The code above is untested and may contain bugs]
With the declaration of void get_line (char* filename), you can never make use of the line you read and store outside of the get_line function because you do not return a pointer to line and do not pass the address of any pointer than could serve to make any allocation and read visible back in the calling function.
A good model (showing return type and useful parameters) for any function to read an unknown number of characters into a single buffer is always POSIX getline. You can implement your own using either fgetc of fgets and a fixed buffer. Efficiency favors the use of fgets only to the extent it would minimize the number of realloc calls needed. (both functions will share the same low-level input buffer size, e.g. see gcc source IO_BUFSIZ constant -- which if I recall is now LIO_BUFSIZE after a recent name change, but basically boils down to an 8192 byte IO buffer on Linux and 512 bytes on windows)
So long as you dynamically allocate the original buffer (either using malloc, calloc or realloc), you can read continually with a fixed buffer using fgets adding the characters read into the fixed buffer to your allocated line and checking whether the final character is '\n' or EOF to determine when you are done. Simply read a fixed buffer worth of chars with fgets each iteration and realloc your line as you go, appending the new characters to the end.
When reallocating, always realloc using a temporary pointer. That way, if you run out of memory and realloc returns NULL (or fails for any other reason), you won't overwrite the pointer to your currently allocated block with NULL creating a memory leak.
A flexible implementation that sizes the fixed buffer as a VLA using either the defined SZINIT for the buffer size (if the user passes 0) or the size provided by the user to allocate initial storage for line (passed as a pointer to pointer to char) and then reallocating as required, returning the number of characters read on success or -1 on failure (the same as POSIX getline does) could be done like:
/** fgetline, a getline replacement with fgets, using fixed buffer.
* fgetline reads from 'fp' up to including a newline (or EOF)
* allocating for 'line' as required, initially allocating 'n' bytes.
* on success, the number of characters in 'line' is returned, -1
* otherwise
*/
ssize_t fgetline (char **line, size_t *n, FILE *fp)
{
if (!line || !n || !fp) return -1;
#ifdef SZINIT
size_t szinit = SZINIT > 0 ? SZINIT : 120;
#else
size_t szinit = 120;
#endif
size_t idx = 0, /* index for *line */
maxc = *n ? *n : szinit, /* fixed buffer size */
eol = 0, /* end-of-line flag */
nc = 0; /* number of characers read */
char buf[maxc]; /* VLA to use a fixed buffer (or allocate ) */
clearerr (fp); /* prepare fp for reading */
while (fgets (buf, maxc, fp)) { /* continuall read maxc chunks */
nc = strlen (buf); /* number of characters read */
if (idx && *buf == '\n') /* if index & '\n' 1st char */
break;
if (nc && (buf[nc - 1] == '\n')) { /* test '\n' in buf */
buf[--nc] = 0; /* trim and set eol flag */
eol = 1;
}
/* always realloc with a temporary pointer */
void *tmp = realloc (*line, idx + nc + 1);
if (!tmp) /* on failure previous data remains in *line */
return idx ? (ssize_t)idx : -1;
*line = tmp; /* assign realloced block to *line */
memcpy (*line + idx, buf, nc + 1); /* append buf to line */
idx += nc; /* update index */
if (eol) /* if '\n' (eol flag set) done */
break;
}
/* if eol alone, or stream error, return -1, else length of buf */
return (feof (fp) && !nc) || ferror (fp) ? -1 : (ssize_t)idx;
}
(note: since nc already holds the current number of characters in buf, memcpy can be used to append the contents of buf to *line without scanning for the terminating nul-character again) Look it over and let me know if you have further questions.
Essentially you can use it as a drop-in replacement for POSIX getline (though it will not be quite as efficient -- but isn't not bad either)
Related
I am making a program in c to read a file and execute different pieces of code depending on what is in the file. However, when I try to read the file I get an exception:
Exception thrown at 0x7C3306DD (ucrtbased.dll) in MacroTool.exe: 0xC0000005: Access violation reading location 0x00000068.
Here is my code:
#include <stdio.h>
#include <string.h>
#include <limits.h>
int main(int argc, char* argv[]) {
if (argc < 2) {
printf("ERR: too few arguments.\n");
return 1;
}
if (argc > 2) {
printf("ERR: too many arguments.\n");
return 1;
}
FILE* fp = fopen(argv[1], "r");
if (fp == NULL) {
printf("ERR: cannot read file.\n");
return 1;
}
char buf[100];
char c;
while ((c = fgetc(fp)) != EOF)
strncat(buf, c, 1);
fclose(fp);
return 0;
}
Using multiple search engines to find an answer that is related to this issue led to nothing.
I am using windows 10 and visual studio 2019
You are attempting to use a string-funciton on a buffer that is not nul-terminated resulting in Undefined Behavior. Specifically:
char buf[100];
char c;
while ((c = fgetc(fp)) != EOF)
strncat(buf, c, 1);
On your first call to strncat, buf in uninitialized resulting in Undefined Behavior due to strncat() replacing the existing nul-terminating character (which doesn't exist) with the new text to be appended. You can initialize buf all zero with:
char buf[100] = "";
That will fix the immediate Undefined Behavior but will not prevent you later reading beyond the bounds of buf when you read past the 99th character.
Instead, declare a counter and initialize the counter zero, and use that to limit the number of characters read into your buf, as in the comment:
size_t n = 0;
while (n + 1 < 100 && (c = fgetc(fp)) != EOF) {
buf[n++] = c;
}
buf[n] = 0; /* don't forget to nul-terminate buf */
You can put it altogether and avoid using Magic-Numbers (100) as follows:
#include <stdio.h>
#define MAXC 100 /* if you need a constant, #define one (or more) */
int main (int argc, char* argv[]) {
if (argc < 2) { /* validate one argument given for filename */
fputs ("error: too few arguments.\n", stderr);
return 1;
}
char buf[MAXC] = "", c;
size_t n = 0;
FILE* fp = fopen (argv[1], "r");
if (fp == NULL) { /* validate file open for reading */
perror ("fopen-argv[1]"); /* on failure, perror() tells why */
return 1;
}
/* while buf not full (saving 1-char for \0), read char */
while (n + 1 < MAXC && (c = fgetc(fp)) != EOF)
buf[n++] = c; /* assign char to next element in buf */
buf[n] = 0; /* nul-terminate buf */
puts (buf); /* output result */
fclose(fp);
}
(note: the comparison with n + 1 ensures one-element in buf remains for the nul-terminating character)
Example Use/Output
$ ./bin/read100chars read100chars.c
#include <stdio.h>
#define MAXC 100 /* if you need a constant, #define one (or more) */
in
(note: the first 'n' in int main (... is the 99th character in the file)
Look thing over and let me know if you need further help. (note with VS you will likely need to pass /w4996 to disable the "CRT Secure...." warning.)
char *strncat(char *dest, const char *src, size_t n);
https://linux.die.net/man/3/strncat
The strcat() function appends the src string to the dest string, overwriting the terminating null byte ('\0') at the end of dest, and then adds a terminating null byte. The strings may not overlap, and the dest string must have enough space for the result. If dest is not large enough, program behavior is unpredictable; buffer overruns are a favorite avenue for attacking secure programs.
Note that strncat will take 2 pointers, first the destination and the second is the source, you're providing a character so it should be strncat(buf, &c, 1);
the strncat will start appending characters at the end of the destination, ie. at the null byte, and since your buffer isn't initialized, it contains noise and may not contain nullbyte '\0' inside. The strncat try searching the \0 starting from buff and it may find somewhere after the array ends and start to append characters there. and that's why you get error Access violation writing location 0x68FB49FC to fix this set the first byte of your buffer to \0
char buf[100];
buf[0] = '\0';
Also you don't check if your buffer has space for the new character, which also leads to access violation that if your source file contains more than 99 characters. you also have to handle that as well
limit your loop by 99 characters (and print an error if it's exceeded)
make your buffer enough like char buf[2048]; or something but still, it'll crash if it's not enough
create a buffer with the size of your source
size_t getFileSize(FILE* _file) {
fseek(_file, 0, SEEK_END);
size_t _size = ftell(_file);
fseek(_file, 0, SEEK_SET);
return _size;
}
char* buf = malloc(getFileSize(fp)); //< you have to manage the allocated memory
And one more thing
you're trying to read a file into a buffer by strncat this might be in efficient since every time it needs to get the strlen, check the size n and add both character and \0. Instead, I recommend
char* readFile(FILE* _file) {
size_t file_size = getFileSize(_file);
char* buf = malloc(file_size + 1);
size_t read = fread(buf, sizeof(char), file_size, _file);
buf[read] = '\0';
return buf;
}
This code reads a text file line by line. But I need to put those lines in an array but I wasn't able to do it. Now I am getting a array of numbers somehow. So how to read the file into a list. I tried using 2 dimensional list but this doesn't work as well.
I am new to C. I am mostly using Python but now I want to check if C is faster or not for a task.
#include <stdio.h>
#include <time.h>
#include <string.h>
void loadlist(char *ptext) {
char filename[] = "Z://list.txt";
char myline[200];
FILE * pfile;
pfile = fopen (filename, "r" );
char larray[100000];
int i = 0;
while (!feof(pfile)) {
fgets(myline,200,pfile);
larray[i]= myline;
//strcpy(larray[i],myline);
i++;
//printf(myline);
}
fclose(pfile);
printf("%s \n %d \n %d \n ","while doneqa",i,strlen(larray));
printf("First larray element is: %d \n",larray[0]);
/* for loop execution */
//for( i = 10; i < 20; i = i + 1 ){
// printf(larray[i]);
//}
}
int main ()
{
time_t stime, etime;
printf("Starting of the program...\n");
time(&stime);
char *ptext = "String";
loadlist(ptext);
time(&etime);
printf("time to load: %f \n", difftime(etime, stime));
return(0);
}
This code reads a text file line by line. But I need to put those lines in an array but I wasn't able to do it. Now I am getting an array of numbers somehow.
There are many ways to do this correctly. To begin with, first sort out what it is you actually need/want to store, then figure out where that information will come from and finally decide how you will provide storage for the information. In your case loadlist is apparently intended load a list of lines (up to 10000) so that they are accessible through your statically declared array of pointers. (you can also allocate the pointers dynamically, but if you know you won't need more than X of them, statically declaring them is fine (up to the point you cause StackOverflow...)
Once you read the line in loadlist, then you need to provide adequate storage to hold the line (plus the nul-terminating character). Otherwise, you are just counting the number of lines. In your case, since you declare an array of pointers, you cannot simply copy the line you read because each of the pointers in your array does not yet point to any allocated block of memory. (you can't assign the address of the buffer you read the line into with fgets (buffer, size, FILE*) because (1) it is local to your loadlist function and it will go away when the function stack frame is destroyed on function return; and (2) obviously it gets overwritten with each call to fgets anyway.
So what to do? That's pretty simple too, just allocate storage for each line as it is read using the strlen of each line as #iharob says (+1 for the nul-byte) and then malloc to allocate a block of memory that size. You can then simply copy the read buffer to the block of memory created and assign the pointer to your list (e.g. larray[x] in your code). Now the gnu extensions provide a strdup function that both allocates and copies, but understand that is not part of the C99 standard so you can run into portability issues. (also note you can use memcpy if overlapping regions of memory are a concern, but we will ignore that for now since you are reading lines from a file)
What are the rules for allocating memory? Well, you allocate with malloc, calloc or realloc and then you VALIDATE that your call to those functions succeeded before proceeding or you have just entered the realm of undefined behavior by writing to areas of memory that are NOT in fact allocated for your use. What does that look like? If you have your array of pointers p and you want to store a string from your read buffer buf of length len at index idx, you could simply do:
if ((p[idx] = malloc (len + 1))) /* allocate storage */
strcpy (p[idx], buf); /* copy buf to storage */
else
return NULL; /* handle error condition */
Now you are free to allocate before you test as follows, but it is convenient to make the assignment as part of the test. The long form would be:
p[idx] = malloc (len + 1); /* allocate storage */
if (p[idx] == NULL) /* validate/handle error condition */
return NULL;
strcpy (p[idx], buf); /* copy buf to storage */
How you want to do it is up to you.
Now you also need to protect against reading beyond the end of your pointer array. (you only have a fixed number since you declared the array statically). You can make that check part of your read loop very easily. If you have declared a constant for the number of pointers you have (e.g. PTRMAX), you can do:
int idx = 0; /* index */
while (fgets (buf, LNMAX, fp) && idx < PTRMAX) {
...
idx++;
}
By checking the index against the number of pointers available, you insure you cannot attempt to assign address to more pointers than you have.
There is also the unaddressed issue of handling the '\n' that will be contained at the end of your read buffer. Recall, fgets read up to and including the '\n'. You do not want newline characters dangling off the ends of the strings you store, so you simply overwrite the '\n' with a nul-terminating character (e.g. simply decimal 0 or the equivalent nul-character '\0' -- your choice). You can make that a simple test after your strlen call, e.g.
while (fgets (buf, LNMAX, fp) && idx < PTRMAX) {
size_t len = strlen (buf); /* get length */
if (buf[len-1] == '\n') /* check for trailing '\n' */
buf[--len] = 0; /* overwrite '\n' with nul-byte */
/* else { handle read of line longer than 200 chars }
*/
...
(note: that also brings up the issue of reading a line longer than the 200 characters you allocate for your read buffer. You check for whether a complete line has been read by checking whether fgets included the '\n' at the end, if it didn't, you know your next call to fgets will be reading again from the same line, unless EOF is encountered. In that case you would simply need to realloc your storage and append any additional characters to that same line -- that is left for future discussion)
If you put all the pieces together and choose a return type for loadlist that can indicate success/failure, you could do something similar to the following:
/** read up to PTRMAX lines from 'fp', allocate/save in 'p'.
* storage is allocated for each line read and pointer
* to allocated block is stored at 'p[x]'. (you should
* add handling of lines greater than LNMAX chars)
*/
char **loadlist (char **p, FILE *fp)
{
int idx = 0; /* index */
char buf[LNMAX] = ""; /* read buf */
while (fgets (buf, LNMAX, fp) && idx < PTRMAX) {
size_t len = strlen (buf); /* get length */
if (buf[len-1] == '\n') /* check for trailing '\n' */
buf[--len] = 0; /* overwrite '\n' with nul-byte */
/* else { handle read of line longer than 200 chars }
*/
if ((p[idx] = malloc (len + 1))) /* allocate storage */
strcpy (p[idx], buf); /* copy buf to storage */
else
return NULL; /* indicate error condition in return */
idx++;
}
return p; /* return pointer to list */
}
note: you could just as easily change the return type to int and return the number of lines read, or pass a pointer to int (or better yet size_t) as a parameter to make the number of lines stored available back in the calling function.
However, in this case, we have used the initialization of all pointers in your array of pointers to NULL, so back in the calling function we need only iterate over the pointer array until the first NULL is encountered in order to traverse our list of lines. Putting together a short example program that read/stores all lines (up to PTRMAX lines) from the filename given as the first argument to the program (or from stdin if no filename is given), you could do something similar to:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
enum { LNMAX = 200, PTRMAX = 10000 };
char **loadlist (char **p, FILE *fp);
int main (int argc, char **argv) {
time_t stime, etime;
char *list[PTRMAX] = { NULL }; /* array of ptrs initialized NULL */
size_t n = 0;
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
fprintf (stderr, "error: file open failed '%s'.\n", argv[1]);
return 1;
}
printf ("Starting of the program...\n");
time (&stime);
if (loadlist (list, fp)) { /* read lines from fp into list */
time (&etime);
printf("time to load: %f\n\n", difftime (etime, stime));
}
else {
fprintf (stderr, "error: loadlist failed.\n");
return 1;
}
if (fp != stdin) fclose (fp); /* close file if not stdin */
while (list[n]) { /* output stored lines and free allocated mem */
printf ("line[%5zu]: %s\n", n, list[n]);
free (list[n++]);
}
return(0);
}
/** read up to PTRMAX lines from 'fp', allocate/save in 'p'.
* storage is allocated for each line read and pointer
* to allocated block is stored at 'p[x]'. (you should
* add handling of lines greater than LNMAX chars)
*/
char **loadlist (char **p, FILE *fp)
{
int idx = 0; /* index */
char buf[LNMAX] = ""; /* read buf */
while (fgets (buf, LNMAX, fp) && idx < PTRMAX) {
size_t len = strlen (buf); /* get length */
if (buf[len-1] == '\n') /* check for trailing '\n' */
buf[--len] = 0; /* overwrite '\n' with nul-byte */
/* else { handle read of line longer than 200 chars }
*/
if ((p[idx] = malloc (len + 1))) /* allocate storage */
strcpy (p[idx], buf); /* copy buf to storage */
else
return NULL; /* indicate error condition in return */
idx++;
}
return p; /* return pointer to list */
}
Finally, in any code your write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
Use a memory error checking program to insure you haven't written beyond/outside your allocated block of memory, attempted to read or base a jump on an uninitialized value and finally to confirm that you have freed all the memory you have allocated.
For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
Look things over, let me know if you have any further questions.
It's natural that you see numbers because you are printing a single character using the "%d" specifier. In fact, strings in c are pretty much that, arrays of numbers, those numbers are the ascii values of the corresponding characters. If you instead use "%c" you will see the character that represents each of those numbers.
Your code also, calls strlen() on something that is intended as a array of strings, strlen() is used to compute the length of a single string, a string being an array of char items with a non-zero value, ended with a 0. Thus, strlen() is surely causing undefined behavior.
Also, if you want to store each string, you need to copy the data like you tried in the commented line with strcpy() because the array you are using for reading lines is overwritten over and over in each iteration.
Your compiler must be throwing all kinds of warnings, if it's not then it's your fault, you should let the compiler know that you want it to do some diagnostics to help you find common problems like assigning a pointer to a char.
You should fix multiple problems in your code, here is a code that fixes most of them
void
loadlist(const char *const filename) {
char line[100];
FILE *file;
// We can only read 100 lines, of
// max 99 characters each
char array[100][100];
int size;
size = 0;
file = fopen (filename, "r" );
if (file == NULL)
return;
while ((fgets(line, sizeof(line), file) != NULL) && (size < 100)) {
strcpy(array[size++], line);
}
fclose(file);
for (int i = 0 ; i < size ; ++i) {
printf("array[%d] = %s", i + 1, array[i]);
}
}
int
main(void)
{
time_t stime, etime;
printf("Starting of the program...\n");
time(&stime);
loadlist("Z:\\list.txt");
time(&etime);
printf("Time to load: %f\n", difftime(etime, stime));
return 0;
}
Just to prove how complicated it can be in c, check this out
#include <stdio.h>
#include <time.h>
#include <string.h>
#include <stdlib.h>
struct string_list {
char **items;
size_t size;
size_t count;
};
void
string_list_print(struct string_list *list)
{
// Simply iterate through the list and
// print every item
for (size_t i = 0 ; i < list->count ; ++i) {
fprintf(stdout, "item[%zu] = %s\n", i + 1, list->items[i]);
}
}
struct string_list *
string_list_create(size_t size)
{
struct string_list *list;
// Allocate space for the list object
list = malloc(sizeof *list);
if (list == NULL) // ALWAYS check this
return NULL;
// Allocate space for the items
// (starting with `size' items)
list->items = malloc(size * sizeof *list->items);
if (list->items != NULL) {
// Update the list size because the allocation
// succeeded
list->size = size;
} else {
// Be optimistic, maybe realloc will work next time
list->size = 0;
}
// Initialize the count to 0, because
// the list is initially empty
list->count = 0;
return list;
}
int
string_list_append(struct string_list *list, const char *const string)
{
// Check if there is room for the new item
if (list->count + 1 >= list->size) {
char **items;
// Resize the array, there is no more room
items = realloc(list->items, 2 * list->size * sizeof *list->items);
if (items == NULL)
return -1;
// Now update the list
list->items = items;
list->size += list->size;
}
// Copy the string into the array we simultaneously
// increase the `count' and copy the string
list->items[list->count++] = strdup(string);
return 0;
}
void
string_list_destroy(struct string_list *const list)
{
// `free()' does work with a `NULL' argument
// so perhaps as a principle we should too
if (list == NULL)
return;
// If the `list->items' was initialized, attempt
// to free every `strdup()'ed string
if (list->items != NULL) {
for (size_t i = 0 ; i < list->count ; ++i) {
free(list->items[i]);
}
free(list->items);
}
free(list);
}
struct string_list *
loadlist(const char *const filename) {
char line[100]; // A buffer for reading lines from the file
FILE *file;
struct string_list *list;
// Create a new list, initially it has
// room for 100 strings, but it grows
// automatically if needed
list = string_list_create(100);
if (list == NULL)
return NULL;
// Attempt to open the file
file = fopen (filename, "r");
// On failure, we now have the responsibility
// to cleanup the allocated space for the string
// list
if (file == NULL) {
string_list_destroy(list);
return NULL;
}
// Read lines from the file until there are no more
while (fgets(line, sizeof(line), file) != NULL) {
char *newline;
// Remove the trainling '\n'
newline = strchr(line, '\n');
if (newline != NULL)
*newline = '\0';
// Append the string to the list
string_list_append(list, line);
}
fclose(file);
return list;
}
int
main(void)
{
time_t stime, etime;
struct string_list *list;
printf("Starting of the program...\n");
time(&stime);
list = loadlist("Z:\\list.txt");
if (list != NULL) {
string_list_print(list);
string_list_destroy(list);
}
time(&etime);
printf("Time to load: %f\n", difftime(etime, stime));
return 0;
}
Now, this will work almost as the python code you say you wrote but it will certainly be faster, there is absolutely no doubt.
It is possible that an experimented python programmer can write a python program that runs faster than that of a non-experimented c programmer, learning c however is really good because you then understand how things work really, and you can then infer how a python feature is probably implemented, so understanding this can be very useful actually.
Although it's certainly way more complicated than doing the same in python, note that I wrote this in nearly 10min. So if you really know what you're doing and you really need it to be fast c is certainly an option, but you need to learn many concepts that are not clear to higher level languages programmers.
how to dynamically allocate memory for a string?
I want to take a text file as input and want to store the characters of the file to a string.
First I count the number of character in the text file then dynamically allocate the string for this size and then want to the copy the text to the string.
main()
{
int count = 0; /* number of characters seen */
FILE *in_file; /* input file */
/* character or EOF flag from input */
int ch;
in_file = fopen("TMCP.txt", "r");
if (in_file == NULL) {
printf("Cannot open %s\n", "FILE_NAME");
exit(8);
}
while (1)
{
ch = fgetc(in_file);
if (ch == EOF)
break;
++count;
}
printf("Number of characters is %d\n",
count);
char *buffer=(char*)malloc(count*(sizeof(char)));
}
That's a terrible solution. You can determine the size of the file using a load of methods (search for tell file size, and especially for fstat), and you can just mmap your file to memory directly, giving you exactly that buffer.
One option is to read the file a fixed-sized chunk at a time and extend the dynamic buffer as you read the file. Something like the following:
#define CHUNK_SIZE 512
...
char chunk[CHUNK_SIZE];
char *buffer = NULL;
size_t bufSize = 0;
...
while ( fgets( chunk, sizeof chunk, in_file ) )
{
char *tmp = realloc( buffer, bufSize + sizeof chunk );
if ( tmp )
{
buffer = tmp;
buffer[bufSize] = 0; // need to make sure that there is a 0 terminator
// in the buffer for strcat to work properly.
strcat( buffer, chunk );
bufSize += sizeof chunk;
}
else
{
// could not extend the dynamic buffer; handle as necessary
}
}
This snippet reads up to 511 characters from in_file at a time (fgets will zero-terminate the target array). It will allocate and extend buffer for each chunk, then concatenate the input to buffer. In order for strcat to work properly, the destination buffer needs to be 0-terminated. This isn't guaranteed the first time around when the buffer is initially allocated, although it should be on sunsequent iterations.
Another strategy is to double the buffer size each time, which results in fewer realloc calls, but this is probably easier to grasp.
I am trying to find a way to manipulate strings in C in a more efficient way (maybe like how java does it).
One way I thought of it is to count the size of the string till the end of the line (maybe including spaces), allocate memory of this size using malloc() and then go back to the beginning of the line and scan the string.
Is there a way to do this? I don't know if there is a way to return the "cursor" to the beginning of the line to 're'scan something.
And if you know another/better way to deal with strings in C please tell me.
Thanks
There is no way to do what you're asking directly, but there is a (in my opinion far better) alternative: fgets().
What it does is read the text until the end of the line, including the final line-feed. If the line is longer than the buffer, then it omits that line feed --- you can use that fact to check if the line was completed.
Something like this (UNTESTED CODE):
// WARNING: Example does not include error checking
// (check the return value of `fgets()`, `malloc()` and `realloc()`!)
size_t buflen = 64;
size_t pos = 0;
char* buf = malloc(buflen);
// `for(;;)` is an infinite loop
for(;;)
{
// read data into buf[pos..buflen] (total of `buflen-pos` bytes)
fgets(buf + pos, buflen - pos, file);
pos = pos + strcspn(buf + pos, "\r\n");
if(buf[pos]) // reached end of line; end the loop
break;
buflen += 64;
// alternative (double the size):
// buflen <<= 1;
buf = realloc(buf, buflen); // resize the buffer
}
// `buf` contains our line; `pos` contains the end of it
// optional: remove the trailing newline
// buf[pos] = 0;
Relevant documentation:
fgets()
strcspn()
malloc()
realloc()
You could use scanf to read every character and then add that character into your buffer.
Your buffer initial size could be 16. And after you read every character you check if you have space for that new character. If you do not have space for your new character you double buffer size and realloc it.
Check out the code example:
#include <stdio.h>
#include <stdlib.h>
char *str;
int main(void) {
char c = '\0';
int size = 0;
int buffer_size = 16;
str = (char *) calloc(buffer_size, sizeof(char));
while (c != '\n') {
scanf("%c", &c);
if (size + 1 == buffer_size) {
buffer_size *= 2;
str = (char *) realloc(str, buffer_size);
if (str == NULL) {
fprintf(stderr, "insufficient memory\n");
return EXIT_FAILURE;
}
}
str[size] = c;
size++;
}
printf("%s\n", str);
return EXIT_SUCCESS;
}
What is the simplest way to read a full line in a C console program
The text entered might have a variable length and we can't make any assumption about its content.
You need dynamic memory management, and use the fgets function to read your line. However, there seems to be no way to see how many characters it read. So you use fgetc:
char * getline(void) {
char * line = malloc(100), * linep = line;
size_t lenmax = 100, len = lenmax;
int c;
if(line == NULL)
return NULL;
for(;;) {
c = fgetc(stdin);
if(c == EOF)
break;
if(--len == 0) {
len = lenmax;
char * linen = realloc(linep, lenmax *= 2);
if(linen == NULL) {
free(linep);
return NULL;
}
line = linen + (line - linep);
linep = linen;
}
if((*line++ = c) == '\n')
break;
}
*line = '\0';
return linep;
}
Note: Never use gets ! It does not do bounds checking and can overflow your buffer
If you are using the GNU C library or another POSIX-compliant library, you can use getline() and pass stdin to it for the file stream.
A very simple but unsafe implementation to read line for static allocation:
char line[1024];
scanf("%[^\n]", line);
A safer implementation, without the possibility of buffer overflow, but with the possibility of not reading the whole line, is:
char line[1024];
scanf("%1023[^\n]", line);
Not the 'difference by one' between the length specified declaring the variable and the length specified in the format string. It is a historical artefact.
So, if you were looking for command arguments, take a look at Tim's answer.
If you just want to read a line from console:
#include <stdio.h>
int main()
{
char string [256];
printf ("Insert your full address: ");
gets (string);
printf ("Your address is: %s\n",string);
return 0;
}
Yes, it is not secure, you can do buffer overrun, it does not check for end of file, it does not support encodings and a lot of other stuff.
Actually I didn't even think whether it did ANY of this stuff.
I agree I kinda screwed up :)
But...when I see a question like "How to read a line from the console in C?", I assume a person needs something simple, like gets() and not 100 lines of code like above.
Actually, I think, if you try to write those 100 lines of code in reality, you would do many more mistakes, than you would have done had you chosen gets ;)
getline runnable example
getline was mentioned on this answer but here is an example.
It is POSIX 7, allocates memory for us, and reuses the allocated buffer on a loop nicely.
Pointer newbs, read this: Why is the first argument of getline a pointer to pointer "char**" instead of "char*"?
main.c
#define _XOPEN_SOURCE 700
#include <stdio.h>
#include <stdlib.h>
int main(void) {
char *line = NULL;
size_t len = 0;
ssize_t read = 0;
while (1) {
puts("enter a line");
read = getline(&line, &len, stdin);
if (read == -1)
break;
printf("line = %s", line);
printf("line length = %zu\n", read);
puts("");
}
free(line);
return 0;
}
Compile and run:
gcc -ggdb3 -O0 -std=c99 -Wall -Wextra -pedantic -o main.out main.c
./main.out
Outcome: this shows on therminal:
enter a line
Then if you type:
asdf
and press enter, this shows up:
line = asdf
line length = 5
followed by another:
enter a line
Or from a pipe to stdin:
printf 'asdf\nqwer\n' | ./main.out
gives:
enter a line
line = asdf
line length = 5
enter a line
line = qwer
line length = 5
enter a line
Tested on Ubuntu 20.04.
glibc implementation
No POSIX? Maybe you want to look at the glibc 2.23 implementation.
It resolves to getdelim, which is a simple POSIX superset of getline with an arbitrary line terminator.
It doubles the allocated memory whenever increase is needed, and looks thread-safe.
It requires some macro expansion, but you're unlikely to do much better.
You might need to use a character by character (getc()) loop to ensure you have no buffer overflows and don't truncate the input.
As suggested, you can use getchar() to read from the console until an end-of-line or an EOF is returned, building your own buffer. Growing buffer dynamically can occur if you are unable to set a reasonable maximum line size.
You can use also use fgets as a safe way to obtain a line as a C null-terminated string:
#include <stdio.h>
char line[1024]; /* Generously large value for most situations */
char *eof;
line[0] = '\0'; /* Ensure empty line if no input delivered */
line[sizeof(line)-1] = ~'\0'; /* Ensure no false-null at end of buffer */
eof = fgets(line, sizeof(line), stdin);
If you have exhausted the console input or if the operation failed for some reason, eof == NULL is returned and the line buffer might be unchanged (which is why setting the first char to '\0' is handy).
fgets will not overfill line[] and it will ensure that there is a null after the last-accepted character on a successful return.
If end-of-line was reached, the character preceding the terminating '\0' will be a '\n'.
If there is no terminating '\n' before the ending '\0' it may be that there is more data or that the next request will report end-of-file. You'll have to do another fgets to determine which is which. (In this regard, looping with getchar() is easier.)
In the (updated) example code above, if line[sizeof(line)-1] == '\0' after successful fgets, you know that the buffer was filled completely. If that position is proceeded by a '\n' you know you were lucky. Otherwise, there is either more data or an end-of-file up ahead in stdin. (When the buffer is not filled completely, you could still be at an end-of-file and there also might not be a '\n' at the end of the current line. Since you have to scan the string to find and/or eliminate any '\n' before the end of the string (the first '\0' in the buffer), I am inclined to prefer using getchar() in the first place.)
Do what you need to do to deal with there still being more line than the amount you read as the first chunk. The examples of dynamically-growing a buffer can be made to work with either getchar or fgets. There are some tricky edge cases to watch out for (like remembering to have the next input start storing at the position of the '\0' that ended the previous input before the buffer was extended).
How to read a line from the console in C?
Building your own function, is one of the ways that would help you to achieve reading a line from console
I'm using dynamic memory allocation to allocate the required amount of memory required
When we are about to exhaust the allocated memory, we try to double the size of memory
And here I'm using a loop to scan each character of the string one by one using the getchar() function until the user enters '\n' or EOF character
finally we remove any additionally allocated memory before returning the line
//the function to read lines of variable length
char* scan_line(char *line)
{
int ch; // as getchar() returns `int`
long capacity = 0; // capacity of the buffer
long length = 0; // maintains the length of the string
char *temp = NULL; // use additional pointer to perform allocations in order to avoid memory leaks
while ( ((ch = getchar()) != '\n') && (ch != EOF) )
{
if((length + 1) >= capacity)
{
// resetting capacity
if (capacity == 0)
capacity = 2; // some initial fixed length
else
capacity *= 2; // double the size
// try reallocating the memory
if( (temp = realloc(line, capacity * sizeof(char))) == NULL ) //allocating memory
{
printf("ERROR: unsuccessful allocation");
// return line; or you can exit
exit(1);
}
line = temp;
}
line[length] = (char) ch; //type casting `int` to `char`
length++;
}
line[length + 1] = '\0'; //inserting null character at the end
// remove additionally allocated memory
if( (temp = realloc(line, (length + 1) * sizeof(char))) == NULL )
{
printf("ERROR: unsuccessful allocation");
// return line; or you can exit
exit(1);
}
line = temp;
return line;
}
Now you could read a full line this way :
char *line = NULL;
line = scan_line(line);
Here's an example program using the scan_line() function :
#include <stdio.h>
#include <stdlib.h> //for dynamic allocation functions
char* scan_line(char *line)
{
..........
}
int main(void)
{
char *a = NULL;
a = scan_line(a); //function call to scan the line
printf("%s\n",a); //printing the scanned line
free(a); //don't forget to free the malloc'd pointer
}
sample input :
Twinkle Twinkle little star.... in the sky!
sample output :
Twinkle Twinkle little star.... in the sky!
I came across the same problem some time ago, this was my solutuion, hope it helps.
/*
* Initial size of the read buffer
*/
#define DEFAULT_BUFFER 1024
/*
* Standard boolean type definition
*/
typedef enum{ false = 0, true = 1 }bool;
/*
* Flags errors in pointer returning functions
*/
bool has_err = false;
/*
* Reads the next line of text from file and returns it.
* The line must be free()d afterwards.
*
* This function will segfault on binary data.
*/
char *readLine(FILE *file){
char *buffer = NULL;
char *tmp_buf = NULL;
bool line_read = false;
int iteration = 0;
int offset = 0;
if(file == NULL){
fprintf(stderr, "readLine: NULL file pointer passed!\n");
has_err = true;
return NULL;
}
while(!line_read){
if((tmp_buf = malloc(DEFAULT_BUFFER)) == NULL){
fprintf(stderr, "readLine: Unable to allocate temporary buffer!\n");
if(buffer != NULL)
free(buffer);
has_err = true;
return NULL;
}
if(fgets(tmp_buf, DEFAULT_BUFFER, file) == NULL){
free(tmp_buf);
break;
}
if(tmp_buf[strlen(tmp_buf) - 1] == '\n') /* we have an end of line */
line_read = true;
offset = DEFAULT_BUFFER * (iteration + 1);
if((buffer = realloc(buffer, offset)) == NULL){
fprintf(stderr, "readLine: Unable to reallocate buffer!\n");
free(tmp_buf);
has_err = true;
return NULL;
}
offset = DEFAULT_BUFFER * iteration - iteration;
if(memcpy(buffer + offset, tmp_buf, DEFAULT_BUFFER) == NULL){
fprintf(stderr, "readLine: Cannot copy to buffer\n");
free(tmp_buf);
if(buffer != NULL)
free(buffer);
has_err = true;
return NULL;
}
free(tmp_buf);
iteration++;
}
return buffer;
}
There is a simple regex like syntax that can be used inside scanf to take whole line as input
scanf("%[^\n]%*c", str);
^\n tells to take input until newline doesn't get encountered. Then, with %*c, it reads newline character and here used * indicates that this newline character is discarded.
Sample code
#include <stdio.h>
int main()
{
char S[101];
scanf("%[^\n]%*c", S);
printf("%s", S);
return 0;
}
On BSD systems and Android you can also use fgetln:
#include <stdio.h>
char *
fgetln(FILE *stream, size_t *len);
Like so:
size_t line_len;
const char *line = fgetln(stdin, &line_len);
The line is not null terminated and contains \n (or whatever your platform is using) in the end. It becomes invalid after the next I/O operation on stream.
Something like this:
unsigned int getConsoleInput(char **pStrBfr) //pass in pointer to char pointer, returns size of buffer
{
char * strbfr;
int c;
unsigned int i;
i = 0;
strbfr = (char*)malloc(sizeof(char));
if(strbfr==NULL) goto error;
while( (c = getchar()) != '\n' && c != EOF )
{
strbfr[i] = (char)c;
i++;
strbfr = (void*)realloc((void*)strbfr,sizeof(char)*(i+1));
//on realloc error, NULL is returned but original buffer is unchanged
//NOTE: the buffer WILL NOT be NULL terminated since last
//chracter came from console
if(strbfr==NULL) goto error;
}
strbfr[i] = '\0';
*pStrBfr = strbfr; //successfully returns pointer to NULL terminated buffer
return i + 1;
error:
*pStrBfr = strbfr;
return i + 1;
}
The best and simplest way to read a line from a console is using the getchar() function, whereby you will store one character at a time in an array.
{
char message[N]; /* character array for the message, you can always change the character length */
int i = 0; /* loop counter */
printf( "Enter a message: " );
message[i] = getchar(); /* get the first character */
while( message[i] != '\n' ){
message[++i] = getchar(); /* gets the next character */
}
printf( "Entered message is:" );
for( i = 0; i < N; i++ )
printf( "%c", message[i] );
return ( 0 );
}
Here is a minimal implementation to do it, the nice thing is that it will not keep the '\n', however you have to give it a size to read for security:
#include <stdio.h>
#include <errno.h>
int sc_gets(char *buf, int n)
{
int count = 0;
char c;
if (__glibc_unlikely(n <= 0))
return -1;
while (--n && (c = fgetc(stdin)) != '\n')
buf[count++] = c;
buf[count] = '\0';
return (count != 0 || errno != EAGAIN) ? count : -1;
}
Test with:
#define BUFF_SIZE 10
int main (void) {
char buff[BUFF_SIZE];
sc_gets(buff, sizeof(buff));
printf ("%s\n", buff);
return 0;
}
NB: You are limited to INT_MAX to find your line return, which is more than enough.