reading a file of strings to a multidimensional array to access later - c

I am really having a problem understanding dynamically allocated arrays.
I am attempting to read a text file of strings to a 2d array so I can sort them out later. right now as my code stands it throws seg faults every once in a while. Which means I'm doing something wrong. I've been surfing around trying to get a better understanding of what malloc actually does but I want to test and check if my array is being filled.
my program is pulling from a text file with nothing but strings and I am attempting to put that data into a 2d array.
for(index = 0; index < lines_allocated; index++){
//for loop to fill array 128 lines at a time(arbitrary number)
words[index] = malloc(sizeof(char));
if(words[index] == NULL){
perror("too many characters");
exit(2);
}
//check for end of file
while(!feof(txt_file)) {
words = fgets(words, 64, txt_file);
puts(words);
//realloc if nessesary
if (lines_allocated == (index - 1)){
realloc(words, lines_allocated + lines_allocated);
}
}
}
//get 3rd value placed
printf("%s", words[3]);
since this just a gist, below here ive closed and free'd the memory, The output is being displayed using puts, but not from the printf from the bottom. an ELI5 version of reading files to an array would be amazing.
Thank you in advance

void *malloc(size_t n) will allocate a region of n bytes and return a pointer to the first byte of that region, or NULL if it could not allocate enough space. So when you do malloc(sizeof(char)), you're only allocating enough space for one byte (sizeof(char) is always 1 by definition).
Here's an annotated example that shows the correct use of malloc, realloc, and free. It reads in between 0 and 8 lines from a file, each of which contains a string of unknown length. It then prints each line and frees all the memory.
#include <stdio.h>
#include <stdlib.h>
/* An issue with reading strings from a file is that we don't know how long
they're going to be. fgets lets us set a maximum length and discard the
rest if we choose, but since malloc is what you're interested in, I'm
going to do the more complicated version in which we grow the string as
needed to store the whole thing. */
char *read_line(void) {
size_t maxlen = 16, i = 0;
int c;
/* sizeof(char) is defined to be 1, so we don't need to include it.
the + 1 is for the null terminator */
char *s = malloc(maxlen + 1);
if (!s) {
fprintf(stderr, "ERROR: Failed to allocate %zu bytes\n", maxlen + 1);
exit(EXIT_FAILURE);
}
/* feof only returns 1 after a read has *failed*. It's generally
easier to just use the return value of the read function directly.
Here we'll keep reading until we hit end of file or a newline. */
while ('\n' != (c = getchar())) {
if (EOF == c) {
/* We return NULL to indicate that we hit the end of file
before reading any characters, but if we've read anything,
we still want to return the string */
if (0 == i) return NULL;
break;
}
if (i == maxlen) {
/* Allocations are expensive, so we don't want to do one each
iteration. As such, we're always going to allocate more than
we need. Exactly how much extra we allocate depends on the
program's needs. Here, we just add a constant amount. */
maxlen += 16;
/* realloc will attempt to resize the memory pointed to by s,
or copy it to a newly allocated region of size maxlen. If it
makes a copy, it will free the old version. */
char *p = realloc(s, maxlen + 1);
if (!p) {
/* If the realloc fails, it does not free the old version, so we do it here. */
free(s);
fprintf(stderr, "ERROR: Failed to allocate %zu bytes\n", maxlen + 1);
exit(EXIT_FAILURE);
}
s = p;//set the pointer to the newly allocated memory
}
s[i++] = c;
}
s[i] = '\0';
return s;
}
int main(void) {
/* If we wanted to, we could grow the array of strings just like we do the strings
themselves, but for brevity's sake, we're just going to stop reading once we've
read 8 of them. */
size_t i, nstrings = 0, max_strings = 8;
/* Each string is an array of characters, so we allocate an array of char*;
each char* will point to the first element of a null-terminated character array */
char **strings = malloc(sizeof(char*) * max_strings);
if (!strings) {
fprintf(stderr, "ERROR: Failed to allocate %zu bytes\n", sizeof(char*) * max_strings);
return 1;
}
for (nstrings = 0; nstrings < max_strings; nstrings++) {
strings[nstrings] = read_line();
if (!strings[nstrings]) {//no more strings in file
break;
}
}
for (i = 0; i < nstrings; i++) {
printf("%s\n", strings[i]);
}
/* Free each individual string, then the array of strings */
for (i = 0; i < nstrings; i++) {
free(strings[i]);
}
free(strings);
return 0;
}

I haven't looked too closely so I could be offering an incomplete solution.
That being said, the error is probably here:
realloc(words, lines_allocated + lines_allocated);
realloc if succesful returns the new pointer, if you're lucky it can allocate the adjacent space (which wouldn't cause a segfault).
words = realloc(words, lines_allocated + lines_allocated);
would solve it, although you probably need to check for errors.

Related

Read file line by line and store lines in array of strings in C

I have problem reading a file in c and storing in array of strings
char **aLineToMatch;
FILE *file2;
int bufferLength = 255;
char buffer[bufferLength];
int i;
char *testFiles[] = { "L_2005149PL.01002201.xml.html",
"L_2007319PL.01000101.xml.html",
NULL};
char *testStrings[] = { "First",
"Second",
"Third.",
NULL};
file = fopen(testFiles[0], "r"); // loop will come later, thats not the problem
while(fgets(buffer, bufferLength, file2) != NULL) {
printf("%s\n", buffer);
// here should be adding to array of strings (testStrings declared above)
}
fclose(file);
}
and then I do some checks, some prints etc.
for(aLineToMatch=testStrings; *aLineToMatch != NULL; aLineToMatch++) {
printf("String: %s\n", *aLineToMatch);
How to properly change the values of *testFiles[] to include valid values read from file and add NULL at the end?
I think the key issue here is that in C you must manage your own memory, and you need to know the difference between the different types of storage available in C.
Simply put, there's:
Stack
Heap
Static
Here's some relevant links with more detail about this:
https://www.geeksforgeeks.org/memory-layout-of-c-program/
https://craftofcoding.wordpress.com/2015/12/07/memory-in-c-the-stack-the-heap-and-static/
In higher-level languages everything is on the heap anyway so you can pretty much manipulate it however you please.
However, bog-standard arrays and strings in C have static storage of a fixed size.
The rest of this answer is in the code comments below.
I've modified your code and tried to give explanations and context as to why it is needed.
// #Compile gcc read_line_by_line.c && ./a.out
// #Compile gcc read_line_by_line.c && valgrind ./a.out
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>
#include <stdbool.h>
// When declaring an array, the size of the array must be a compile-time constant
// i.e. it cannot be a dynamic variable like this: int n = 3; int numbers[n];
#define BUFFER_SIZE_BYTES 255
// Uses static program storage, size is fixed at time of compilation
char *files[] = {"file1.txt", "file2.txt"}; // The size of this symbol is (sizeof(char*) * 2)
// Hence this line of code is valid even outside the body of a function
// because it doesn't actually execute,
// it just declares some memory that the compiler is supposed to provision in the resulting binary executable
// Divide the total size, by the size of an element, to calculate the number of elements
const int num_files = sizeof(files) / sizeof(files[0]);
int main() {
printf("Program start\n\n");
printf("There are %d files to read.\n", num_files);
// These lines are in the body of a function and they execute at runtime
// This means we are now allocating memory 'on-the-fly' at runtime
int num_lines = 3;
char **lines = malloc(sizeof(lines[0]) * num_lines);
// lines[0] = "First"; // This would assign a pointer to some static storage containing the bytes { 'F', 'i', 'r', 's', 't', '\0' }
lines[0] = strdup("First"); // Use strdup() instead to allocate a copy of the string on the heap
lines[1] = strdup("Second"); // This is so that we don't end up with a mixture of strings
lines[2] = strdup("Third"); // with different kinds of storage in the same array
// because only the heap strings can be free()'d
// and trying to free() static strings is an error
// but you won't be able to tell them apart,
// they will all just look like pointers
// and you won't know which ones are safe to free()
printf("There are %d lines in the array.\n", num_lines);
// Reading the files this way only works for lines shorter than 255 characters
/*
printf("\nReading file...\n");
FILE *fp = fopen(files[0], "r");
char buffer[BUFFER_SIZE_BYTES];
while (fgets(buffer, BUFFER_SIZE_BYTES, fp) != NULL) {
printf("%s\n", buffer);
// Resize the array we allocated on the heap
void *ptr = realloc(lines, (num_lines + 1) * sizeof(lines[0]));
// Note that this can fail if there isn't enough free memory available
// This is also a comparatively expensive operation
// so you wouldn't typically do a resize for every single line
// Normally you would allocate extra space, wait for it to run out, then reallocate
// Either growing by a fixed size, or even doubling the size, each time it gets full
// Check if the allocation was successful
if (ptr == NULL) {
fprintf(stderr, "Failed to allocate memory at %s:%d\n", __FILE__, __LINE__);
assert(false);
}
// Overwrite `lines` with the pointer to the new memory region only if realloc() was successful
lines = ptr;
// We cannot simply lines[num_lines] = buffer
// because we will end up with an array full of pointers
// that are all pointing to `buffer`
// and in the next iteration of the loop
// we will overwrite the contents of `buffer`
// so all appended strings will be the same: the last line of the file
// So we strdup() to allocate a copy on the heap
// we must remember to free() this later
lines[num_lines] = strdup(buffer);
// Keep track of the size of the array
num_lines++;
}
fclose(fp);
printf("Done.\n");
*/
// I would recommend reading the file this way instead
///*
printf("\nReading file...\n");
FILE *fp = fopen(files[0], "r");
char *new_line = NULL; // This string is allocated for us by getline() and could be any length, we must free() it though afterwards
size_t str_len = 0; // This will store the length of the string (including null-terminator)
ssize_t bytes_read; // This will store the bytes read from the file (excluding null-terminator), or -1 on error (i.e. end-of-file reached)
while ((bytes_read = getline(&new_line, &str_len, fp)) != -1) {
printf("%s\n", new_line);
// Resize the array we allocated on the heap
void *ptr = realloc(lines, (num_lines + 1) * sizeof(lines[0]));
// Note that this can fail if there isn't enough free memory available
// This is also a comparatively expensive operation
// so you wouldn't typically do a resize for every single line
// Normally you would allocate extra space, wait for it to run out, then reallocate
// Either growing by a fixed size, or even doubling the size, each time it gets full
// Check if the allocation was successful
if (ptr == NULL) {
fprintf(stderr, "Failed to allocate memory at %s:%d\n", __FILE__, __LINE__);
assert(false);
}
// Overwrite `lines` with the pointer to the new memory region only if realloc() was successful
lines = ptr;
// Allocate a copy on the heap
// so that the array elements don't all point to the same buffer
// we must remember to free() this later
lines[num_lines] = strdup(new_line);
// Keep track of the size of the array
num_lines++;
}
free(new_line); // Free the buffer that was allocated by getline()
fclose(fp); // Close the file since we're done with it
printf("Done.\n");
//*/
printf("\nThere are %d lines in the array:\n", num_lines);
for (int i = 0; i < num_lines; i++) {
printf("%d: \"%s\"\n", i, lines[i]);
}
// Here you can do what you need to with the data...
// free() each string
// We know they're all allocated on the heap
// because we made copies of the statically allocated strings
for (int i = 0; i < num_lines; i++) {
free(lines[i]);
}
// free() the array itself
free(lines);
printf("\nProgram end.\n");
// At this point we should have free()'d everything that we allocated
// If you run the program with Valgrind, you should get the magic words:
// "All heap blocks were freed -- no leaks are possible"
return 0;
}
If you want to add elements to an array, you have 3 options:
Determine the maximum number of elements at compile-time and create a correctly sized arrray
Determine the maximum number of elements at run-time and create a variable-length array (works only in C99 and later)
Dynamically allocate the array and expand it as needed
Option 1 doesn't work here because it is impossible to know at compile-time how many lines your file will have.
Option 2 would imply that you first find the number of lines, which means to iterate the file twice. It also means that when you return from the function that reads the file, the array is automatically deallocated.
Option 3 is the best. Here is an example:
char **aLineToMatch;
FILE *file2;
int bufferLength = 255;
char buffer[bufferLength];
int i = 0;
char *testFiles[] = { "L_2005149PL.01002201.xml.html",
"L_2007319PL.01000101.xml.html",
NULL};
char (*testStrings)[bufferLength] = NULL; //pointer to an array of strings
//you probably meant file2 here (or the normal file in the while condition)
file2 = fopen(testFiles[0], "r"); // loop will come later, thats not the problem
while(fgets(buffer, bufferLength, file2) != NULL) {
printf("%s\n", buffer);
testStrings = realloc(testStrings, (i + 1) * sizeof testStrings[0]);
strcpy(testStrings[i], buffer);
i++;
}
fclose(file);
}

How should I fix this interesting getdelim / getline (dynamic memory allocation) bug?

I have this C assignment I am a bit struggling at this specific point. I have some background in C, but pointers and dynamic memory management still elude me very much.
The assignment asks us to write a program which would simulate the behaviour of the "uniq" command / filter in UNIX.
But the problem I am having is with the C library functions getline or getdelim (we need to use those functions according to the implementation specifications).
According to the specification, the user input might contain arbitrary amount of lines and each line might be of arbitrary length (unknown at compile-time).
The problem is, the following line for the while-loop
while (cap = getdelim(stream.linesArray, size, '\n', stdin))
compiles and "works" somehow when I leave it like that. What I mean by this is that, when I execute the program, I enter arbitrary amount of lines of arbitrary length per each line and the program does not crash - but it keeps looping unless I stop the program execution (whether the lines are correctly stored in " char **linesArray; " are a different story I am not sure about.
I would like to be able to do is something like
while ((cap = getdelim(stream.linesArray, size, '\n', stdin)) && (cap != -1))
so that when getdelim does not read any characters at some line (besides EOF or \n) - aka the very first time when user enters an empty line -, the program would stop taking more lines from stdin.
(and then print the lines that were stored in stream.linesArray by getdelim).
The problem is, when I execute the program if I make the change I mentioned above, the program gives me "Segmentation Fault" and frankly I don't know why and how should I fix this (I have tried to do something about it so many times to no avail).
For reference:
https://pubs.opengroup.org/onlinepubs/9699919799/functions/getdelim.html
https://en.cppreference.com/w/c/experimental/dynamic/getline
http://man7.org/linux/man-pages/man3/getline.3.html
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define DEFAULT_SIZE 20
typedef unsigned long long int ull_int;
typedef struct uniqStream
{
char **linesArray;
ull_int lineIndex;
} uniq;
int main()
{
uniq stream = { malloc(DEFAULT_SIZE * sizeof(char)), 0 };
ull_int cap, i = 0;
size_t *size = 0;
while ((cap = getdelim(stream.linesArray, size, '\n', stdin))) //&& (cap != -1))
{
stream.lineIndex = i;
//if (cap == -1) { break; }
//print("%s", stream.linesArray[i]);
++i;
if (i == sizeof(stream.linesArray))
{
stream.linesArray = realloc(stream.linesArray, (2 * sizeof(stream.linesArray)));
}
}
ull_int j;
for (j = 0; j < i; ++j)
{
printf("%s\n", stream.linesArray[j]);
}
free(stream.linesArray);
return 0;
}
Ok, so the intent is clear - use getdelim to store the lines inside an array. getline itself uses dynamic allocation. The manual is quite clear about it:
getline() reads an entire line from stream, storing the address of the
buffer containing the text into *lineptr. The buffer is
null-terminated and includes the newline character, if one was found.
The getline() "stores the address of the buffer into *lineptr". So lineptr has to be a valid pointer to a char * variable (read that twice).
*lineptr and *n will be updated
to reflect the buffer address and allocated size respectively.
Also n needs to be a valid(!) pointer to a size_t variable, so the function can update it.
Also note that the lineptr buffer:
This buffer should be freed by the user program even if getline() failed.
So what do we do? We need to have an array of pointers to an array of strings. Because I don't like becoming a three star programmer, I use structs. I somewhat modified your code a bit, added some checks. You have the excuse me, I don't like typedefs, so I don't use them. Renamed the uniq to struct lines_s:
#define _GNU_SOURCE
#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>
struct line_s {
char *line;
size_t len;
};
struct lines_s {
struct line_s *lines;
size_t cnt;
};
int main() {
struct lines_s lines = { NULL, 0 };
// loop breaks on error of feof(stdin)
while (1) {
char *line = NULL;
size_t size = 0;
// we pass a pointer to a `char*` variable
// and a pointer to `size_t` variable
// `getdelim` will update the variables inside it
// the initial values are NULL and 0
ssize_t ret = getdelim(&line, &size, '\n', stdin);
if (ret < 0) {
// check for EOF
if (feof(stdin)) {
// EOF found - break
break;
}
fprintf(stderr, "getdelim error %zd!\n", ret);
abort();
}
// new line was read - add it to out container "lines"
// always handle realloc separately
void *ptr = realloc(lines.lines, sizeof(*lines.lines) * (lines.cnt + 1));
if (ptr == NULL) {
// note that lines.lines is still a valid pointer here
fprintf(stderr, "Out of memory\n");
abort();
}
lines.lines = ptr;
lines.lines[lines.cnt].line = line;
lines.lines[lines.cnt].len = size;
lines.cnt += 1;
// break if the line is "stop"
if (strcmp("stop\n", lines.lines[lines.cnt - 1].line) == 0) {
break;
}
}
// iterate over lines
for (size_t i = 0; i < lines.cnt; ++i) {
// note that the line has a newline in it
// so no additional is needed in this printf
printf("line %zu is %s", i, lines.lines[i].line);
}
// getdelim returns dynamically allocated strings
// we need to free them
for (size_t i = 0; i < lines.cnt; ++i) {
free(lines.lines[i].line);
}
free(lines.lines);
}
For such input:
line1 line1
line2 line2
stop
will output:
line 0 is line1 line1
line 1 is line2 line2
line 2 is stop
Tested on onlinegdb.
Notes:
if (i == sizeof(stream.linesArray)) sizeof does not magically store the size of an array. sizeof(stream.linesArray) is just sizeof(char**) is just a sizeof of a pointer. It's usually 4 or 8 bytes, depending if on the 32bit or 64bit architecture.
uniq stream = { malloc(DEFAULT_SIZE * sizeof(char)), - stream.linesArray is a char** variable. So if you want to have an array of pointers to char, you should allocate the memory for pointers malloc(DEFAULT_SIZE * sizeof(char*)).
typedef unsigned long long int ull_int; The size_t type if the type to represent array size or sizeof(variable). The ssize_t is sometimes used in posix api to return the size and an error status. Use those variables, no need to type unsigned long long.
ull_int cap cap = getdelim - cap is unsigned, it will never be cap != 1.

C - cannot read and process a list of strings from a text file into an array

This code reads a text file line by line. But I need to put those lines in an array but I wasn't able to do it. Now I am getting a array of numbers somehow. So how to read the file into a list. I tried using 2 dimensional list but this doesn't work as well.
I am new to C. I am mostly using Python but now I want to check if C is faster or not for a task.
#include <stdio.h>
#include <time.h>
#include <string.h>
void loadlist(char *ptext) {
char filename[] = "Z://list.txt";
char myline[200];
FILE * pfile;
pfile = fopen (filename, "r" );
char larray[100000];
int i = 0;
while (!feof(pfile)) {
fgets(myline,200,pfile);
larray[i]= myline;
//strcpy(larray[i],myline);
i++;
//printf(myline);
}
fclose(pfile);
printf("%s \n %d \n %d \n ","while doneqa",i,strlen(larray));
printf("First larray element is: %d \n",larray[0]);
/* for loop execution */
//for( i = 10; i < 20; i = i + 1 ){
// printf(larray[i]);
//}
}
int main ()
{
time_t stime, etime;
printf("Starting of the program...\n");
time(&stime);
char *ptext = "String";
loadlist(ptext);
time(&etime);
printf("time to load: %f \n", difftime(etime, stime));
return(0);
}
This code reads a text file line by line. But I need to put those lines in an array but I wasn't able to do it. Now I am getting an array of numbers somehow.
There are many ways to do this correctly. To begin with, first sort out what it is you actually need/want to store, then figure out where that information will come from and finally decide how you will provide storage for the information. In your case loadlist is apparently intended load a list of lines (up to 10000) so that they are accessible through your statically declared array of pointers. (you can also allocate the pointers dynamically, but if you know you won't need more than X of them, statically declaring them is fine (up to the point you cause StackOverflow...)
Once you read the line in loadlist, then you need to provide adequate storage to hold the line (plus the nul-terminating character). Otherwise, you are just counting the number of lines. In your case, since you declare an array of pointers, you cannot simply copy the line you read because each of the pointers in your array does not yet point to any allocated block of memory. (you can't assign the address of the buffer you read the line into with fgets (buffer, size, FILE*) because (1) it is local to your loadlist function and it will go away when the function stack frame is destroyed on function return; and (2) obviously it gets overwritten with each call to fgets anyway.
So what to do? That's pretty simple too, just allocate storage for each line as it is read using the strlen of each line as #iharob says (+1 for the nul-byte) and then malloc to allocate a block of memory that size. You can then simply copy the read buffer to the block of memory created and assign the pointer to your list (e.g. larray[x] in your code). Now the gnu extensions provide a strdup function that both allocates and copies, but understand that is not part of the C99 standard so you can run into portability issues. (also note you can use memcpy if overlapping regions of memory are a concern, but we will ignore that for now since you are reading lines from a file)
What are the rules for allocating memory? Well, you allocate with malloc, calloc or realloc and then you VALIDATE that your call to those functions succeeded before proceeding or you have just entered the realm of undefined behavior by writing to areas of memory that are NOT in fact allocated for your use. What does that look like? If you have your array of pointers p and you want to store a string from your read buffer buf of length len at index idx, you could simply do:
if ((p[idx] = malloc (len + 1))) /* allocate storage */
strcpy (p[idx], buf); /* copy buf to storage */
else
return NULL; /* handle error condition */
Now you are free to allocate before you test as follows, but it is convenient to make the assignment as part of the test. The long form would be:
p[idx] = malloc (len + 1); /* allocate storage */
if (p[idx] == NULL) /* validate/handle error condition */
return NULL;
strcpy (p[idx], buf); /* copy buf to storage */
How you want to do it is up to you.
Now you also need to protect against reading beyond the end of your pointer array. (you only have a fixed number since you declared the array statically). You can make that check part of your read loop very easily. If you have declared a constant for the number of pointers you have (e.g. PTRMAX), you can do:
int idx = 0; /* index */
while (fgets (buf, LNMAX, fp) && idx < PTRMAX) {
...
idx++;
}
By checking the index against the number of pointers available, you insure you cannot attempt to assign address to more pointers than you have.
There is also the unaddressed issue of handling the '\n' that will be contained at the end of your read buffer. Recall, fgets read up to and including the '\n'. You do not want newline characters dangling off the ends of the strings you store, so you simply overwrite the '\n' with a nul-terminating character (e.g. simply decimal 0 or the equivalent nul-character '\0' -- your choice). You can make that a simple test after your strlen call, e.g.
while (fgets (buf, LNMAX, fp) && idx < PTRMAX) {
size_t len = strlen (buf); /* get length */
if (buf[len-1] == '\n') /* check for trailing '\n' */
buf[--len] = 0; /* overwrite '\n' with nul-byte */
/* else { handle read of line longer than 200 chars }
*/
...
(note: that also brings up the issue of reading a line longer than the 200 characters you allocate for your read buffer. You check for whether a complete line has been read by checking whether fgets included the '\n' at the end, if it didn't, you know your next call to fgets will be reading again from the same line, unless EOF is encountered. In that case you would simply need to realloc your storage and append any additional characters to that same line -- that is left for future discussion)
If you put all the pieces together and choose a return type for loadlist that can indicate success/failure, you could do something similar to the following:
/** read up to PTRMAX lines from 'fp', allocate/save in 'p'.
* storage is allocated for each line read and pointer
* to allocated block is stored at 'p[x]'. (you should
* add handling of lines greater than LNMAX chars)
*/
char **loadlist (char **p, FILE *fp)
{
int idx = 0; /* index */
char buf[LNMAX] = ""; /* read buf */
while (fgets (buf, LNMAX, fp) && idx < PTRMAX) {
size_t len = strlen (buf); /* get length */
if (buf[len-1] == '\n') /* check for trailing '\n' */
buf[--len] = 0; /* overwrite '\n' with nul-byte */
/* else { handle read of line longer than 200 chars }
*/
if ((p[idx] = malloc (len + 1))) /* allocate storage */
strcpy (p[idx], buf); /* copy buf to storage */
else
return NULL; /* indicate error condition in return */
idx++;
}
return p; /* return pointer to list */
}
note: you could just as easily change the return type to int and return the number of lines read, or pass a pointer to int (or better yet size_t) as a parameter to make the number of lines stored available back in the calling function.
However, in this case, we have used the initialization of all pointers in your array of pointers to NULL, so back in the calling function we need only iterate over the pointer array until the first NULL is encountered in order to traverse our list of lines. Putting together a short example program that read/stores all lines (up to PTRMAX lines) from the filename given as the first argument to the program (or from stdin if no filename is given), you could do something similar to:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
enum { LNMAX = 200, PTRMAX = 10000 };
char **loadlist (char **p, FILE *fp);
int main (int argc, char **argv) {
time_t stime, etime;
char *list[PTRMAX] = { NULL }; /* array of ptrs initialized NULL */
size_t n = 0;
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
fprintf (stderr, "error: file open failed '%s'.\n", argv[1]);
return 1;
}
printf ("Starting of the program...\n");
time (&stime);
if (loadlist (list, fp)) { /* read lines from fp into list */
time (&etime);
printf("time to load: %f\n\n", difftime (etime, stime));
}
else {
fprintf (stderr, "error: loadlist failed.\n");
return 1;
}
if (fp != stdin) fclose (fp); /* close file if not stdin */
while (list[n]) { /* output stored lines and free allocated mem */
printf ("line[%5zu]: %s\n", n, list[n]);
free (list[n++]);
}
return(0);
}
/** read up to PTRMAX lines from 'fp', allocate/save in 'p'.
* storage is allocated for each line read and pointer
* to allocated block is stored at 'p[x]'. (you should
* add handling of lines greater than LNMAX chars)
*/
char **loadlist (char **p, FILE *fp)
{
int idx = 0; /* index */
char buf[LNMAX] = ""; /* read buf */
while (fgets (buf, LNMAX, fp) && idx < PTRMAX) {
size_t len = strlen (buf); /* get length */
if (buf[len-1] == '\n') /* check for trailing '\n' */
buf[--len] = 0; /* overwrite '\n' with nul-byte */
/* else { handle read of line longer than 200 chars }
*/
if ((p[idx] = malloc (len + 1))) /* allocate storage */
strcpy (p[idx], buf); /* copy buf to storage */
else
return NULL; /* indicate error condition in return */
idx++;
}
return p; /* return pointer to list */
}
Finally, in any code your write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
Use a memory error checking program to insure you haven't written beyond/outside your allocated block of memory, attempted to read or base a jump on an uninitialized value and finally to confirm that you have freed all the memory you have allocated.
For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
Look things over, let me know if you have any further questions.
It's natural that you see numbers because you are printing a single character using the "%d" specifier. In fact, strings in c are pretty much that, arrays of numbers, those numbers are the ascii values of the corresponding characters. If you instead use "%c" you will see the character that represents each of those numbers.
Your code also, calls strlen() on something that is intended as a array of strings, strlen() is used to compute the length of a single string, a string being an array of char items with a non-zero value, ended with a 0. Thus, strlen() is surely causing undefined behavior.
Also, if you want to store each string, you need to copy the data like you tried in the commented line with strcpy() because the array you are using for reading lines is overwritten over and over in each iteration.
Your compiler must be throwing all kinds of warnings, if it's not then it's your fault, you should let the compiler know that you want it to do some diagnostics to help you find common problems like assigning a pointer to a char.
You should fix multiple problems in your code, here is a code that fixes most of them
void
loadlist(const char *const filename) {
char line[100];
FILE *file;
// We can only read 100 lines, of
// max 99 characters each
char array[100][100];
int size;
size = 0;
file = fopen (filename, "r" );
if (file == NULL)
return;
while ((fgets(line, sizeof(line), file) != NULL) && (size < 100)) {
strcpy(array[size++], line);
}
fclose(file);
for (int i = 0 ; i < size ; ++i) {
printf("array[%d] = %s", i + 1, array[i]);
}
}
int
main(void)
{
time_t stime, etime;
printf("Starting of the program...\n");
time(&stime);
loadlist("Z:\\list.txt");
time(&etime);
printf("Time to load: %f\n", difftime(etime, stime));
return 0;
}
Just to prove how complicated it can be in c, check this out
#include <stdio.h>
#include <time.h>
#include <string.h>
#include <stdlib.h>
struct string_list {
char **items;
size_t size;
size_t count;
};
void
string_list_print(struct string_list *list)
{
// Simply iterate through the list and
// print every item
for (size_t i = 0 ; i < list->count ; ++i) {
fprintf(stdout, "item[%zu] = %s\n", i + 1, list->items[i]);
}
}
struct string_list *
string_list_create(size_t size)
{
struct string_list *list;
// Allocate space for the list object
list = malloc(sizeof *list);
if (list == NULL) // ALWAYS check this
return NULL;
// Allocate space for the items
// (starting with `size' items)
list->items = malloc(size * sizeof *list->items);
if (list->items != NULL) {
// Update the list size because the allocation
// succeeded
list->size = size;
} else {
// Be optimistic, maybe realloc will work next time
list->size = 0;
}
// Initialize the count to 0, because
// the list is initially empty
list->count = 0;
return list;
}
int
string_list_append(struct string_list *list, const char *const string)
{
// Check if there is room for the new item
if (list->count + 1 >= list->size) {
char **items;
// Resize the array, there is no more room
items = realloc(list->items, 2 * list->size * sizeof *list->items);
if (items == NULL)
return -1;
// Now update the list
list->items = items;
list->size += list->size;
}
// Copy the string into the array we simultaneously
// increase the `count' and copy the string
list->items[list->count++] = strdup(string);
return 0;
}
void
string_list_destroy(struct string_list *const list)
{
// `free()' does work with a `NULL' argument
// so perhaps as a principle we should too
if (list == NULL)
return;
// If the `list->items' was initialized, attempt
// to free every `strdup()'ed string
if (list->items != NULL) {
for (size_t i = 0 ; i < list->count ; ++i) {
free(list->items[i]);
}
free(list->items);
}
free(list);
}
struct string_list *
loadlist(const char *const filename) {
char line[100]; // A buffer for reading lines from the file
FILE *file;
struct string_list *list;
// Create a new list, initially it has
// room for 100 strings, but it grows
// automatically if needed
list = string_list_create(100);
if (list == NULL)
return NULL;
// Attempt to open the file
file = fopen (filename, "r");
// On failure, we now have the responsibility
// to cleanup the allocated space for the string
// list
if (file == NULL) {
string_list_destroy(list);
return NULL;
}
// Read lines from the file until there are no more
while (fgets(line, sizeof(line), file) != NULL) {
char *newline;
// Remove the trainling '\n'
newline = strchr(line, '\n');
if (newline != NULL)
*newline = '\0';
// Append the string to the list
string_list_append(list, line);
}
fclose(file);
return list;
}
int
main(void)
{
time_t stime, etime;
struct string_list *list;
printf("Starting of the program...\n");
time(&stime);
list = loadlist("Z:\\list.txt");
if (list != NULL) {
string_list_print(list);
string_list_destroy(list);
}
time(&etime);
printf("Time to load: %f\n", difftime(etime, stime));
return 0;
}
Now, this will work almost as the python code you say you wrote but it will certainly be faster, there is absolutely no doubt.
It is possible that an experimented python programmer can write a python program that runs faster than that of a non-experimented c programmer, learning c however is really good because you then understand how things work really, and you can then infer how a python feature is probably implemented, so understanding this can be very useful actually.
Although it's certainly way more complicated than doing the same in python, note that I wrote this in nearly 10min. So if you really know what you're doing and you really need it to be fast c is certainly an option, but you need to learn many concepts that are not clear to higher level languages programmers.

reading an unbounded line from the console with scanf

I need to read a finite yet unbounded-in-length string.
We learned only about scanf so I guess I cannot use fgets.
Anyway, I've ran this code on a an input with length larger than 5.
char arr[5];
scanf("%s", arr);
char *s = arr;
while (*s != '\0')
printf("%c", *s++);
scanf keeps scanning and writing the overflowed part, but it seems like an hack. Is that a good practice? If not, how should I read it?
Note: We have learned about the alloc functions family.
Buffer overflows are a plague, of the most famous and yet most elusive bugs. So you should definitely not rely on them.
Since you've learned about malloc() and friends, I suppose you're expected to make use of them.
#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
// Array growing step size
#define CHUNK_SIZE 8
int main(void) {
size_t arrSize = CHUNK_SIZE;
char *arr = malloc(arrSize);
if(!arr) {
fprintf(stderr, "Initial allocation failed.\n");
goto failure;
}
// One past the end of the array
// (next insertion position)
size_t arrEnd = 0u;
for(char c = '\0'; c != '\n';) {
if(scanf("%c", &c) != 1) {
fprintf(stderr, "Reading character %zu failed.\n", arrEnd);
goto failure;
}
// No more room, grow the array
// (-1) takes into account the
// nul terminator.
if(arrEnd == arrSize - 1) {
arrSize += CHUNK_SIZE;
char *newArr = realloc(arr, arrSize);
if(!newArr) {
fprintf(stderr, "Reallocation failed.\n");
goto failure;
}
arr = newArr;
// Debug output
arr[arrEnd] = '\0';
printf("> %s\n", arr);
// Debug output
}
// Append the character and
// advance the end index
arr[arrEnd++] = c;
}
// Nul-terminate the array
arr[arrEnd++] = '\0';
// Done !
printf("%s", arr);
free(arr);
return 0;
failure:
free(arr);
return 1;
}
%as or %ms(POSIX) can be used for such purpose If you are using gcc with glibc.(not C standard)
#include <stdio.h>
#include <stdlib.h>
int main(void){
char *s;
scanf("%as", &s);
printf("%s\n", s);
free(s);
return 0;
}
scanf is the wrong tool for this job (as for most jobs). If you are required to use this function, read one char at a time with scanf("%c", &c).
You code misuses scanf(): you are passing arr, the address of an array of pointers to char instead of an array of char.
You should allocate an array of char with malloc, read characters into it and use realloc to extend it when it is too small, until you get a '\n' or EOF.
If you can rewind stdin, you can first compute the number of chars to read with scanf("%*s%n", &n);, then allocate the destination array to n+1 bytes, rewind(stdin); and re-read the string into the buffer with scanf("%s", buf);.
It is risky business as some streams such as console input cannot be rewinded.
For example:
fpos_t pos;
int n = 0;
char *buf;
fgetpos(stdin, &pos);
scanf("%*[^\n]%n", &n);
fsetpos(stdin, &pos);
buf = calloc(n+1, 1);
scanf("%[^\n]", buf);
Since you are supposed to know just some basic C, I doubt this solution is what is expected from you, but I cannot think of any other way to read an unbounded string in one step using standard C.
If you are using the glibc and may use extensions, you can do this:
scanf("%a[^\n]", &buf);
PS: all error checking and handling is purposely ignored, but should be handled in you actual assignment.
Try limiting the amount of characters accepted:
scanf("%4s", arr);
It's just that you're writing beyond arr[5]. "Hopefully" you're keeping writing on allocated memory of the process, but if you go beyond you'll end up with a segmentation fault.
Consider
1) malloc() on many systems only allocates memory, not uses it. It isn't until the memory is assigned that the underlining physical memory usage occurs. See Why is malloc not "using up" the memory on my computer?
2) Unbounded user input is not realistic. Given that some upper bound should be employed to prevent hackers and nefarious users, simple use a large buffer.
If you system can work with these two ideas:
char *buf = malloc(1000000);
if (buf == NULL) return NULL; // Out_of_memory
if (scanf("%999999s", buf) != 1) { free(buf); return NULL; } //EOF
// Now right-size buffer
size_t size = strlen(buf) + 1;
char *tmp = realloc(buf, size);
if (tmp == NULL) { free(buf); return NULL; } // Out_of_memory
return tmp;
Fixed up per #chqrlie comments.

How do I use scanf when I dont know how many values it will assign in C?

These are the instructions:
"Read characters from standard input until EOF (the end-of-file mark) is read. Do not prompt the user to enter text - just read data as soon as the program starts."
So the user will be entering characters, but I dont know how many. I will later need to use them to build a table that displays the ASCII code of each value entered.
How should I go about this?
This is my idea
int main(void){
int inputlist[], i = -1;
do {++i;scanf("%f",&inputlist[i]);}
while(inputlist[i] != EOF)
You said character.So this might be used
char arr[10000];
ch=getchar();
while(ch!=EOF)
{
arr[i++]=ch;
ch=getchar();
}
//arr[i]=0; TO make it a string,if necessary.
And to convert to ASCII
for(j=0;j<i;j++)
printf("%d\n",arr[j]);
If you are particular in using integer array,Use
int arr[1000];
while(scanf("%d",&arr[i++])!=EOF);
PPS:This works only if your input is one character per line.
scanf returns EOF on EOF
You have a reasonable attempt at a start to the solution, with a few errors. You can't define an array without specifying a size, so int inputlist[] shouldn't even compile. Your scanf() specifier is %f for float, which is wrong twice (once because you declared inputlist with an integer type, and twice because you said your input is characters, so you should be telling scanf() to use %c or %s), and really if you're reading input unconditionally until EOF, you should use an unconditional input function, such as fgets() or fread(). (or read(), if you prefer).
You'll need two things: A place to store the current chunk of input, and a place to store the input that you've already read in. Since the input functions I mentioned above expect you to specify the input buffer, you can allocate that with a simple declaration.
char input[1024];
However, for the place to store all input, you'll want something dynamically allocated. The simplest solution is to simply malloc() a chunk of storage, keep track of how large it is, and realloc() it if and when necessary.
char *all_input;
int poolsize=16384;
all_input = malloc(pool_size);
Then, just loop on your input function until the return value indicates that you've hit EOF, and on each iteration of the loop, append the input data to the end of your storage area, increment a counter by the size of the input data, and check whether you're getting too close to the size of your input storage area. (And if you are, then use realloc() to grow your storage.)
You could read the input by getchar until reach EOF. And you don't know the size of input, you should use dynamic size buffer in heap.
char *buf = NULL;
long size = 1024;
long count = 0;
char r;
buf = (char *)malloc(size);
if (buf == NULL) {
fprintf(stderr, "malloc failed\n");
exit(1);
}
while( (r = getchar()) != EOF) {
buf[count++] = r;
// leave one space for '\0' to terminate the string
if (count == size - 1) {
buf = realloc(buf,size*2);
if (buf == NULL) {
fprintf(stderr, "realloc failed\n");
exit(1);
}
size = size * 2;
}
}
buf[count] = '\0';
printf("%s \n", buf);
return 0;
Here is full solution for your needs with comments.
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
// Number of elements
#define CHARNUM 3
int main(int argc, char **argv) {
// Allocate memory for storing input data
// We calculate requested amount of bytes by the formula:
// NumElement * SizeOfOneElement
size_t size = CHARNUM * sizeof(int);
// Call function to allocate memory
int *buffer = (int *) calloc(1, size);
// Check that calloc() returned valid pointer
// It can: 1. Return pointer in success or NULL in faulire
// 2. Return pointer or NULL if size is 0
// (implementation dependened).
// We can't use this pointer later.
if (!buffer || !size)
{
exit(EXIT_FAILURE);
}
int curr_char;
int count = 0;
while ((curr_char = getchar()) != EOF)
{
if (count >= size/sizeof(int))
{
// If we put more characters than now our buffer
// can hold, we allocate more memory
fprintf(stderr, "Reallocate memory buffer\n");
size_t tmp_size = size + (CHARNUM * sizeof(int));
int *tmp_buffer = (int *) realloc(buffer, tmp_size);
if (!tmp_buffer)
{
fprintf(stderr, "Can't allocate enough memory\n");
exit(EXIT_FAILURE);
}
size = tmp_size;
buffer = tmp_buffer;
}
buffer[count] = curr_char;
++count;
}
// Here you get buffer with the characters from
// the standard input
fprintf(stderr, "\nNow buffer contains characters:\n");
for (int k = 0; k < count; ++k)
{
fprintf(stderr, "%c", buffer[k]);
}
fprintf(stderr, "\n");
// Todo something with the data
// Free all resources before exist
free(buffer);
exit(EXIT_SUCCESS); }
Compile with -std=c99 option if you use gcc.
Also you can use getline() function which will read from standard input line by line. It will allocate enough memory to store line. Just call it until End-Of-File.
errno = 0;
int read = 0;
char *buffer = NULL;
size_t len = 0;
while ((read = getline(&buffer, &len, stdin)) != -1)
{ // Process line }
if (errno) { // Get error }
// Process later
Note that if you are using getline() you should anyway use dynamic allocated memory. But not for storing characters, rather to store pointers to the strings.

Resources