How to read a command output with pipe and scanf()? - c

I need to be able to send the output of the GET command and store it into a variable inside my program, currently I'm doing it like this:
GET google.com | ./myprogram
And receiving it in my program with the following code:
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
int main(int argc, char *argv[]){
char *a = (char *) malloc (10000000);
scanf("%[^\n]", a);
printf("%s\n",a);
return 0;
}
The problem I have is that the scanf function stops at a new line, and I need to be able to store the whole paragraph output from GET.
Any help will be appreciated. Thanks.

One possibility: Does GET include the size information in the headers? Could you use that to determine how much space to allocate, and how much data to read? That's fiddly though, and requires reading the data in dribs and drabs.
More simply, consider using POSIX (and Linux) getdelim() (a close relative of getline()) and specify the delimiter as the null byte. That's unlikely to appear in the GET output, so the whole content will be a single 'line', and getdelim() will allocate an appropriate amount of space automatically. It also tells you how long the data was.
#include <stdio.h>
int main(void)
{
char *buffer = 0;
size_t buflen = 0;
int length = getdelim(&buffer, &buflen, '\0', stdin);
if (length > 0)
printf("%*.*s\n", length, length, buffer);
free(buffer);
return 0;
}

scanf documentation says
These functions return the number of input items successfully
matched and assigned, which can be fewer than provided for, or even
zero in the event of an early matching failure. The value EOF is
returned if the end of input is reached before either the first
successful conversion or a matching failure occurs. EOF is also
returned if a read error occurs, in which case the error indicator for
the stream (see ferror(3)) is set, and errno is set indicate the
error.
https://www.freebsd.org/cgi/man.cgi?query=scanf&sektion=3
Have you considered writing a loop that calls scanf, monitors its return value and breaks out if EOF

Consider the following readall() function implemented in standard C:
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
char *readall(FILE *source, size_t *length)
{
char *data = NULL;
size_t size = 0;
size_t used = 0;
size_t n;
/* If we have a place to store the length,
we initialize it to zero. */
if (length)
*length = 0;
/* Do not attempt to read the source, if it is
already in end-of-file or error state. */
if (feof(source) || ferror(source))
return NULL;
while (1) {
/* Ensure there is unused chars in data. */
if (used >= size) {
const size_t new_size = (used | 65535) + 65537 - 32;
char *new_data;
new_data = realloc(data, new_size);
if (!new_data) {
/* Although reallocation failed, data is still there. */
free(data);
/* We just fail. */
return NULL;
}
data = new_data;
size = new_size;
}
/* Read more of the source. */
n = fread(data + used, 1, size - used, source);
if (!n)
break;
used += n;
}
/* Read error or other wonkiness? */
if (!feof(source) || ferror(source)) {
free(data);
return NULL;
}
/* Optimize the allocation. For ease of use, we append
at least one nul byte ('\0') at end. */
{
const size_t new_size = (used | 7) + 9;
char *new_data;
new_data = realloc(data, new_size);
if (!new_data) {
if (used >= size) {
/* There is no room for the nul. We fail. */
free(data);
return NULL;
}
/* There is enough room for at least one nul,
so no reason to fail. */
} else {
data = new_data;
size = new_size;
}
}
/* Ensure the buffer is padded with nuls. */
memset(data + used, 0, size - used);
/* Save length, if requested. */
if (length)
*length = used;
return data;
}
It reads everything from the specified file handle (which can be a standard stream like stdin or a pipe opened via popen()) into a dynamically allocated buffer, appends a nul byte (\0), and returns a pointer to the buffer. If not NULL, the actual number of characters read (so, not including the appended nul byte), is stored in the size_t pointed to by the second parameter.
You can use it to read binary data output by programs, say dot -Tpng diagram.dot or image converters, or even wget -O - output (getting data from specific URLs, text or binary).
You can use this for example thus:
int main(void)
{
char *src;
size_t len;
src = readall(stdin, &len);
if (!src) {
fprintf(stderr, "Error reading standard input.\n");
return EXIT_FAILURE;
}
fprintf(stderr, "Read %zu chars.\n", len);
/* As an example, print it to standard output. */
if (len > 0)
fwrite(src, len, 1, stdout);
free(src);
return EXIT_SUCCESS;
}
The readall() function has two quirks: it allocates memory in roughly 131072-byte chunks (but could vary if fread() were to return short reads), and pads the buffer with 7 to 15 nul bytes. (There are reasons why I like doing it this way, but it is all speculative and specific to the C libraries I tend to use, so it is not important.)
Although the ones used above work fine, you can change the size_new calculations if you prefer otherwise. Just make sure that they both are at least used + 1.

Related

Are there ways to overcome the constraints of fgets()?

The fgets() function has two problems. The first is that, if the size of the line is longer than that of the passed buffer, the line is truncated. The second is that, if the line read from the file has embedded '\0' characters, then there is no way to know the actual length of the line. I would like to get a replacement for fgets() that dynamically allocates the space for the line read and also provides the size of the line read. I have written the code for dynamically allocating the space. I am unable to figure out how to get the size of the line read. I am a beginner. Thank you so much.
#include <stdio.h>
#include <stdlib.h>
#include <error.h>
#include <errno.h>
char *myfgets(FILE *fptr, int *size);
char *myfgets(FILE *fptr, int *size) {
char *buffer;
char *ret;
buffer = (char *)malloc((*size) * sizeof(char));
if (buffer == NULL)
error(1, 0, "No memory available\n");
ret = fgets(buffer, *size, fptr);
if (ret == NULL)
error(1, 0, "Error in reading the file\n");
return ret;
}
int main(int argc, char *argv[]) {
char *file;
FILE *fptr;
int size;
char *result;
if (argc != 3)
error(1, 0, "Too many or few arguments <File_name>, <Number of bytes to read>\n");
file = argv[1];
size = atoi(argv[2]);
fptr = fopen(file, "r");
if (fptr == NULL)
error(1, 0, "Error in opening the file\n");
result = myfgets(fptr, &size);
printf("The line read is :%s", result);
free(result);
return 0;
}
Use getline(3) to read a complete line of unknown length. It allocates memory as needed to hold it all.
The function can deal with 0 bytes in the line being read too. From the linked man page (emphasis added):
On success, getline() and getdelim() return the number of characters read, including the delimiter character, but not including the terminating null byte ('\0'). This value can be used to handle embedded null bytes in the line read.
So you just have to save its return value instead of using strlen().
You have correctly identified 2 issues in fgets(), but your proposed alternative does not address either of them as you still call fgets().
You should write a loop, calling getc() repeatedly until you get EOF or '\n' and you would store the bytes read into an allocated array, reallocating as needed.
Here is a simplistic version:
// Read a full line from `fptr`
// - return `NULL` at end of file or upon read error like `fgets()`.
// - otherwise return a pointer to an allocated array containing the
// characters read, up to and including the newline and a null terminator.
// - store the number of bytes read into *plength.
// - the buffer is null terminated, and it may contain embedded null bytes
// if such bytes were read from the file
char *myfgets(FILE *fptr, size_t *plength) {
size_t length = 0;
char *buffer = NULL, *newp;
int c;
for (;;) {
if (c = getc()) == EOF) {
if (!feof(fptr)) {
/* read error: discard data read so far and return NULL */
free(buffer);
buffer = NULL;
length = 0;
}
break;
}
if ((newp = realloc(buffer, length + 2)) == NULL) {
free(buffer);
error(1, 0, "Out of memory for realloc\n");
return NULL;
}
buffer = newp;
buffer[length] = c;
length++;
if (c == '\n')
break;
}
if (length != 0) {
buffer[length] = '\0';
}
*plength = length;
return buffer;
}
Various approaches for a "fixed" fgets():
1) Use the non-C library standard getline() as suggested by #Shawn. Commonly available in *nix and source code easy enough to find. It unfortunately obliges a new type: ssize_t.
2) Roll your own getc() code #chqrlie. Corner cases can be tricky.
3) Repeatedly call fgets() as needed. Pre-fill the buffer with '\n' and look for the first occurrence of '\n', its position, next character to help determine length. (There are only a few cases to consider)
4) Repeatedly call scanf("%99[^\n]%n", buf100, &n) and getc() for the '\n' as needed. Look at the return value and n to determine length.
5) Likely others
A good functional test of the design is how well did it report the cases:
Happy path: a line was read, memory allocated, no problems.
End-of-file: Nothing read due to end of file.
Out-of-memory.
Input error occurred.
Other considerations:
Do you really want to save a '\n'?
Performance.
As for me with "dynamically allocates the space" with no limit, code introduces the ability for a nefarious user to overwhelm memory resources by entering a pathologically long line. Rather than give such ability to a user, I recommend to limit input to a sane bound. Excessively long input is an attack that should be detected, not enabled.
So I would start with
char *myfgets(FILE *fptr, size_t limit, size_t *size) {

How should I fix this interesting getdelim / getline (dynamic memory allocation) bug?

I have this C assignment I am a bit struggling at this specific point. I have some background in C, but pointers and dynamic memory management still elude me very much.
The assignment asks us to write a program which would simulate the behaviour of the "uniq" command / filter in UNIX.
But the problem I am having is with the C library functions getline or getdelim (we need to use those functions according to the implementation specifications).
According to the specification, the user input might contain arbitrary amount of lines and each line might be of arbitrary length (unknown at compile-time).
The problem is, the following line for the while-loop
while (cap = getdelim(stream.linesArray, size, '\n', stdin))
compiles and "works" somehow when I leave it like that. What I mean by this is that, when I execute the program, I enter arbitrary amount of lines of arbitrary length per each line and the program does not crash - but it keeps looping unless I stop the program execution (whether the lines are correctly stored in " char **linesArray; " are a different story I am not sure about.
I would like to be able to do is something like
while ((cap = getdelim(stream.linesArray, size, '\n', stdin)) && (cap != -1))
so that when getdelim does not read any characters at some line (besides EOF or \n) - aka the very first time when user enters an empty line -, the program would stop taking more lines from stdin.
(and then print the lines that were stored in stream.linesArray by getdelim).
The problem is, when I execute the program if I make the change I mentioned above, the program gives me "Segmentation Fault" and frankly I don't know why and how should I fix this (I have tried to do something about it so many times to no avail).
For reference:
https://pubs.opengroup.org/onlinepubs/9699919799/functions/getdelim.html
https://en.cppreference.com/w/c/experimental/dynamic/getline
http://man7.org/linux/man-pages/man3/getline.3.html
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define DEFAULT_SIZE 20
typedef unsigned long long int ull_int;
typedef struct uniqStream
{
char **linesArray;
ull_int lineIndex;
} uniq;
int main()
{
uniq stream = { malloc(DEFAULT_SIZE * sizeof(char)), 0 };
ull_int cap, i = 0;
size_t *size = 0;
while ((cap = getdelim(stream.linesArray, size, '\n', stdin))) //&& (cap != -1))
{
stream.lineIndex = i;
//if (cap == -1) { break; }
//print("%s", stream.linesArray[i]);
++i;
if (i == sizeof(stream.linesArray))
{
stream.linesArray = realloc(stream.linesArray, (2 * sizeof(stream.linesArray)));
}
}
ull_int j;
for (j = 0; j < i; ++j)
{
printf("%s\n", stream.linesArray[j]);
}
free(stream.linesArray);
return 0;
}
Ok, so the intent is clear - use getdelim to store the lines inside an array. getline itself uses dynamic allocation. The manual is quite clear about it:
getline() reads an entire line from stream, storing the address of the
buffer containing the text into *lineptr. The buffer is
null-terminated and includes the newline character, if one was found.
The getline() "stores the address of the buffer into *lineptr". So lineptr has to be a valid pointer to a char * variable (read that twice).
*lineptr and *n will be updated
to reflect the buffer address and allocated size respectively.
Also n needs to be a valid(!) pointer to a size_t variable, so the function can update it.
Also note that the lineptr buffer:
This buffer should be freed by the user program even if getline() failed.
So what do we do? We need to have an array of pointers to an array of strings. Because I don't like becoming a three star programmer, I use structs. I somewhat modified your code a bit, added some checks. You have the excuse me, I don't like typedefs, so I don't use them. Renamed the uniq to struct lines_s:
#define _GNU_SOURCE
#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>
struct line_s {
char *line;
size_t len;
};
struct lines_s {
struct line_s *lines;
size_t cnt;
};
int main() {
struct lines_s lines = { NULL, 0 };
// loop breaks on error of feof(stdin)
while (1) {
char *line = NULL;
size_t size = 0;
// we pass a pointer to a `char*` variable
// and a pointer to `size_t` variable
// `getdelim` will update the variables inside it
// the initial values are NULL and 0
ssize_t ret = getdelim(&line, &size, '\n', stdin);
if (ret < 0) {
// check for EOF
if (feof(stdin)) {
// EOF found - break
break;
}
fprintf(stderr, "getdelim error %zd!\n", ret);
abort();
}
// new line was read - add it to out container "lines"
// always handle realloc separately
void *ptr = realloc(lines.lines, sizeof(*lines.lines) * (lines.cnt + 1));
if (ptr == NULL) {
// note that lines.lines is still a valid pointer here
fprintf(stderr, "Out of memory\n");
abort();
}
lines.lines = ptr;
lines.lines[lines.cnt].line = line;
lines.lines[lines.cnt].len = size;
lines.cnt += 1;
// break if the line is "stop"
if (strcmp("stop\n", lines.lines[lines.cnt - 1].line) == 0) {
break;
}
}
// iterate over lines
for (size_t i = 0; i < lines.cnt; ++i) {
// note that the line has a newline in it
// so no additional is needed in this printf
printf("line %zu is %s", i, lines.lines[i].line);
}
// getdelim returns dynamically allocated strings
// we need to free them
for (size_t i = 0; i < lines.cnt; ++i) {
free(lines.lines[i].line);
}
free(lines.lines);
}
For such input:
line1 line1
line2 line2
stop
will output:
line 0 is line1 line1
line 1 is line2 line2
line 2 is stop
Tested on onlinegdb.
Notes:
if (i == sizeof(stream.linesArray)) sizeof does not magically store the size of an array. sizeof(stream.linesArray) is just sizeof(char**) is just a sizeof of a pointer. It's usually 4 or 8 bytes, depending if on the 32bit or 64bit architecture.
uniq stream = { malloc(DEFAULT_SIZE * sizeof(char)), - stream.linesArray is a char** variable. So if you want to have an array of pointers to char, you should allocate the memory for pointers malloc(DEFAULT_SIZE * sizeof(char*)).
typedef unsigned long long int ull_int; The size_t type if the type to represent array size or sizeof(variable). The ssize_t is sometimes used in posix api to return the size and an error status. Use those variables, no need to type unsigned long long.
ull_int cap cap = getdelim - cap is unsigned, it will never be cap != 1.

How to make string function with user input in C?

I know how to make function with int, double, float with user input inside(im currently using scanf).
int getData(){
int a;
scanf("%i",&a);
return a;
}
but how to make function with string type and user input inside, then we return that value with type string?
A C string is an array of char terminated by a NUL (zero) byte. Arrays are normally passed around as pointers to the first element. The problem with returning that from the function is that the address pointed to must remain valid beyond the lifetime of the function, which means it needs to be either a static buffer (which is then overwritten by any subsequent calls to the same function, breaking earlier returned values) or allocated by the function, in which case the caller is responsible for freeing it.
The scanf you mention is also problematic for reading interactive user input, e.g., it may leave the input in an unexpected state such as when you don't consume the newline at the end of a line the next call to scanf (maybe in an unrelated function) may surprisingly fail to give the expected result when it encounters the newline.
It is often simpler to read input into a buffer line-by-line, e.g., with fgets, and then parse the line from there. (Some inputs you may be able to parse without a buffer simply by reading character by character, but such code often gets long and hard to follow quickly.)
An example of reading any string, which may contain whitespace other than the newline, would be something like:
/// Read a line from stdin and return a `malloc`ed copy of it without
/// the trailing newline. The caller is responsible for `free`ing it.
char *readNewString(void) {
char buffer[1024];
if (!fgets(buffer, sizeof buffer, stdin)) {
return NULL; // read failed, e.g., EOF
}
int len = strlen(buffer);
if (len > 0 && buffer[len - 1] == '\n') {
buffer[--len] = '\0'; // remove the newline
// You may also wish to remove trailing and/or leading whitespace
} else {
// Invalid input
//
// Depending on the context you may wish to e.g.,
// consume input until newline/EOF or abort.
}
char *str = malloc(len + 1);
if (!str) {
return NULL; // out of memory (unlikely)
}
return strcpy(str, buffer); // or use `memcpy` but then be careful with length
}
Another option is to have the caller supply the buffer and its size, then just return the same buffer on success and NULL on failure. This approach has the
advantage that the caller may decide when a buffer is reused and whether the string needs to be copied or simply read once and forgotten.
Extending Arkku's approach to an unlimited size (in fact it is limited to SIZE_MAX - 1 characters) as input:
#include <stdlib.h>
#include <string.h>
#define BUFFER_MAX (256)
int read_string(FILE * pf, char ** ps)
{
int result = 0;
if (!ps)
{
result = -1;
errno = EINVAL;
}
else
{
char buffer[BUFFER_MAX];
size_t len = 0;
*ps = NULL;
while (NULL != fgets(buffer, sizeof buffer, pf))
{
len += strlen(buffer);
{
void * p = realloc(*ps, len + 1);
if (!p)
{
int es = errno;
result = -1;
free(*ps);
errno = es;
break;
}
*ps = p;
}
strcpy(&(*ps)[len], buffer);
}
if (ferror(pf))
{
result = -1;
}
}
return result;
}
Call it like this:
#include <stdlib.h>
#include <stdlio.h>
int read_string(FILE * pf, char ** ps);
int main(void);
{
char * p;
if (-1 == read_string(stdin, &p)) /* This read from standard input,
still any FILE* would do here. */
{
perror("read_string() failed");
exit(EXIT_FAILURE);
}
printf("Read: '%s'\n", p);
free(p); /* Clean up. */
}

Program crashing at realloc

Problem
I am currently writing a small (and bad) grep-like program for Windows. In it I want to read files line by line and print out the ones which contain a key. For this to work I need a function which reads each line of a file. Since I am not on Linux I cannot use the getline function and have to implement it myself.
I have found an SO answer where such a function is implemented. I tried it out and it works fine for 'normal' text files. But the program crashes if I try to read a file with a line length of 13 000 characters.
MCVE
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
char * getline(FILE *f)
{
size_t size = 0;
size_t len = 0;
size_t last = 0;
char *buf = NULL;
do {
size += BUFSIZ; /* BUFSIZ is defined as "the optimal read size for this platform" */
buf = realloc(buf, size); /* realloc(NULL,n) is the same as malloc(n) */
/* Actually do the read. Note that fgets puts a terminal '\0' on the
end of the string, so we make sure we overwrite this */
if (buf == NULL) return NULL;
fgets(buf + last, size, f);
len = strlen(buf);
last = len - 1;
} while (!feof(f) && buf[last] != '\n');
return buf;
}
int main(int argc, char *argv[])
{
FILE *file = fopen(argv[1], "r");
if (file == NULL)
return 1;
while (!feof(file))
{
char *line = getline(file);
if (line != NULL)
{
printf("%s", line);
free(line);
}
}
return 0;
}
This is the file I am using. It contains three short lines which get read just fine and a long one from one of my Qt projects. When reading this line the getline function reallocates 2 times to a size of 1024 and crashes at the 3rd time. I've put printf around the realloc to make sure it crashes there and it definitely does.
Question
Could anyone explain me why my program is crashing like that? I just spend hours with this and don't know what to do anymore.
In this fragment
size += BUFSIZ;
buf = realloc(buf, size);
if (buf == NULL) return NULL;
fgets(buf + last, size, f);
you add size + BUFSIZ and allocate that, but then you read that same – increased! – size. In essence, you are reading more and more characters than you allocated in each turn. The first time around, size = BUFSIZ and you read exactly size/BUFSIZ characters. If the line is longer than this (the last character is not \n), you increase the size of the memory (size += BUFSIZ) but you also read its (new) total size again – and you've already processes that last number of size bytes.
The allocated memory grows with BUFSIZE per loop, but the amount of bytes to read increases with BUFSIZE – after one loop, it's BUFSIZE, after two loops 2*BUFSIZE, and so on, until something important gets overwritten and the program is terminated.
If you read only chunks of the exact size of BUFSIZE then this should work.
Note that your code expects the last line to end with an \n, which may not always be true. You can catch this with an additional test:
if (!fgets(buf + last, size, f))
break;
so your code won't be trying to read past the end of the input file.

C - cannot read and process a list of strings from a text file into an array

This code reads a text file line by line. But I need to put those lines in an array but I wasn't able to do it. Now I am getting a array of numbers somehow. So how to read the file into a list. I tried using 2 dimensional list but this doesn't work as well.
I am new to C. I am mostly using Python but now I want to check if C is faster or not for a task.
#include <stdio.h>
#include <time.h>
#include <string.h>
void loadlist(char *ptext) {
char filename[] = "Z://list.txt";
char myline[200];
FILE * pfile;
pfile = fopen (filename, "r" );
char larray[100000];
int i = 0;
while (!feof(pfile)) {
fgets(myline,200,pfile);
larray[i]= myline;
//strcpy(larray[i],myline);
i++;
//printf(myline);
}
fclose(pfile);
printf("%s \n %d \n %d \n ","while doneqa",i,strlen(larray));
printf("First larray element is: %d \n",larray[0]);
/* for loop execution */
//for( i = 10; i < 20; i = i + 1 ){
// printf(larray[i]);
//}
}
int main ()
{
time_t stime, etime;
printf("Starting of the program...\n");
time(&stime);
char *ptext = "String";
loadlist(ptext);
time(&etime);
printf("time to load: %f \n", difftime(etime, stime));
return(0);
}
This code reads a text file line by line. But I need to put those lines in an array but I wasn't able to do it. Now I am getting an array of numbers somehow.
There are many ways to do this correctly. To begin with, first sort out what it is you actually need/want to store, then figure out where that information will come from and finally decide how you will provide storage for the information. In your case loadlist is apparently intended load a list of lines (up to 10000) so that they are accessible through your statically declared array of pointers. (you can also allocate the pointers dynamically, but if you know you won't need more than X of them, statically declaring them is fine (up to the point you cause StackOverflow...)
Once you read the line in loadlist, then you need to provide adequate storage to hold the line (plus the nul-terminating character). Otherwise, you are just counting the number of lines. In your case, since you declare an array of pointers, you cannot simply copy the line you read because each of the pointers in your array does not yet point to any allocated block of memory. (you can't assign the address of the buffer you read the line into with fgets (buffer, size, FILE*) because (1) it is local to your loadlist function and it will go away when the function stack frame is destroyed on function return; and (2) obviously it gets overwritten with each call to fgets anyway.
So what to do? That's pretty simple too, just allocate storage for each line as it is read using the strlen of each line as #iharob says (+1 for the nul-byte) and then malloc to allocate a block of memory that size. You can then simply copy the read buffer to the block of memory created and assign the pointer to your list (e.g. larray[x] in your code). Now the gnu extensions provide a strdup function that both allocates and copies, but understand that is not part of the C99 standard so you can run into portability issues. (also note you can use memcpy if overlapping regions of memory are a concern, but we will ignore that for now since you are reading lines from a file)
What are the rules for allocating memory? Well, you allocate with malloc, calloc or realloc and then you VALIDATE that your call to those functions succeeded before proceeding or you have just entered the realm of undefined behavior by writing to areas of memory that are NOT in fact allocated for your use. What does that look like? If you have your array of pointers p and you want to store a string from your read buffer buf of length len at index idx, you could simply do:
if ((p[idx] = malloc (len + 1))) /* allocate storage */
strcpy (p[idx], buf); /* copy buf to storage */
else
return NULL; /* handle error condition */
Now you are free to allocate before you test as follows, but it is convenient to make the assignment as part of the test. The long form would be:
p[idx] = malloc (len + 1); /* allocate storage */
if (p[idx] == NULL) /* validate/handle error condition */
return NULL;
strcpy (p[idx], buf); /* copy buf to storage */
How you want to do it is up to you.
Now you also need to protect against reading beyond the end of your pointer array. (you only have a fixed number since you declared the array statically). You can make that check part of your read loop very easily. If you have declared a constant for the number of pointers you have (e.g. PTRMAX), you can do:
int idx = 0; /* index */
while (fgets (buf, LNMAX, fp) && idx < PTRMAX) {
...
idx++;
}
By checking the index against the number of pointers available, you insure you cannot attempt to assign address to more pointers than you have.
There is also the unaddressed issue of handling the '\n' that will be contained at the end of your read buffer. Recall, fgets read up to and including the '\n'. You do not want newline characters dangling off the ends of the strings you store, so you simply overwrite the '\n' with a nul-terminating character (e.g. simply decimal 0 or the equivalent nul-character '\0' -- your choice). You can make that a simple test after your strlen call, e.g.
while (fgets (buf, LNMAX, fp) && idx < PTRMAX) {
size_t len = strlen (buf); /* get length */
if (buf[len-1] == '\n') /* check for trailing '\n' */
buf[--len] = 0; /* overwrite '\n' with nul-byte */
/* else { handle read of line longer than 200 chars }
*/
...
(note: that also brings up the issue of reading a line longer than the 200 characters you allocate for your read buffer. You check for whether a complete line has been read by checking whether fgets included the '\n' at the end, if it didn't, you know your next call to fgets will be reading again from the same line, unless EOF is encountered. In that case you would simply need to realloc your storage and append any additional characters to that same line -- that is left for future discussion)
If you put all the pieces together and choose a return type for loadlist that can indicate success/failure, you could do something similar to the following:
/** read up to PTRMAX lines from 'fp', allocate/save in 'p'.
* storage is allocated for each line read and pointer
* to allocated block is stored at 'p[x]'. (you should
* add handling of lines greater than LNMAX chars)
*/
char **loadlist (char **p, FILE *fp)
{
int idx = 0; /* index */
char buf[LNMAX] = ""; /* read buf */
while (fgets (buf, LNMAX, fp) && idx < PTRMAX) {
size_t len = strlen (buf); /* get length */
if (buf[len-1] == '\n') /* check for trailing '\n' */
buf[--len] = 0; /* overwrite '\n' with nul-byte */
/* else { handle read of line longer than 200 chars }
*/
if ((p[idx] = malloc (len + 1))) /* allocate storage */
strcpy (p[idx], buf); /* copy buf to storage */
else
return NULL; /* indicate error condition in return */
idx++;
}
return p; /* return pointer to list */
}
note: you could just as easily change the return type to int and return the number of lines read, or pass a pointer to int (or better yet size_t) as a parameter to make the number of lines stored available back in the calling function.
However, in this case, we have used the initialization of all pointers in your array of pointers to NULL, so back in the calling function we need only iterate over the pointer array until the first NULL is encountered in order to traverse our list of lines. Putting together a short example program that read/stores all lines (up to PTRMAX lines) from the filename given as the first argument to the program (or from stdin if no filename is given), you could do something similar to:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
enum { LNMAX = 200, PTRMAX = 10000 };
char **loadlist (char **p, FILE *fp);
int main (int argc, char **argv) {
time_t stime, etime;
char *list[PTRMAX] = { NULL }; /* array of ptrs initialized NULL */
size_t n = 0;
FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) { /* validate file open for reading */
fprintf (stderr, "error: file open failed '%s'.\n", argv[1]);
return 1;
}
printf ("Starting of the program...\n");
time (&stime);
if (loadlist (list, fp)) { /* read lines from fp into list */
time (&etime);
printf("time to load: %f\n\n", difftime (etime, stime));
}
else {
fprintf (stderr, "error: loadlist failed.\n");
return 1;
}
if (fp != stdin) fclose (fp); /* close file if not stdin */
while (list[n]) { /* output stored lines and free allocated mem */
printf ("line[%5zu]: %s\n", n, list[n]);
free (list[n++]);
}
return(0);
}
/** read up to PTRMAX lines from 'fp', allocate/save in 'p'.
* storage is allocated for each line read and pointer
* to allocated block is stored at 'p[x]'. (you should
* add handling of lines greater than LNMAX chars)
*/
char **loadlist (char **p, FILE *fp)
{
int idx = 0; /* index */
char buf[LNMAX] = ""; /* read buf */
while (fgets (buf, LNMAX, fp) && idx < PTRMAX) {
size_t len = strlen (buf); /* get length */
if (buf[len-1] == '\n') /* check for trailing '\n' */
buf[--len] = 0; /* overwrite '\n' with nul-byte */
/* else { handle read of line longer than 200 chars }
*/
if ((p[idx] = malloc (len + 1))) /* allocate storage */
strcpy (p[idx], buf); /* copy buf to storage */
else
return NULL; /* indicate error condition in return */
idx++;
}
return p; /* return pointer to list */
}
Finally, in any code your write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.
Use a memory error checking program to insure you haven't written beyond/outside your allocated block of memory, attempted to read or base a jump on an uninitialized value and finally to confirm that you have freed all the memory you have allocated.
For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.
Look things over, let me know if you have any further questions.
It's natural that you see numbers because you are printing a single character using the "%d" specifier. In fact, strings in c are pretty much that, arrays of numbers, those numbers are the ascii values of the corresponding characters. If you instead use "%c" you will see the character that represents each of those numbers.
Your code also, calls strlen() on something that is intended as a array of strings, strlen() is used to compute the length of a single string, a string being an array of char items with a non-zero value, ended with a 0. Thus, strlen() is surely causing undefined behavior.
Also, if you want to store each string, you need to copy the data like you tried in the commented line with strcpy() because the array you are using for reading lines is overwritten over and over in each iteration.
Your compiler must be throwing all kinds of warnings, if it's not then it's your fault, you should let the compiler know that you want it to do some diagnostics to help you find common problems like assigning a pointer to a char.
You should fix multiple problems in your code, here is a code that fixes most of them
void
loadlist(const char *const filename) {
char line[100];
FILE *file;
// We can only read 100 lines, of
// max 99 characters each
char array[100][100];
int size;
size = 0;
file = fopen (filename, "r" );
if (file == NULL)
return;
while ((fgets(line, sizeof(line), file) != NULL) && (size < 100)) {
strcpy(array[size++], line);
}
fclose(file);
for (int i = 0 ; i < size ; ++i) {
printf("array[%d] = %s", i + 1, array[i]);
}
}
int
main(void)
{
time_t stime, etime;
printf("Starting of the program...\n");
time(&stime);
loadlist("Z:\\list.txt");
time(&etime);
printf("Time to load: %f\n", difftime(etime, stime));
return 0;
}
Just to prove how complicated it can be in c, check this out
#include <stdio.h>
#include <time.h>
#include <string.h>
#include <stdlib.h>
struct string_list {
char **items;
size_t size;
size_t count;
};
void
string_list_print(struct string_list *list)
{
// Simply iterate through the list and
// print every item
for (size_t i = 0 ; i < list->count ; ++i) {
fprintf(stdout, "item[%zu] = %s\n", i + 1, list->items[i]);
}
}
struct string_list *
string_list_create(size_t size)
{
struct string_list *list;
// Allocate space for the list object
list = malloc(sizeof *list);
if (list == NULL) // ALWAYS check this
return NULL;
// Allocate space for the items
// (starting with `size' items)
list->items = malloc(size * sizeof *list->items);
if (list->items != NULL) {
// Update the list size because the allocation
// succeeded
list->size = size;
} else {
// Be optimistic, maybe realloc will work next time
list->size = 0;
}
// Initialize the count to 0, because
// the list is initially empty
list->count = 0;
return list;
}
int
string_list_append(struct string_list *list, const char *const string)
{
// Check if there is room for the new item
if (list->count + 1 >= list->size) {
char **items;
// Resize the array, there is no more room
items = realloc(list->items, 2 * list->size * sizeof *list->items);
if (items == NULL)
return -1;
// Now update the list
list->items = items;
list->size += list->size;
}
// Copy the string into the array we simultaneously
// increase the `count' and copy the string
list->items[list->count++] = strdup(string);
return 0;
}
void
string_list_destroy(struct string_list *const list)
{
// `free()' does work with a `NULL' argument
// so perhaps as a principle we should too
if (list == NULL)
return;
// If the `list->items' was initialized, attempt
// to free every `strdup()'ed string
if (list->items != NULL) {
for (size_t i = 0 ; i < list->count ; ++i) {
free(list->items[i]);
}
free(list->items);
}
free(list);
}
struct string_list *
loadlist(const char *const filename) {
char line[100]; // A buffer for reading lines from the file
FILE *file;
struct string_list *list;
// Create a new list, initially it has
// room for 100 strings, but it grows
// automatically if needed
list = string_list_create(100);
if (list == NULL)
return NULL;
// Attempt to open the file
file = fopen (filename, "r");
// On failure, we now have the responsibility
// to cleanup the allocated space for the string
// list
if (file == NULL) {
string_list_destroy(list);
return NULL;
}
// Read lines from the file until there are no more
while (fgets(line, sizeof(line), file) != NULL) {
char *newline;
// Remove the trainling '\n'
newline = strchr(line, '\n');
if (newline != NULL)
*newline = '\0';
// Append the string to the list
string_list_append(list, line);
}
fclose(file);
return list;
}
int
main(void)
{
time_t stime, etime;
struct string_list *list;
printf("Starting of the program...\n");
time(&stime);
list = loadlist("Z:\\list.txt");
if (list != NULL) {
string_list_print(list);
string_list_destroy(list);
}
time(&etime);
printf("Time to load: %f\n", difftime(etime, stime));
return 0;
}
Now, this will work almost as the python code you say you wrote but it will certainly be faster, there is absolutely no doubt.
It is possible that an experimented python programmer can write a python program that runs faster than that of a non-experimented c programmer, learning c however is really good because you then understand how things work really, and you can then infer how a python feature is probably implemented, so understanding this can be very useful actually.
Although it's certainly way more complicated than doing the same in python, note that I wrote this in nearly 10min. So if you really know what you're doing and you really need it to be fast c is certainly an option, but you need to learn many concepts that are not clear to higher level languages programmers.

Resources