I'm trying to print array with 50k items into a file but it could be done only if I set small numbers of items, e.g. 5k.
void fputsArray(int *arr, int size, char *filename)
{
char *string = (char*)calloc( (8*size+1), sizeof(char) );
for(int i = 0; i < size; i++)
sprintf( &string[ strlen(string) ], "%d\n", arr[i] );
FILE *output;
char fullFilename[50] = "./";
output = fopen(strcat(fullFilename, filename), "w");
fputs(string, output);
fclose(output);
free(string);
}
size is 50000, defined in #DEFINE.
This is working code. But if I delete 8 multiplying to size, that is I supposed to be working, doesn't work. I got in that situation Segmentation fault: 11
Why should I allocate 8 times more memory than I need?
Assuming the input to your function is all correct:
void fputsArray(int *arr, int size, char *filename)
Sizes should be given as size_t.
{
char *string = (char*)calloc( (8*size+1), sizeof(char) );
The clearing of the memory (calloc) is unnecessary, malloc and setting string[0] = '\0' would suffice. sizeof( char ) is always 1 by definition. And you should not cast the result of an allocation.
Actually, the whole construct is unnecessary, but that's for later.
for(int i = 0; i < size; i++)
sprintf( &string[ strlen(string) ], "%d\n", arr[i] );
Not actually that bad, aside from string + strlen( string ) being simpler and that there should always be { } surrounding the statement. Still unnecessarily complex.
FILE *output;
char fullFilename[50] = "./";
output = fopen(strcat(fullFilename, filename), "w");
A filename is always relative to the current working directory, so the "./" is unnecessary. You should however have checked the filename length before strcating it into a static buffer like that.
fputs(string, output);
Ah, but you have not checked if the fopen actually succeeded!
fclose(output);
free(string);
}
All in all, I've seen worse. Whether your numbers actually fit your buffer is guesswork, though, and most importantly the whole memory shenanigans are unnecessary.
Consider:
void printArray( int const * arr, size_t size, char const * filename )
{
FILE * output = fopen( filename, "w" );
if ( output != NULL )
{
for ( size_t i = 0; i < size; ++i )
{
fprintf( output, "%d\n", arr[i] );
}
fclose( output );
}
else
{
perror( "File open failed" );
}
}
I think this is much better than trying to figure out where your memory guesswork went wrong.
Edit: On second thought, I would have that function take a FILE * argument instead of a filename, which would give you the flexibility of printing to an already-opened stream (like stdout) as well, and also let you do the error handling of the fopen in a place higher up that might have additional capabilities to give useful information.
size is 50000, defined in #DEFINE. This is working code. But if I delete 8 multiplying to size, that is I supposed to be working, doesn't work. I got in that situation Segmentation fault: 11 Why should I allocate 8 times more memory than I need?
You are writing about this size estimates:
char *string = (char*)calloc( (8*size+1), sizeof(char) );
But the array in use is int[] and you will write one value per line in disk as in
sprintf( &string[ strlen(string) ], "%d\n", arr[i] );
This seems unnecessary complicated. As for the size, assume all values as INT_MIN, a.k.a. (in limits.h)
#define INT_MIN (-2147483647 - 1)
for an 4-byte integer. So you have 11 chars. Just that. 10 digits plus one symbol for the signal. This will got you covered for any int value. Add 1 for the '\n'
But...
Why use calloc() at all?
Why not just use a size * 12-byte array that would fit every possible values?
why declare a new char* to hold the value in char format instead of just using fprintf() at once?
why void instead of just returning something like -1 for error or the number of itens written to disk in case of success?
Back to the program
If you really want to write down the array to disk in a single call to fputs(), holding the whole giant string in memory, consider that sprintf() returns the number of bytes written, so this is the value you need to use as a pointer to the output string...
If you want to use memory allocation you can do it in blocks. Considere that if all values are under 999 the 50.000 lines would have no more than 4 bytes each. But if all values are constant equal to INT_MIN you will have the max 12 bytes per line.
So you can use the return of sprintf() to update the pointer to the string, and use realloc() when needed, allocating, let's say, in blocks of a few K-bytes. (if you really want to to that write back and I can post an example)
C Example
The code below writes the file the way you tried to, and returns the total bytes written. It depends on the values of the array, anyway. The maximum is what I said, 12 bytes per line...
int fputsArray( unsigned size, int* array , const char* filename)
{
static char string[12 * MY_SIZE_ ] = {0};
unsigned ix = 0; // pointer to the next char to use in string
FILE* output = fopen( filename, "w");
if ( output == NULL ) return -1;
// file is open
for(int i = 0; i < size; i+= 1)
{
unsigned used = sprintf( (string + ix), "%d\n", array[i] );
ix += used;
}
fputs(string, output);
fclose(output);
return ix;
}
Using fprintf()
This code writes the same file, using fprintf() and is way simpler...
int fputsArray_b( unsigned size, int* array , const char* filename)
{
unsigned ix = 0; // bytes written
FILE* output = fopen( filename, "w");
if ( output == NULL ) return -1;
// file is open
for(int i = 0; i < size; i+= 1)
ix += fprintf( output, "%d\n", array[i]);
fclose(output);
return ix;
}
A complete test with the 2 functions
#define MY_SIZE_ 50000
#include <limits.h>
#include <stdio.h>
#include <stdlib.h>
int fputsArray(const unsigned,int*,const char*);
int fputsArray_b(const unsigned,int*,const char*);
int main(void)
{
int value[MY_SIZE_];
srand(210726); // seed for today :)
value[0] = INT_MIN; // just to test: this is the longest value
for ( int i=1; i<MY_SIZE_; i+=1 ) value[i] = rand();
int used = fputsArray( MY_SIZE_, value, "test.txt");
printf("%d bytes written to disk\n", used );
used = fputsArray_b( MY_SIZE_, value, "test_b.txt");
printf("%d bytes written to disk using the alternate function\n", used );
return 0;
}
int fputsArray( unsigned size, int* array , const char* filename)
{
static char string[12 * MY_SIZE_ ] = {0};
unsigned ix = 0; // pointer to the next char to use in string
FILE* output = fopen( filename, "w");
if ( output == NULL ) return -1;
// file is open
for(int i = 0; i < size; i+= 1)
{
unsigned used = sprintf( (string + ix), "%d\n", array[i] );
ix += used;
}
fputs(string, output);
fclose(output);
return ix;
}
int fputsArray_b( unsigned size, int* array , const char* filename)
{
unsigned ix = 0; // bytes written
FILE* output = fopen( filename, "w");
if ( output == NULL ) return -1;
// file is open
for(int i = 0; i < size; i+= 1)
ix += fprintf( output, "%d\n", array[i]);
fclose(output);
return ix;
}
The program writes 2 identical files...
Related
i'm trying to implement little program that takes a text and breaks it into lines and sort them in alphabetical order but i encountered a little problem, so i have readlines function which updates an array of pointers called lines, the problem is when i try to printf the first pointer in lines as an array using %s nothing is printed and there is no errors.
I have used strcpy to copy an every single text line(local char array) into a pointer variable and then store that pointer in lines array but it gave me the error.
Here is the code:
#include <stdio.h>
#define MAXLINES 4
#define MAXLENGTH 1000
char *lines[MAXLINES];
void readlines() {
int i;
for (i = 0; i < MAXLINES; i++) {
char c, line[MAXLENGTH];
int j;
for (j = 0; (c = getchar()) != '\0' && c != '\n' && j < MAXLENGTH; j++) {
line[j] = c;
}
lines[i] = line;
}
}
int main(void) {
readlines();
printf("%s", lines[0]);
getchar();
return 0;
}
One problem is the following line:
lines[i] = line;
In this line, you make lines[i] point to line. However, line is a local char array whose lifetime ends as soon as the current loop iteration ends. Therefore, lines[i] will contain a dangling pointer (i.e. a pointer to an object that is no longer valid) as soon as the loop iteration ends.
For this reason, when you later call
printf("%s", lines[0]);
lines[0] is pointing to an object whose lifetime has ended. Dereferencing such a pointer invokes undefined behavior. Therefore, you cannot rely on getting any meaningful output, and your program may crash.
One way to fix this would be to not make lines an array of pointers, but rather an multidimensional array of char, i.e. an array of strings:
char lines[MAXLINES][MAXLENGTH+1];
Now you have a proper place for storing the strings, and you no longer need the local array line in the function readlines.
Another issue is that the line
printf("%s", lines[0]);
requires that lines[0] points to a string, i.e. to an array of characters terminated by a null character. However, you did not put a null character at the end of the string.
After fixing all of the issues mentioned above, your code should look like this:
#include <stdio.h>
#define MAXLINES 4
#define MAXLENGTH 1000
char lines[MAXLINES][MAXLENGTH+1];
void readlines() {
int i;
for (i = 0; i < MAXLINES; i++) {
char c;
int j;
for (j = 0; (c = getchar()) != '\0' && c != '\n' && j < MAXLENGTH; j++) {
lines[i][j] = c;
}
//add terminating null character
lines[i][j] = '\0';
}
}
int main(void) {
readlines();
printf("%s", lines[0]);
return 0;
}
However, this code still has a few issues, which are probably unrelated to your immediate problem, but could cause trouble later:
The function getchar will return EOF, not '\0', when there is no more data (or when an error occurred). Therefore, you should compare the return value of getchar with EOF instead of '\0'. However, a char is not guaranteed to be able to store the value of EOF. Therefore, you should store the return value of getchar in an int instead. Note that getchar returns a value of type int, not char.
When j reaches MAX_LENGTH, you will call getchar one additional time before terminating the loop. This can cause undesired behavior, such as your program waiting for more user input or an important character being discarded from the input stream.
In order to also fix these issues, I recommend the following code:
#include <stdio.h>
#define MAXLINES 4
#define MAXLENGTH 1000
char lines[MAXLINES][MAXLENGTH+1];
void readlines() {
int i;
for (i = 0; i < MAXLINES; i++)
{
//changed type from "char" to "int"
int c;
int j;
for ( j = 0; j < MAXLENGTH; j++ )
{
if ( (c = getchar()) == EOF || c == '\n' )
break;
lines[i][j] = c;
}
//add terminating null character
lines[i][j] = '\0';
}
}
int main(void) {
readlines();
printf("%s", lines[0]);
return 0;
}
Problem 1
char *lines[MAXLINES];
For the compiler it makes no difference how you write this, but for you, as you are learning C, maybe it is worth consider different spacing and naming. Question is: what is lines[]? lines[] is supposed to be an array of strings and hold some text inside. So lines[0] is a string, lines[1] is a string and so on. As pointed in a comment you could also use char lines[MAX_LINES][MAX_LENGTH] and have a 2D box of NxM char. This way you would have a pre-determined size in terms of number and size of lines and have simpler things at a cost of wasting space in lines of less than MAX_LENGTH chars and having a fixed number of lines you can use, but no need to allocate memory.
A more flexible way is to use an array of pointers. Since each pointer will represent a line, a single one
char* line[MAXLINES];
is a better picture of the use: line[0] is char*, line[1] is char* and so on. But you will need to allocate memory for each line (and you did not) in your code.
Remember int main(int argc, char**argv)
This is the most flexible way, since in this way you can hold any number of lines. The cost? Additional allocations.
size_t n_lines;
char** line;
This may be the best representation, as known by every C program since K&R.
Problem 2
for (
j = 0;
(c = getchar()) != '\0' && c != '\n' && j < MAXLENGTH;
j++) {
line[j] = c;
}
lines[i] = line;
This loop does not copy the final 0 that terminates each string. And reuses the same line, a char[] to hold the data as being read. And the final line does not copy a string, if one existed there. There is no one since the final 0 was stripped off by the loop. And there is no data too, since the area is being reused.
A complete C example of uploading a file to a container in memory
I will let an example of a more controlled way of writing this, a container for a set of lines and even a sorting function.
a data structure
The plan is to build an array of pointers as the system does for main. Since we do no know ahead the number of lines and do not want this limitation we will allocate memory in groups of blk_size lines. At any time we have limit pointers to use. From these size are in use. line[] is char* and points to a single line of text. The struct is
typedef struct
{
size_t blk_size; // block
size_t limit; // actual allocated size
size_t size; // size in use
char** line; // the lines
} Block;
the test function
Block* load_file(const char*);
Plan is to call load_file("x.txt") and the function returns a Block* pointing to the array representing the lines in file, one by one. Then we call qsort() and sort the whole thing. If the program is called lines we will run
lines x.txt
and it will load the file x.txt, show its contents on screen, sort it, show the sorted lines and then erase everything at exit.
main() for the test
int main(int argc, char** argv)
{
char msg[80] = {0};
if (argc < 2) usage();
Block* test = load_file(argv[1]);
sprintf(msg, "==> Loading \"%s\" into memory", argv[1]);
status_blk(test, msg);
qsort(test->line, test->size, sizeof(void*), cmp_line);
sprintf(msg, "==> \"%s\" after sort", argv[1]);
status_blk(test, msg);
test = delete_blk(test);
return 0;
};
As planned
load_file() is the constructor and load the file contents into a Block.
status_blk() shows the contents and accepts a convenient optional message
qsort() sorts the lines using a one-line cmp_line() function.
status_blk() is called again and shows the now sorted contents
as in C++ delete_blk() is the destructor and erases the whole thing._
output using main() as tlines.c for testing
PS M:\> .\lines tlines.c
loading "tlines.c" into memory
Block extended for a total of 16 pointers
==> Loading "tlines.c" into memory
Status: 13 of 16 lines. [block size is 8]:
1 int main(int argc, char** argv)
2 {
3 char msg[80] = {0};
4 if (argc < 2) usage();
5 Block* test = load_file(argv[1]);
6 sprintf(msg, "==> Loading \"%s\" into memory", argv[1]);
7 status_blk(test, msg);
8 qsort(test->line, test->size, sizeof(void*), cmp_line);
9 sprintf(msg, "==> \"%s\" after sort", argv[1]);
10 status_blk(test, msg);
11 test = delete_blk(test);
12 return 0;
13 };
==> "tlines.c" after sort
Status: 13 of 16 lines. [block size is 8]:
1 Block* test = load_file(argv[1]);
2 char msg[80] = {0};
3 if (argc < 2) usage();
4 qsort(test->line, test->size, sizeof(void*), cmp_line);
5 return 0;
6 sprintf(msg, "==> Loading \"%s\" into memory", argv[1]);
7 sprintf(msg, "==> \"%s\" after sort", argv[1]);
8 status_blk(test, msg);
9 status_blk(test, msg);
10 test = delete_blk(test);
11 int main(int argc, char** argv)
12 {
13 };
About the code
I am not sure if it needs much explanation, it is a single function that does the file loading and it has around 20 lines of code. The other functions has less than 10. The whole file is represented in line that is char** and Block has the needed info about actual size.
Since line[] is an array of pointers we can call
qsort(test->line, test->size, sizeof(void*), cmp_line);
and use
int cmp_line(const void* one, const void* other)
{
return strcmp(
*((const char**)one), *((const char**)other));
}
using strcmp() to compare the strings and have the lines sorted.
create_blk() accepts a block size for use in the calls to realloc() for eficiency.
Delete a Block is a 3-step free() in the reverse order of allocation.
The complete code
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct
{
size_t blk_size; // block
size_t limit; // actual allocated size
size_t size; // size in use
char** line; // the lines
} Block;
Block* create_blk(size_t);
Block* delete_blk(Block*);
int status_blk(Block*, const char*);
Block* load_file(const char*);
int cmp_line(const void*, const void*);
void usage();
int main(int argc, char** argv)
{
char msg[80] = {0};
if (argc < 2) usage();
Block* test = load_file(argv[1]);
sprintf(msg, "\n\n==> Loading \"%s\" into memory", argv[1]);
status_blk(test, msg);
qsort(test->line, test->size, sizeof(void*), cmp_line);
sprintf(msg, "\n\n==> \"%s\" after sort", argv[1]);
status_blk(test, msg);
test = delete_blk(test);
return 0;
};
int cmp_line(const void* one, const void* other)
{
return strcmp(
*((const char**)one), *((const char**)other));
}
Block* create_blk(size_t size)
{
Block* nb = (Block*)malloc(sizeof(Block));
if (nb == NULL) return NULL;
nb->blk_size = size;
nb->limit = size;
nb->size = 0;
nb->line = (char**)malloc(sizeof(char*) * size);
return nb;
}
Block* delete_blk(Block* blk)
{
if (blk == NULL) return NULL;
for (size_t i = 0; i < blk->size; i += 1)
free(blk->line[i]); // free lines
free(blk->line); // free block
free(blk); // free struct
return NULL;
}
int status_blk(Block* bl,const char* msg)
{
if (msg != NULL) printf("%s\n", msg);
if (bl == NULL)
{
printf("Status: not allocated\n");
return -1;
}
printf(
"Status: %zd of %zd lines. [block size is %zd]:\n",
bl->size, bl->limit, bl->blk_size);
for (int i = 0; i < bl->size; i += 1)
printf("%4d\t%s", 1 + i, bl->line[i]);
return 0;
}
Block* load_file(const char* f_name)
{
if (f_name == NULL) return NULL;
fprintf(stderr, "loading \"%s\" into memory\n", f_name);
FILE* F = fopen(f_name, "r");
if (F == NULL) return NULL;
// file is open
Block* nb = create_blk(8); // block size is 8
char line[200];
char* p = &line[0];
p = fgets(p, sizeof(line), F);
while (p != NULL)
{
// is block full?
if (nb->size >= nb->limit)
{
const size_t new_sz = nb->limit + nb->blk_size;
char* new_block =
realloc(nb->line, (new_sz * sizeof(char*)));
if (new_block == NULL)
{
fprintf(
stderr,
"\tCould not extend block to %zd "
"lines\n",
new_sz);
break;
}
printf(
"Block extended for a total of %zd "
"pointers\n",
new_sz);
nb->limit = new_sz;
nb->line = (char**)new_block;
}
// now copy the line
nb->line[nb->size] = (char*)malloc(1 + strlen(p));
strcpy(nb->line[nb->size], p);
nb->size += 1;
// read next line
p = fgets(p, sizeof(line), F);
}; // while()
fclose(F);
return nb;
}
void usage()
{
fprintf(stderr,"Use: program file_to_load\n");
exit(EXIT_FAILURE);
}
Try something like this:
#include <stdio.h>
#include <stdlib.h> // for malloc(), free(), exit()
#include <string.h> // for strcpy()
#define MAXLINES 4
#define MAXLENGTH 1000
char *lines[MAXLINES];
void readlines() {
for( int i = 0; i < MAXLINES; i++) {
char c, line[MAXLENGTH + 1]; // ALWAYS one extra to allow for '\0'
int j = 0;
// RE-USE(!) local array for input characters until NL or length
// NB: Casting return value to character (suppress warning)
while( (c = (char)getchar()) != '\0' && c != '\n' && j < MAXLENGTH )
line[ j++ ] = c;
line[j] = '\0'; // terminate array (transforming it to 'string')
// Attempt to get a buffer to preserve this line
// (Old) compiler insists on casting return from malloc()
if( ( lines[i] = (char*)malloc( (j + 1) * sizeof lines[0][0] ) ) == NULL ) {
fprintf( stderr, "malloc failure\n" );
exit( -1 );
}
strcpy( lines[i], line ); // preserve this line
}
}
int my_main() {
readlines(); // only returns after successfully reading 4 lines of input
for( int i = 0; i < MAXLINES; i++)
printf( "Line %d: '%s'\n", i, lines[i] ); // enhanced
/* Maybe do stuff here */
for( int j = 0; j < MAXLINES; j++) // free up allocated memory.
free( lines[j] );
return 0;
}
If you would prefer to 'factor out` some code (and have a facility that you've written is absent, here's a version:
char *my_strdup( char *str ) {
int len = strlen( str ) + 1; // ALWAYS +1
// Attempt to get a buffer to preserve this line
// (Old) compiler insists on casting return from malloc()
char *pRet = (char*)malloc( len * sizeof *pRet );
if( pRet == NULL ) {
fprintf( stderr, "malloc failure\n" );
exit( -1 );
}
return strcpy( pRet, str );
}
The the terminating and preserve is condensed to:
line[j] = '\0'; // terminate array (transforming it to 'string')
lines[i] = my_strdup( line ); // preserve this line
I probably got an easy one for the C programmers out there!
I am trying to create a simple C function that will execute a system command in and write the process output to a string buffer out (which should be initialized as an array of strings of length n). The output needs to be formatted in the following way:
Each line written to stdout should be initialized as a string. Each of these strings has variable length. The output should be an array consisting of each string. There is no way to know how many strings will be written, so this array is also technically of variable length (but for my purposes, I just create a fixed-length array outside the function and pass its length as an argument, rather than going for an array that I would have to manually allocate memory for).
Here is what I have right now:
#define MAX_LINE_LENGTH 512
int exec(const char* in, const char** out, const size_t n)
{
char buffer[MAX_LINE_LENGTH];
FILE *file;
const char terminator = '\0';
if ((file = popen(in, "r")) == NULL) {
return 1;
}
for (char** head = out; (size_t)head < (size_t)out + n && fgets(buffer, MAX_LINE_LENGTH, file) != NULL; head += strlen(buffer)) {
*head = strcat(buffer, &terminator);
}
if (pclose(file)) {
return 2;
}
return 0;
}
and I call it with
#define N 128
int main(void)
{
const char* buffer[N];
const char cmd[] = "<some system command resulting in multi-line output>";
const int code = exec(cmd, buffer, N);
exit(code);
}
I believe the error the above code results in is a seg fault, but I'm not experienced enough to figure out why or how to fix.
I'm almost positive it is with my logic here:
for (char** head = out; (size_t)head < (size_t)out + n && fgets(buffer, MAX_LINE_LENGTH, file) != NULL; head += strlen(buffer)) {
*head = strcat(buffer, &terminator);
}
What I thought this does is:
Get a mutable reference to out (i.e. the head pointer)
Save the current stdout line to buffer (via fgets)
Append a null terminator to buffer (because I don't think fgets does this?)
Overwrite the data at head pointer with the value from step 3
Move head pointer strlen(buffer) bytes over (i.e. the number of chars in buffer)
Continue until fgets returns NULL or head pointer has been moved beyond the bounds of out array
Where am I wrong? Any help appreciated, thanks!
EDIT #1
According to Barmar's suggestions, I edited my code:
#include <stdio.h>
#include <stdlib.h>
#define MAX_LINE_LENGTH 512
int exec(const char* in, const char** out, const size_t n)
{
char buffer[MAX_LINE_LENGTH];
FILE *file;
if ((file = popen(in, "r")) == NULL) return 1;
for (size_t i = 0; i < n && fgets(buffer, MAX_LINE_LENGTH, file) != NULL; i += 1) out[i] = buffer;
if (pclose(file)) return 2;
return 0;
}
#define N 128
int main(void)
{
const char* buffer[N];
const char cmd[] = "<system command to run>";
const int code = exec(cmd, buffer, N);
for (int i = 0; i < N; i += 1) printf("%s", buffer[i]);
exit(code);
}
While there were plenty of redundancies with what I wrote that are now fixed, this still causes a segmentation fault at runtime.
Focusing on the edited code, this assignment
out[i] = buffer;
has problems.
In this expression, buffer is implicitly converted to a pointer-to-its-first-element (&buffer[0], see: decay). No additional memory is allocated, and no string copying is done.
buffer is rewritten every iteration. After the loop, each valid element of out will point to the same memory location, which will contain the last line read.
buffer is an array local to the exec function. Its lifetime ends when the function returns, so the array in main contains dangling pointers. Utilizing these values is Undefined Behaviour.
Additionally,
for (int i = 0; i < N; i += 1)
always loops to the maximum storable number of lines, when it is possible that fewer lines than this were read.
A rigid solution uses an array of arrays to store the lines read. Here is a cursory example (see: this answer for additional information on using multidimensional arrays as function arguments).
#include <stdio.h>
#include <stdlib.h>
#define MAX_LINES 128
#define MAX_LINE_LENGTH 512
int exec(const char *cmd, char lines[MAX_LINES][MAX_LINE_LENGTH], size_t *lc)
{
FILE *stream = popen(cmd, "r");
*lc = 0;
if (!stream)
return 1;
while (*lc < MAX_LINES) {
if (!fgets(lines[*lc], MAX_LINE_LENGTH, stream))
break;
(*lc)++;
}
return pclose(stream) ? 2 : 0;
}
int main(void)
{
char lines[MAX_LINES][MAX_LINE_LENGTH];
size_t n;
int code = exec("ls -al", lines, &n);
for (size_t i = 0; i < n; i++)
printf("%s", lines[i]);
return code;
}
Using dynamic memory is another option. Here is a basic example using strdup(3), lacking robust error handling.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char **exec(const char *cmd, size_t *length)
{
FILE *stream = popen(cmd, "r");
if (!stream)
return NULL;
char **lines = NULL;
char buffer[4096];
*length = 0;
while (fgets(buffer, sizeof buffer, stream)) {
char **reline = realloc(lines, sizeof *lines * (*length + 1));
if (!reline)
break;
lines = reline;
if (!(lines[*length] = strdup(buffer)))
break;
(*length)++;
}
pclose(stream);
return lines;
}
int main(void)
{
size_t n = 0;
char **lines = exec("ls -al", &n);
for (size_t i = 0; i < n; i++) {
printf("%s", lines[i]);
free(lines[i]);
}
free(lines);
}
I have a pointer of pointer to store lines I read from a file;
char **lines;
And I'm assigning them like this :
line_no=0;
*(&lines[line_no++])=buffer;
But it crashes why ?
According to my logic the & should give the pointer of zeroth index, then *var=value, that's how to store value in pointer. Isn't it ?
Here is my current complete code :
void read_file(char const *name,int len)
{
int line_no=0;
FILE* file;
int buffer_length = 1024;
char buffer[buffer_length];
file = fopen(name, "r");
while(fgets(buffer, buffer_length, file)) {
printf("---%s", buffer);
++line_no;
if(line_no==0)
{
lines = (char**)malloc(sizeof(*lines) * line_no);
}
else
{
lines = (char**)realloc(lines,sizeof(*lines) * line_no);
}
lines[line_no-1] = (char*)malloc(sizeof(buffer));
lines[line_no-1]=buffer;
printf("-------%s--------\n", *lines[line_no-1]);
}
fclose(file);
}
You have just a pointer, nothing more. You need to allocate memory using malloc().
Actually, you need first to allocate memory for pointers, then allocate memory for strings.
N lines, each M characters long:
char** lines = malloc(sizeof(*lines) * N);
for (int i = 0; i < N; ++i) {
lines[i] = malloc(sizeof(*(lines[i])) * M);
}
You are also taking an address and then immediately dereference it - something like*(&foo) makes little to no sense.
For updated code
Oh, there is so much wrong with that code...
You need to include stdlib.h to use malloc()
lines is undeclared. The char** lines is missing before loop
if in loop checks whether line_no is 0. If it is, then it allocates lines. The problem is, variable line_no is 0 - sizeof(*lines) times 0 is still zero. It allocates no memory.
But! There is ++line_no at the beginning of the loop, therefore line_no is never 0, so malloc() isn't called at all.
lines[line_no-1] = buffer; - it doesn't copy from buffer to lines[line_no-1], it just assigns pointers. To copy strings in C you need to use strcpy()
fgets() adds new line character at the end of buffer - you probably want to remove it: buffer[strcspn(buffer, "\n")] = '\0';
Argument len is never used.
char buffer[buffer_length]; - don't use VLA
It would be better to increment line_no at the end of the loop instead of constantly calculating line_no-1
In C, casting result of malloc() isn't mandatory
There is no check, if opening file failed
You aren't freeing the memory
Considering all of this, I quickly "corrected" it to such state:
void read_file(char const* name)
{
FILE* file = fopen(name, "r");
if (file == NULL) {
return;
}
int buffer_length = 1024;
char buffer[1024];
char** lines = malloc(0);
int line_no = 0;
while (fgets(buffer, buffer_length, file)) {
buffer[strcspn(buffer, "\n")] = '\0';
printf("---%s\n", buffer);
lines = realloc(lines, sizeof (*lines) * (line_no+1));
lines[line_no] = malloc(sizeof (*lines[line_no]) * buffer_length);
strcpy(lines[line_no], buffer);
printf("-------%s--------\n", lines[line_no]);
++line_no;
}
fclose(file);
for (int i = 0; i < line_no; ++i) {
free(lines[i]);
}
free(lines);
}
Ok, you have a couple of errors here:
lines array is not declared
Your allocation is wrong
I don't understand this line, it is pointless to allocate something multiplying it by zero
if( line_no == 0 )
{
lines = (char**)malloc(sizeof(*lines) * line_no);
}
You shouldn't allocate array with just one element and constantly reallocate it. It is a bad practice, time-consuming, and can lead to some bigger problems later.
I recommend you to check this Do I cast the result of malloc? for malloc casting.
You could write something like this:
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
void read_file(char const *name)
{
int line_no = 0, arr_size = 10;
int buffer_length = 1024;
char buffer[buffer_length];
char **lines;
FILE* file;
lines = malloc(sizeof(char*) * 10);
file = fopen(name, "r");
while(fgets(buffer, buffer_length, file)) {
buffer[strlen(buffer)-1] = '\0';
printf("---%s", buffer);
++line_no;
if(line_no == arr_size)
{
arr_size += 10;
lines = realloc(lines, sizeof(char*) * arr_size);
}
lines[line_no-1] = malloc(sizeof(buffer));
lines[line_no-1] = buffer;
printf("-------%s--------\n", lines[line_no-1]);
}
fclose(file);
}
PS, fgets() also takes the '\n' char at the end, in order to prevent this you can write the following line: buffer[strlen(buffer)-1] = '\0';
I have a file of binary data with various character strings sprinkled throughout. I am trying to write a C code to find the first occurrence of user-specified strings in the file. (I know this can be done with bash but I need a C code for other reasons.) The code as it stands is:
#include <stdio.h>
#include <string.h>
#define CHUNK_SIZE 512
int main(int argc, char **argv) {
char *fname = argv[1];
char *tag = argv[2];
FILE *infile;
char *chunk;
char *taglcn = NULL;
long lcn_in_file = 0;
int back_step;
fpos_t pos;
// allocate chunk
chunk = (char*)malloc((CHUNK_SIZE + 1) * sizeof(char));
// find back_step
back_step = strlen(tag) - 1;
// open file
infile = fopen(fname, "r");
// loop
while (taglcn == NULL) {
// read chunk
memset(chunk, 0, (CHUNK_SIZE + 1) * sizeof(char));
fread(chunk, sizeof(char), CHUNK_SIZE, infile);
printf("Read %c\n", chunk[0]);
// look for tag
taglcn = strstr(chunk, tag);
if (taglcn != NULL) {
// if you find tag, add to location the offset in bytes from beginning of chunk
lcn_in_file += (long)(taglcn - chunk);
printf("HEY I FOUND IT!\n");
} else {
// if you don't find tag, add chunk size minus back_step to location and ...
lcn_in_file += ((CHUNK_SIZE - back_step) * sizeof(char));
// back file pointer up by back_step for next read
fseek(infile, -back_step, SEEK_CUR);
fgetpos(infile, &pos);
printf("%ld\n", pos);
printf("%s\n\n\n", chunk);
}
}
printf("%ld\n", lcn_in_file);
fclose(infile);
free(chunk);
}
If you're wondering, back_step is put in to take care of the unlikely eventuality that the string in question is split by a chunk boundary.
The file I am trying to examine is about 1Gb in size. The problem is that for some reason I can find any string within the first 9000 or so bytes, but beyond that, strstr is somehow not detecting any string. That is, if I look for a string located beyond 9000 or so bytes into the file, strstr does not detect it. The code reads through the entire file and never finds the search string.
I have tried varying CHUNK_SIZE from 128 to 50000, with no change in results. I have tried varying back_step as well. I have even put in diagnostic code to print out chunk character by character when strstr fails to find the string, and sure enough, the string is exactly where it is supposed to be. The diagnostic output of pos is always correct.
Can anyone tell me where I am going wrong? Is strstr the wrong tool to use here?
Since you say your file is binary, strstr() will stop scanning at the first null byte in the file.
If you wish to look for patterns in binary data, then the memmem() function is appropriate, if it is available. It is available on Linux and some other platforms (BSD, macOS, …) but it is not defined as part of standard C or POSIX. It bears roughly the same relation to strstr() that memcpy() bears to strcpy().
Note that your code should detect the number of bytes read by fread() and only search on that.
char *tag = …; // Identify the data to be searched for
size_t taglen = …; // Identify the data's length (maybe strlen(tag))
int nbytes;
while ((nbytes = fread(chunk, 1, (CHUNK_SIZE + 1), infile)) > 0)
{
…
tagcln = memmem(chunk, nbytes, tag, taglen);
if (tagcln != 0)
…found it…
…
}
It isn't really clear why you have the +1 on the chunk size. The fread() function doesn't add null bytes at the end of the data or anything like that. I've left that aspect unchanged, but would probably not use it in my own code.
It is good that you take care of identifying a tag that spans the boundaries between two chunks.
The most likely reason for strstr to fail in your code is the presence of null bytes in the file. Furthermore, you should open the file in binary mode for the file offsets to be meaningful.
To scan for a sequence of bytes in a block, use the memmem() function. If it is not available on your system, here is a simple implementation:
#include <string.h>
void *memmem(const void *haystack, size_t n1, const void *needle, size_t n2) {
const unsigned char *p1 = haystack;
const unsigned char *p2 = needle;
if (n2 == 0)
return (void*)p1;
if (n2 > n1)
return NULL;
const unsigned char *p3 = p1 + n1 - n2 + 1;
for (const unsigned char *p = p1; (p = memchr(p, *p2, p3 - p)) != NULL; p++) {
if (!memcmp(p, p2, n2))
return (void*)p;
}
return NULL;
}
You would modify your program this way:
#include <errno.h>
#include <stdio.h>
#include <string.h>
void *memmem(const void *haystack, size_t n1, const void *needle, size_t n2);
#define CHUNK_SIZE 65536
int main(int argc, char **argv) {
if (argc < 3) {
fprintf(sderr, "missing parameters\n");
exit(1);
}
// open file
char *fname = argv[1];
FILE *infile = fopen(fname, "rb");
if (infile == NULL) {
fprintf(sderr, "cannot open file %s: %s\n", fname, strerror(errno));
exit(1);
}
char *tag = argv[2];
size_t tag_len = strlen(tag);
size_t overlap_len = 0;
long long pos = 0;
char *chunk = malloc(CHUNK_SIZE + tag_len - 1);
if (chunk == NULL) {
fprintf(sderr, "cannot allocate memory\n");
exit(1);
}
// loop
for (;;) {
// read chunk
size_t chunk_len = overlap_len + fread(chunk + overlap_len, 1,
CHUNK_SIZE, infile);
if (chunk_len < tag_len) {
// end of file or very short file
break;
}
// look for tag
char *tag_location = memmem(chunk, chunk_len, tag, tag_len);
if (tag_location != NULL) {
// if you find tag, add to location the offset in bytes from beginning of chunk
printf("string found at %lld\n", pos + (tag_location - chunk));
break;
} else {
// if you don't find tag, add chunk size minus back_step to location and ...
overlap_len = tag_len - 1;
memmove(chunk, chunk + chunk_len - overlap_len, overlap_len);
pos += chunk_len - overlap_len;
}
}
fclose(infile);
free(chunk);
return 0;
}
Note that the file is read in chunks of CHUNK_SIZE bytes, which is optimal if CHUNK_SIZE is a multiple of the file system block size.
For some really simple code, you can use mmap() and memcmp().
Error checking and proper header files are left as an exercise for the reader (there is at least one bug - another exercise for the reader to find):
int main( int argc, char **argv )
{
// put usable names on command-line args
char *fname = argv[ 1 ];
char *tag = argv[ 2 ];
// mmap the entire file
int fd = open( fname, O_RDONLY );
struct stat sb;
fstat( fd, &sb );
char *contents = mmap( NULL, sb.st_size, PROT_READ, MAP_PRIVATE, fd, 0 );
close( fd );
size_t tag_len = strlen( tag );
size_t bytes_to_check = 1UL + sb.st_size - tag_len;
for ( size_t ii = 0; ii < bytes_to_check; ii++ )
{
if ( !memcmp( contents + ii, tag, tag_len ) )
{
// match found
// (probably want to check if contents[ ii + tag_len ]
// is a `\0' char to get actual string matches)
}
}
munmap( contents, sb.st_len );
return( 0 );
}
That likely won't be anywhere near the fastest way (in general, mmap() is not going to be anywhere near a performance winner, especially in this use case of simply streaming through a file from beginning to end), but it's simple.
(Note that mmap() also has problems if the file size changes while it's being read. If the file grows, you won't see the additional data. If the file is shortened, you'll get SIGBUS when you try to read the removed data.)
A binary data file is going to contain '\0' bytes acting as string ends. The more that are in there, the shorter the area strstr is going to search will be. Note strstr will consider its work done once it hits a 0 byte.
You can scan the memory in intervals like
while (strlen (chunk) < CHUNKSIZE)
chunk += strlen (chunk) + 1;
i.e. restart after a null byte in the chunk as long as you are still within the chunk.
So I have a textfile which goes like this:
zero three two one five zero zero five seven .. etc
and there is a lot of it, 9054 words to be exact
My idea was to create a char array of 9054 spaces and store it in, this is what I have done so far:
#include <stdio.h>
int main(void)
{
char tmp;
int i = 0;
int j = 0;
char array[44000];
FILE *in_file;
in_file = fopen("in.txt", "r");
// Read file in to array
while (!feof(in_file))
{
fscanf(in_file,"%c",&tmp);
array[i] = tmp;
i++;
}
// Display array
while (j<i)
{
printf("%c",array[j]);
j++;
}
fclose(in_file);
while(1);
return 0;
}
The problem is I don't know how to store words, because from what I have done stores each character into the array so it becomes an array of around 44000. How can I make it so the array holds words instead?
Also I don't have an idea what the feof function does, especially the line
while (!feof(in_file))
what does this line exactly mean? Sorry I am still in the baby stages of learning C, I tried looking up what feof does but there is not much to find
Rather than check feof(), which tells you if the end-of-file occurred in the previous input operation, check the result of fscanf()
Reads "words" with "%s" and limit the max numbers of char to be read.
char buf[100];
fscanf(in_file,"%99s",buf);
Putting that together:
#define WORD_SIZE_MAX 20
#define WORD_COUNT_MAX 10000
char array[WORD_COUNT_MAX][WORD_SIZE_MAX];
unsigned word_i = 0;
for (i=0; i<WORD_COUNT_MAX; i++) {
if (fscanf(in_file,"%19s", word_list[i]) != 1) {
break;
}
}
Another approach is to use OP code nearly as is. Read the whole file into 1 array. Then on printing, skip white-space.
Usually you may use the following steps:
Dump the whole text file to a char buffer.
Use strtok to split the char buffer to multiple tokens or words.
Use an array of pointer to char to store individual words.
Something along this line would do. Note, I use your question title as the text file. You will need to replace 20 as appropriately.
int main ()
{
FILE *in_file;
in_file = fopen("in.txt", "r");
fseek( in_file, 0, SEEK_END );
long fsize = ftell( in_file );
fseek( in_file, 0, SEEK_SET );
char *buf = malloc( fsize + 1 );
fread( buf, fsize, 1, in_file ); // Dump the whole file to a char buffer.
fclose( in_file );
char *items[20] = { NULL };
char *pch;
pch = strtok (buf," \t\n");
int i = 0;
while (pch != NULL)
{
items[i++] = pch;
pch = strtok (NULL, " \t\n");
}
for( i = 0; i < 20; i++ )
{
if( items[i] != NULL )
{
printf( "items[%d] = %s\n", i, items[i] );
}
}
return 0;
}
Output:
items[0] = Storing
items[1] = words
items[2] = from
items[3] = textfile
items[4] = into
items[5] = char
items[6] = array
items[7] = using
items[8] = feof?