i'm trying to implement little program that takes a text and breaks it into lines and sort them in alphabetical order but i encountered a little problem, so i have readlines function which updates an array of pointers called lines, the problem is when i try to printf the first pointer in lines as an array using %s nothing is printed and there is no errors.
I have used strcpy to copy an every single text line(local char array) into a pointer variable and then store that pointer in lines array but it gave me the error.
Here is the code:
#include <stdio.h>
#define MAXLINES 4
#define MAXLENGTH 1000
char *lines[MAXLINES];
void readlines() {
int i;
for (i = 0; i < MAXLINES; i++) {
char c, line[MAXLENGTH];
int j;
for (j = 0; (c = getchar()) != '\0' && c != '\n' && j < MAXLENGTH; j++) {
line[j] = c;
}
lines[i] = line;
}
}
int main(void) {
readlines();
printf("%s", lines[0]);
getchar();
return 0;
}
One problem is the following line:
lines[i] = line;
In this line, you make lines[i] point to line. However, line is a local char array whose lifetime ends as soon as the current loop iteration ends. Therefore, lines[i] will contain a dangling pointer (i.e. a pointer to an object that is no longer valid) as soon as the loop iteration ends.
For this reason, when you later call
printf("%s", lines[0]);
lines[0] is pointing to an object whose lifetime has ended. Dereferencing such a pointer invokes undefined behavior. Therefore, you cannot rely on getting any meaningful output, and your program may crash.
One way to fix this would be to not make lines an array of pointers, but rather an multidimensional array of char, i.e. an array of strings:
char lines[MAXLINES][MAXLENGTH+1];
Now you have a proper place for storing the strings, and you no longer need the local array line in the function readlines.
Another issue is that the line
printf("%s", lines[0]);
requires that lines[0] points to a string, i.e. to an array of characters terminated by a null character. However, you did not put a null character at the end of the string.
After fixing all of the issues mentioned above, your code should look like this:
#include <stdio.h>
#define MAXLINES 4
#define MAXLENGTH 1000
char lines[MAXLINES][MAXLENGTH+1];
void readlines() {
int i;
for (i = 0; i < MAXLINES; i++) {
char c;
int j;
for (j = 0; (c = getchar()) != '\0' && c != '\n' && j < MAXLENGTH; j++) {
lines[i][j] = c;
}
//add terminating null character
lines[i][j] = '\0';
}
}
int main(void) {
readlines();
printf("%s", lines[0]);
return 0;
}
However, this code still has a few issues, which are probably unrelated to your immediate problem, but could cause trouble later:
The function getchar will return EOF, not '\0', when there is no more data (or when an error occurred). Therefore, you should compare the return value of getchar with EOF instead of '\0'. However, a char is not guaranteed to be able to store the value of EOF. Therefore, you should store the return value of getchar in an int instead. Note that getchar returns a value of type int, not char.
When j reaches MAX_LENGTH, you will call getchar one additional time before terminating the loop. This can cause undesired behavior, such as your program waiting for more user input or an important character being discarded from the input stream.
In order to also fix these issues, I recommend the following code:
#include <stdio.h>
#define MAXLINES 4
#define MAXLENGTH 1000
char lines[MAXLINES][MAXLENGTH+1];
void readlines() {
int i;
for (i = 0; i < MAXLINES; i++)
{
//changed type from "char" to "int"
int c;
int j;
for ( j = 0; j < MAXLENGTH; j++ )
{
if ( (c = getchar()) == EOF || c == '\n' )
break;
lines[i][j] = c;
}
//add terminating null character
lines[i][j] = '\0';
}
}
int main(void) {
readlines();
printf("%s", lines[0]);
return 0;
}
Problem 1
char *lines[MAXLINES];
For the compiler it makes no difference how you write this, but for you, as you are learning C, maybe it is worth consider different spacing and naming. Question is: what is lines[]? lines[] is supposed to be an array of strings and hold some text inside. So lines[0] is a string, lines[1] is a string and so on. As pointed in a comment you could also use char lines[MAX_LINES][MAX_LENGTH] and have a 2D box of NxM char. This way you would have a pre-determined size in terms of number and size of lines and have simpler things at a cost of wasting space in lines of less than MAX_LENGTH chars and having a fixed number of lines you can use, but no need to allocate memory.
A more flexible way is to use an array of pointers. Since each pointer will represent a line, a single one
char* line[MAXLINES];
is a better picture of the use: line[0] is char*, line[1] is char* and so on. But you will need to allocate memory for each line (and you did not) in your code.
Remember int main(int argc, char**argv)
This is the most flexible way, since in this way you can hold any number of lines. The cost? Additional allocations.
size_t n_lines;
char** line;
This may be the best representation, as known by every C program since K&R.
Problem 2
for (
j = 0;
(c = getchar()) != '\0' && c != '\n' && j < MAXLENGTH;
j++) {
line[j] = c;
}
lines[i] = line;
This loop does not copy the final 0 that terminates each string. And reuses the same line, a char[] to hold the data as being read. And the final line does not copy a string, if one existed there. There is no one since the final 0 was stripped off by the loop. And there is no data too, since the area is being reused.
A complete C example of uploading a file to a container in memory
I will let an example of a more controlled way of writing this, a container for a set of lines and even a sorting function.
a data structure
The plan is to build an array of pointers as the system does for main. Since we do no know ahead the number of lines and do not want this limitation we will allocate memory in groups of blk_size lines. At any time we have limit pointers to use. From these size are in use. line[] is char* and points to a single line of text. The struct is
typedef struct
{
size_t blk_size; // block
size_t limit; // actual allocated size
size_t size; // size in use
char** line; // the lines
} Block;
the test function
Block* load_file(const char*);
Plan is to call load_file("x.txt") and the function returns a Block* pointing to the array representing the lines in file, one by one. Then we call qsort() and sort the whole thing. If the program is called lines we will run
lines x.txt
and it will load the file x.txt, show its contents on screen, sort it, show the sorted lines and then erase everything at exit.
main() for the test
int main(int argc, char** argv)
{
char msg[80] = {0};
if (argc < 2) usage();
Block* test = load_file(argv[1]);
sprintf(msg, "==> Loading \"%s\" into memory", argv[1]);
status_blk(test, msg);
qsort(test->line, test->size, sizeof(void*), cmp_line);
sprintf(msg, "==> \"%s\" after sort", argv[1]);
status_blk(test, msg);
test = delete_blk(test);
return 0;
};
As planned
load_file() is the constructor and load the file contents into a Block.
status_blk() shows the contents and accepts a convenient optional message
qsort() sorts the lines using a one-line cmp_line() function.
status_blk() is called again and shows the now sorted contents
as in C++ delete_blk() is the destructor and erases the whole thing._
output using main() as tlines.c for testing
PS M:\> .\lines tlines.c
loading "tlines.c" into memory
Block extended for a total of 16 pointers
==> Loading "tlines.c" into memory
Status: 13 of 16 lines. [block size is 8]:
1 int main(int argc, char** argv)
2 {
3 char msg[80] = {0};
4 if (argc < 2) usage();
5 Block* test = load_file(argv[1]);
6 sprintf(msg, "==> Loading \"%s\" into memory", argv[1]);
7 status_blk(test, msg);
8 qsort(test->line, test->size, sizeof(void*), cmp_line);
9 sprintf(msg, "==> \"%s\" after sort", argv[1]);
10 status_blk(test, msg);
11 test = delete_blk(test);
12 return 0;
13 };
==> "tlines.c" after sort
Status: 13 of 16 lines. [block size is 8]:
1 Block* test = load_file(argv[1]);
2 char msg[80] = {0};
3 if (argc < 2) usage();
4 qsort(test->line, test->size, sizeof(void*), cmp_line);
5 return 0;
6 sprintf(msg, "==> Loading \"%s\" into memory", argv[1]);
7 sprintf(msg, "==> \"%s\" after sort", argv[1]);
8 status_blk(test, msg);
9 status_blk(test, msg);
10 test = delete_blk(test);
11 int main(int argc, char** argv)
12 {
13 };
About the code
I am not sure if it needs much explanation, it is a single function that does the file loading and it has around 20 lines of code. The other functions has less than 10. The whole file is represented in line that is char** and Block has the needed info about actual size.
Since line[] is an array of pointers we can call
qsort(test->line, test->size, sizeof(void*), cmp_line);
and use
int cmp_line(const void* one, const void* other)
{
return strcmp(
*((const char**)one), *((const char**)other));
}
using strcmp() to compare the strings and have the lines sorted.
create_blk() accepts a block size for use in the calls to realloc() for eficiency.
Delete a Block is a 3-step free() in the reverse order of allocation.
The complete code
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct
{
size_t blk_size; // block
size_t limit; // actual allocated size
size_t size; // size in use
char** line; // the lines
} Block;
Block* create_blk(size_t);
Block* delete_blk(Block*);
int status_blk(Block*, const char*);
Block* load_file(const char*);
int cmp_line(const void*, const void*);
void usage();
int main(int argc, char** argv)
{
char msg[80] = {0};
if (argc < 2) usage();
Block* test = load_file(argv[1]);
sprintf(msg, "\n\n==> Loading \"%s\" into memory", argv[1]);
status_blk(test, msg);
qsort(test->line, test->size, sizeof(void*), cmp_line);
sprintf(msg, "\n\n==> \"%s\" after sort", argv[1]);
status_blk(test, msg);
test = delete_blk(test);
return 0;
};
int cmp_line(const void* one, const void* other)
{
return strcmp(
*((const char**)one), *((const char**)other));
}
Block* create_blk(size_t size)
{
Block* nb = (Block*)malloc(sizeof(Block));
if (nb == NULL) return NULL;
nb->blk_size = size;
nb->limit = size;
nb->size = 0;
nb->line = (char**)malloc(sizeof(char*) * size);
return nb;
}
Block* delete_blk(Block* blk)
{
if (blk == NULL) return NULL;
for (size_t i = 0; i < blk->size; i += 1)
free(blk->line[i]); // free lines
free(blk->line); // free block
free(blk); // free struct
return NULL;
}
int status_blk(Block* bl,const char* msg)
{
if (msg != NULL) printf("%s\n", msg);
if (bl == NULL)
{
printf("Status: not allocated\n");
return -1;
}
printf(
"Status: %zd of %zd lines. [block size is %zd]:\n",
bl->size, bl->limit, bl->blk_size);
for (int i = 0; i < bl->size; i += 1)
printf("%4d\t%s", 1 + i, bl->line[i]);
return 0;
}
Block* load_file(const char* f_name)
{
if (f_name == NULL) return NULL;
fprintf(stderr, "loading \"%s\" into memory\n", f_name);
FILE* F = fopen(f_name, "r");
if (F == NULL) return NULL;
// file is open
Block* nb = create_blk(8); // block size is 8
char line[200];
char* p = &line[0];
p = fgets(p, sizeof(line), F);
while (p != NULL)
{
// is block full?
if (nb->size >= nb->limit)
{
const size_t new_sz = nb->limit + nb->blk_size;
char* new_block =
realloc(nb->line, (new_sz * sizeof(char*)));
if (new_block == NULL)
{
fprintf(
stderr,
"\tCould not extend block to %zd "
"lines\n",
new_sz);
break;
}
printf(
"Block extended for a total of %zd "
"pointers\n",
new_sz);
nb->limit = new_sz;
nb->line = (char**)new_block;
}
// now copy the line
nb->line[nb->size] = (char*)malloc(1 + strlen(p));
strcpy(nb->line[nb->size], p);
nb->size += 1;
// read next line
p = fgets(p, sizeof(line), F);
}; // while()
fclose(F);
return nb;
}
void usage()
{
fprintf(stderr,"Use: program file_to_load\n");
exit(EXIT_FAILURE);
}
Try something like this:
#include <stdio.h>
#include <stdlib.h> // for malloc(), free(), exit()
#include <string.h> // for strcpy()
#define MAXLINES 4
#define MAXLENGTH 1000
char *lines[MAXLINES];
void readlines() {
for( int i = 0; i < MAXLINES; i++) {
char c, line[MAXLENGTH + 1]; // ALWAYS one extra to allow for '\0'
int j = 0;
// RE-USE(!) local array for input characters until NL or length
// NB: Casting return value to character (suppress warning)
while( (c = (char)getchar()) != '\0' && c != '\n' && j < MAXLENGTH )
line[ j++ ] = c;
line[j] = '\0'; // terminate array (transforming it to 'string')
// Attempt to get a buffer to preserve this line
// (Old) compiler insists on casting return from malloc()
if( ( lines[i] = (char*)malloc( (j + 1) * sizeof lines[0][0] ) ) == NULL ) {
fprintf( stderr, "malloc failure\n" );
exit( -1 );
}
strcpy( lines[i], line ); // preserve this line
}
}
int my_main() {
readlines(); // only returns after successfully reading 4 lines of input
for( int i = 0; i < MAXLINES; i++)
printf( "Line %d: '%s'\n", i, lines[i] ); // enhanced
/* Maybe do stuff here */
for( int j = 0; j < MAXLINES; j++) // free up allocated memory.
free( lines[j] );
return 0;
}
If you would prefer to 'factor out` some code (and have a facility that you've written is absent, here's a version:
char *my_strdup( char *str ) {
int len = strlen( str ) + 1; // ALWAYS +1
// Attempt to get a buffer to preserve this line
// (Old) compiler insists on casting return from malloc()
char *pRet = (char*)malloc( len * sizeof *pRet );
if( pRet == NULL ) {
fprintf( stderr, "malloc failure\n" );
exit( -1 );
}
return strcpy( pRet, str );
}
The the terminating and preserve is condensed to:
line[j] = '\0'; // terminate array (transforming it to 'string')
lines[i] = my_strdup( line ); // preserve this line
Related
I probably got an easy one for the C programmers out there!
I am trying to create a simple C function that will execute a system command in and write the process output to a string buffer out (which should be initialized as an array of strings of length n). The output needs to be formatted in the following way:
Each line written to stdout should be initialized as a string. Each of these strings has variable length. The output should be an array consisting of each string. There is no way to know how many strings will be written, so this array is also technically of variable length (but for my purposes, I just create a fixed-length array outside the function and pass its length as an argument, rather than going for an array that I would have to manually allocate memory for).
Here is what I have right now:
#define MAX_LINE_LENGTH 512
int exec(const char* in, const char** out, const size_t n)
{
char buffer[MAX_LINE_LENGTH];
FILE *file;
const char terminator = '\0';
if ((file = popen(in, "r")) == NULL) {
return 1;
}
for (char** head = out; (size_t)head < (size_t)out + n && fgets(buffer, MAX_LINE_LENGTH, file) != NULL; head += strlen(buffer)) {
*head = strcat(buffer, &terminator);
}
if (pclose(file)) {
return 2;
}
return 0;
}
and I call it with
#define N 128
int main(void)
{
const char* buffer[N];
const char cmd[] = "<some system command resulting in multi-line output>";
const int code = exec(cmd, buffer, N);
exit(code);
}
I believe the error the above code results in is a seg fault, but I'm not experienced enough to figure out why or how to fix.
I'm almost positive it is with my logic here:
for (char** head = out; (size_t)head < (size_t)out + n && fgets(buffer, MAX_LINE_LENGTH, file) != NULL; head += strlen(buffer)) {
*head = strcat(buffer, &terminator);
}
What I thought this does is:
Get a mutable reference to out (i.e. the head pointer)
Save the current stdout line to buffer (via fgets)
Append a null terminator to buffer (because I don't think fgets does this?)
Overwrite the data at head pointer with the value from step 3
Move head pointer strlen(buffer) bytes over (i.e. the number of chars in buffer)
Continue until fgets returns NULL or head pointer has been moved beyond the bounds of out array
Where am I wrong? Any help appreciated, thanks!
EDIT #1
According to Barmar's suggestions, I edited my code:
#include <stdio.h>
#include <stdlib.h>
#define MAX_LINE_LENGTH 512
int exec(const char* in, const char** out, const size_t n)
{
char buffer[MAX_LINE_LENGTH];
FILE *file;
if ((file = popen(in, "r")) == NULL) return 1;
for (size_t i = 0; i < n && fgets(buffer, MAX_LINE_LENGTH, file) != NULL; i += 1) out[i] = buffer;
if (pclose(file)) return 2;
return 0;
}
#define N 128
int main(void)
{
const char* buffer[N];
const char cmd[] = "<system command to run>";
const int code = exec(cmd, buffer, N);
for (int i = 0; i < N; i += 1) printf("%s", buffer[i]);
exit(code);
}
While there were plenty of redundancies with what I wrote that are now fixed, this still causes a segmentation fault at runtime.
Focusing on the edited code, this assignment
out[i] = buffer;
has problems.
In this expression, buffer is implicitly converted to a pointer-to-its-first-element (&buffer[0], see: decay). No additional memory is allocated, and no string copying is done.
buffer is rewritten every iteration. After the loop, each valid element of out will point to the same memory location, which will contain the last line read.
buffer is an array local to the exec function. Its lifetime ends when the function returns, so the array in main contains dangling pointers. Utilizing these values is Undefined Behaviour.
Additionally,
for (int i = 0; i < N; i += 1)
always loops to the maximum storable number of lines, when it is possible that fewer lines than this were read.
A rigid solution uses an array of arrays to store the lines read. Here is a cursory example (see: this answer for additional information on using multidimensional arrays as function arguments).
#include <stdio.h>
#include <stdlib.h>
#define MAX_LINES 128
#define MAX_LINE_LENGTH 512
int exec(const char *cmd, char lines[MAX_LINES][MAX_LINE_LENGTH], size_t *lc)
{
FILE *stream = popen(cmd, "r");
*lc = 0;
if (!stream)
return 1;
while (*lc < MAX_LINES) {
if (!fgets(lines[*lc], MAX_LINE_LENGTH, stream))
break;
(*lc)++;
}
return pclose(stream) ? 2 : 0;
}
int main(void)
{
char lines[MAX_LINES][MAX_LINE_LENGTH];
size_t n;
int code = exec("ls -al", lines, &n);
for (size_t i = 0; i < n; i++)
printf("%s", lines[i]);
return code;
}
Using dynamic memory is another option. Here is a basic example using strdup(3), lacking robust error handling.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char **exec(const char *cmd, size_t *length)
{
FILE *stream = popen(cmd, "r");
if (!stream)
return NULL;
char **lines = NULL;
char buffer[4096];
*length = 0;
while (fgets(buffer, sizeof buffer, stream)) {
char **reline = realloc(lines, sizeof *lines * (*length + 1));
if (!reline)
break;
lines = reline;
if (!(lines[*length] = strdup(buffer)))
break;
(*length)++;
}
pclose(stream);
return lines;
}
int main(void)
{
size_t n = 0;
char **lines = exec("ls -al", &n);
for (size_t i = 0; i < n; i++) {
printf("%s", lines[i]);
free(lines[i]);
}
free(lines);
}
I'm trying to open a file, read the content line by line (excluding the empty lines) and store all these lines in an array, but seems I cannot come to the solution.
#include <stdio.h>
#include <stdlib.h>
int main()
{
char buffer[500];
FILE *fp;
int lineno = 0;
int n;
char topics[lineno];
if ((fp = fopen("abc.txt","r")) == NULL){
printf("Could not open abc.txt\n");
return(1);
}
while (!feof(fp))
{
// read in the line and make sure it was successful
if (fgets(buffer,500,fp) != NULL){
if(buffer[0] == '\n'){
}
else{
strncpy(topics[lineno],buffer, 50);
printf("%d: %s",lineno, topics[lineno]);
lineno++;
printf("%d: %s",lineno, buffer);
}
}
}
return(0);
}
Considering "abc.txt" contains four lines (the third one is empty) like the following:
ab
2
4
I have been trying several ways but all I'm getting now is segmentation fault.
It is mostly because you are trying to store the read line in a 0 length array
int lineno = 0;
int n;
char topics[lineno]; //lineno is 0 here
There are more mistakes in your program after you correct the above mentioned one.
strncpy() needs a char* as its first parameter, and you are passing it a char.
If you want to store all the lines, in a manner such that array[0] is the first line, array[1] is the next one, then you would need an `array of char pointers.
Something like this
char* topics[100];
.
.
.
if (fgets(buffer,500,fp) != NULL){
if(buffer[0] == '\n'){
}
else{
topics[lineno] = malloc(128);
strncpy(topics[lineno],buffer, 50);
printf("%d: %s",lineno, topics[lineno]);
lineno++;
printf("%d: %s",lineno, buffer);
}
NOTE:
Use the standard definition of main()
int main(void) //if no command line arguments.
Bonus
Since you have accidentally stepped onto 0 length array, do read about it here.
This declaration of a variable length array
int lineno = 0;
char topics[lineno];
is invalid because the size of the array may not be equal to 0 and does not make sense in the context of the program/
You could dynamically allocate an array of pojnters to char that is of type char * and reallocate it each time when a new record is added.
For example
int lineno = 0;
int n;
char **topics = NULL;
//...
char **tmp = realloc( topics, ( lineno + 1 ) * sizeof( char * ) );
if ( tmp != NULL )
{
topics = tmp;
topics[lineno] = malloc( 50 * sizeof( char ) );
//... copy the string and so on
++lineno;
}
The program runs fine except for the last free, which results in the program freezing.
When I comment out the last 'free' it runs fine.
The program gets all substrings from a string and returns it.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char** getPrefixes(char* invoer);
int main()
{
char buffer[100];
char *input;
char **prefixes;
int counter = 0;
puts("Give string.");
fgets(buffer, 99, stdin);
fflush(stdin);
if (buffer[strlen(buffer) - 1] == '\n')
buffer[strlen(buffer) - 1] = '\0';
input= (char*)malloc(strlen(buffer) + 1);
if (input == NULL)
{
puts("Error allocating memory.");
return;
}
strcpy(input, buffer);
prefixes = (char**) getPrefixes(input);
for (counter = strlen(input); counter > 0; counter--)
{
puts(prefixes[counter]);
free(prefixes[counter]);
}
free(input);
free(prefixes);
}
char** getPrefixes(char* input)
{
char** prefixes;
int counter;
prefixes = malloc(strlen(input) * sizeof(char*));
if (prefixes == NULL)
{
puts("ELM.");
return NULL;
}
for (counter= strlen(input); counter> 0; counter--)
{
prefixes[counter] = (char*)malloc(counter + 1);
strcpy(prefixes[counter], input);
input++;
}
return prefixes;
}
Thanks in advance!
The reason for your program freezing is simple: undefined behaviour + invalid return values: _Your main function returns void, not an int: add return 0 ASAP! If you type in echo $? in your console after executing your compiled binary, you should see a number other than 0. This is the program's exit code. anything other than 0 means trouble. if the main did not return an int, it's bad news.
Next:
The undefined behaviour occurs in a couple of places, for example right here:
prefixes = malloc(strlen(input) * sizeof(char*));
//allocate strlen(input) pointers, if input is 10 long => valid indexes == 0-9
for (counter= strlen(input); counter> 0; teller--)
{//teller doesn't exist, so I assume you meant "counter--"
prefixes[teller] = (char*)malloc(counter + 1);//first call prefixes[10] ==> out of bounds
strcpy(prefixes[counter], input);//risky, no zero-termination... use calloc + strncpy
input++;
}
Then, when free-ing the memory, you're not freeing the pointer # offset 0, so the free(prefixes) call is invalid:
for (counter = strlen(input); counter > 0; counter--)
{//again 10 --> valid offsets are 9 -> 0
puts(prefixes[counter]);
free(prefixes[counter]);
}
free(prefixes);//wrong
Again, valid indexes are 0 and up, your condition in the loop (counter > 0) means that the loop breaks whenever counter is 0. You, at no point, are freeing the first pointer in the array, the one at index/offstet 0.
Write your loops like everyone would:
for (int i=0, size_t len = strlen(input); i<len; ++i)
{
printf("%d\n", i);//prints 0-9... 10 lines, all valid indexes
}
Change your loops, and make sure you're only using the valid offsets and you _should be good to go. using strncpy, you can still get the same result as before:
for (int i=0;i<len;++i)
{
//or malloc(i+2), char is guaranteed to be 1
//I tend to use `calloc` to set all chars to 0 already, and ensure zero-termination
prefixes[i] = malloc((i+2)*sizeof(*prefixes[i]));
strncpy(prefixes[i], input, i+1);//max 1 - 10 chars are copied
}
If we apply this to your code, and re-write it like so:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char** getPrefixes(char* input);
int main( void )
{
char *input;
char **prefixes;
int counter, i;
input= calloc(50,1);
if (input == NULL)
{
puts("Error allocating memory.");
return;
}
strcpy(input, "teststring");
prefixes = getPrefixes(input);
counter = strlen(input);
for (i=0; i<counter;++i)
{
puts(prefixes[i]);
free(prefixes[i]);
}
free(input);
free(prefixes);
return 0;
}
char** getPrefixes(char* input)
{
int i, counter = strlen(input);
char** prefixes = malloc(counter * sizeof *prefixes);
if (prefixes == NULL)
{
puts("ELM.");
return NULL;
}
for (i=0; i<counter; ++i)
{
prefixes[i] = calloc(i + 2,sizeof *prefixes[i]);
strncpy(prefixes[i], input, i+1);
}
return prefixes;
}
The output we get is:
t
te
tes
test
tests
testst
teststr
teststri
teststrin
teststring
As you can see for yourself
on this codepad
allocating memory for pointer to pointer:
char** cArray = (char**)malloc(N*sizeof(char*));
for(i=0;i<N;i++)
cArray[i] = (char*)malloc(M*sizeof(char));
De-allocating memory - in reverse order:
for(i=0;i<N;i++)
free(cArray[i]);
free(cArray)
I hope this gives you a little insight on what's wrong.
you are calling strcpy with prefixes[counter] as destination. However, you've only allocated 4/8 bytes per prefixes[counter] depending on the size of (char*)
When you call strcpy you're copying all of input all the way to the end requiring strlen(input)! space
Doing this will corrupt the heap which might explain why the program is freezing.
I am trying to read lines from a file and store them in a multidimensional char pointer. When I run my code, it runs without errors, however, the lines will not be printed correctly in my main() function, but prints correctly in the getline() function. Can anybody explain what is happening and how I correctly store them in my multidimensional pointer?
My code:
int main (int argc, const char * argv[]) {
char **s = (char**) malloc(sizeof(char*) * 1000);
int i = 0;
while (getline(s[i]) != -1){
printf("In array: %s \n", s[i]);
i++;
}
return 0;
}
int getline(char *s){
s = malloc(sizeof(char*) * 1000);
int c, i = 0;
while ((c = getchar()) != '\n' && c != EOF) {
s[i++] = c;
}
s[i] = '\0';
printf("String: %s \n", s);
if (c == EOF) {
return -1;
}
return 0;
}
and my output:
String: First line
<br>In array: (null)
<br>String: Second line
<br>In array: (null)
<br>String: Third line
<br>In array: (null)
You are passing by value. Even though you change the value of *s in getline(), main does not see it. You have to pass the address of s[i] so that getline() can change it:
int getline(char **s) {
* s= malloc( sizeof(char) * 1000) ;
...
}
Also, if you want to be a bit more efficient with memory, read the line into a local buffer (of size 1000) if you want. Then when you are done reading the line, allocate only the memory you need to store the actual string.
int getline( char ** s )
{
char tmpstr[1000] ;
...
tmpstr[i++]= c ;
}
tmpstr[i]= '\0' ;
* s= strdup( tmpstr) ;
return 0 ;
}
If you want to improve things even further, take a step back and thing about a few things. 1) allocating the two parts of the multi-dimensional array in two different functions is going to make it harder for others to understand. 2) passing in a temporary string from outside to getline() would allow it to be significantly simpler:
int main()
{
char ** s= (char **) malloc( 1000 * sizeof(char *)) ;
char tmpstr[1000] ;
int i ;
while ( -1 != getline( tmpstr))
{
s[i ++]= strdup( tmpstr) ;
}
return 0 ;
}
int getline( char * s)
{
int c, i = 0 ;
while (( '\n' != ( c= getchar())) && ( EOF != c )) { s[i ++]= c ; }
s[i]= '\0' ;
return ( EOF == c ) ? -1 : 0 ;
}
Now, getline is just about IO, and all the allocation of s is handled in one place, and thus easier to reason about.
The problem is that this line inside getline function
s = malloc(sizeof(char) * 1000); // Should be sizeof(char), not sizeof(char*)
has no effect on the s[i] pointer passed in. This is because pointers are passed by value.
You have two choices here:
Move your memory allocation into main, and keep passing the pointer, or
Keep your allocation in getline, but pass it a pointer to pointer from main.
Here is how you change main for the first option:
int main (int argc, const char * argv[]) {
char **s = malloc(sizeof(char*) * 1000);
int i = 0;
for ( ; ; ) {
s[i] = malloc(sizeof(char) * 1000);
if (getline(s[i]) == -1) break;
printf("In array: %s \n", s[i]);
i++;
}
return 0;
}
My advise: completely avoid writing your own getline() function, and avoid all fixed size buffers. In the POSIX.1-2008 standart, there is already a getline() function. So you can do this:
//Tell stdio.h that we want POSIX.1-2008 functions
#define _POSIX_C_SOURCE 200809L
#include <stdio.h>
int main() {
char** s = NULL;
int count = 0;
do {
count++;
s = realloc(s, count*sizeof(char*));
s[count-1] = NULL; //so that getline() will allocate the memory itself
} while(getline(&s[count-1], NULL, stdin));
for(int i = 0; i < count; i++) printf(s[i]);
}
Why do you use malloc in the first place. malloc is very very dangerous!!
Just use s[1000][1000] and pass &s[0][0] for the first line, &s[1][0] for the second line etc.
For printing printf("%s \n", s[i]); where i is the line you want.
I am writing some code that needs to read fasta files, so part of my code (included below) is a fasta parser. As a single sequence can span multiple lines in the fasta format, I need to concatenate multiple successive lines read from the file into a single string. I do this, by realloc'ing the string buffer after reading every line, to be the current length of the sequence plus the length of the line read in. I do some other stuff, like stripping white space etc. All goes well for the first sequence, but fasta files can contain multiple sequences. So similarly, I have a dynamic array of structs with a two strings (title, and actual sequence), being "char *". Again, as I encounter a new title (introduced by a line beginning with '>') I increment the number of sequences, and realloc the sequence list buffer. The realloc segfaults on allocating space for the second sequence with
*** glibc detected *** ./stackoverflow: malloc(): memory corruption: 0x09fd9210 ***
Aborted
For the life of me I can't see why. I've run it through gdb and everything seems to be working (i.e. everything is initialised, the values seems sane)... Here's the code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <ctype.h>
#include <math.h>
#include <errno.h>
//a struture to keep a record of sequences read in from file, and their titles
typedef struct {
char *title;
char *sequence;
} sequence_rec;
//string convenience functions
//checks whether a string consists entirely of white space
int empty(const char *s) {
int i;
i = 0;
while (s[i] != 0) {
if (!isspace(s[i])) return 0;
i++;
}
return 1;
}
//substr allocates and returns a new string which is a substring of s from i to
//j exclusive, where i < j; If i or j are negative they refer to distance from
//the end of the s
char *substr(const char *s, int i, int j) {
char *ret;
if (i < 0) i = strlen(s)-i;
if (j < 0) j = strlen(s)-j;
ret = malloc(j-i+1);
strncpy(ret,s,j-i);
return ret;
}
//strips white space from either end of the string
void strip(char **s) {
int i, j, len;
char *tmp = *s;
len = strlen(*s);
i = 0;
while ((isspace(*(*s+i)))&&(i < len)) {
i++;
}
j = strlen(*s)-1;
while ((isspace(*(*s+j)))&&(j > 0)) {
j--;
}
*s = strndup(*s+i, j-i);
free(tmp);
}
int main(int argc, char**argv) {
sequence_rec *sequences = NULL;
FILE *f = NULL;
char *line = NULL;
size_t linelen;
int rcount;
int numsequences = 0;
f = fopen(argv[1], "r");
if (f == NULL) {
fprintf(stderr, "Error opening %s: %s\n", argv[1], strerror(errno));
return EXIT_FAILURE;
}
rcount = getline(&line, &linelen, f);
while (rcount != -1) {
while (empty(line)) rcount = getline(&line, &linelen, f);
if (line[0] != '>') {
fprintf(stderr,"Sequence input not in valid fasta format\n");
return EXIT_FAILURE;
}
numsequences++;
sequences = realloc(sequences,sizeof(sequence_rec)*numsequences);
sequences[numsequences-1].title = strdup(line+1); strip(&sequences[numsequences-1].title);
rcount = getline(&line, &linelen, f);
sequences[numsequences-1].sequence = malloc(1); sequences[numsequences-1].sequence[0] = 0;
while ((!empty(line))&&(line[0] != '>')) {
strip(&line);
sequences[numsequences-1].sequence = realloc(sequences[numsequences-1].sequence, strlen(sequences[numsequences-1].sequence)+strlen(line)+1);
strcat(sequences[numsequences-1].sequence,line);
rcount = getline(&line, &linelen, f);
}
}
return EXIT_SUCCESS;
}
You should use strings that look something like this:
struct string {
int len;
char *ptr;
};
This prevents strncpy bugs like what it seems you saw, and allows you to do strcat and friends faster.
You should also use a doubling array for each string. This prevents too many allocations and memcpys. Something like this:
int sstrcat(struct string *a, struct string *b)
{
int len = a->len + b->len;
int alen = a->len;
if (a->len < len) {
while (a->len < len) {
a->len *= 2;
}
a->ptr = realloc(a->ptr, a->len);
if (a->ptr == NULL) {
return ENOMEM;
}
}
memcpy(&a->ptr[alen], b->ptr, b->len);
return 0;
}
I now see you are doing bioinformatics, which means you probably need more performance than I thought. You should use strings like this instead:
struct string {
int len;
char ptr[0];
};
This way, when you allocate a string object, you call malloc(sizeof(struct string) + len) and avoid a second call to malloc. It's a little more work but it should help measurably, in terms of speed and also memory fragmentation.
Finally, if this isn't actually the source of error, it looks like you have some corruption. Valgrind should help you detect it if gdb fails.
One potential issue is here:
strncpy(ret,s,j-i);
return ret;
ret might not get a null terminator. See man strncpy:
char *strncpy(char *dest, const char *src, size_t n);
...
The strncpy() function is similar, except that at most n bytes of src
are copied. Warning: If there is no null byte among the first n bytes
of src, the string placed in dest will not be null terminated.
There's also a bug here:
j = strlen(*s)-1;
while ((isspace(*(*s+j)))&&(j > 0)) {
What if strlen(*s) is 0? You'll end up reading (*s)[-1].
You also don't check in strip() that the string doesn't consist entirely of spaces. If it does, you'll end up with j < i.
edit: Just noticed that your substr() function doesn't actually get called.
I think the memory corruption problem might be the result of how you're handling the data used in your getline() calls. Basically, line is reallocated via strndup() in the calls to strip(), so the buffer size being tracked in linelen by getline() will no longer be accurate. getline() may overrun the buffer.
while ((!empty(line))&&(line[0] != '>')) {
strip(&line); // <-- assigns a `strndup()` allocation to `line`
sequences[numsequences-1].sequence = realloc(sequences[numsequences-1].sequence, strlen(sequences[numsequences-1].sequence)+strlen(line)+1);
strcat(sequences[numsequences-1].sequence,line);
rcount = getline(&line, &linelen, f); // <-- the buffer `line` points to might be
// smaller than `linelen` bytes
}