I am trying to write a function that reads a text file and copies each line of the text file into a line of an array that is passed into the function.
void read_lines(FILE* fp, char*** lines, int* num_lines) {
int i = 0, line_count = 0;
char line[256], c;
fscanf(fp, "%c", &c);
while(!feof(fp)){
if(c == '\n') {
++line_count;
}
printf("%c", c);
fscanf(fp, "%c", &c);
}
rewind(fp);
*num_lines = line_count;
lines = (char***)malloc(line_count * sizeof(char**));
while (fgets(line, sizeof(line), fp) != NULL) {
lines[i] = (char**)malloc(strlen(line) * sizeof(char*));
strcpy(*lines[i], line);
}
++i;
}
}
The initial part scans for newlines so that I know how much to allocate to lines initially. I am not sure where I am going wrong.
Additionally, if anybody has any resources that could help me to better understand how to dynamically allocate space, that would be greatly appreciated.
You should understand how the pointers work. After that, dynamical memory allocation task would be pretty trivial. Right now your code is completely wrong:
//here you assign to the argument. While this is technically allowed
//it is most certainly not what you have intended
lines = (char***)malloc(line_count * sizeof(char**));
while (fgets(line, sizeof(line), fp) != NULL) {
//sizeof(char*) <> sizeof(char). Also you need a space for the trailing \0
lines[i] = (char**)malloc(strlen(line) * sizeof(char*));
//[] precedes * so it is copying the string somewhere you not intend to
strcpy(*lines[i], line);
}
++i;
}
Correct version should be:
*lines = malloc(line_count * sizeof(char*));
while (fgets(line, sizeof(line), fp) != NULL) {
(*lines)[i] = malloc((strlen(line) + 1) * sizeof(char));
strcpy((*lines)[i], line);
}
++i;
}
Note, that you need to use (*lines)[i] construct, because [] operator precedes * (dereference) operator.
Code is making various mistakes including the key ones detailed by #Ari0nhh
The other is the counting the '\n' can fail to get the correct number of lines vs. fgets() in 3 ways:
Line exceeds 256.
More than INT_MAX lines.
Last line does not end with a '\n'.
Suggest instead use the same loop to count "lines"
unsigned long long line_count = 0;
while (fgets(line, sizeof(line), fp) != NULL) {
line_count++;
}
rewind(fp);
....
assert(line_count <= SIZE_MAX/sizeof *(*lines));
*lines = malloc(sizeof *(*lines) * line_count);
Related
I got this piece of code:
void scanLinesforArray(FILE* file, char search[], int* lineNr){
char line[1024];
int line_count = 0;
while(fgets(line, sizeof(line),file) !=NULL){
++line_count;
printf("%d",line_count);
printf(line);
char *temp = malloc(strlen(line));
// strncpy(temp,line,sizeof(line));
// printf("%s\n",temp);
free(temp);
continue;
}
}
This will print all lines of the file, but as soon as I uncomment the strncpy(), the program just stops without error.
Same happens as soon as I use strstr() to compare the line to my search variable.
I tried the continue statement and other redundant things, but nothing helps.
Many problems:
Do not print a general string as a format
Code risks undefined behavior should the string contain a %.
// printf(line); // BAD
printf("%s", line);
// or
fputs(line, stdout);
Bad size
strncpy(temp,line,sizeof(line)); is like strncpy(temp,line, 1024);, yet temp points to less than 1024 allocated bytes. Code attempts to write outside allocated memory. Undefined behavior (UB).
Rarely should code use strncpy().
Bad specifier
%s expects a match string. temp does not point to a string as it lacks a null character. Instead allocated for the '\0'.
// printf("%s\n", temp);`.
char *temp = malloc(strlen(line) + 1); // + 1
strcpy(temp,line);
printf("<%s>", temp);
free(temp);
No compare
"Can't compare Lines of a file in C" is curious as there is no compare code.
Recall fgets() typically retains a '\n' in line[].
Perhaps
long scanLinesforArray(FILE* file, const char search[]){
char line[1024*4]; // Suggest wider buffer - should be at least as wide as the search string.
long line_count = 0; // Suggest wider type
while(fgets(line, sizeof line, file)) {
line_count++;
line[strcspn(line, "\n")] = 0; // Lop off potential \n
if (strcmp(line, search) == 0) {
return line_count;
}
}
return 0; // No match
}
Advanced: Sample better performance code.
long scanLinesforArray(FILE *file, const char search[]) {
size_t len = strlen(search);
size_t sz = len + 1;
if (sz < BUFSIZ) sz = BUFSIZ;
if (sz > INT_MAX) {
return -2; // Too big for fgets()
}
char *line = malloc(sz);
if (line == NULL) {
return -1;
}
long line_count = 0;
while (fgets(line, (int) sz, file)) {
line_count++;
if (memcmp(line, search, len) == 0) {
if (line[len] == '\n' || line[len] == 0) {
free(line);
return line_count;
}
}
}
free(line);
return 0; // No match
}
I seem to be losing the reference to my pointers here. I dont know why but I suspect its the pointer returned by fgets that messes this up.
I was told a good way to read words from a file was to get the line then separate the words with strok, but how can I do this if my pointers inside words[i] keep dissapearing.
text
Natural Reader is
john make tame
Result Im getting.
array[0] = john
array[1] = e
array[2] =
array[3] = john
array[4] = make
array[5] = tame
int main(int argc, char *argv[]) {
FILE *file = fopen(argv[1], "r");
int ch;
int count = 0;
while ((ch = fgetc(file)) != EOF){
if (ch == '\n' || ch == ' ')
count++;
}
fseek(file, 0, SEEK_END);
size_t size = ftell(file);
fseek(file, 0, SEEK_SET);
char** words = calloc(count, size * sizeof(char*) +1 );
int i = 0;
int x = 0;
char ligne [250];
while (fgets(ligne, 80, file)) {
char* word;
word = strtok(ligne, " ,.-\n");
while (word != NULL) {
for (i = 0; i < 3; i++) {
words[x] = word;
word = strtok(NULL, " ,.-\n");
x++;
}
}
}
for (i = 0; i < count; ++i)
if (words[i] != 0){
printf("array[%d] = %s\n", i, words[i]);
}
free(words);
fclose(file);
return 0;
}
strtok does not allocate any memory, it returns a pointer to a delimited string in the buffer.
therefore you need to allocate memory for the result if you want to keep the word between loop iterations
e.g.
word = strdup(strtok(ligne, " ,.-\n"));
You could also hanle this by using a unique ligne for each line read, so make it an array of strings like so:
char ligne[20][80]; // no need to make the string 250 since fgets limits it to 80
Then your while loop changes to:
int lno = 0;
while (fgets(ligne[lno], 80, file)) {
char *word;
word = strtok(ligne[lno], " ,.-\n");
while (word != NULL) {
words[x++] = word;
word = strtok(NULL, " ,.-\n");
}
lno++;
}
Adjust the first subscript as needed for the maximum size of the file, or dynamically allocate the line buffer during each iteration if you don't want such a low limit. You could also use getline instead of fgets, if your implementation supports it; it can handle the allocation for, though you then need to free the blocks when you are done.
If you are processing real-world prose, you might want to include other delimiters in your list, like colon, semicolon, exclamation point, and question mark.
I am trying to solve a problem on Dynamic memory allocation by reading the input from a file by malloc(),free(),realloc(); i just need help to push the strings into an array from the file, without the commas . My test.txt file are as follows:
a,5,0
a,25,1
a,1,2
r,10,1,3
f,2
int i;
int count;
char line[256];
char *str[20];//to store the strings without commas
char ch[20];
int main (void)
{
FILE *stream;
if ( (stream = fopen ( "test.txt", "r" )) == NULL )
{ printf ("Cannot read the new file\n");
exit (1);
}
while(fgets(line, sizeof line, stream))
{
printf ("%s", line);
int length = strlen(line);
strcpy(ch,line);
for (i=0;i<length;i++)
{
if (ch[i] != ',')
{
printf ("%c", ch[i]);
}
}
}
//i++;
//FREE(x);
//FREE(y);
//FREE(z);
fclose (stream);
the str[] array should only store values like a520. (excluding the commas)
First of all DO NOT use global variables unless it is absolutely requires.
I am assuming you want str as array of pointers and str[0] stores first line, str[1] stores second line and so on.
For this:
int line_pos = 0; //stores line_number
int char_pos = 0; //stores position in str[line_pos]
while(fgets(line, sizeof(line), stream))
{
printf ("%s", line);
int length = strlen(line);
strcpy(ch,line);
str[line_pos] = calloc(length, sizeof(char)); //allocating memory
for (i=0;i<length;i++)
{
if (ch[i] != ',')
{
*(str[line_pos]+char_pos) = ch[i]; //setting value of str[line][pos]
char_pos++;
}
}
char_pos = 0;
line_pos++;
}
printf("%s", str[0]); //print first line without comma
Note that it only works for 20 lines (because you declared *str[20]) and then for 21st or later lines it leads to overflow and can cause variety of disasters. You can include:
if (line_pos >= 20)
break;
as a safety measure.
Note that slighty more memory is allocated for str(memory allocated = memory_required + number of comma). To prevent this you can set ch to text without comma:
for (i=0;i<length;i++)
{
int j = 0; //stores position in ch
if (line[i] != ',')
{
ch[j++] = line[i];
}
Then allocate memory for str[line_pos] like:
str[line_pos] = calloc(strlen(ch0, sizeof(char));
I've been searching on how to allocate a dynamic buffer using fgets, but I can't seem to get it on this example. The file has two numbers of unknown length separated by a white-space. For every line it reads each character until ' ' and \n and prints it.
char *ptr;
char line[MAX];
while(fgets(line, sizeof line , fp) != NULL){
ptr = line;
for(i=0; i<2; i++){
while(*ptr && (*ptr) != ' '){
if(*ptr == ' ')
break;
k = (*ptr) - '0';
if(k != -38) // wont print '\n'
printf("%d", k);
ptr++;
}
while(*ptr && (*ptr) != '\n') {
if(*ptr == ' '){
ptr++;
continue;
}
k = (*ptr) - '0';
printf("%d", k);
ptr++;
}
}
}
Can someone give me an idea on how to make line dynamic while still using ptr that way?
I think what you want is something like this:
size_t linelen = 80;
char *line = malloc(linelen);
while(magic_reallocating_fgets(&line, &linelen, fp) != NULL) {
/* ... do whatever you want with line ... */
}
But then, of course, the $64,000 question is, what does magic_reallocating_fgets look like? It's something like this:
char *magic_reallocating_fgets(char **bufp, size_t *sizep, FILE *fp) {
size_t len;
if(fgets(*bufp, *sizep, fp) == NULL) return NULL;
len = strlen(*bufp);
while(strchr(*bufp, '\n') == NULL) {
*sizep += 100;
*bufp = realloc(*bufp, *sizep);
if(fgets(*bufp + len, *sizep - len, fp) == NULL) return *bufp;
len += strlen(*bufp + len);
}
return *bufp;
}
That's not really complete code, it's almost pseudocode. I've left two things for you as exercises:
It has no error-checking on the malloc and realloc calls.
It's kinda grossly inefficient, in that it makes not one but two extra passes over each line it reads: to count the characters, and again to look for a '\n'. (It turns out fgets's interface isn't ideal for this kind of work.)
On systems with glibc >= 2.7, or POSIX.1-2008 support, what I think you want can be accomplished using:
char *line;
while (fscanf(f, "%m[^\n]\n", &line) == 1) {
/* do stuff with line */
free(line);
}
This works great on my Linux systems, but over in the Windows universe, Microsoft Visual C++ supports neither %m, nor any equivalent that I can find.
You cannot change the size of an array during run time in C. It is illegal. This is because arrays are allocated from the stack. To have the size be dynamic you would have to declare a pointer, and allocate the memory for it dynamically. This data is allocated from the heap.
You can change the size of allocated memory by using realloc.
int lineLen = 80;
char *line;
line = (char *)malloc(sizeof(char) * 80);
if (line == NULL) {
// Something went horribly wrong
exit(1);
}
while (fgets(line, lineLen, fp)) {
// Do something to find the size
line = (char *)realloc(line, sizeof(char) * newLen);
if (line == NULL) {
// Something went horribly wrong
exit(1);
}
}
However, allocating and reallocating memory is a rather expensive operation. As a result, you may be more effective by just choosing a big buffer size, if you can do that safely. If you have a short loop then it may not be significant enough to worry about, but it is probably not advisable to be constantly changing your buffer's size.
I need remove punctuation from a given string or a word. Here's my code:
void remove_punc(char* *str)
{
char* ps = *str;
char* nstr;
// should be nstr = malloc(sizeof(char) * (1 + strlen(*str)))
nstr = (char *)malloc(sizeof(char) * strlen(*str));
if (nstr == NULL) {
perror("Memory Error in remove_punc function");
exit(1);
}
// should be memset(nstr, 0, sizeof(char) * (1 + strlen(*str)))
memset(nstr, 0, sizeof(char) * strlen(*str));
while(*ps) {
if(! ispunct(*ps)) {
strncat(nstr, ps, 1);
}
++ps;
}
*str = strdup(nstr);
free(nstr);
}
If my main function is the simple one:
int main(void) {
char* str = "Hello, World!:)";
remove_punc(&str);
printf("%s\n", str);
return 0;
}
It works! The output is Hello World.
Now I want to read in a big file and remove punctuation from the file, then output to another file.
Here's another main function:
int main(void) {
FILE* fp = fopen("book.txt", "r");
FILE* fout = fopen("newbook.txt", "w");
char* str = (char *)malloc(sizeof(char) * 1024);
if (str == NULL) {
perror("Error -- allocating memory");
exit(1);
}
memset(str, 0, sizeof(char) * 1024);
while(1) {
if (fscanf(fp, "%s", str) != 1)
break;
remove_punc(&str);
fprintf(fout, "%s ", str);
}
return 0;
}
When I rerun the program in Visual C++, it reports a
Debug Error! DAMAGE: after Normal Block(#54)0x00550B08,
and the program is aborted.
So, I have to debug the code. Everything works until the statement free(nstr) being executed.
I get confused. Anyone can help me?
You forgot to malloc space for the null terminator. Change
nstr = (char *)malloc(sizeof(char) * strlen(*str));
to
nstr = malloc( strlen(*str) + 1 );
Note that casting malloc is a bad idea, and if you are going to malloc and then memset to zero, you could use calloc instead which does just that.
There is another bug later in your program. The remove_punc function changes str to point to a freshly-allocated buffer that is just big enough for the string with no punctuation. However you then loop up to fscanf(fp, "%s", str). This is no longer reading into a 1024-byte buffer, it is reading into just the buffer size of the previous punctuation-free string.
So unless your file contains lines all in descending order of length (after punctuation removal), you will cause a buffer overflow here. You'll need to rethink your design of this loop. For example perhaps you could have remove_punc leave the input unchanged, and return a pointer to the freshly-allocated string, which you would free after printing.
If you go with this solution, then use %1023s to avoid a buffer overflow with fscanf (unfortunately there's no simple way to take a variable here instead of hardcoding the length). Using a scanf function with a bare "%s" is just as dangerous as gets.
The answer by #MatMcNabb explains the causes of your problems. I'm going to suggest couple of ways you can simplify your code, and make it less susceptible to memory problems.
If performance is not an issue, read the file character by character and discard the puncuation characters.
int main(void)
{
FILE* fp = fopen("book.txt", "r");
FILE* fout = fopen("newbook.txt", "w");
char c;
while ( (c = fgetc(fp)) != EOF )
{
if ( !ispunct(c) )
{
fputc(c, fout);
}
}
fclose(fout);
fclose(fp);
return 0;
}
Minimize the number of calls to malloc and free by passing in the input string as well as the output string to remove_punc.
void remove_punc(char* inStr, char* outStr)
{
char* ps = inStr;
int index = 0;
while(*ps)
{
if(! ispunct(*ps))
{
outStr[index++] = *ps;
}
++ps;
}
outStr[index] = '\0';
}
and change the way you use remove_punc in main.
int main(void)
{
FILE* fp = fopen("book.txt", "r");
FILE* fout = fopen("newbook.txt", "w");
char inStr[1024];
char outStr[1024];
while (fgets(inStr, 1024, fp) != NULL )
{
remove_punc(inStr, outStr);
fprintf(fout, "%s", outStr);
}
fclose(fout);
fclose(fp);
return 0;
}
In your main you have the following
char* str = (char *)malloc(sizeof(char) * 1024);
...
remove_punc(&str);
...
Your remove_punc() function takes the address of str but when you do this in your remove_punc function
...
*str = strdup(nstr);
...
you are not copying the new string to the previously allocated buffer, you are reassigning str to point to the new line sized buffer! This means that when you read lines from the file and the next line to be read is longer than the previous line you will run into trouble.
You should leave the original buffer alone and instead e.g. return the new allocate buffer containing the new string e.g. return nstr and then free that when done with it or better yet just copy the original file byte by byte to the new file and exclude any punctuation. That would be far more effective