c - get file into array of chars - c

hi i have the following code below, where i try to get all the lines of a file into an array... for example if in file data.txt i have the following:
first line
second line
then in below code i want to get in data array the following:
data[0] = "first line";
data[1] = "second line"
My first question: Currently I am getting "Segmentation fault"... Why?
Exactly i get the following output:
Number of lines is 7475613
Segmentation fault
My second question: Is there any better way to do what i am trying do?
Thanks!!!
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char* argv[])
{
FILE *f = fopen("data.txt", "rb");
fseek(f, 0, SEEK_END);
long pos = ftell(f);
fseek(f, 0, SEEK_SET);
char *bytes = malloc(pos);
fread(bytes, pos, 1, f);
int i =0;
int counter = 0;
for(; i<pos; i++)
{
if(*(bytes+i)=='\n') counter++;
}
printf("\nNumber of lines is %d\n", counter);
char* data[counter];
int start=0, end=0;
counter = 0;
int length;
for(i=0; i<pos; i++)
{
if(*(bytes+i)=='\n')
{
end = i;
length =end-start;
data[counter]=(char*)malloc(sizeof(char)*(length));
strncpy(data[counter],
bytes+start,
length);
counter = counter+1;
start = end+1;
}
}
free(bytes);
return 0;
}
First line of the data.txt in this case is not '\n' it is: "23454555 6346346 3463463".
Thanks!

You need to malloc 1 more char for data[counter] for the terminating NUL.
after strncpy, you need to terminate the destination string.
Edit after edit of original question
Number of lines is 7475613
Whooooooaaaaaa, that's a bit too much for your computer!
If the size of a char * is 4, you want to reserve 29902452 bytes (30M) of automatic memory in the allocation of data.
You can allocate that memory dynamically instead:
/* char *data[counter]; */
char **data = malloc(counter * sizeof *data);
/* don't forget to free the memory when you no longer need it */
Edit: second question
My second question: Is there any
better way to do what i am trying do?
Not really; you're doing it right. But maybe you can code without the need to have all that data in memory at the same time.
Read and deal with a single line at a time.
You also need to free(data[counter]); in a loop ... and free(data); before the "you're doing it right" above is correct :)
And you need to check if each of the several malloc() calls succeeded LOL

First of all you need to check if the file got opened correctly or not:
FILE *f = fopen("data.txt", "rb");
if(!f)
{
fprintf(stderr,"Error opening file");
exit (1);
}
If there is error opening the file and you don't check it, you'll get a seg fault when you try to fseek on an invalid file pointer.
Apart from that I see no errors. Tried running the program, by printing the value of the data array at the end, it ran as expected.

One thing to note is that you're opening your file as binary - line termination disciplines may not work as you expect on your platform (UNIX is lf, Windows is cr-lf, some versions of MacOS are cr).

Related

Trying to read an unknown string length from a file using fgetc()

So yeah, saw many similar questions to this one, but thought to try solving it my way. Getting huge amount of text blocks after running it (it compiles fine).
Im trying to get an unknown size of string from a file. Thought about allocating pts at size of 2 (1 char and null terminator) and then use malloc to increase the size of the char array for every char that exceeds the size of the array.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main()
{
char *pts = NULL;
int temp = 0;
pts = malloc(2 * sizeof(char));
FILE *fp = fopen("txtfile", "r");
while (fgetc(fp) != EOF) {
if (strlen(pts) == temp) {
pts = realloc(pts, sizeof(char));
}
pts[temp] = fgetc(fp);
temp++;
}
printf("the full string is a s follows : %s\n", pts);
free(pts);
fclose(fp);
return 0;
}
You probably want something like this:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define CHUNK_SIZE 1000 // initial buffer size
int main()
{
int ch; // you need int, not char for EOF
int size = CHUNK_SIZE;
char *pts = malloc(CHUNK_SIZE);
FILE* fp = fopen("txtfile", "r");
int i = 0;
while ((ch = fgetc(fp)) != EOF) // read one char until EOF
{
pts[i++] = ch; // add char into buffer
if (i == size + CHUNK_SIZE) // if buffer full ...
{
size += CHUNK_SIZE; // increase buffer size
pts = realloc(pts, size); // reallocate new size
}
}
pts[i] = 0; // add NUL terminator
printf("the full string is a s follows : %s\n", pts);
free(pts);
fclose(fp);
return 0;
}
Disclaimers:
this is untested code, it may not work, but it shows the idea
there is absolutely no error checking for brevity, you should add this.
there is room for other improvements, it can probably be done even more elegantly
Leaving aside for now the question of if you should do this at all:
You're pretty close on this solution but there are a few mistakes
while (fgetc(fp) != EOF) {
This line is going to read one char from the file and then discard it after comparing it against EOF. You'll need to save that byte to add to your buffer. A type of syntax like while ((tmp=fgetc(fp)) != EOF) should work.
pts = realloc(pts, sizeof(char));
Check the documentation for realloc, you'll need to pass in the new size in the second parameter.
pts = malloc(2 * sizeof(char));
You'll need to zero this memory after acquiring it. You probably also want to zero any memory given to you by realloc, or you may lose the null off the end of your string and strlen will be incorrect.
But as I alluded to earlier, using realloc in a loop like this when you've got a fair idea of the size of the buffer already is generally going to be non-idiomatic C design. Get the size of the file ahead of time and allocate enough space for all the data in your buffer. You can still realloc if you go over the size of the buffer, but do so using chunks of memory instead of one byte at a time.
Probably the most efficient way is (as mentioned in the comment by Fiddling Bits) is to read the whole file in one go (after first getting the file's size):
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <sys/stat.h>
int main()
{
size_t nchars = 0; // Declare here and set to zero...
// ... so we can optionally try using the "stat" function, if the O/S supports it...
struct stat st;
if (stat("txtfile", &st) == 0) nchars = st.st_size;
FILE* fp = fopen("txtfile", "rb"); // Make sure we open in BINARY mode!
if (nchars == 0) // This code will be used if the "stat" function is unavailable or failed ...
{
fseek(fp, 0, SEEK_END); // Go to end of file (NOTE: SEEK_END may not be implemented - but PROBABLY is!)
// while (fgetc(fp) != EOF) {} // If your system doesn't implement SEEK_END, you can do this instead:
nchars = (size_t)(ftell(fp)); // Add one for NUL terminator
}
char* pts = calloc(nchars + 1, sizeof(char));
if (pts != NULL)
{
fseek(fp, 0, SEEK_SET); // Return to start of file...
fread(pts, sizeof(char), nchars, fp); // ... and read one great big chunk!
printf("the full string is a s follows : %s\n", pts);
free(pts);
}
else
{
printf("the file is too big for me to handle (%zu bytes)!", nchars);
}
fclose(fp);
return 0;
}
On the issue of the use of SEEK_END, see this cppreference page, where it states:
Library implementations are allowed to not meaningfully support SEEK_END (therefore, code using it has no real standard portability).
On whether or not you will be able to use the stat function, see this Wikipedia page. (But it is now available in MSVC on Windows!)

Find end of text in a text file padded with NULL characters in C [duplicate]

file looks like this:
abcd
efgh
ijkl
I want to read the file using C so that it read the last line first:
ijkl
efgh
abcd
I cannot seem to find a solution that does not use an array for storage. Please help.
edit0:
Thanks for all the answers. Just to let you know, I am the one creating this file. So, can I create in a way its in the reverse order? Is that possible?
It goes like this:
Seek to one byte before the end of the file using fseek. There's no guarantee that the last line will have an EOL so the last byte doesn't really matter.
Read one byte using fgetc.
If that byte is an EOL then the last line is a single empty line and you have it.
Use fseek again to go backwards two bytes and check that byte with fgetc.
Repeat the above until you find an EOL. When you have an EOL, the file pointer will be at the beginning of the next (from the end) line.
...
Profit.
Basically you have to keep doing (4) and (5) while keeping track of where you were when you found the beginning of a line so that you can seek back there before starting your scan for the beginning of the next line.
As long as you open your file in text mode you shouldn't have have to worry about multibyte EOLs on Windows (thanks for the reminder Mr. Lutz).
If you happen to be given a non-seekable input (such as a pipe), then you're out of luck unless you want to dump your input to a temporary file first.
So you can do it but it is rather ugly.
You could do pretty much the same thing using mmap and a pointer if you have mmap available and the "file" you're working with is mappable. The technique would be pretty much the same: start at the end and go backwards to find the end of the previous line.
Re: "I am the one creating this file. So, can I create in a way its in the reverse order? Is that possible?"
You'll run into the same sorts of problems but they'll be worse. Files in C are inherently sequential lists of bytes that start at the beginning and go to the end; you're trying to work against this fundamental property and going against the fundamentals is never fun.
Do you really need your data in a plain text file? Maybe you need text/plain as the final output but all the way through? You could store the data in an indexed binary file (possibly even an SQLite database) and then you'd only have to worry about keeping (or windowing) the index in memory and that's unlikely to be a problem (and if it is, use a "real" database); then, when you have all your lines, just reverse the index and away you go.
In pseudocode:
open input file
while (fgets () != NULL)
{
push line to stack
}
open output file
while (stack no empty)
{
pop stack
write popped line to file
}
The above is efficient, there is no seek (a slow operation) and the file is read sequentially. There are, however, two pitfalls to the above.
The first is the fgets call. The buffer supplied to fgets may not be big enough to hold a whole line from the input in which case you can do one of the following: read again and concatenate; push a partial line and add logic to the second half to fix up partial lines or wrap the line into a linked list and only push the linked list when a newline/eof is encountered.
The second pitfall will happen when the file is bigger than the available ram to hold the stack, in which case you'll need to write the stack structure to a temporary file whenever it reaches some threshold memory usage.
The following code should do the necessary inversion:
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
FILE *fd;
char len[400];
int i;
char *filename = argv[1];
int ch;
int count;
fd = fopen(filename, "r");
fseek(fd, 0, SEEK_END);
while (ftell(fd) > 1 ){
fseek(fd, -2, SEEK_CUR);
if(ftell(fd) <= 2)
break;
ch =fgetc(fd);
count = 0;
while(ch != '\n'){
len[count++] = ch;
if(ftell(fd) < 2)
break;
fseek(fd, -2, SEEK_CUR);
ch =fgetc(fd);
}
for (i =count -1 ; i >= 0 && count > 0 ; i--)
printf("%c", len[i]);
printf("\n");
}
fclose(fd);
}
The following works for me on Linux, where the text file line separator is "\n".
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void readfileinreverse(FILE *fp)
{
int i, size, start, loop, counter;
char *buffer;
char line[256];
start = 0;
fseek(fp, 0, SEEK_END);
size = ftell(fp);
buffer = malloc((size+1) * sizeof(char));
for (i=0; i< size; i++)
{
fseek(fp, size-1-i, SEEK_SET);
buffer[i] = fgetc(fp);
if(buffer[i] == 10)
{
if(i != 0)
{
counter = 0;
for(loop = i; loop > start; loop--)
{
if((counter == 0) && (buffer[loop] == 10))
{
continue;
}
line[counter] = buffer[loop];
counter++;
}
line[counter] = 0;
start = i;
printf("%s\n",line);
}
}
}
if(i > start)
{
counter = 0;
for(loop = i; loop > start; loop--)
{
if((counter == 0) && ((buffer[loop] == 10) || (buffer[loop] == 0)))
{
continue;
}
line[counter] = buffer[loop];
counter++;
}
line[counter] = 0;
printf("%s\n",line);
return;
}
}
int main()
{
FILE *fp = fopen("./1.txt","r");
readfileinreverse(fp);
return 0;
}
Maybe , The does the trick , It reverse content of the file in whole
just like a string
Define a variable of type string with size of your file
Get Contents of the file and store in the variable
Use strrev() to reverse the string.
You can later on display the output or even write it to a file. The code goes like this:
#include <stdio.h>
#include <String.h>
int main(){
FILE *file;
char all[1000];
// give any name to read in reverse order
file = fopen("anyFile.txt","r");
// gets all the content and stores in variable all
fscanf(file,"%[]",all);
// Content of the file
printf("Content Of the file %s",all);
// reverse the string
printf("%s",strrev(all));
fclose(file);
return 0;
}
I know this question has been awnsered, but the accepted awnser does not contain a code snippet and the other snippets feel too complex.
This is my implementation:
#include <stdio.h>
long file_size(FILE* f) {
fseek(f, 0, SEEK_END); // seek to end of file
long size = ftell(f); // get current file pointer
fseek(f, 0, SEEK_SET); // seek back to beginning of file
return size;
}
int main(int argc, char* argv[]) {
FILE *in_file = fopen(argv[1], "r");
long in_file_size = file_size(in_file);
printf("Got file size: %ld\n", in_file_size);
// Start from end of file
fseek(in_file, -1, SEEK_END); // seek to end of file
for (int i = in_file_size; i > 0; i--) {
char current_char = fgetc(in_file); // This progresses the seek location
printf("Got char: |%c| with hex: |%x|\n", current_char, current_char);
fseek(in_file, -2, SEEK_CUR); // Go back 2 bytes (1 to compensate)
}
printf("Done\n");
fclose(in_file);
}

Trouble finding frequency of words from a file in C

I need to write a code that will print the frequency of each word from a given file. Words like "the" and "The" will count as two different words. I've written some code so far but the command prompt stops working when I try to run the program. I just need some guidance and to be pointed in the best direction for this code, or I would like to be told that this code needs to be abandoned. I'm not very good at this so any help would be very appreciated.
#include <stdio.h>
#include <string.h>
#define FILE_NAME "input.txt"
struct word {
char wordy[2000];
int frequency;
} words;
int word_freq(const char *text, struct word words[]);
int main (void)
{
char *text;
FILE *fp = fopen(FILE_NAME, "r");
fread(text, sizeof(text[0]), sizeof(text) / sizeof(text[0]), fp);
struct word words[2000];
int nword;
int i;
nword = word_freq(text, words);
puts("\nWord frequency:");
for(i = 0; i < nword; i++)
printf(" %s: %d\n", words[i].wordy, words[i].frequency);
return 0;
}
int word_freq(const char *text, struct word words[])
{
char punctuation[] =" .,;:!?'\"";
char *tempstr;
char *pword;
int nword;
int i;
nword = 0;
strcpy(tempstr, text);
while (pword != NULL) {
for(i = 0; i < nword; i++) {
if (strcmp(pword, words[i].wordy) == 0)
break;
}
if (i < nword)
words[i].frequency++;
else {
strcpy(words[nword].wordy, pword);
words[nword].frequency= 1;
nword++;
}
pword = strtok(NULL, punctuation);
}
return nword;
}
First off all:
char *text;
FILE *fp = fopen(FILE_NAME, "r");
fread(text, sizeof(text[0]), sizeof(text) / sizeof(text[0]), fp);
Reads probably 4 bytes of your file because sizeof(text[0]) is 1 and sizeof(text) is probably 4 (depending on pointer size). You need to use ftell() or some other means to get the actual size of your data file in order to read it all into memory.
Next, you are storing this information into a pointer that has no memory allocated to it. text needs to be malloc'd or made to hold memory in some way. This is probably what is causing your program to fail to work, just to start.
There are so so SO many further issues that it will take time to explain them:
How you are using strcpy to blow up memory when you place it intotempstr
How even if that weren't the case, it would copy probably the whole file at once, unless the file had NULL terminated strings within, which it may, so perhaps this is ok.
How you compare nwords[i].wordy, even though it is not initialized and therefore garbage.
How, even if your file were read into memory correctly, you look a pword, which is unitialized for your loop counter.
Please, get some help or ask your teacher about this because this code is seriously broken.

Why am I getting a hyphen at start of file?

I am learning C and I have tried to build a program that outputs its own source. This is my source:
#include <stdio.h>
int S = 512;
int main(){
FILE * fp;
fp = fopen("hello.c","r");
char * line = (char *) malloc(S);
int i = 0;
while (i == 0)
{
i = feof(fp);
printf("%s",line);
fgets(line,S,fp);
}
fclose(fp);
}
I have used the tcc compiler and I got this output:
But notice, I got a hyphen before #include. The rest of the output is correct.
So please can someone explain why I got this hyphen??
You're printing the first line before you've read anything.
#include <stdio.h>
int main(){
FILE *fp = fopen("hello.c", "r");
char line[256];
while (fgets(line, sizeof line, fp) != NULL)
printf("%s",line);
fclose(fp);
return 0;
}
#ooga gave you the correct answer.
The why is that malloc doesn't initialize the memory before it returns it to you, unlike its sister calloc.
Most likely, on another platform / compiler, you'd get something different.
Some compilers use a debug heap that initializes "unitialized" memory to a specific value. The release mode will probably result in random garbage instead of a '-' everytime.

C program not entering for loop, but executing everything right up until it

for some reason when running a program it refuses to enter a for loop, and rather just hangs.
main()
{
char *buffer;
int chunk_offset, current_chunk_number, total_chunks;
int filelen;
filelen = ReadFileintoBuffer( buffer);
printf("file read \n"); // error break point 1
int events = filelen/eventlen; // number of 512 + 2 32 bit events events with timecode
int sp = 0;
int i,j;
int toterror[25];
printf("file of length %d has %d events \n", filelen, events);
printf(" i = %d \n", i);
for( i = 0; i < events; i++)
{
printf("analyzed %d events of %d", i, events);
sp=0;
int error[13];
Analyzeevent(buffer+i*eventlen, error);
if(error[0])
{
sp = 12;
}
for( j =1; j < 13; j++)
{
toterror[j+sp] += error[j];
}
}
printf("post for loop");
Printerrs(toterror, events);
exit(0);
}
It prints everything down to i = (9727988 in this particular case), then nothing. it all just stops. any idea what happened? i learned c++ and programming in c is very strange and awkward for me right now. compiler doesnt throw up any warnings or anything
thank you for your help in advance.
edit: for ring0 heres the code to ReadFileintoBuffer:
int ReadFileintoBuffer( char *buffer)
{
int filelen;
char text[200];
printf("Input File: " );
scanf( "%s" ,text);
FILE *file = 0;
int i;
//Open file
file = fopen(text, "rb");
if (!file)
{
fprintf(stderr, "Unable to open file \n");
exit(0);
}
//Get file length
fseek(file, 0, SEEK_END); // find the end of the file
filelen=ftell(file); // set the current pointer (currently at end from above) as file length
fseek(file, 0, SEEK_SET); // set pointer back to beginning of file
//Allocate memory
buffer=(char *)malloc(filelen+1);
if (!buffer)
{
fprintf(stderr, "Memory error! \n");
fclose(file);
return;
}
//Read file contents into buffer
fread(buffer, filelen, 1, file);
fclose(file); // clse the file, all in buffer now
return filelen;
}
You may be getting some buffering issues with printf. Try adding a \n to the print inside the for loop. (Like you have in some of the earlier prints)
It may be executing all the code but you aren't seeing the printed result.
You need to use a debugger - gdb
What is eventlen here? What is its type? What is its value and filelen's value?
int events = filelen/eventlen;
It could be possible that your event is negative and so refusing to enter this loop!
for( i = 0; i < events; i++)
Looking at
char *buffer;
filelen = ReadFileintoBuffer( buffer);
it seems something is not going right.
We don't see the code of ReadFileintoBuffer but it cannot use the buffer parameter in any way:
buffer is a non allocated string
its value, and not its reference, is passed to the function
Thus, either
buffer should be allocated prior to calling ReadFileintoBuffer(), eg with malloc() or
its referenced should be passed instead, like
filelen = ReadFileintoBuffer( &buffer);
It all depends on the ReadFileintoBuffer() function.
Edit based on ReadFileintoBuffer() code
The code has a problem with buffer as expected.
The local buffer parameter is allocated in the function, but its pointer value is not copied into the caller's buffer.
There are several ways to fix this - one of which is, as mentioned above, to pass the reference to the buffer pointer, and not its value to the function.
filelen = ReadFileintoBuffer( &buffer); // note the &
And the function itself - buffer related:
int ReadFileintoBuffer( char **buffer) // note the **
{
// ...
//Allocate memory ( using *buffer instead of buffer )
*buffer=(char *)malloc(filelen+1);
if (!*buffer)
{
// ...
}
//Read file contents into *buffer
fread(*buffer, filelen, 1, file);
// ...
}
This way the main() buffer variable allocation is actually performed by that function.
Note that the solution you approved does not fix the problem: it just empties the printf buffer... Unless buffer is fixed, your program cannot run correctly.
If you are trying to catch some problem by intermittent printf's, than use them with fflush'es. \n works as a such, too. Or you can't say nothing looking at seeming hanging app.
eventlen is not defined prior to this line:
int events = filelen/eventlen;
Use printf to see its value.

Resources