Segmentation fault in file accessing - c

May I just ask why this piece of code is resulting to a segmentation fault. I'm trying to get input from a text file and I can't figure out what is the problem.
using namespace std;
using namespace cv;
int main()
{
char str[50];
FILE *trainfile;
int k, n, maxval1, maxval2, classnum;
char dataArray[n][3];
trainfile = fopen("training.txt", "r+");
if(trainfile == NULL){
perror("Cannot open file.\n");
}else{
while(!feof(trainfile)){
fscanf(trainfile, "%s", str);
}
}
fclose(trainfile);
return 0;
}

int k, n, maxval1, maxval2, classnum;
char dataArray[n][3];
n is not initialized, so it can be any value and hence your code has an Undefined Behavior.
err...its not used anyways.
The other problem in code is your data buffer:
char str[50];
should be big enough to hold the contents of the file, which it probably is not and causes an Undefined Behavior.

One problem is that your buffer might not be big enough.
You should get the size of the file first, then make a dynamic buffer of that size, and then finally read the file.
fseek(trainfile,0,SEEK_END); //Go to end
int size = ftell(trainfile); //Tell offset of end from beginning
char* buffer = malloc(size); //Make a buffer of the right size
fseek(ftrainfile,0,SEEK_SET); //Rewind the file
//Read file here with buffer

Related

code in C being killed when reading a 250MB file

I am trying to process a 250MB file using a script in C.
The file is basically a dataset and I want to read just some of the columns and (more importantly) break one of them (which is originally a string) into a sequence of characters.
However, even though I have plenty of RAM available, the code is killed by konsole (using KDE Neon) everytime I run it.
The source is available below:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main() {
FILE *arquivo;
char *line = NULL;
size_t len = 0;
int i = 0;
int j;
int k;
char *vetor[500];
int acertos[45];
FILE *licmat = fopen("licmat.csv", "w");
//creating the header
fprintf(licmat,"CO_CATEGAD,CO_UF_CURSO,ACERTO09,ACERTO10,ACERTO11,ACERTO12,ACERTO13,ACERTO14,ACERTO15,ACERTO16,ACERTO17,ACERTO18,ACERTO19,ACERTO20,ACERTO21,ACERTO22,ACERTO23,ACERTO24,ACERTO25,ACERTO26,ACERTO27,ACERTO28,ACERTO29,ACERTO30,ACERTO31,ACERTO32,ACERTO33,ACERTO34,ACERTO35\n");
if ((arquivo = fopen("MICRODADOS_ENADE_2017.csv", "r")) == NULL) {
printf ("\nError");
exit(0);
}
//reading one line at a time
while (getline(&line, &len, arquivo)) {
char *ptr = strsep(&line,";");
j=0;
//breaking the line into a vector based on ;
while(ptr != NULL)
{
vetor[j]=ptr;
j=j+1;
ptr = strsep(&line,";");
}
//filtering based on content
if (strcmp(vetor[4],"702")==0 && strcmp(vetor[33],"555")==0) {
//copying some info
fprintf(licmat,"%s,%s,",vetor[2],vetor[8]);
//breaking the string (32) into isolated characters
for (k=0;k<27;k=k+1) {
fprintf(licmat,"%c", vetor[32][k]);
if (k<26) {
fprintf(licmat,",");
}
}
fprintf(licmat,"\n");
}
i=i+1;
}
free(line);
fclose(arquivo);
fclose(licmat);
}
The output is perfect up to the point when the script is killed. The output file is just 640KB long and has about 10000 lines only.
What could be the issue?
It looks to me like you're mishandling the memory buffer managed by getline() - which allocates/reallocates as needed - by the use of strsep(), which seems to manipulate that same pointer value.
Once line has been updated to reflect some other element on the line, it's no longer pointing to the start of allocated memory, and then boom the next time getline() needs to do anything with it.
Use a different variable to pass to strsep():
while (getline(&line, &len, arquivo) > 0) { // use ">=" if you want blank lines
char *parseline = line;
char *ptr = strsep(&parseline,";");
// do the same thing later
The key thing here: you're not allowed to muck with the value of line other than to free() it at the end (which you do), and you can't let any other routine do it either.
Edit: updated to reflect getline() returning <0 on error (h/t to #user3121023)

"Invalid argument" when creating a file

Could you help me with the creation of a text file as right now the *fp pointer to the file is returning NULL to the function fopen ?
Using the library errno.h and extern int errno I get "Value of errno: 22".
if (!fp)perror("fopen") gives me "Error opening file: Invalid argument".
In my main function I enter the name of the file:
void main()
{
float **x;
int i,j;
int l,c;
char name_file[30];
FILE *res;
/* some lines omitted */
printf("\nEnter the name of the file =>");
fflush (stdin);
fgets(name_file,30,stdin);
printf("Name of file : %s", name_file);
res=creation(name_file,l,c,x);
printf("\nThe created file\n");
readfile(res,name_file);
}
The function to create the text file:
FILE *creation(char *f_name,int l, int c, float **a) // l rows - c colums - a array
{ FILE *fp;
int i,j;
fp = fopen(f_name,"wt+"); // create for writing and reading
fflush(stdin);
/* The pointer to the file is NULL: */
if (!fp)perror("fopen"); // it's returning Invalid argument
printf("%d\n",fp); //0 ??
if(fp==NULL) { printf("File could not be created!\n"); exit(1); }
fflush(stdin);
for(i=0;i<l;i++)
{
for(j=0;j<c;j++)
{
fprintf(fp,"%3.2f",a[i][j]); // enter every score of the array in the text file
}
fprintf(fp,"\n");
}
return fp;
}
Function to read the file and check if it is correct:
**void readfile(FILE *fp,char *f_name)**
{
float a;
rewind(fp);
if(fp==NULL) { printf("File %s could not open\n",f_name); exit(1); }
while(fscanf(fp,"%3.2f",&a)!= EOF)
printf("\n%3.2f",a);
fclose(fp);
}
There are quite a few wrong things your code.
1.
The correct signatures of main are
int main(void);
int main(int argc, char **argv)
int main(int argc, char *argv[])
See What are the valid signatures for C's main() function?
2.
The behaviour of fflush(stdin) is undefined. See Using fflush(stdin).
fflush works with output buffers, it tells the OS that is should write
the buffered content. stdin is an input buffer, flushing makes no sense.
3.
Use fgets like this:
char name_file[30];
fgets(name_file, sizeof name_file, stdin);
It's more robust using sizeof name_file because this will give you always
the correct size. If you later change the declaration of name_file to
an char array with less than 30 spaces, but forget to change the size parameter in fgets, you
might end up with a buffer overflow.
4.
You are passing to creation the uninitialized pointer p that is pointing
to nowhere. In said function you cannot read nor write with the pointer a.
You need to allocate memory prior to the call of creation. At least judging
from the code you posted.
5.
fgets preserves the newline ('\n') character, so
name_file is containing the newline character. I really don't know if newline
is allowed in file names. I did a google search but found conflicting answers.
I don't think that you want to have newlines in your file names, anyway. It's
best to remove it before passing it to fopen (which might be the reason for
the error 22):
char name_file[30];
fgets(name_file, sizeof name_file, stdin);
int len = strlen(name_file);
if(name_file[len - 1] == '\n')
name_file[len - 1] = 0;

Getting the error message "assignment makes pointer from integer without a cast" while trying to write a parsing program

I'm trying to write a parsing program that will read the file /proc/stat and store its various tokens in arrays. This is the progress I have made so far. my problem comes with the line
s = strtok(str, " ");
With this line I get the error message:
ass2.c:62:15: warning: assignment makes pointer from integer without a cast [enabled by default]
s = strtok(string, " ");
I'm not sure how to solve this issue. I'm just about a complete beginner with C so not familiar with the feedback terms and so I'm struggling to rectify this issue. Below I have pasted the entire programs code so far.
//standard input/output file to help with io operations
#include<stdio.h>
//standard library files to help with exit and other standard functions
#include<stdlib.h>
//header file for usleep function
#include <unistd.h>
int main()
{
//FILE pointer will need to be declared initially, in this example the name is fp
FILE *fp;
//A character pointer that will store each line within the file; you will need to parse this line to extract useful information
char *str = NULL;
//size_t defined within C is a unsigned integer; you may need this for getline(..) function from stdio.h to allocate buffer dynamically
size_t len = 0;
//ssize_t is used to represent the sizes of blocks that can be read or written in a single operation through getline(..). It is similar to size_t, but must be a signed type.
ssize_t read;
int cpu_line1[5];
int cpu_line2[5];
int cpu_line3[5];
int cpu_line4[5];
int cpu_line5[5];
int page_line[3];
int swap_line[3];
int intr_line[2];
int ctxt_line[2];
int btime_line[2];
//a variable declared to keep track of the number of times we read back the file
unsigned int sample_count = 0;
//opening the file in read mode; this file must be closed after you are done through fclose(..); note that explicit location of the file to ensure file can be found
fp = fopen("/proc/stat", "r");
//checking if the file opening was successful; if not we do not want to proceed further and exit with failure code right away
if(fp == NULL)
{
exit(EXIT_FAILURE);
}
int i = 0;
char **string = NULL; //declaration of string
string = (char**)malloc(10*sizeof(char*)); //assign space for 10 pointers to array
for (i=0; i<10; i++) //allocate 50 bytes to each string in the array
{
string[i] = (char*)malloc(50*sizeof(char));
}
//a loop that will read one line in the file at a time; str will read the line; len will store the length of the file
while(sample_count < 1)
{
printf("\e[1;1H\e[2J"); //this line will make sure you have cleared the previous screen using C's powerful format specifiers
printf("----------------------------------------------------------------\n");//used for presentation
printf("Sample: %u\n", sample_count); //showing the sample count
while ((read = getline(&str, &len, fp)) != -1)
{
printf("Retrieved line: \n%sof length: %zu, allocated buffer: %u :\n", str, read, (unsigned int) len);
//You will then need to extract the useful information, including the name and the statistics
char *s = NULL;
s = strtok(str, " ");
sprintf(string[i], s);
printf("Test: %s", string[0]);
}
if (i=0)
{
cpu_line1[0] = atoi(strtok(NULL, " "));
cpu_line1[1] = atoi(strtok(NULL, " "));
cpu_line1[2] = atoi(strtok(NULL, " "));
}
printf("----------------------------------------------------------------\n"); //used for presentation
usleep(500000);//this will ensure time delay
rewind(fp);//rewind the file pointer to start reading from the beginning
sample_count++;//update the sample count
}
//once you are done, you should free the pointers to make your program memory efficient
free(str);
//once you are done, you should also close all file pointers to make your program memory efficient
fclose(fp);
return 0;
}
Since you did not #include the header file in which strtok is declared, the compiler assumes that the return type of the function is int. Hence, the warning.
Add
#include <string.h>
to fix the problem.

Trouble finding frequency of words from a file in C

I need to write a code that will print the frequency of each word from a given file. Words like "the" and "The" will count as two different words. I've written some code so far but the command prompt stops working when I try to run the program. I just need some guidance and to be pointed in the best direction for this code, or I would like to be told that this code needs to be abandoned. I'm not very good at this so any help would be very appreciated.
#include <stdio.h>
#include <string.h>
#define FILE_NAME "input.txt"
struct word {
char wordy[2000];
int frequency;
} words;
int word_freq(const char *text, struct word words[]);
int main (void)
{
char *text;
FILE *fp = fopen(FILE_NAME, "r");
fread(text, sizeof(text[0]), sizeof(text) / sizeof(text[0]), fp);
struct word words[2000];
int nword;
int i;
nword = word_freq(text, words);
puts("\nWord frequency:");
for(i = 0; i < nword; i++)
printf(" %s: %d\n", words[i].wordy, words[i].frequency);
return 0;
}
int word_freq(const char *text, struct word words[])
{
char punctuation[] =" .,;:!?'\"";
char *tempstr;
char *pword;
int nword;
int i;
nword = 0;
strcpy(tempstr, text);
while (pword != NULL) {
for(i = 0; i < nword; i++) {
if (strcmp(pword, words[i].wordy) == 0)
break;
}
if (i < nword)
words[i].frequency++;
else {
strcpy(words[nword].wordy, pword);
words[nword].frequency= 1;
nword++;
}
pword = strtok(NULL, punctuation);
}
return nword;
}
First off all:
char *text;
FILE *fp = fopen(FILE_NAME, "r");
fread(text, sizeof(text[0]), sizeof(text) / sizeof(text[0]), fp);
Reads probably 4 bytes of your file because sizeof(text[0]) is 1 and sizeof(text) is probably 4 (depending on pointer size). You need to use ftell() or some other means to get the actual size of your data file in order to read it all into memory.
Next, you are storing this information into a pointer that has no memory allocated to it. text needs to be malloc'd or made to hold memory in some way. This is probably what is causing your program to fail to work, just to start.
There are so so SO many further issues that it will take time to explain them:
How you are using strcpy to blow up memory when you place it intotempstr
How even if that weren't the case, it would copy probably the whole file at once, unless the file had NULL terminated strings within, which it may, so perhaps this is ok.
How you compare nwords[i].wordy, even though it is not initialized and therefore garbage.
How, even if your file were read into memory correctly, you look a pword, which is unitialized for your loop counter.
Please, get some help or ask your teacher about this because this code is seriously broken.

c - get file into array of chars

hi i have the following code below, where i try to get all the lines of a file into an array... for example if in file data.txt i have the following:
first line
second line
then in below code i want to get in data array the following:
data[0] = "first line";
data[1] = "second line"
My first question: Currently I am getting "Segmentation fault"... Why?
Exactly i get the following output:
Number of lines is 7475613
Segmentation fault
My second question: Is there any better way to do what i am trying do?
Thanks!!!
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char* argv[])
{
FILE *f = fopen("data.txt", "rb");
fseek(f, 0, SEEK_END);
long pos = ftell(f);
fseek(f, 0, SEEK_SET);
char *bytes = malloc(pos);
fread(bytes, pos, 1, f);
int i =0;
int counter = 0;
for(; i<pos; i++)
{
if(*(bytes+i)=='\n') counter++;
}
printf("\nNumber of lines is %d\n", counter);
char* data[counter];
int start=0, end=0;
counter = 0;
int length;
for(i=0; i<pos; i++)
{
if(*(bytes+i)=='\n')
{
end = i;
length =end-start;
data[counter]=(char*)malloc(sizeof(char)*(length));
strncpy(data[counter],
bytes+start,
length);
counter = counter+1;
start = end+1;
}
}
free(bytes);
return 0;
}
First line of the data.txt in this case is not '\n' it is: "23454555 6346346 3463463".
Thanks!
You need to malloc 1 more char for data[counter] for the terminating NUL.
after strncpy, you need to terminate the destination string.
Edit after edit of original question
Number of lines is 7475613
Whooooooaaaaaa, that's a bit too much for your computer!
If the size of a char * is 4, you want to reserve 29902452 bytes (30M) of automatic memory in the allocation of data.
You can allocate that memory dynamically instead:
/* char *data[counter]; */
char **data = malloc(counter * sizeof *data);
/* don't forget to free the memory when you no longer need it */
Edit: second question
My second question: Is there any
better way to do what i am trying do?
Not really; you're doing it right. But maybe you can code without the need to have all that data in memory at the same time.
Read and deal with a single line at a time.
You also need to free(data[counter]); in a loop ... and free(data); before the "you're doing it right" above is correct :)
And you need to check if each of the several malloc() calls succeeded LOL
First of all you need to check if the file got opened correctly or not:
FILE *f = fopen("data.txt", "rb");
if(!f)
{
fprintf(stderr,"Error opening file");
exit (1);
}
If there is error opening the file and you don't check it, you'll get a seg fault when you try to fseek on an invalid file pointer.
Apart from that I see no errors. Tried running the program, by printing the value of the data array at the end, it ran as expected.
One thing to note is that you're opening your file as binary - line termination disciplines may not work as you expect on your platform (UNIX is lf, Windows is cr-lf, some versions of MacOS are cr).

Resources