I'm trying to write a stream editor in C and I'm having a hard time dealing with strings. After reading in the lines of a File, I want to store them locally in an array of Strings. However, when I try to store the variable temp into the array of strings StoredEdits I get a segmentation fault (core dumped) error. Furthermore, if I uncomment the char* temp2 variable and save this into my array as a workaround, then the last value read in gets stored for every value in the array.
I assume this has to do with the fact that temp2 is a pointer. I've tried a million things like malloc'ing and free'ing this variable after each iteration, but nothing seems to work.
Any help would be greatly appreciated.
#define MAX_SIZE 100
typedef char String[MAX_SIZE];
int main(int argc, char* argv[]){
char** StoredEdits;
int index, numOfEdits;
FILE *EditFile;
char* temp;
//char* temp2;
StoredEdits = (char**)malloc(MAX_INPUT_SIZE*sizeof(String));
/*Check to see that edit file is passed in.*/
if(argc < 2){
printf("ERROR: Edit File not given\n");
return(EXIT_FAILURE);
}
printf("%s\n",argv[1]);
if( (EditFile = fopen(argv[1],"r")) != NULL ){
printf("file opened\n");
numOfEdits = 0;
while(fgets(temp, MAX_STRING_SIZE, EditFile) != NULL){
printf("%d %s",numOfEdits,temp);
//temp2 = temp;
StoredEdits[numOfEdits++] = temp;
//StoredEdits[numOfEdits++] = temp;
printf("Stored successfully\n");
}
..........
printf("%d\n",numOfEdits);
for(index=0;index<numOfEdits;index++){
printf("%d %s\n",index, StoredEdits[index]);
}
You need to initialize temp to point to valid storage.
temp = malloc(MAX_STRING_SIZE+1);
It looks like you may have intended to do something like this:
String temp;
using your macro. This would be better as a regular char array. And the common name for this is buffer.
char buffer[MAX_STRING_SIZE+1];
Then, you should store in your array, not temp itself, but a new string containing a copy of the contents. There is a POSIX function strdup that should be helpful here. Note, strdup is not part of the C standard, but it is available in most hosted implementations. Historically, it comes from the BSD branch.
StoredEdits[numOfEdits++] = strdup(temp);
Let me backpedal a little and say that if you're allocating new storage for temp inside the loop, then you should skip the strdup because, as Jim Balter says, this will leak memory. If you allocate temp outside of the loop, then it makes little difference whether you allocate it statically (by declaring a char []) or dynamically (with malloc).
By the way, this line will not buy you much:
typedef char String[MAX_SIZE];
For why, see the classic Kernighan (the K in K&R) essay Why Pascal is not my favorite Programming Language.
Also note, that my examples above do not check the pointer returned by malloc. malloc can fail. When malloc fails it will return a NULL pointer. If you try to store data through this pointer, Kaboom!
You're right about your problem being because of pointer semantics. You should use copy the contents of the string from temp.
char *cpy = malloc(1 + strlen(temp));
if (cpy)
strcpy(cpy, temp);
//else handle error
StoredEdits[numOfEdits++] = cpy;
Others answered the reason for the error.
But from the program, i see that you tried to allocate a character double array. then you store each line read from the file into the array.
StoredEdits = (char**)malloc(MAX_INPUT_SIZE*sizeof(String));
if my assumption is right, then you should pass the array into strcpy like the below.
strcpy(StoredEdits[numOfEdits],tmp);
when you have a file where each line varies in size, it is better to go array of pointers points to character array.
Related
I am trying to write a function to convert a text file into a CSV file.
The input file has 3 lines with space-delimited entries. I have to find a way to read a line into a string and transform the three lines from the input file to three columns in a CSV file.
The files look like this :
Jake Ali Maria
24 23 43
Montreal Johannesburg Sydney
And I have to transform it into something like this:
Jake, 24, Montreal
...etc
I figured I could create a char **line variable that would hold three references to three separate char arrays, one for each of the three lines of the input file. I.e., my goal is to have *(line+i) store the i+1'th line of the file.
I wanted to avoid hardcoding char array sizes, such as
char line1 [999];
fgets(line1, 999, file);
so I wrote a while loop to fgets pieces of a line into a small buffer array of predetermined size, and then strcat and realloc memory as necessary to store the line as a string, with *(line+i) as as pointer to the string, where i is 0 for the first line, 1 for the second, etc.
Here is the problematic code:
#include <stdio.h>
#include<stdlib.h>
#include<string.h>
#define CHUNK 10
char** getLines (const char * filename){
FILE *file = fopen(filename, "rt");
char **lines = (char ** ) calloc(3, sizeof(char*));
char buffer[CHUNK];
for(int i = 0; i < 3; i++){
int lineLength = 0;
int bufferLength = 0;
*(lines+i) = NULL;
do{
fgets(buffer, CHUNK, file);
buffLength = strlen(buffer);
lineLength += buffLength;
*(lines+i) = (char*) realloc(*(lines+i), (lineLength +1)*sizeof(char));
strcat(*(lines+i), buffer);
}while(bufferLength ==CHUNK-1);
}
puts(*(lines+0));
puts(*(lines+1));
puts(*(lines+2));
fclose(file);
}
void load_and_convert(const char* filename){
char ** lines = getLines(filename);
}
int main(){
const char* filename = "demo.txt";
load_and_convert(filename);
}
This works as expected only for i=0. However, going through this with GDB, I see that I get a realloc(): invalid pointer error. The buffer loads fine, and it only crashes when I call 'realloc' in the for loop for i=1, when I get to the second line.
I managed to store the strings like I wanted in a small example I did to try to see what was going on, but the inputs were all on the same line. Maybe this has to do with fgets reading from a new line?
I would really appreciate some help with this, I've been stuck all day.
Thanks a lot!
***edit
I tried as suggested to use calloc instead of malloc to initialize the variable **lines, but I still have the same issue.I have added the modifications to the original code I uploaded.
***edit
After deleting the file and recompiling, the above now seems to work. Thank you to everyone for helping me out!
You allocate line (which is a misnomer since it's not a single line), which is a pointer to three char*s. You never initialize the contents of line (that is, you never make any of those three char*s point anywhere). Consequently, when you do realloc(*(line + i), ...), the first argument is uninitialized garbage.
To use realloc to do an initial memory allocation, its first argument must be a null pointer. You should explicitly initialize each element of line to NULL first.
Additionally, *(line+i) = (char *)realloc(*(line+i), ...) is still bad because if realloc fails to allocate memory, it will return a null pointer, clobber *(line + i), and leak the old pointer. You instead should split it into separate steps:
char* p = realloc(line[i], ...);
if (p == null) {
// Handle failure somehow.
exit(1);
}
line[i] = p;
A few more notes:
In C, you should avoid casting the result of malloc/realloc/calloc. It's not necessary since C allows implicit conversion from void* to other pointer types, and the explicit could mask an error where you accidentally omit #include <stdlib.h>.
sizeof(char) is, by definition, 1 byte.
When you're allocating memory, it's safer to get into a habit of using T* p = malloc(n * sizeof *p); instead of T* p = malloc(n * sizeof (T));. That way if the type of p ever changes, you won't silently be allocating the wrong amount of memory if you neglect to update the malloc (or realloc or calloc) call.
Here, you have to zero your array of pointers (for example by using calloc()),
char **line = (char**)malloc(sizeof(char*)*3); //allocate space for three char* pointers
otherwise the reallocs
*(line+i) = (char *)realloc(*(line+i), (inputLength+1)*sizeof(char)); //+1 for the empty character
use an uninitialized pointer, leading to undefined behaviour.
That it works with i=0 is pure coindicence and is a typical pitfall when encountering UB.
Furthermore, when using strcat(), you have to make sure that the first parameter is already a zero-terminated string! This is not the case here, since at the first iteration, realloc(NULL, ...); leaves you with an uninitialized buffer. This can lead to strcpy() writing past the end of your allocated buffer and lead to heap corruption. A possible fix is to use strcpy() instead of strcat() (this should even be more efficient here):
do{
fgets(buffer, CHUNK, file);
buffLength = strlen(buffer);
lines[i] = realloc(lines[i], (lineLength + buffLength + 1));
strcpy(lines[i]+lineLength, buffer);
lineLength += buffLength;
}while(bufferLength ==CHUNK-1);
The check bufferLength == CHUNK-1 will not do what you want if the line (including the newline) is exactly CHUNK-1 bytes long. A better check might be while (buffer[buffLength-1] != '\n').
Btw. line[i] is by far better readable than *(line+i) (which is semantically identical).
As an exercise, I have build a simple program that, given a text file of N lowercase words and whitespaces, populates a ragged array char *en[N].
It works without great problems, apart for one: it populates the ragged array with only the last word of the input.
#include<stdio.h>
#include<ctype.h>
int main(int argc, char *argv[]){
int i = 0, j = 0;
char *en[100];
char temp[20];
FILE *p = fopen(argv[1], "r");
char single;
while((single = fgetc(p)) != EOF){
if(!isspace(single)) /* Temporary store a single word */
temp[i++] = single;
else{
temp[i] = '\0';
en[j++] = temp; /* Save stored word in ragged array */
i = 0;
}
}
printf("%s\n", en[0]); /* Return the same than en[1] and en[99] */
printf("%s\n", en[1]);
printf("%s\n", en[99]);
return 0;
}
I cannot understand why it goes down to the end of the input file. I am unable of detecting major issues that could suggest a wrong approach.
Edit:
The reasoning behind my approach was that an array of *char can be initialized with this form:
p[0] = "abc";
reasoning that I wrongly tried to translate in the error above, that #coderredoc brilliantly caught. As far as the dimensions of single words and inputs are concerned, I admit I did not put many attention in them. The exercise is centered on a different topic. In any case, thanks a lot your your valuable suggestions!
Your array of charcaters are all pointing to the same char array and then the content of the array at last changes to the last word. And you get only the last word.
A possible solution
en[j++] = temp;
to
en[j++] = strdup(temp);
Then you will achieve the behavior you want your program to have.
You just found out the awesomeness of pointers, congratulations!
Seriously, char *en[100] is an array of pointers. en[j++] = temp; assigns the pointer to the first value of temp to a pointer at en[j++]. And you do this over and over again. No surprise that you end up with an array of pointers, all of which point to the same array temp, which holds the contents of the last word.
What to learn from this: a pointer merely points to some memory, and no memory copying happens when you do en[j++] = temp;. You have to allocate the memory yourself and copy temp to that new memory yourself.
I need to read in a file. The first line of the file is the number of lines in the file and it returns an array of strings, with the last element being a NULL indicating the end of the array.
char **read_file(char *fname)
{
char **dict;
printf("Reading %s\n", fname);
FILE *d = fopen(fname, "r");
if (! d) return NULL;
// Get the number of lines in the file
//the first line in the file is the number of lines, so I have to get 0th element
char *size;
fscanf(d, "%s[^\n]", size);
int filesize = atoi(size);
// Allocate memory for the array of character pointers
dict = NULL; // Change this
// Read in the rest of the file, allocting memory for each string
// as we go.
// NULL termination. Last entry in the array should be NULL.
printf("Done\n");
return dict;
}
I put some comments because I know that's what I'm to do, but I can't seem to figure out how to put it in actual code.
To solve this problem you need to do one of two things.
Read the file as characters then convert to integers.
Read the file directly as integers.
For the first, you would use freed into a char array and then use atoi to convert to integer.
For the second, you would use fscanf and use the %d specify to read directly into an int variable;
fscanf does not allocate memory for you. Passing it a random pointer as you have will only cause trouble. (I recommend avoid fscanf).
The question code has a flaw:
char *size;
fscanf(d, "%s[^\n]", size);
Although the above may compile, it will not function as expected at runtime. The problem is that fscanf() needs the memory address of where to write the parsed value. While size is a pointer that can store a memory address, it is uninitialized, and points to no specific memory in the process' memory map.
The following may be a better replacement:
fscanf(d, " %d%*c", &filesize);
See my version of the spoiler code here
In my program I am getting a seg fault and I'm not sure the cause or how to find out the cause. Any help would be greatly appreciated!
In the code I am trying to read word by word, but I need to keep track of the line numbers. Then I am trying to create a linked list where the data is the word and line number.
(there are two files compiled together)
void main(int argc, char **argv){
file = fopen(argv[1],"r");
struct fileIndex *fIndex = NULL;
delimiters = " .,;:!-";/*strtok chars to seperate*/
int wCount = wordcount(file);/*number of words in file*/
char **str[wCount+1];/*where the lines are being stored*/
int j=0;
while(!feof(file)){/*inserting lines*/
fscanf(file, "%s", &str[j]);
j++;
}
char *token, *cp;
int i;
int len;
for(i = 0; str[i]; i++){/*checking to insert words*/
len = strlen(*str[i]);
cp = xerox(*str[i]);
token = strtok(cp, delimiters);
if(!present(fIndex, token)){
insert(fIndex, i+1,token);
}
while(token!=NULL){
token = strtok(NULL, delimiters);
if(!present(fIndex, token)){
insert(fIndex, i+1,token);
}
}
i++;
}
fclose(file);
}
int strcmpigncase(char *s1, char *s2){/*checks words*/
for(;*s1==*s2;s1++,s2++){
if(*s1=='\0')
return 0;
}
return tolower(*s2)-tolower(*s2);
}
present(struct fileIndex* fIndex, char *findIt){/*finds if word is in structure*/
struct fileIndex* current = fIndex;
while(current!=NULL){
current = current -> next;
if(strcmpigncase(current -> str, findIt)==0){
return current -> lineNum;
}
}
return 0;
}
void insert(struct fileIndex *head, int num, char *insert){/*inserts word into structure*/
struct fileIndex* node = malloc(sizeof(struct fileIndex));
node -> str = insert;
node -> lineNum = num;
node -> next = head;
head = node;
}
#define IN_WORD 1
#define OUT_WORD 0
int wordcount(FILE *input)/*number of words in file*/
{
FILE *open = input;
int cur; /* current character */
int lc=0; /* line count */
int state=OUT_WORD;
while ((cur=fgetc(open))!=EOF) {
if (cur=='\n')
lc++;
if (!isspace(cur) && state == OUT_WORD) {
state=IN_WORD;
}
else if (state==IN_WORD && isspace(cur)) {
state=OUT_WORD;
}
}
return lc;
}
char *xerox(char *s){
int i = strlen(s);
char *buffer = (char *)(malloc(i+1));
if(buffer == NULL)
return NULL;
char *t = buffer;
while(*s!='\0'){
*t=*s;
s++; t++;
}
*t = '\0';
return buffer;
}
This code has a fairly high rate of problems. I'll dissect just the first few lines to give an idea:
void main(int argc, char **argv){
main should return int, not void. Probably not causing your problem, but not right either.
file = fopen(argv[1],"r");
You really need to check the value of argc before trying to use argv[1]. Invoking the program without an argument may well lead to a problem. Depending on how you've invoked it, this could be the cause of your problem.
struct fileIndex *fIndex = NULL;
Unless you've included some headers you haven't shown, this shouldn't compile -- struct fileIndex doesn't seem to have been defined (nor does it seem to be defined anywhere I can see in the code you'e posted).
delimiters = " .,;:!-";/*strtok chars to seperate*/
int wCount = wordcount(file);/*number of words in file*/
This (wordcount) reads to the end of the file, but does not rewind the file afterward.
char **str[wCount+1];/*where the lines are being stored*/
From your description, you don't really have any need to store lines (plural) at all. What you probably want is to read one line, then tokenize it and insert the individual tokens (along with the line number) into your index, then read the next line. From what you've said, however, there's no real reason to store more than one raw line at a time though.
int j=0;
while(!feof(file)){/*inserting lines*/
As noted above, you've previously read to the end of the file, and never rewound the file. Therefore, nothing inside this loop should ever execute, because as soon as you get here, feof(file) should return true. When/if you take care of that, this loop won't work correctly -- in fact, a loop of the form while (!feof(file)) is essentially always wrong. Under the circumstances, you want to check the result of your fscanf, with something like:
while (1 == fscanf(file, "%1023s", line))
...so you exit the loop when attempting to read fails.
fscanf(file, "%s", &str[j]);
What you have here is basically equivalent to the notorious gets -- you've done nothing to limit the input to the size of the buffer. As shown above, you normally want to use %[some_number]s, where some_number is one smaller than the size of the buffer you're using (though, of course, to do that you do need a buffer, which you don't have either).
You've also done nothing to limit the number of lines to the amount of space you've allocated (but, as with the individual lines, you haven't allocated any). I almost hesitate to mention this, however, because (as mentioned above) from your description you don't seem to have any reason to store more than one line anyway.
Your code also leaks all the memory it allocates -- you have calls to malloc, but not a single call to free anywhere.
Actually, some of the advice above is (at last more or less) wrong. It's looking at how to fix an individual line of code, but in reality you probably want to structure the code a bit differently in general. Rather than read the file twice, once to count the words, then read it again to index the words, you probably want to read a line at a time (probably with fgets, then break the line into words, and count each word as you insert it into your index. Oh, and you almost certainly do not want to use a linked-list for your index either. A tree or a hash-table would make a great deal more sense for the job.
I also disagree with the suggestion(s) in the direction of using a debugger on this code. A debugger is not likely to lead toward significantly better code -- it may help you find a few of the localized problems, but is unlikely to lead toward a significantly better program. Instead, I'd suggest a pencil and a piece of paper as the tools you really need to use. I believe your current problems stem primarily for not having thought about the problem enough to really understand what steps are needed to accomplish the goal, and a debugger isn't likely to help much in finding an answer to that question.
If you don't have a good debugger handy, a good fallback is to simply add a few printf statements at steps through the code, so you can see how far it gets before crashing.
In this code:
char **str[wCount+1];/*where the lines are being stored*/
int j=0;
while(!feof(file)){/*inserting lines*/
fscanf(file, "%s", &str[j]);
j++;
}
str is an array of pointers to char *s. In your loop you are reading each piece of input into a slot in it. There are a couple of problems.
I think there's a miscount in the number of *s vs. &s (I don't usually program with that many levels of pointer indirection to avoid having to think so hard about them ;-). &str[j] is the address of that array element, but that array element is a pointer to a pointer; now you have a pointer to a pointer to a pointer. If you had instead char *str[wCount+1], and read into str[j], I think it might match up. (Also I don't use fscanf much, so perhaps someone can confirm how best to use it.)
More obviously, you're not actually allocating any memory for the string data. You're only allocating it for the array itself. You probably want to allocate a fixed amount for each one (you can do that in the loop before each fscanf call). Remember that you're fscanf could in practice read more than that fixed size, resulting in another memory error. Again, working around that requires an expert in fscanf usage.
Hope this helps for a start. If the printf suggestion finds a more specific point in the code where it fails, add that to the question.
This question already has answers here:
Crash or "segmentation fault" when data is copied/scanned/read to an uninitialized pointer
(5 answers)
Closed 3 years ago.
What is wrong with strcpy() in this code?
void process_filedata(char *filename)
{
void* content;
const char * buffer;
char * temp;
char * row;
char * col;
int lsize,buflen,tmp,num_scan; //num_scan - number of characters scanned
int m=0,p=0,d=0,j=0; //m - machine, p - phase, d- delimiter, j - job
FILE *file_pointer = fopen("machinetimesnew.csv","r");
if(file_pointer == NULL)
{
error_flag = print_error("Error opening file");
if(error_flag) exit(1);
}
fseek(file_pointer, 0 ,SEEK_END);
lsize = ftell(file_pointer);
buflen = lsize;
rewind(file_pointer);
// content = (char*) malloc(sizeof(char)*lsize);
fread(content,1,lsize,file_pointer);
buffer = (const char*) content;
strcpy(temp,buffer);
row = strtok(temp,"\n");
...............
...............
I am getting a segmentation fault..
You're not allocating any space for temp. It's a wild pointer.
There are actually three segmentation faults here:
fread(content,1,lsize,file_pointer);
strcpy(temp,buffer);
row = strtok(temp,"\n");
The first one is fread() which is attempting to write to memory that does not yet exist as far as your process is concerned.
The second one is strcpy(), (expounding on the first) you are attempting to copy to a pointer that points to nothing. No memory (other than the pointer reference itself) has been allocated for temp, statically or dynamically.
Fix this via changing temp to look like this (allocating it statically):
char temp[1024];
Or use malloc() to dynamically allocate memory for it (as well as your other pointers, so they actually point to something), likewise for content. If you know the needed buffer size at compile time, use static allocation. If not, use malloc(). 'Knowing' is the subject of another question.
The third one is strtok() , which is going to modify temp en situ (in place), which it obviously can not do, since temp was never allocated. In any event, don't expect temp to be the same once strtok() is done with it. By the name of the variable, I assume you know that.
Also, Initializing a pointer is not the same thing as allocating memory for it:
char *temp = NULL; // temp is initialized
char *temp = (char *) malloc(size); // temp is allocated if malloc returns agreeably, cast return to not break c++
Finally, please get in the habit of using strncpy() over strcpy(), its much safer.
Nothing's wrong with strcpy. You haven't initialised temp.
There's yet another mistake. fread does not add a nul character to the end of the buffer. That's because it only deals with arrays of bytes, not nul-terminated strings. So you need to do something like this:
content = malloc(lsize + 1);
fread(content,1,lsize,file_pointer);
content[lsize] = 0;
temp = malloc(lsize + 1);
strcpy(temp, content);
or this:
content = malloc(lsize);
fread(content,1,lsize,file_pointer);
temp = malloc(lsize + 1);
memcpy(temp, content, lsize);
temp[lsize] = 0;
(Also, in real code you should check the results of fread and malloc.)
you didn't allocate memory for temp
char * temp hasn't been initialized and you consequently haven't allocated any memory for it.
try:
temp = (char *)malloc(SIZE);
where SIZE is however much memory you want to allocate for temp
This piece of code intrigues me:
if(file_pointer == NULL)
{
error_flag = print_error("Error opening file");
if(error_flag) exit(1);
}
Shouldn't you exit unconditionally if the file_pointer is NULL?