segmentation fault with c program - c

The program is getting input from a text file and then parsing them by space and then storing them in an array. I'm using the "<" in linux to read in the file.I needed to create a dynamically allocated array to store the words
if the input strings are "I am the cat in the hat"
then the array should look like:
index:
"I" "am" "the" "cat" "in" "the" "hat"
0 1 2 3 4 5 6
I am getting a segmentation fault and I can't figure out why I am getting the error? I need to Any help would be appreciated.
code:
#include <stdio.h>
#include <time.h>
#include <stdlib.h>
int main()
{
int wordCount = 1, i;
char ch,* wordsArray;
while((ch = getc(stdin)) != EOF)
{
if(ch == ' ') {++wordCount;}
}
wordsArray =(char*)malloc(wordCount * sizeof(char));
for(i = 0; i<wordCount; i++)
{
scanf("%s",&wordsArray[i]);
printf("%s\n", wordsArray[i])
}
return 0;
}
Edit: I have posted the actual code. I am no longer getting segmentation falls, but the words are not going in the array.
example output:
a.out<dict.txt
(null)
(null)
(null)
(null)
(null)
(null)
(null)

Briefly, you're asking scanf to scan strings into memory space that you've only allocated big enough for a few integers. Observe that your malloc call is given a number of ints to allocate and then you're assigning that to a char *. Those two things don't mix, and what's happening is you're scanning a bunch of bytes into your buffer that is too small, running off the end, and stomping on something else, then crashing. The compiler is letting you get away with this because both malloc and scanf are designed to work with various types (void*), so the compiler doesn't know to enforce the type consistency in this case.
If your goal is to save the word tokens from the input string into an array of words, you can do that but you need to allocate something quite different up front and manage it a bit differently.
But if your goal is simply to produce the output you want, there are actually simpler ways of doing it, like with using strtok or just looping through the input char by char and tracking where you hit spaces, etc.

What you're trying to accomplish is parsing standard input. Some functions are available to assist you, such as strtok() and strsep().
You will need an input buffer, which you can mallocate or use an array (I would use an array). As the functions parse the input string, you can fling off the words' addresses into a char pointer array. You could also build a linked list.
Finally, you must buffer input, which adds some delicious complexity to the operation. Still, it's nothing a good C programmer can't handle. Have fun.

Related

Why do I get additional weird characters in this C code with dynamically-allocated char arrays?

I'm quite new to dynamic memory allocation in general.
I've been looking for an error in this code for about 6 hours in the last 3 days now, it's driving me crazy, that's why I've decided to ask for help here.
Here's the code:
char ch;
char* line=(char*)calloc(1, sizeof(char));
if(input!=NULL) {
for(int num=1; (ch=fgetc(input)) != EOF; num++) //input is the pointer to the in file
if(ch!=' ') {
line=(char*)realloc(line, sizeof(char)*num+1);
strcat(line, &ch);
}
else
break;
}
I'm trying to read from a file the first of two whitespace-separated words, where the total size is not predetermined (I'll need this to read even more from the file so it's important, this was "just to try").
This is for a single line, not multiple lines (char** I think would be used in that case), and the idea was to allocate the first character of the line and set it to zero, then reallocate the memory incrementing its size by one character.
If I "num++", it crashes; if I don't, its output will be, instead of "Nole", this: N☺o☺l☺e☺ (output is after the loop; how does it even increase if num remains the same?). I checked the ASCII codes and this is what I get: 78 1 111 1 108 1 101 1; there is a '1' after every character, which is THE SAME value as "num" (in fact, if num==2, then I get '2's instead of '1's). I've tried it with different compilers and different machines but I always get the same result and I cannot explain why.
I'm really going crazy, also because I'm gonna have an exam in about two weeks and this is basically the only thing I haven't learned yet among all the required topics.
Thank you so much in advance 😿
EOF is an int so you must use int ch;
As mentioned in comments, you pass a single ch to strcat and not a null terminated string, so it will go haywire. Quick fix: strcat(line, (char[2]){ch,'\0'});.
Or if you add a counter, you could just do line[count] = ch; which is much more efficient. Though in that case you'll have to remember to append the null terminator manually in the end.
Also, sizeof(char) is always 1 by the very definition of sizeof, so it's just a needlessly bloated way of writing 1.

What is the proper way to populate an array of Strings in C, such that each string is a single element in the array

I'm trying to initialize a 2D array of strings in C; which does not seem to work like any other language I've coded in. What I'm TRYING to do, is read input, and take all of the comments out of that input and store them in a 2d array, where each string would be a new row in the array. When I get to a character that is next line, I want to advance the first index of the array so that I can separate each "comment string". ie.
char arr[100][100];
<whatever condition>
arr[i][j] = "//First comment";
Then when I get to a '/n' I want to increment the first index such that:
arr[i+1][j] = "//Second comment";
I just want to be able to access each input as an individual element in my array. In Java I wouldn't need to do this, as each string would already be an individual element in a String array. I've only been working with c for 3 weeks now, and things that I used to take for granted as being simple, have proven to be quite frustrating in C.
My actual code is below. It gives me an infinite loop and prints out a ton of numbers:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
const int MAXLENGTH = 500;
int main(){
char comment[MAXLENGTH][MAXLENGTH];
char activeChar;
int cIndex = 0;
int startComment = 0;
int next = 0;
while((activeChar = getchar()) != EOF){
if(activeChar == '/'){
startComment = 1;
}
if(activeChar == '\n'){
comment[next][cIndex] = '\0';
next++;
}
if(startComment){
comment[next][cIndex] = activeChar;
cIndex++;
}
}
for(int x = 0 ; x < MAXLENGTH; x++){
for (int j = 0; j < MAXLENGTH; j++){
if(comment[x][j] != 0)
printf("%s", comment[x][j]);
}
}
return 0;
}
The problem you are having is that C was designed to be essentially a glorified assembler. That means that the only stuff it has 'built-in' are things for which there is an obvious correct way to do it. Strings do not meet this criteria. As such strings are not a first-order citizen in c.
In particular there are at least three viable ways to deal with strings, C doesn't force you to use any of them, but instead allows you to use what you want for the job at hand.
Method 1: Static Array
This method appears to be similar to what you are trying to do, and is often used by new C programmers. In this method a string is just an array of characters exactly long enough to fit its contents. Assigning arrays becomes difficult, so this promotes using strings as immutables. It feels likely that this is how most JVM's would implement strings. C code: char my_string[] = "Hello";
Method 2: Static Bounded Array
This method is what you are doing. You decide that your strings must be shorter than a specified length, and pre-allocate a large enough buffer for them. In this case it is relatively easy to assign strings and change them, but they must not become longer than the set length. C code: char my_string[MAX_STRING_LENGTH] = "Hello";
Method 3: Dynamic Array
This is the most advanced and risky method. Here you dynamically allocate your strings so that they always fit their content. If they grow too big, you resize. They can be implemented many ways (usually as a single char pointer that is realloc'd as necessary in combination with method 2, occasionally as a linked list).
Regardless of how you implement strings, to C's eyes they are all just arrays of characters. The only caveat is that to use the standard library you need to null terminate your strings manually (although many [all?] of them specify ways to get around this by manually specifying the length).
This is why java strings are not primitive types, but rather objects of type String.
Interestingly enough, many languages actually use different String types for these solutions. For example Ada has String, Bounded_String, and Unbounded_String for the three methods above.
Solution
Look at your code again: char arr[100][100]; which method is this, and what is it?
Obviously it is method 2 with MAX_STRING_LENGTH of 100. So you could pretend the line says: my_strings arr[100] which makes your issue apparent, this is not a 2D array of strings, but a 2D array of characters which represents a 1D array of strings. To create a 2D array of strings in C you would use: char arr[WIDTH][HEIGHT][MAX_STRING_LENGTH] which is easy to get wrong. As above, however, you have some logic errors in your code, and you can probably solve this problem with just a 1D array of strings. (2D array of chars)
comment is a 2D array of chars, which are single characters. In C, a string is simply an array of characters, so your definition of comment is one way to define a 1D array of strings.
As far as the loading goes, the only obvious potential problem is that you don't ever reset startComment to zero (but you should use a debugger to make sure it's being loaded correctly), however your code to print it out is wrong.
Using printf() with a %s tells it to start printing the string at whatever address you give it, but you're giving it individual characters, not whole strings, so it's interpreting each character in each string (because C is a horrible, horrible language) as an address in RAM and trying to print that RAM. To print an individual character, use %c instead of %s. Or, just make a 1D for loop:
for(int x=0; x<MAX_LENGTH; X++)
printf("%s\n", comment[x])
It's also a bit confusing that you use the same MAX_LENGTH for the number of lines in the array and the length of the string in each line

Pointer mystery/noobish issue

I am originally a Java programmer who is now struggling with C and specifically C's pointers.
The idea on my mind is to receive a string, from the user, on a command line, into a character pointer. I then want to access its individual elements. The idea is later to devise a function that will reverse the elements' order. (I want to work with anagrams in texts.)
My code is
#include <stdio.h>
char *string;
int main(void)
{
printf("Enter a string: ");
scanf("%s\n",string);
putchar(*string);
int i;
for (i=0; i<3;i++)
{
string--;
}
putchar(*string);
}
(Sorry, Code marking doesn't work).
What I am trying to do is to have a first shot at accessing individual elements. If the string is "Santillana" and the pointer is set at the very beginning (after scanf()), the content *string ought to be an S. If unbeknownst to me the pointer should happen to be set at the '\0' after scanf(), backing up a few steps (string-- repeated) ought to produce something in the way of a character with *string. Both these putchar()'s, though, produce a Segmentation fault.
I am doing something fundamentally wrong and something fundamental has escaped me. I would be eternally grateful for any advice about my shortcomings, most of all of any tips of books/resources where these particular problems are illuminated. Two thick C books and the reference manual have proved useless as far as this.
You haven't allocated space for the string. You'll need something like:
char string[1024];
You also should not be decrementing the variable string. If it is an array, you can't do that.
You could simply do:
putchar(string[i]);
Or you can use a pointer (to the proposed array):
char *str = string;
for (i = 0; i < 3; i++)
str++;
putchar(*str);
But you could shorten that loop to:
str += 3;
or simply write:
putchar(*(str+3));
Etc.
You should check that scanf() is successful. You should limit the size of the input string to avoid buffer (stack) overflows:
if (scanf("%1023s", string) != 1)
...something went wrong — probably EOF without any data...
Note that %s skips leading white space, and then reads characters up to the next white space (a simple definition of 'word'). Adding the newline to the format string makes little difference. You could consider "%1023[^\n]\n" instead; that looks for up to 1023 non-newlines followed by a newline.
You should start off avoiding global variables. Sometimes, they're necessary, but not in this example.
On a side note, using scanf(3) is bad practice. You may want to look into fgets(3) or similar functions that avoid common pitfalls that are associated with scanf(3).

how to write out a tokenized string

I am quite new to C and any examples I have found of my problem didn't seem to work, or I completely misunderstood what that solution was. I have a large file with data that looks like:
LYS 24L HB2 45.212 39.585 124.457 SC0 0.145 -0.795 0.585 0.157
on each line. I have tokenized the data already using strtok. What I need is from the second field, I wish to have the 24 stored as an integer for comparison, and the L to be stored as a char for comparison as well.
I tried using
sscanf(token[1], "%d%s", number, letter);
but I keep getting Segmentation fault error. Also upon further experimenting with sscanf i tried simply to print out "LYS" (in attempt to further understand my problem) however my program would only print L using the following command:
sscanf(token[0], "%c", &stemp);
letter = stemp;
printf("%c \n", letter);
However if change %c ---> %s (hoping to obtain the whole string) then I obtain the Segmentation fault error once again ... Is there something I do not understand about the sscanf command? Why can i not read in a full string?? Thank you in advance for your time and help!!
Paul
I suspect the issue is actually that number and letter are of type int and char, respectively. scanf() needs addresses of memory locations in which to store the values, not the variables themselves; i.e.,
int number;
char letter[2];
sscanf(token[1], "%d%s", &number, letter);
I've made letter into an array of two characters, and am passing the address of the array; that matches the %s scan conversion that you've used.

how to form an array of numbers , taken input from a file in C

The program should be able to make an array of numbers from a text file which reads like this
The data is given as this
123 2132 1100909 3213 89890
my code for it is
char a;
char d[100];
char array[100];
a=fgetc(fp) // where fp is a file pointer
if (a=='')
{
d[count1]='/0';
strcpy(&array[count],d);
count=count+1;
memset(d,'\0',100)
count1=0;
}
else
{
d[count1]=a;
count1=count1+1;
}
a=fgetc(fp);
i am getting segmentation fault now . want to store each number in the array so that i can do sorting on it
Your (first) problem is here:
d[count1]='/0';
strcpy(&array[count],d);
You have written '/0', which isn't what you think it is. Assuming you meant '\0' (a null char literal), then you appear to be trying to manually terminate the string d before calling strcpy(). The problem is that what actually gets written to d is not a null byte, and so d is not null-terminated, and then strcpy() goes off and starts reading random memory after it, and copying that memory into array, until either the reading or the writing ends up outside of memory you're allowed to access, and you get a segmentation fault.
You also have some confusion about that array is. It's declared as an array of 100 chars, but you're treating it like it's an array of strings. Perhaps you meant to declare it as char *array[100] ?
Hmm...as a first approximation, to read a single number, consider using fscanf("%d", &number);. To store the numbers you read, you'll probably want to create an array of numbers (e.g., int numbers[100];). To read more than one number, use a loop to read the numbers into the array.
Sidenote: fscanf isn't particularly forgiving of errors in the input (among other things) so for production code, you probably want to read a string, and parse numbers out of that, but for now, it looks like you probably just need to get something that works for correct input, not worry about handling incorrect input gracefully.
Is this actually how the code is written?
In d[count1]='/0'; I think you mean d[count1]='\0'; (already mentioned by Daniel Pryden).
There is also a semicolon missing at the end of memset(d,'\0',100)

Resources