sscanf cycle segmentation fault - c

I haven't programmed for a couple years and I have a question with sscanf:
I want to separate a string into several using sscanf, but sscanf gives me segmentation fault in a cycle. Why is that? how can I use sscanf in a cycle without happening?
Example:
int main() {
char str[100];
char mat[100][100]; int i = 0;
strcpy(str, "higuys\nilovestackoverflow\n2234\nhaha");
while (sscanf(str, "%s", mat[i]) == 1) i++;
}

int sscanf(const char *str, const char *format, ...);
while(sscanf(str,"%s", mat[i]) == 1) i++;
Since str is constant in the prototype it cannot be changed by sscanf (unless sscanf is very broken :)), so it successfully repeats over and over, returning 1 all the time...
So i increases, and at some point you're hitting a memory boundary and the system stops your harmful program.
If you want to read a multi-line string, use a loop with strtok for instance, that will go through your string and yield lines.
Note: my previous answer correctly assumed that the previous version question had a typo with an extra ; in the middle
while(sscanf(str,"%s", mat[i]) == 1); i++;
is always successful since str is the input and doesn't change (unlike when you're reading from a file using fscanf or fgets).
So it was just an infinite loop in that case.

sscanf stops at \n, stores the word higuys into the array mat[i] and returns 1. The loop condition is true, i gets incremented and the process goes on for the next element of mat as the destination with the same source string... every element of mat receives the same higuys string, and the loop continues, causing a buffer overflow, invoking undefined behavior and ultimately crashing.
Here is how to modify your code to make it work:
#include <stdio.h>
int main(void) {
const char *str = "higuys\nilovestackoverflow\n2234\nhaha";
char mat[100][100];
int i = 0, n = 0;
/* parse the multiline string */
while (sscanf(str, "%s%n", mat[i], &n) == 1) {
str += n;
i++;
}
/* output the array */
for (int j = 0; j < i; j++) {
printf("mat[%d] = %s\n", j, mat[j]);
}
return 0;
}

Related

How to return empty string from a function in C?

How should I return an empty string from a function? I tried using lcp[i] = ' ' but it creates an error. Then I used lcp[i] = 0 and it returned an empty string. However, I do not know if it's right.
Also, is it necessary to use free(lcp) in the caller function? Since I could not free and return at the same time.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_LEN 50
char *find_LCP(char str1[], char str2[]);
char *find_LCP(char str1[], char str2[]){
char * lcp = malloc(MAX_LEN * sizeof(char));
int a = strlen(str1);
int b = strlen(str2);
int min = a < b ? a : b;
for(int i = 0; i < min; i++){
if(str1[i] == str2[i])
lcp[i] = str1[i];
else
lcp[i] = 0;
}
return lcp;
}
int main()
{
char str1[MAX_LEN], str2[MAX_LEN];
char * lcp;
printf("Enter first word > ");
scanf("%s", str1);
printf("Enter second word > ");
scanf("%s", str2);
lcp = find_LCP(str1, str2);
printf("\nLongest common prefix: '%s'\n", lcp);
free(lcp);
return 0;
}
An "empty" string is just a string with the first byte zero, so you can write:
s[0] = 0;
However, it is not clear what you are trying to do. The LCP of "foo" and "fob" is "fo", not the empty string.
You can also return as soon as you find the first non-matching character, no need to go until the end.
Further, you can simply pass the output string as a parameter and have lcp be an array. That way you avoid both malloc and free:
char lcp[MAX_LEN];
...
find_LCP(lcp, str1, str2);
If you want to empty a string without using a for loop then you can do
lcp[0] = 0
but for emptying a string it was right the way you did using a for loop.
There are plenty other ways of emptying the string word by word using for loop:
lcp[i] = '\0';
and it's the right way to make string empty as letter by letter you trying to do using for loop
But if you are not using some loops and simply empty a string then you can do this.
memset(buffer,0,strlen(buffer));
but this will only work for zeroing up to the first NULL character.
If the string is a static array, you can use:
memset(buffer,0,sizeof(buffer));
Your program has a bug: If you supply two identical strings, lcp[i] = 0; never executes which means that your function will return a string which is not NUL-terminated. This will cause undefined behvaior when you use that string in your printf in main.
The fix for this is easy, NUL-terminate the string after the loop:
int i;
for (i = 0; i < min; i++){
if(str1[i] == str2[i])
lcp[i] = str1[i];
else
break;
}
lcp[i] = 0;
As for the answer to the question, an empty string is one which has the NUL-terminator right at the start. We've already handled that as we've NUL-terminated the string outside the loop.
Also, is it necessary to use free(lcp) in the caller function?
In this case, it is not required as the allocated memory will get freed when the program exits, but I'd recommend keeping it because it is good practice.
As the comments say, you can use calloc instead of malloc which fills the allocated memory with zeros so you don't have to worry about NUL-terminating.
In the spirit of code golf. No need to calculate string lengths. Pick any string and iterate through it until the current character either null or differs from the corresponding character in the other string. Store the index, then copy appropriate number of bytes.
char *getlcp(const char *s1, const char *s2) {
int i = 0;
while (s1[i] == s2[i] && s1[i] != '\0') ++i;
char *lcp = calloc((i + 1), sizeof(*lcp));
memcpy(lcp, s1, i);
return lcp;
}
P.S. If you don't care about preserving one of input strings then you can simplify the code even further and just return the index (the position of the last character of the common prefix) from the function, then put '\0' at that index into one of the strings.

fscanf in C with a text file with no spaces

I have a text file with names that looks as follows:
"MARY","PATRICIA","LINDA","BARBARA","ELIZABETH","JENNIFER","MARIA","SUSAN","MARGARET",
I have used the following code to attempt to put the names into an array:
char * names[9];
int i = 0;
FILE * fp = fopen("names.txt", "r");
for (i=0; i < 9; i++) {
fscanf(fp, "\"%s\",", names[i]);
}
The program comes up with a segmentation fault when I try to run it. I have debugged carefully, and I notice that the fault comes when I try and read in the second name.
Does anybody know why my code isn't working, and also why the segmentation fault is happening?
You have undefined behavior in your code, because you don't allocate memory for the pointers you write to in the fscanf call.
You have an array of nine uninitialized pointers, and as they are part of a local variable they have an indeterminate value, i.e. they will point to seemingly random locations. Writing to random locations in memory (which is what will happen when you call fscanf) will do bad things.
The simplest way to solve the problem is to use an array of arrays, like e.g.
char names[9][20];
This will gives you an array of nine arrays, each sub-array being 20 characters (which allows you to have names up to 19 characters long).
To not write out of bounds, you should also modify your call so that you don't read to many characters:
fscanf(fp, "\"%19s\",", names[i]);
There is however another problem with your use of the fscanf function, and that is that the format to read a string, "%s", reads until it finds a whitespace in the input (or until the limit is reached, if a field width is provided).
In short: You can't use fscanf to read your input.
Instead I suggest you read the whole line into memory at once, using fgets, and then split the string on the comma using e.g. strtok.
One way of handling arbitrarily long lines as input from a file (pseudoish-code):
#define SIZE 256
size_t current_size = SIZE;
char *buffer = malloc(current_size);
buffer[0] = '\0'; // Terminator at first character, makes the string empty
for (;;)
{
// Read into temporary buffer
char temp[SIZE];
fgets(temp, sizeof(temp), file_pointer);
// Append to actual buffer
strcat(buffer, temp);
// If last character is a newline (which `fgets` always append
// if it reaches the end of the line) then the whole line have
// been read and we are done
if (last_character_is_newline(buffer))
break;
// Still more data to read from the line
// Allocate a larger buffer
current_size += SIZE;
buffer = realloc(buffer, current_size);
// Continues the loop to try and read the next part of the line
}
// After the loop the pointer `buffer` points to memory containing the whole line
[Note: The above code snippet doesn't contain any error handling.]
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void) {
char *names[9], buff[32];
int i = 0;
FILE *fp = fopen("names.txt", "r");
for(i = 0; i < 9; i++) {
if(1==fscanf(fp, "\"%31[^\"]\",", buff)){//"\"%s\"," does not work like that what you want
size_t len = strlen(buff) + 1;
names[i] = malloc(len);//Space is required to load the strings of each
memcpy(names[i], buff, len);
}
}
fclose(fp);
//check print & deallocate
for(i = 0; i< 9; ++i){
puts(names[i]);
free(names[i]);
}
return 0;
}
try this...
for (i=0; i < 9; i++)
{
names[i]=malloc(15);// you should take care about size
fscanf(fp, "\"%s\",", names[i]);
}

passing tokens from array to strcmp

What I am trying to do is to break the user input in parts with whitespace as a delimiter, copy the parts into the array (tokenAr) and compare the tokenAr[0] (the first part) if it is equal to sHistory. if they are equal, check the value of tokenAr[1] if it is "1", "2" etc, to execute the corresponding command that is entered in the history array. This is what i have tried to far and it crashes. I am using TCC on Windows x64.
EDIT: I forgot to mention that I began learning C, just two days ago.
EDIT2: I run the program in a debugger and it has raised an Acces Violation(Segmentation Fault) in line if(strcmp(tokenArPtr[0],sHistory)==0)
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char *argv[])
{
int i=1; int j=1; int k=0;
char history[100][100] = {0};
char sKey[] = "exit";
char sInput[100];
char sHistory[]="history";
do
{
//gather user input
printf ("hshell> ");
fgets (sInput, 100, stdin);
strcpy(history[i],sInput);
i++;
//END_gather user input
//Tokenizing
char delims[] = " ";
char *tokenArPtr[5];
char *result = NULL;
result = strtok(sInput, delims);
tokenArPtr[0] = result;
while (result!=NULL)
{
puts(result);
result= strtok(NULL, delims);
tokenArPtr[k+1] = result;
puts(tokenArPtr[k]);
puts("=====");
k++;
}
k=0;
/*
//END_Tokenizing
if(strcmp(tokenArPtr[0],sHistory)==0)
{
for(j=1;j<i;j++)
{
printf("%d. %s \n",j,history[j]);
}
}
else if (strcmp (sKey,tokenArPtr[0]) != 0)
{
printf("\nCommand not found \n");
}*/
}while (strcmp (sKey,sInput) != 0);
return 0;
}
EDIT 3: I used the result variable instead of the tokenArPtr directly, but when debugging, I noticed that the values of the array are not being updated.
Which type does strtok return? char *. What is the type of tokenAr[k]? char. What type does strcmp expect as input? char * and char *. What is the type of tokenAr[0]? char.
See a problem? You should. The * is pretty significant.
Assuming tokenAr is declared like char *tokenAr[2];, how many char * values can tokenAr store? What happens when k exceeds 2? You need to ensure you don't overflow your tokenAr array.
history is uninitialised. Using an uninitialised variable is undefined behaviour. I suggest initialising it, like this: char history[100][100] = { 0 };
Which book are you reading?
While tokenizing, the loop will never end because the test is on the variable "result" that will never change... So you're finally going to a buffer overflow with "tokenAr"... Modify your code to test "tokenAr".
Edit: And tokenAR should be an array... (I don't know how it can compile...)
There are many problems... First of all you should include string.h which will show you some errors in compilation.
I believe that the main problem is here:
char tokenAr[2];
result = strtok(sInput, delims);
while (result!=NULL)
{
tokenAr[k] = strtok(NULL, delims);
k++;
}
tokenAr should be an array of pointers, not chars. And are you sure that k will never exceed 2? An assertion would help debugging.

Why am I getting a segmentation fault?

I'm trying to write a program that takes in a plaintext file as it's argument and parses through it, adding all the numbers together and then print out the sum. The following is my code:
#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
static int sumNumbers(char filename[])
{
int sum = 0;
FILE *file = fopen(filename, "r");
char *str;
while (fgets(str, sizeof BUFSIZ, file))
{
while (*str != '\0')
{
if (isdigit(*str))
{
sum += atoi(str);
str++;
while (isdigit(*str))
str++;
continue;
}
str++;
}
}
fclose(file);
return sum;
}
int main(int argc, char *argv[])
{
if (argc != 2)
{
fprintf(stderr, "Please enter the filename as the argument.\n");
exit(EXIT_FAILURE);
}
else
{
printf("The sum of all the numbers in the file is : %d\n", sumNumbers(argv[1]));
exit(EXIT_SUCCESS);
}
return 0;
}
And the text file I'm using is:
This a rather boring text file with
some random numbers scattered
throughout it.
Here is one: 87 and here is another: 3
and finally two last numbers: 12
19381. Done. Phew.
When I compile and try to run it, I get a segmentation fault.
You've not allocated space for the buffer.The pointer str is just a dangling pointer. So your program effectively dumps the data read from the file into memory location which you don't own, leading to the segmentation fault.
You need:
char *str;
str = malloc(BUFSIZ); // this is missing..also free() the mem once done using it.
or just:
char str[BUFSIZ]; // but then you can't do str++, you'll have to use another
// pointer say char *ptr = str; and use it in place of str.
EDIT:
There is another bug in:
while (fgets(str, sizeof BUFSIZ, file))
The 2nd argument should be BUFSIZ not sizeof BUFSIZ.
Why?
Because the 2nd argument is the maximum number of characters to be read into the buffer including the null-character. Since sizeof BUFSIZ is 4 you can read max upto 3 char into the buffer. That is reason why 19381 was being read as 193 and then 81<space>.
You haven't allocated any memory to populate str. fgets takes as its first argument a buffer, not an unassigned pointer.
Instead of char *str; you need to define a reasonably sized buffer, say, char str[BUFSIZ];
Because you've not allocated space for your buffer.
A number of people have already addressed the problem you asked about, but I've got a question in return. What exactly do you think this accomplishes:
if (isdigit(*str))
{
if (isdigit(*str))
{
sum += atoi(str);
str++;
while (isdigit(*str))
str++;
continue;
}
}
What's supposed to be the point of two successive if statements with the exact same condition? (Note for the record: neither one has an else clause).
You have declared char* str, but you have not set aside memory for it just yet. You will need to malloc memory for it.
Many memory related errors such as this one can be easily found with valgrind. I'd highly recommend using it as a debugging tool.
char *str;
str has no memory allocated for it. Either use malloc() to allocate some memory for it, or declared it with a predefined size.
char str[MAX_SIZE];
Your program has several bugs:
It does not handle long lines correctly. When you read a buffer of some size it may happen that some number starts at the end of the buffer and continues at the beginning of the next buffer. For example, if you have a buffer of size 4, there might be the input The |numb|er 1|2345| is |larg|e., where the vertical lines indicate the buffer's contents. You would then count the 1 and the 2345 separately.
It calls isdigit with a char as argument. As soon as you read any "large" character (greater than SCHAR_MAX) the behavior is undefined. Your program might crash or produce incorrect results or do whatever it wants to do. To fix this, you must first cast the value to an unsigned char, for example isdigit((unsigned char) *str). Or, as in my code, you can feed it the value from the fgetc function, which is guaranteed to be a valid argument for isdigit.
You use a function that requires a buffer (fgets) but you fail to allocate the buffer. As others noted, the easiest way to get a buffer is to declare a local variable char buffer[BUFSIZ].
You use the str variable for two purposes: To hold the address of the buffer (which should remain constant over the whole execution time) and the pointer for analyzing the text (which changes during the execution). Make these two variables. I would call them buffer and p (short for pointer).
Here is my code:
#include <ctype.h>
#include <stdio.h>
static int sumNumbers(const char *filename)
{
int sum, num, c;
FILE *f;
if ((f = fopen(filename, "r")) == NULL) {
/* TODO: insert error handling here. */
}
sum = 0;
num = 0;
while ((c = fgetc(f)) != EOF) {
if (isdigit(c)) {
num = 10 * num + (c - '0');
} else if (num != 0) {
sum += num;
num = 0;
}
}
if (fclose(f) != 0) {
/* TODO: insert error handling here. */
}
return sum;
}
int main(int argc, char **argv) {
int i;
for (i = 1; i < argc; i++)
printf("%d\t%s\n", sumNumbers(argv[i]), argv[i]);
return 0;
}
Here is a function, that does your job:
static int sumNumbers(char* filename) {
int sum = 0;
FILE *file = fopen(filename, "r");
char buf[BUFSIZ], *str;
while (fgets(buf, BUFSIZ, file))
{
str=buf;
while (*str)
{
if (isdigit(*str))
{
sum += strtol(str, &str, 10);
}
str++;
}
}
fclose(file);
return sum;
}
This doesn't includes error handling, but works quite well. For your file, output will be
The sum of all the numbers in the file is : 19483

How to iterate over a string in C?

Right now I'm trying this:
#include <stdio.h>
int main(int argc, char *argv[]) {
if (argc != 3) {
printf("Usage: %s %s sourcecode input", argv[0], argv[1]);
}
else {
char source[] = "This is an example.";
int i;
for (i = 0; i < sizeof(source); i++) {
printf("%c", source[i]);
}
}
getchar();
return 0;
}
This does also NOT work:
char *source = "This is an example.";
int i;
for (i = 0; i < strlen(source); i++){
printf("%c", source[i]);
}
I get the error
Unhandled exception at 0x5bf714cf (msvcr100d.dll) in Test.exe: 0xC0000005: Access violation while reading at position 0x00000054.
(loosely translated from german)
So what's wrong with my code?
You want:
for (i = 0; i < strlen(source); i++) {
sizeof gives you the size of the pointer, not the string. However, it would have worked if you had declared the pointer as an array:
char source[] = "This is an example.";
but if you pass the array to function, that too will decay to a pointer. For strings it's best to always use strlen. And note what others have said about changing printf to use %c. And also, taking mmyers comments on efficiency into account, it would be better to move the call to strlen out of the loop:
int len = strlen(source);
for (i = 0; i < len; i++) {
or rewrite the loop:
for (i = 0; source[i] != 0; i++) {
One common idiom is:
char* c = source;
while (*c) putchar(*c++);
A few notes:
In C, strings are null-terminated. You iterate while the read character is not the null character.
*c++ increments c and returns the dereferenced old value of c.
printf("%s") prints a null-terminated string, not a char. This is the cause of your access violation.
Rather than use strlen as suggested above, you can just check for the NULL character:
#include <stdio.h>
int main(int argc, char *argv[])
{
const char *const pszSource = "This is an example.";
const char *pszChar = pszSource;
while (pszChar != NULL && *pszChar != '\0')
{
printf("%s", *pszChar);
++pszChar;
}
getchar();
return 0;
}
An optimized approach:
for (char character = *string; character != '\0'; character = *++string)
{
putchar(character); // Do something with character.
}
Most C strings are null-terminated, meaning that as soon as the character becomes a '\0' the loop should stop. The *++string is moving the pointer one byte, then dereferencing it, and the loop repeats.
The reason why this is more efficient than strlen() is because strlen already loops through the string to find the length, so you would effectively be looping twice (one more time than needed) with strlen().
sizeof(source) returns the number of bytes required by the pointer char*. You should replace it with strlen(source) which will be the length of the string you're trying to display.
Also, you should probably replace printf("%s",source[i]) with printf("%c",source[i]) since you're displaying a character.
sizeof() includes the terminating null character. You should use strlen() (but put the call outside the loop and save it in a variable), but that's probably not what's causing the exception.
you should use "%c", not "%s" in printf - you are printing a character, not a string.
This should work
#include <stdio.h>
#include <string.h>
int main(int argc, char *argv[]){
char *source = "This is an example.";
int length = (int)strlen(source); //sizeof(source)=sizeof(char *) = 4 on a 32 bit implementation
for (int i = 0; i < length; i++)
{
printf("%c", source[i]);
}
}
The last index of a C-String is always the integer value 0, hence the phrase "null terminated string". Since integer 0 is the same as the Boolean value false in C, you can use that to make a simple while clause for your for loop. When it hits the last index, it will find a zero and equate that to false, ending the for loop.
for(int i = 0; string[i]; i++) { printf("Char at position %d is %c\n", i, string[i]); }
sizeof(source) is returning to you the size of a char*, not the length of the string. You should be using strlen(source), and you should move that out of the loop, or else you'll be recalculating the size of the string every loop.
By printing with the %s format modifier, printf is looking for a char*, but you're actually passing a char. You should use the %c modifier.
Just change sizeof with strlen.
Like this:
char *source = "This is an example.";
int i;
for (i = 0; i < strlen(source); i++){
printf("%c", source[i]);
}
This is 11 years old but relevant to someone who is learning C. I don't understand why we have all this discussion and disagreement about something so fundamental. A string literal in C, I.E. "Text between quotes" has an implicit null terminator after the last character. Don't let the name confuse you. The null terminator is equal to numeric 0. Its purpose is exactly what OP needs it for:
char source[] = "This is an example.";
for (int i = 0; source[i]; i++)
printf("%c", source[i]);
A char in C is an 8-bit integer with the numeric ASCII value of the corresponding character. That means source[i] is a positive integer until char[19], which is the null terminator after the final '.' The null character is ASCII 0. This is where the loop terminates. The loop iterates through every character with no regard for the length of the array.
Replace sizeof with strlen and it should work.
sizeof(source) returns sizeof a pointer as source is declared as char *.
Correct way to use it is strlen(source).
Next:
printf("%s",source[i]);
expects string. i.e %s expects string but you are iterating in a loop to print each character. Hence use %c.
However your way of accessing(iterating) a string using the index i is correct and hence there are no other issues in it.
You need a pointer to the first char to have an ANSI string.
printf("%s", source + i);
will do the job
Plus, of course you should have meant strlen(source), not sizeof(source).

Resources