Getting the length of a String Token - c

C Programming:
I'm attempting to get the length of each word inside a string, but having massive troubles and get a segmentation fault no matter what method I use.
I was originally trying to use just "strlen(pstr)" but that caused segmentation fault.
This is my latest attempt which still causes segmentation fault:
void format_text(char* text)
{
char* pstr;
char copied[1000];
int loop;
int strCnt = 0;
int temp;
int strSize[100];
int strNo[100];
pstr = strtok(text, " ");
while (pstr != NULL)
{
pstr = strtok(NULL, " ");
strcpy(copied, pstr);
strSize[strCnt] = strlen(copied);
strCnt++;
}
printf("the number of strings is: %d\n", strCnt);
for (loop = 0; loop < strCnt; loop++)
{
printf("The length of string %d is %d\n", loop + 1, strSize[loop]);
}
}
How can I get and print the length of each word(token)?

You don't need the copied string to get the lenght, you can simply use pstr. And, you need to put the strtok() after the strlen(), otherwise you'll miss the length of the first word.
Here's what should work:
pstr = strtok(text, " ");
while (pstr != NULL)
{
strSize[strCnt++] = strlen(pstr);
pstr = strtok(NULL, " ");
}

The strtok() function returns pstr == NULL when there are no further tokens to return, however your code does not take notice of this and instead immediately calls strcpy(copied, pstr);, which will result in your crash. Check for NULL and terminate the loop immediately before otherwise using pstr.
For bonus learning points, walk through this in your debugger to follow exactly what happens.

Related

Reading from an input file and storing words into an array [duplicate]

This question already has an answer here:
Unexpected strtok() behaviour
(1 answer)
Closed 4 years ago.
The end goal is to output a text file where repeating words are encoded as single digits instead. The current problem I'm having is reading words and storing them into an array.
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#define MAX_CODE 120
void main() {
FILE *inputfile = fopen("input.txt","rw");
char buffer[128];
char *token;
char *words[MAX_CODE];
int i = 0;
while(fgets(buffer, 128, inputfile)){
token = strtok(buffer," ");
printf("Token %d was %s",i,token);
while(token != NULL) {
words[i] = malloc(strlen(token)+1);
strcpy(words[i], token);
i++;
token = strtok(buffer," ");
}
}
for(int i = 0; i<3; i++) printf("%d\n%s\n",i,words[i]);
printf("End");
}
What I get is segmentation fault errors, or nothing. What I want is for words to be an array of strings. I'm allocating memory for each string, so where am I going wrong?
Your second call to strtok should pass NULL for the first argument. Otherwise, strtok will parse the first token over and over again.
token = strtok(buffer," ");
printf("Token %d was %s\n",i,token);
while(i < MAX_CODE && token != NULL) {
words[i] = malloc(strlen(token)+1);
strcpy(words[i], token);
i++;
token = strtok(NULL," ");
}
The check against MAX_CODE is for the safety's sake, in case you ever increase the size of your buffer or reduce the value of MAX_CODE. In your current code, the maximum number of space delimited tokens you can hold in a 128 byte buffer is 64.
From cppreference:
If str != NULL, the call is treated as the first call to strtok for this particular string. ...
If str == NULL, the call is treated as a subsequent calls to strtok: the function continues from where it left in previous invocation. ...

Explicitly ignoring NULL value but C behaves weird

I am writing a program to check a certain word in a string. I use strtok to chop up the string and store it in an array. There is no problem with that.
The problem comes when I try to check the value of the wordArray at a certain index and say that if it is not NULL, save into a variable, and if it is NULL, do nothing. However, it is not ignoring NULL.
My code is below:
// This is a string to consider
char line[] = "I am here";
// Array of pointers to later hold pointers to each word
char *wordArray[MAX_LINE_LEN];
// Below is the chopping function, this is working well
// First chop up the first word, using the original string
wordArray[0] = strtok(line, " ");
int i = 0;
// Then loop to chop up and save into wordArray
while(wordArray[i] != NULL){
i++;
wordArray[i] = strtok(NULL, " ");
}
// Print out the words in wordArray
for (int j = 0; j < i; j++) {
printf("Word at index %d in wordArray is: %s \n",j, wordArray[j]);
}
// This is a part I don't get
// First define a character array/pointer so that it's the same type with wordArray
char *word = "a word";
int i = 0;
// Check wordArray at a certain key, if not null, save the value into word variable
if (wordArray[i] != NULL) {
word = wordArray[i];
}
printf("Word is: %s \n", word);
When i = 0:
Word is: I
When i = 2:
Word is: here
When i = 3 (at this point it's doing the right thing - ignore the if statement):
Word is: a word
When i >= 4:
Word is:
Nothing prints out. What exactly is its problem? How do I fix this?
UPDATE:
Thanks to all the help! The problem is wordArray has not been initialized with NULL values. Here's what I add:
for (int i = 0; i < MAX_LINE_LEN; i++) {
wordArray[i] = NULL;
}
This is an array of pointers so I used NULL, but for an array of characters it will probably prefer wordArray[i] = '\0' since '\0' is a null character array.
// Array of pointers to later hold pointers to each word
char *wordArray[MAX_LINE_LEN];
wordArray is not intially assigned with null.
// Then loop to chop up and save into wordArray
while(wordArray[i] != NULL){
i++;
wordArray[i] = strtok(NULL, " ");
}
above loop will terminate when value of i reach to 4.so ,wordArray[4] is not initialsed. since,wordArray coming from stack and you are not initializing its value can be anything.so ,below condition will not fail.
if (wordArray[i] != NULL) {
word = wordArray[key];
}
you are lucky that you didnt get hard fault as for this case word is pointing to any random pointer you will get undefined behavior here.

Segmentation fault when using strlen on user input string

I'm trying to understand what's wrong with my code.
I have a string composed by words inserted by user input.
I've terminated the string so it should be ok.
Then I use another cycle to reverse the direction of the words. But when I call STRLEN on the last word of the string, it gives me segmentation fault.
(the reversal part is not done yet, since i'm stuck with this problem).
Why?
Here is the code:
char *frase, c;
int i=0;
int d=1;
frase = (char*)malloc(sizeof(char));
printf("Insert phrase: ");
while(c != '\n') {
c = getc(stdin);
frase = (char*)realloc(frase,d*sizeof(char)); //dynamic allocation
frase[i] = c;
d++;
i++;
}
//at the end i terminate the string
frase[i]='\0';
printf("\nInserted phrase: %s\n",frase);
// here I start the reversal, first of all I separate all the words
char *pch;
char s[2] = " ";
int messl=0;
pch = strtok (frase,s);
printf ("%s\n",pch);
messl += 1 + strlen(pch);
printf ("Lung stringa = %d\n",messl);
char *message = (char*) malloc(messl);
while (pch != NULL) {
pch = strtok (NULL, s);
printf("%s\n",pch);
messl += 1 + strlen(pch); //in the last cycle of the loop I get the error
}
//printf ("%s\n",message);
return 0;
In your code.
while(c != '\n')
at the very first iteration, c is uninitialised. It invokes undefined behaviour to use the value of an automatic local variable which has not been initialized explicitly.
getc() returns an int which , at times, may not fit into a char. Change the type of c to int.
That said, as you mentioned in your question, that you're getting segfault from strlen(), you need check for the non-NULL value of the passed pointer to strlen(). Add the NULL-check to pch immediately after tokenizing.
The main problem is:
while (pch != NULL) {
pch = strtok (NULL, s);
printf("%s\n",pch);
messl += 1 + strlen(pch);
When strtok returns NULL, you go on to call printf and strlen on it. You need to immediately test pch upon calling strtok. For example the loop structure could be:
while ( (pch = strtok(NULL, s)) != NULL ) {
There are various other problems too, as other answerers/commentors have noted.

segmentation fault while running the programme

I have written code for parsing a string into words. Here is code. Can any one help here to fix the segmentation fault error during run time?
Calling fun :
int main()
{
int count = 0, i; // count to hold numbr of words in the string line.
char buf[MAX_LENTHS]; // buffer to hold the string
char *options[MAX_ORGS]; // options to hold the words that we got after parsing.
printf("enter string");
scanf("%s",buf);
count = parser(buf,options); // calling parser
for(i = 0; i < count; ++i)
printf("option %d is %s", i, options[i]);
return 0;
}
Called function:
int parser(char str[], char *orgs[])
{
char temp[1000];//(char *)malloc(strlen(str)*sizeof(char));
int list = 0;
strcpy(temp, str);
*orgs[list]=strtok(str, " \t ");
while(((*orgs[list++]=strtok(str," \t"))!=NULL)&&MAX_ORGS>list)
list++;
printf("count =%d",list);
return list;
}
Note : I'm trying to learn C these days, can any one help to get a good tutorial (pdf) or site to learn these strings with pointers, and sending string to functions as arguments?
You are using strtok wrong.
(It is generally best to not use strtok at all, for all its problems and pitfalls.)
If you must use it, the proper way to use strtok is to call it ONCE with the string you want to "tokenize",
then call it again and again with NULL as an indication to continue parsing the original string.
I also think you're using the orgs array wrong.
Change this assignment
*orgs[list++]=strtok(str, " \t ");
to this:
orgs[list++]=strtok(str, " \t ");
Because orgs is an array of character-pointers.
orgs[x] is a character-pointer, which matches the return-type of strtok
Instead, you are referring to *orgs[x], which is just a character.
So you are trying to do:
[character] = [character-pointer];
which will result in "very-bad-thingsā„¢".
Finally, note that you are incrementing list twice each time through your loop.
So basically you're only filling in the even-elements, leaving the odd-elements of orgs uninitialized.
Only increment list once per loop.
Basically, you want this:
orgs[list++] = strtok(str, " \t ");
while(( (orgs[list++] = strtok(NULL," \t")) !=NULL) && MAX_ORGS > list)
/* do nothing */;
PS You allocate space for temp, and strcpy into it.
But then it looks like you never use it. Explain what temp is for, or remove it.
char buf[MAX_LENTHS];
You have not defined the array size, i. e. MAX_LENTHS should be defined like
#define MAX_LENTHS 25
And as Paul R says in his comment you also need to initialize your array of character pointers
char *options[MAX_ORGS];
with something .
int parser(char str[], char *orgs[]){
int list=0;
orgs[list]=strtok(str, " \t\n");
while(orgs[list]!=NULL && ++list < MAX_ORGS)
orgs[list]=strtok(NULL," \t\n");
printf("count = %d\n",list);
return list;
}
int main(){
int count=0,i;
char buf[MAX_LENTHS];
char *options[MAX_ORGS];
printf("enter string: ");
fgets(buf, sizeof(buf), stdin);//input include space character
count=parser(buf,options);
for(i=0;i<count;++i)
printf("option %d is %s\n",i,options[i]);
return 0;
}

Compilation issues on linux

So I wrote the following code in linux(Ubuntu) using the emacs text editor it basically supposed to split the string on the delimeter passed in . When I ran it it segfaulted I ran it though GDB and it gives me an error at strcpy(which I don't invoke) but is probably done implicitly in sprintf. I didn't think I was doing anything wrong so I booted into windows and ran it through visual studio and it works fine I am new to writing C in Linux and know the problem is in the While loop where i call sprintf() (which is odd because the call outside of the loop writes without causing an error) to write the token to the array. If anyone can tell me where I am going wrong I would greatly appreciate it. Here is the code
/* split()
Description:
- takes a string and splits it into substrings "on" the
<delimeter>*/
void split(char *string, char *delimiter)
{
int i;
int count = 0;
char *token;
//large temporary buffer to over compensate for the fact that we have
//no idea how many arguments will be passed with a command
char *bigBuffer[25];
for(i = 0; i < 25; i++)
{
bigBuffer[i] = (char*)malloc(sizeof(char) * 50);
}
//get the first token and add it to <tokens>
token = strtok(string, delimiter);
sprintf(bigBuffer[0], "%s", token);
//while we have not encountered the end of the string keep
//splitting on the delimeter and adding to <bigBuffer>
while(token != NULL)
{
token = strtok(NULL, delimiter);
sprintf(bigBuffer[++count], "%s", token);
}
//for(i = 0; i < count; i++)
//printf("i = %d : %s\n", i, bigBuffer[i]);
for(i = 0; i< 25; i++)
{
free(bigBuffer[i]);
}
} //end split()
You aren't checking for NULL from the return of strtok on the last iteration of the loop ... so strtok can return NULL, yet you still pass the NULL value in the token pointer to sprintf.
Change your while-loop to the following:
while(token = strtok(NULL, delimiter)) sprintf(bigBuffer[++count], "%s", token);
That way you can never pass a NULL pointer to strtok because the while-loop NULL-pointer check will enforce that token always has a valid value when sprintf is called with it as an argument.
You should ask gdb for a full traceback of where your program crashed. The fact that you don't know precisely where it crashed means you didn't ask it for a full traceback, which is important.

Resources