I am writing a program to check a certain word in a string. I use strtok to chop up the string and store it in an array. There is no problem with that.
The problem comes when I try to check the value of the wordArray at a certain index and say that if it is not NULL, save into a variable, and if it is NULL, do nothing. However, it is not ignoring NULL.
My code is below:
// This is a string to consider
char line[] = "I am here";
// Array of pointers to later hold pointers to each word
char *wordArray[MAX_LINE_LEN];
// Below is the chopping function, this is working well
// First chop up the first word, using the original string
wordArray[0] = strtok(line, " ");
int i = 0;
// Then loop to chop up and save into wordArray
while(wordArray[i] != NULL){
i++;
wordArray[i] = strtok(NULL, " ");
}
// Print out the words in wordArray
for (int j = 0; j < i; j++) {
printf("Word at index %d in wordArray is: %s \n",j, wordArray[j]);
}
// This is a part I don't get
// First define a character array/pointer so that it's the same type with wordArray
char *word = "a word";
int i = 0;
// Check wordArray at a certain key, if not null, save the value into word variable
if (wordArray[i] != NULL) {
word = wordArray[i];
}
printf("Word is: %s \n", word);
When i = 0:
Word is: I
When i = 2:
Word is: here
When i = 3 (at this point it's doing the right thing - ignore the if statement):
Word is: a word
When i >= 4:
Word is:
Nothing prints out. What exactly is its problem? How do I fix this?
UPDATE:
Thanks to all the help! The problem is wordArray has not been initialized with NULL values. Here's what I add:
for (int i = 0; i < MAX_LINE_LEN; i++) {
wordArray[i] = NULL;
}
This is an array of pointers so I used NULL, but for an array of characters it will probably prefer wordArray[i] = '\0' since '\0' is a null character array.
// Array of pointers to later hold pointers to each word
char *wordArray[MAX_LINE_LEN];
wordArray is not intially assigned with null.
// Then loop to chop up and save into wordArray
while(wordArray[i] != NULL){
i++;
wordArray[i] = strtok(NULL, " ");
}
above loop will terminate when value of i reach to 4.so ,wordArray[4] is not initialsed. since,wordArray coming from stack and you are not initializing its value can be anything.so ,below condition will not fail.
if (wordArray[i] != NULL) {
word = wordArray[key];
}
you are lucky that you didnt get hard fault as for this case word is pointing to any random pointer you will get undefined behavior here.
Related
I am writing a simple Shell for school assignment and stuck with a segmentation problem. Initially, my shell parses the user input to remove whitespaces and endofline character, and seperate the words inside the input line to store them in a char **args array. I can seperate the words and can print them without any problem, but when storing the words into a char **args array, and if argument number is greater than 1 and is odd, I get a segmentation error.
I know the problem is absurd, but I stuck with it. Please help me.
This is my parser code and the problem occurs in it:
char **parseInput(char *input){
int idx = 0;
char **parsed = NULL;
int parsed_idx = 0;
while(input[idx]){
if(input[idx] == '\n'){
break;
}
else if(input[idx] == ' '){
idx++;
}
else{
char *word = (char*) malloc(sizeof(char*));
int widx = 0; // Word index
word[widx] = input[idx];
idx++;
widx++;
while(input[idx] && input[idx] != '\n' && input[idx] != ' '){
word = (char*)realloc(word, (widx+1)*sizeof(char*));
word[widx] = input[idx];
idx++;
widx++;
}
word = (char*)realloc(word, (widx+1)*sizeof(char*));
word[widx] = '\0';
printf("Word[%d] --> %s\n", parsed_idx, word);
if(parsed == NULL){
parsed = (char**) malloc(sizeof(char**));
parsed[parsed_idx] = word;
parsed_idx++;
}else{
parsed = (char**) realloc(parsed, (parsed_idx+1)*sizeof(char**));
parsed[parsed_idx] = word;
parsed_idx++;
}
}
}
int i = 0;
while(parsed[i] != NULL){
printf("Parsed[%d] --> %s\n", i, parsed[i]);
i++;
}
return parsed;
}
In your code you have the loop
while(parsed[i] != NULL) { ... }
The problem is that the code never sets any elements of parsed to be a NULL pointer.
That means the loop will go out of bounds, and you will have undefined behavior.
You need to explicitly set the last element of parsed to be a NULL pointer after you parsed the input:
while(input[idx]){
// ...
}
parsed[parsed_idx] = NULL;
On another couple of notes:
Don't assign back to the same pointer you pass to realloc. If realloc fails it will return a NULL pointer, but not free the old memory. If you assign back to the pointer you will loose it and have a memory leak. You also need to be able to handle this case where realloc fails.
A loop like
int i = 0;
while (parsed[i] != NULL)
{
// ...
i++;
}
is almost exactly the same as
for (int i = 0; parsed[i] != NULL; i++)
{
// ...
}
Please use a for loop instead, it's usually easier to read and follow. Also for a for loop the "index" variable (i in your code) will be in a separate scope, and not available outside of the loop. Tighter scope for variables leads to less possible problems.
In C you shouldn't really cast the result of malloc (or realloc) (or really any function returning void *). If you forget to #include <stdlib.h> it could lead to hard to diagnose problems.
Also, a beginner might find the -pedantic switch helpful on your call to the compiler. That switch would have pointed up most of the other suggestions made here. I personally am also a fan of -Wall, though many find it annoying instead of helpful.
I am working on a program in c, and I have two variables: file[] and tok[]. The idea is to iterate through file[] character by character and place the characters in tok[]. I can print the characters from file[] directly, but I can't place them into tok[]. How would I grab file[], character by character and place it character by character into tok[]?
My main() method (always returns 0 without any errors):
int main()
{
char file[] = "PRINT \"Hello, world!\"";
int filelen = strlen(file);
int i = 0;
char tok[] = "";
for (i = 0; i < filelen; i++) {
printf("%c \n", file[i]); // Print every char from variable file
tok[strlen(tok)+1] = file[i]; // Add the character to variable tok
printf("%s \n", tok); // Print tok
}
return 0;
}
You make a few errors:
char tok[] = "";
This allocates a fixed-length array of one! The memory is not automatically expanded when you add characters. As you want to copy filelen characters, you should do:
char tok[filelen+1]; // note the "+1" for the terminating null character
In your loop, you repeatedly call strlen. Personally I find that a waste of CPU cycles and would prefer to use another index variable, for example:
int toklen= 0; // initially empty
...
tok[toklen++] = file[i]; // Add the character to variable tok
In your version you have added the character one position too far (indices in C go from 0..n-1).
After the loop you must still terminate the string with a null character:
tok[toklen] = '\0';
So im getting a file with strings, i want to tokenize each string whenever i come to a whitespace/newline. i am able to get the tokens seperated into delimiter strings, but im not able to copy them into an array.
int lexer(FILE* file){
char line[50];
char* delim;
int i = 0;
int* intptr = &i;
while(fgets(line,sizeof(line),file)){
printf("%s\n", line);
if(is_empty(line) == 1)
continue;
delim = strtok(line," ");
if(delim == NULL)
printf("%s\n", "ERROR");
while(delim != NULL){
if(delim[0] == '\n'){
//rintf("%s\n", "olala");
break;
}
tokenArray[*intptr] = delim;
printf("Token IN array: %s\n", tokenArray[*intptr]);
*intptr = *intptr + 1;
delim = strtok(NULL, " ");
}
if i run this i get the output :
Token IN array: 012
Token IN array: 23ddd
Token IN array: vs32
Token IN array: ,344
Token IN array: 0sdf
which is correct according to my textfile, but when i try to reprint the array at a later time in the same function and out
*intptr = *intptr + 1;
delim = strtok(NULL, " ");
}
}
printf("%s\n", tokenArray[3]);
fclose(file);
return 0;
i dont get an output, i tried writing all the contents of the array to a txt file, i got gibberish. i dont know what to do plz help
First, your pointer on i is useless. Why not using i directly?
I'll assume that from now on.
Then, the real problem: you have to allocate and copy the strings that strtok returns each time because strtok does not allocate the tokens for you, it justs points to the last one. The references are all the same, so you get last empty token
Something like this would help:
tokenArray[*intptr] = strdup(delim);
(instead of tokenArray[*intptr] = delim;) note that I have replaced the index by i. Just to i++ afterwards.
BTW I wouldn't recommend using strtok for other purposes that quick hacks. This function has a memory, so if you call several functions using it in different parts of your program, it can conflict (I made that mistake a long time ago). Check manual for strtok_r in that case (r for reentrant)
tokenArray[*intptr] = delim;
In this line, delim is a pointer to a char array of which the content is ever changing in the for loop. So in your case, the content which delim point to should be copied as content of tokenArray[*intptr], that is:
tokenArray[*intptr] = strdup(delim);
I have problems getting following Code to work. It parses a users input into a char*[] and returns it. However the char* command[] does not accept any values and stays filled with NULL... whats going on here?
void* setCommands(int length){
char copy[strlen(commandline)]; //commandline is a char* read with gets();
strcpy(copy, commandline);
char* commands[length];
for (int x=0; x<length; x++)
commands[x] = "\0";
int i = 0;
char* temp;
temp = strtok (copy, " \t");
while (temp != NULL){
commands[i] = temp; //doesnt work here.. commands still filled with NULL afterwards
i++;
printf("word:%s\n", temp);
temp = strtok (NULL, " \t");
}
commands[i] = NULL;
for (int u=0; u<length; u++)
printf("%s ", commands[i]);
printf("\n");
return *commands;
}
You may assume, that commandline != NULL, length != 0
commands[i] = NULL;
for (int u=0; u<length; u++)
printf("%s ", commands[i]);
Take a very good look at that code. It uses u as the loop control variable but prints out the element based on i.
Hence, due to the fact you've set commands[i] to NULL in the line before the loop, you'll just get a series of NULLs.
Use commands[u] in the loop rather than commands[i].
In addition to that:
void* setCommands(int length){
char* commands[length];
:
return *commands;
}
will only return one pointer, the one to the first token, not the one to the array of token pointers. You cannot return addresses of local variables that are going out of scope (well, you can, but it may not work).
And, in any case, since that one pointer most likely points to yet another local variable (somewhere inside copy), it's also invalid.
If you want to pass back blocks of memory from functions, you'll need to look into using malloc, in this case both for the array of pointers and the strings themselves.
You have a number of issues... Your program will be exhibiting undefined behaviour currently, so until you address the issues you cannot hope to predict what's going on. Let's begin.
The following string is one character too short. You forgot to include a character for the string terminator ('\0'). This will lead to a buffer overrun during tokenising, which might be partly responsible for the behaviour you are seeing.
char copy[strlen(commandline)]; // You need to add 1
strcpy(copy, commandline);
The next part is your return value, but it's a temporary (local array). You are not allowed to return this. You should allocate it instead.
// Don't do this:
char* commands[length];
for (int x=0; x<length; x++)
commands[x] = "\0"; // And this is not the right way to zero a pointer
// Do this instead (calloc will zero your elements):
char ** commands = calloc( length, sizeof(char*) );
It's possible for the tokenising loop to overrun your buffer because you never check for length, so you should add in a test:
while( temp != NULL && i < length )
And because of the above, you can't just blindly set commands[i] to NULL after the loop. Either test i < length or just don't set it (you zeroed the array beforehand anyway).
Now let's deal with the return value. Currently you have this:
return *commands;
That returns a pointer to the first token in your temporary string (copy). Firstly, it looks like you actually intended to return an array of tokens, not just the first token. Secondly, you can't return a temporary string. So, I think you meant this:
return commands;
Now, to deal with those strings... There's an easy way, and a clever way. The easy way has already been suggested: you call strdup on each token before shoving them in memory. The annoying part of this is that when you clean up that memory, you have to go through the array and free each individual token.
Instead, let's do it all in one hit, by allocating the array AND the string storage in one call:
char **commands = malloc( length * sizeof(char*) + strlen(commandline) + 1 );
char *copy = (char*)(commands + length);
strcpy( copy, commandline );
The only thing I didn't do above is zero the array. You can do this after the tokenising loop, by just zeroing the remaining values:
while( i < length ) commands[i++] = NULL;
Now, when you return commands, you return an array of tokens which also contains its own token storage. To free the array and all strings it contains, you just do this:
free( commands );
Putting it all together:
void* setCommands(int length)
{
// Create array and string storage in one memory block.
char **commands = malloc( length * sizeof(char*) + strlen(commandline) + 1 );
if( commands == NULL ) return NULL;
char *copy = (char*)(commands + length);
strcpy( copy, commandline );
// Tokenise commands
int i = 0;
char *temp = strtok(copy, " \t");
while( temp != NULL && i < length )
{
commands[i++] = temp;
temp = strtok(NULL, " \t");
}
// Zero any unused tokens
while( i < length ) commands[i++] = NULL;
return commands;
}
I am getting a junk character to be output at the very end of some text that I read in:
hum 1345342342 ~Users/Documents ecabd459 //line that was read in from stdin
event action: hum_?
event timestamp: 1345342342
event path: ~Users/Documents
event hash: ecabd459
At the end of the event action value there is a '_?' garbage character that is output as well. That can be rectified by setting the variable's last position to the null terminator (event.action[3] = '\0') which is all well and good, but I am perplexed by the fact that the other char array event.hash does not exhibit this type of behavior. I am creating/printing them in an identical manner, yet hash does not behave the same.
Note: I was considering maybe this was due to the hash value being followed strictly by a newline character(which I get rid of by the way), so I tested my program with re-ordered input to no avail (that is, added an additional space and word after the hash value's position on the line).
The relevant code is below:
struct Event{
char action[4];
long timestamp;
char* path;
char hash[9];
};
// parse line and return an Event struct
struct Event parseLineIntoEvent(char* line) {
struct Event event;
char* lineSegment;
int i = 0;
lineSegment = strtok(line, " ");
while (lineSegment != NULL) {
if (i > 3) {
printf("WARNING: input format error!\n");
break;
}
if (i == 0)
strncpy(event.action, lineSegment, sizeof(event.action)-1);
else if(i == 1)
event.timestamp = atoi(lineSegment);
else if(i == 2) {
event.path = malloc(sizeof(lineSegment));
strcpy(event.path, lineSegment);
} else if(i == 3)
strncpy(event.hash, lineSegment, sizeof(event.hash)-1);
lineSegment = strtok(NULL, " ");
i++;
} // while
return event;
} // parseLineIntoEvent()
int main (int argc, const char * argv[]) {
//...
printf("%s\n",line); //prints original line that was read in from stdin
struct Event event = parseLineIntoEvent(line);
printf("event action: %s\n", event.action);
printf("event timestamp: %lu\n", event.timestamp);
printf("event path: %s\n", event.path);
printf("event hash: %s\n", event.hash);
free(event.path);
free(line);
//...
return 0;
}
EDIT:
I read in a line with this function, which gets rid of the newline character:
// read in line from stdin, eliminating newline character if present
char* getLineFromStdin() {
char *text;
int textSize = 50*sizeof(char);
text = malloc(textSize);
if ( fgets(text, textSize, stdin) != NULL ) {
char *newline = strchr(text, '\n'); // search for newline character
if ( newline != NULL ) {
*newline = '\0'; // overwrite trailing newline
}
}
return text;
}
Thanks in advance!
This is a mistake:
event.path = malloc(sizeof(lineSegment));
will return the sizeof(char*), when you require the length plus one for terminating NULL character:
event.path = malloc(sizeof(char) * (strlen(lineSegment) + 1));
To avoid having to insert null string terminators into action and hash you could initialise event:
struct Event event = { 0 };
From the Linux manual page:
The strncpy() function is similar, except that at most n bytes of src are copied.
Warning: If there is no null byte among the first n bytes of src, the string
placed in dest will not be null-terminated.
When doing strncpy you have to make sure the destination string is properly terminated.
Change the setting of the event.action field:
if (i == 0)
{
strncpy(event.action, lineSegment, sizeof(event.action)-1);
event.action[sizeof(event.action)-1] = '\0';
}
but I am perplexed by the fact that the other char array event.hash does not exhibit this type of behavior
You got unlucky. hash[8] may have gotten a '\0' by sheer (bad-)luck.
Try setting it to something "random" before your strtok loop
int i = 0;
event.hash[8] = '_'; /* forcing good-luck */
lineSegment = strtok(line, " ");
while (lineSegment != NULL) {
This is because, the string "num" takes only three elements from the 4 element character array Event.action and the fourth element will stay unset. Because nothing has been set to the Event.action array element it will point to random memory location which has some random value stored. When you printf this character array it will print all of the elements instead of those pointing to valid data. This causes the garbage character to show up.