I have been trying to make some kind of "my own shell". So, what I have been trying to do is get input with fgets() and execute it with execvp().
If I use execvp with an array made by me, it works as expected. However, if I try to do it with the results of fgets then I get no output.
main() {
char str[64];
char *array[sizeof(str)];
char *p = NULL;
int i = 0;
printf("my_shell >");
fgets(str, sizeof(str), stdin); // Use fgets instead of gets.
p = strtok(str," ");
while (p != NULL) {
array[i++] = p;
p = strtok(NULL, " ");
}
execvp(str, array);
}
As commented by user3386109, the solution was:
First, the array must have a NULL pointer at the end. Second, the delimiter string passed to both strtok should be " \n" (that's a space followed by a newline). You need the newline because fgets will put a newline character into your buffer, and you don't want that newline added to the array as an argument. Finally, put a perror("execvp failed"); after the execvp so that you get some indication of the problem when the execvp fails.
What would be the best way to imitate the functionality of gets with scanf?
Here is my current attempt
int main()
{
char cvalue[20]; //char array to store input string
int iloop=0; //integer variable for loop
for(iloop=0;iloop<20;iloop++) // for loop to get the string char by char
{
scanf("%c",&cvalue[iloop]); //getting input
if(cvalue[iloop]=='\n') //if input is newline skip further looping
break;
} // end of loop
cvalue[iloop]='\0'; //set end of the character for given input
printf("%s",cvalue); //printing the given string
return 0;
}
You could use scanf this way to work like gets
scanf("%[^\n]",&a);
You need to observe the usually dangers of gets().
The challenge to using scanf() is
1) Insuring that \n is consumed. scanf("%[^\n]",... does not do this.
2) Insuring the str gets a \0 if only a \n is read.
3) Dealing with EOF and I/O errors and return 0.
4) Insure leading whitespace are read into str as scanf("%s" skips them.
#include <stdio.h>
// On success, the gets() returns str.
// If EOF encountered, the eof indicator is set (feof).
// If this happens before any characters could be read,
// pointer returned is a null pointer.
// If a read error occurs, the error (ferror) is set
// and a null pointer is also returned.
char *gets_via_scanf( char * str ) {
// Reads characters from stdin & saves them into str until \n or the end-of-file.
// \n, if found, is not copied into str.
int retval = scanf("%[^\n]",str); // %[ does not skip leading whitespace
if (retval == EOF) return 0;
if (retval == 0) {
*str = '\0'; // Happens when users only types in \n
}
char ch;
scanf("%c",&ch); // Consume leftover \n, could be done with getc()
return str;
}
Your attempt doesn't really imitate gets(), since gets() just keeps putting bytes into the supplied buffer until the end of line is reached. You should realize then that gets() is dangerous and should be avoided. It does not offer any protection from buffer overflow. So, it is also questionable to imitate it.
Given that, your attempt has a couple flaws that I see. First, it loops to the complete size of the input buffer. This doesn't leave you any room to store the NUL terminator if the input line is 20 bytes or longer. This means that you may attempt to store the \0 at cvalue[20], which outside the array boundary. You can fix this by shortening your for loop by one:
for(iloop=0;iloop<19;iloop++) // for loop to get the string char by char
The second flaw is that you do not check to see if the scanf() call succeeds. If you detect failure, you should also leave the loop:
if (scanf("%c",&cvalue[iloop]) != 1) { //getting input
break;
}
Below was my attempt at creating a safer version of gets() implemented with scanf().
char *getsn (char *s, size_t sz) {
char c;
char fmt[sizeof(sz) * CHAR_BIT + sizeof("[^\n]")];
if (sz == 0) return 0;
if (sz == 1) {
s[0] = '\0';
return s;
}
s[sz-2] = '\0';
snprintf(fmt, sizeof(fmt), "%%%lu%s", (unsigned long)sz-1, "[^\n]");
switch (scanf(fmt, s)) {
case 0: s[0] = '\0';
scanf("%c", &c);
return s;
case 1: scanf("%c", &c);
if (s[sz-2] != '\0' && c != '\n') {
ungetc(c, stdin);
}
return s;
default: break;
}
return 0;
}
The safer version uses snprintf() to create a format string that limits how many characters should be stored by the scanf(). So if the provided sz parameter was 100, the resulting format string would be "%99[^\n]". Then, it makes sure to only strip out the \n from the input stream if it was actually encountered.
I'm writing a shell and I'm using getline() with stdin from the keyboard to take commands. I'm having trouble tokenizing the inputs though. I tried using \n as a delimiter in the strtok() function, but it seems not to be working.
For example, I included an if statement to check if the user typed "exit" in which case it will terminate the program. It's not terminating.
Here's the code I'm using:
void main() {
int ShInUse = 1;
char *UserCommand; // This holds the input
int combytes = 100;
UserCommand = (char *) malloc (combytes);
char *tok;
while (ShInUse == 1) {
printf("GASh: "); // print prompt
getline(&UserCommand, &combytes, stdin);
tok = strtok(UserCommand, "\n");
printf("%s\n", tok);
if(tok == "exit") {
ShInUse = 0;
printf("Exiting.\n");
exit(0);
}
}
if (tok == "exit")
tok and exit are pointers, so you are comparing two pointers. This leads to an undefined behavior, since they don't belong to the same aggregate.
This is not the way to compare strings. Use rather strcmp.
if (strcmp (tok, "exit") == 0)
As #Kirilenko stated, you can't compare strings using the == operator.
But that's not it. If you're using getline() you don't need to split the input to lines anyway as getline() only reads a single line. And if you did want to split the input to other delimiters, you'd have call strtok() in a loop till it returns NULL.
I am getting a junk character to be output at the very end of some text that I read in:
hum 1345342342 ~Users/Documents ecabd459 //line that was read in from stdin
event action: hum_?
event timestamp: 1345342342
event path: ~Users/Documents
event hash: ecabd459
At the end of the event action value there is a '_?' garbage character that is output as well. That can be rectified by setting the variable's last position to the null terminator (event.action[3] = '\0') which is all well and good, but I am perplexed by the fact that the other char array event.hash does not exhibit this type of behavior. I am creating/printing them in an identical manner, yet hash does not behave the same.
Note: I was considering maybe this was due to the hash value being followed strictly by a newline character(which I get rid of by the way), so I tested my program with re-ordered input to no avail (that is, added an additional space and word after the hash value's position on the line).
The relevant code is below:
struct Event{
char action[4];
long timestamp;
char* path;
char hash[9];
};
// parse line and return an Event struct
struct Event parseLineIntoEvent(char* line) {
struct Event event;
char* lineSegment;
int i = 0;
lineSegment = strtok(line, " ");
while (lineSegment != NULL) {
if (i > 3) {
printf("WARNING: input format error!\n");
break;
}
if (i == 0)
strncpy(event.action, lineSegment, sizeof(event.action)-1);
else if(i == 1)
event.timestamp = atoi(lineSegment);
else if(i == 2) {
event.path = malloc(sizeof(lineSegment));
strcpy(event.path, lineSegment);
} else if(i == 3)
strncpy(event.hash, lineSegment, sizeof(event.hash)-1);
lineSegment = strtok(NULL, " ");
i++;
} // while
return event;
} // parseLineIntoEvent()
int main (int argc, const char * argv[]) {
//...
printf("%s\n",line); //prints original line that was read in from stdin
struct Event event = parseLineIntoEvent(line);
printf("event action: %s\n", event.action);
printf("event timestamp: %lu\n", event.timestamp);
printf("event path: %s\n", event.path);
printf("event hash: %s\n", event.hash);
free(event.path);
free(line);
//...
return 0;
}
EDIT:
I read in a line with this function, which gets rid of the newline character:
// read in line from stdin, eliminating newline character if present
char* getLineFromStdin() {
char *text;
int textSize = 50*sizeof(char);
text = malloc(textSize);
if ( fgets(text, textSize, stdin) != NULL ) {
char *newline = strchr(text, '\n'); // search for newline character
if ( newline != NULL ) {
*newline = '\0'; // overwrite trailing newline
}
}
return text;
}
Thanks in advance!
This is a mistake:
event.path = malloc(sizeof(lineSegment));
will return the sizeof(char*), when you require the length plus one for terminating NULL character:
event.path = malloc(sizeof(char) * (strlen(lineSegment) + 1));
To avoid having to insert null string terminators into action and hash you could initialise event:
struct Event event = { 0 };
From the Linux manual page:
The strncpy() function is similar, except that at most n bytes of src are copied.
Warning: If there is no null byte among the first n bytes of src, the string
placed in dest will not be null-terminated.
When doing strncpy you have to make sure the destination string is properly terminated.
Change the setting of the event.action field:
if (i == 0)
{
strncpy(event.action, lineSegment, sizeof(event.action)-1);
event.action[sizeof(event.action)-1] = '\0';
}
but I am perplexed by the fact that the other char array event.hash does not exhibit this type of behavior
You got unlucky. hash[8] may have gotten a '\0' by sheer (bad-)luck.
Try setting it to something "random" before your strtok loop
int i = 0;
event.hash[8] = '_'; /* forcing good-luck */
lineSegment = strtok(line, " ");
while (lineSegment != NULL) {
This is because, the string "num" takes only three elements from the 4 element character array Event.action and the fourth element will stay unset. Because nothing has been set to the Event.action array element it will point to random memory location which has some random value stored. When you printf this character array it will print all of the elements instead of those pointing to valid data. This causes the garbage character to show up.
Please explain to me the working of strtok() function. The manual says it breaks the string into tokens. I am unable to understand from the manual what it actually does.
I added watches on str and *pch to check its working when the first while loop occurred, the contents of str were only "this". How did the output shown below printed on the screen?
/* strtok example */
#include <stdio.h>
#include <string.h>
int main ()
{
char str[] ="- This, a sample string.";
char * pch;
printf ("Splitting string \"%s\" into tokens:\n",str);
pch = strtok (str," ,.-");
while (pch != NULL)
{
printf ("%s\n",pch);
pch = strtok (NULL, " ,.-");
}
return 0;
}
Output:
Splitting string "- This, a sample string." into tokens:
This
a
sample
string
the strtok runtime function works like this
the first time you call strtok you provide a string that you want to tokenize
char s[] = "this is a string";
in the above string space seems to be a good delimiter between words so lets use that:
char* p = strtok(s, " ");
what happens now is that 's' is searched until the space character is found, the first token is returned ('this') and p points to that token (string)
in order to get next token and to continue with the same string NULL is passed as first
argument since strtok maintains a static pointer to your previous passed string:
p = strtok(NULL," ");
p now points to 'is'
and so on until no more spaces can be found, then the last string is returned as the last token 'string'.
more conveniently you could write it like this instead to print out all tokens:
for (char *p = strtok(s," "); p != NULL; p = strtok(NULL, " "))
{
puts(p);
}
EDIT:
If you want to store the returned values from strtok you need to copy the token to another buffer e.g. strdup(p); since the original string (pointed to by the static pointer inside strtok) is modified between iterations in order to return the token.
strtok() divides the string into tokens. i.e. starting from any one of the delimiter to next one would be your one token. In your case, the starting token will be from "-" and end with next space " ". Then next token will start from " " and end with ",". Here you get "This" as output. Similarly the rest of the string gets split into tokens from space to space and finally ending the last token on "."
strtok maintains a static, internal reference pointing to the next available token in the string; if you pass it a NULL pointer, it will work from that internal reference.
This is the reason strtok isn't re-entrant; as soon as you pass it a new pointer, that old internal reference gets clobbered.
strtok doesn't change the parameter itself (str). It stores that pointer (in a local static variable). It can then change what that parameter points to in subsequent calls without having the parameter passed back. (And it can advance that pointer it has kept however it needs to perform its operations.)
From the POSIX strtok page:
This function uses static storage to keep track of the current string position between calls.
There is a thread-safe variant (strtok_r) that doesn't do this type of magic.
strtok will tokenize a string i.e. convert it into a series of substrings.
It does that by searching for delimiters that separate these tokens (or substrings). And you specify the delimiters. In your case, you want ' ' or ',' or '.' or '-' to be the delimiter.
The programming model to extract these tokens is that you hand strtok your main string and the set of delimiters. Then you call it repeatedly, and each time strtok will return the next token it finds. Till it reaches the end of the main string, when it returns a null. Another rule is that you pass the string in only the first time, and NULL for the subsequent times. This is a way to tell strtok if you are starting a new session of tokenizing with a new string, or you are retrieving tokens from a previous tokenizing session. Note that strtok remembers its state for the tokenizing session. And for this reason it is not reentrant or thread safe (you should be using strtok_r instead). Another thing to know is that it actually modifies the original string. It writes '\0' for teh delimiters that it finds.
One way to invoke strtok, succintly, is as follows:
char str[] = "this, is the string - I want to parse";
char delim[] = " ,-";
char* token;
for (token = strtok(str, delim); token; token = strtok(NULL, delim))
{
printf("token=%s\n", token);
}
Result:
this
is
the
string
I
want
to
parse
The first time you call it, you provide the string to tokenize to strtok. And then, to get the following tokens, you just give NULL to that function, as long as it returns a non NULL pointer.
The strtok function records the string you first provided when you call it. (Which is really dangerous for multi-thread applications)
strtok modifies its input string. It places null characters ('\0') in it so that it will return bits of the original string as tokens. In fact strtok does not allocate memory. You may understand it better if you draw the string as a sequence of boxes.
To understand how strtok() works, one first need to know what a static variable is. This link explains it quite well....
The key to the operation of strtok() is preserving the location of the last seperator between seccessive calls (that's why strtok() continues to parse the very original string that is passed to it when it is invoked with a null pointer in successive calls)..
Have a look at my own strtok() implementation, called zStrtok(), which has a sligtly different functionality than the one provided by strtok()
char *zStrtok(char *str, const char *delim) {
static char *static_str=0; /* var to store last address */
int index=0, strlength=0; /* integers for indexes */
int found = 0; /* check if delim is found */
/* delimiter cannot be NULL
* if no more char left, return NULL as well
*/
if (delim==0 || (str == 0 && static_str == 0))
return 0;
if (str == 0)
str = static_str;
/* get length of string */
while(str[strlength])
strlength++;
/* find the first occurance of delim */
for (index=0;index<strlength;index++)
if (str[index]==delim[0]) {
found=1;
break;
}
/* if delim is not contained in str, return str */
if (!found) {
static_str = 0;
return str;
}
/* check for consecutive delimiters
*if first char is delim, return delim
*/
if (str[0]==delim[0]) {
static_str = (str + 1);
return (char *)delim;
}
/* terminate the string
* this assignmetn requires char[], so str has to
* be char[] rather than *char
*/
str[index] = '\0';
/* save the rest of the string */
if ((str + index + 1)!=0)
static_str = (str + index + 1);
else
static_str = 0;
return str;
}
And here is an example usage
Example Usage
char str[] = "A,B,,,C";
printf("1 %s\n",zStrtok(s,","));
printf("2 %s\n",zStrtok(NULL,","));
printf("3 %s\n",zStrtok(NULL,","));
printf("4 %s\n",zStrtok(NULL,","));
printf("5 %s\n",zStrtok(NULL,","));
printf("6 %s\n",zStrtok(NULL,","));
Example Output
1 A
2 B
3 ,
4 ,
5 C
6 (null)
The code is from a string processing library I maintain on Github, called zString. Have a look at the code, or even contribute :)
https://github.com/fnoyanisi/zString
This is how i implemented strtok, Not that great but after working 2 hr on it finally got it worked. It does support multiple delimiters.
#include "stdafx.h"
#include <iostream>
using namespace std;
char* mystrtok(char str[],char filter[])
{
if(filter == NULL) {
return str;
}
static char *ptr = str;
static int flag = 0;
if(flag == 1) {
return NULL;
}
char* ptrReturn = ptr;
for(int j = 0; ptr != '\0'; j++) {
for(int i=0 ; filter[i] != '\0' ; i++) {
if(ptr[j] == '\0') {
flag = 1;
return ptrReturn;
}
if( ptr[j] == filter[i]) {
ptr[j] = '\0';
ptr+=j+1;
return ptrReturn;
}
}
}
return NULL;
}
int _tmain(int argc, _TCHAR* argv[])
{
char str[200] = "This,is my,string.test";
char *ppt = mystrtok(str,", .");
while(ppt != NULL ) {
cout<< ppt << endl;
ppt = mystrtok(NULL,", .");
}
return 0;
}
For those who are still having hard time understanding this strtok() function, take a look at this pythontutor example, it is a great tool to visualize your C (or C++, Python ...) code.
In case the link got broken, paste in:
#include <stdio.h>
#include <string.h>
int main()
{
char s[] = "Hello, my name is? Matthew! Hey.";
char* p;
for (char *p = strtok(s," ,?!."); p != NULL; p = strtok(NULL, " ,?!.")) {
puts(p);
}
return 0;
}
Credits go to Anders K.
Here is my implementation which uses hash table for the delimiter, which means it O(n) instead of O(n^2) (here is a link to the code):
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#define DICT_LEN 256
int *create_delim_dict(char *delim)
{
int *d = (int*)malloc(sizeof(int)*DICT_LEN);
memset((void*)d, 0, sizeof(int)*DICT_LEN);
int i;
for(i=0; i< strlen(delim); i++) {
d[delim[i]] = 1;
}
return d;
}
char *my_strtok(char *str, char *delim)
{
static char *last, *to_free;
int *deli_dict = create_delim_dict(delim);
if(!deli_dict) {
/*this check if we allocate and fail the second time with entering this function */
if(to_free) {
free(to_free);
}
return NULL;
}
if(str) {
last = (char*)malloc(strlen(str)+1);
if(!last) {
free(deli_dict);
return NULL;
}
to_free = last;
strcpy(last, str);
}
while(deli_dict[*last] && *last != '\0') {
last++;
}
str = last;
if(*last == '\0') {
free(deli_dict);
free(to_free);
deli_dict = NULL;
to_free = NULL;
return NULL;
}
while (*last != '\0' && !deli_dict[*last]) {
last++;
}
*last = '\0';
last++;
free(deli_dict);
return str;
}
int main()
{
char * str = "- This, a sample string.";
char *del = " ,.-";
char *s = my_strtok(str, del);
while(s) {
printf("%s\n", s);
s = my_strtok(NULL, del);
}
return 0;
}
strtok() stores the pointer in static variable where did you last time left off , so on its 2nd call , when we pass the null , strtok() gets the pointer from the static variable .
If you provide the same string name , it again starts from beginning.
Moreover strtok() is destructive i.e. it make changes to the orignal string. so make sure you always have a copy of orignal one.
One more problem of using strtok() is that as it stores the address in static variables , in multithreaded programming calling strtok() more than once will cause an error. For this use strtok_r().
strtok replaces the characters in the second argument with a NULL and a NULL character is also the end of a string.
http://www.cplusplus.com/reference/clibrary/cstring/strtok/
you can scan the char array looking for the token if you found it just print new line else print the char.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
char *s;
s = malloc(1024 * sizeof(char));
scanf("%[^\n]", s);
s = realloc(s, strlen(s) + 1);
int len = strlen(s);
char delim =' ';
for(int i = 0; i < len; i++) {
if(s[i] == delim) {
printf("\n");
}
else {
printf("%c", s[i]);
}
}
free(s);
return 0;
}
So, this is a code snippet to help better understand this topic.
Printing Tokens
Task: Given a sentence, s, print each word of the sentence in a new line.
char *s;
s = malloc(1024 * sizeof(char));
scanf("%[^\n]", s);
s = realloc(s, strlen(s) + 1);
//logic to print the tokens of the sentence.
for (char *p = strtok(s," "); p != NULL; p = strtok(NULL, " "))
{
printf("%s\n",p);
}
Input: How is that
Result:
How
is
that
Explanation: So here, "strtok()" function is used and it's iterated using for loop to print the tokens in separate lines.
The function will take parameters as 'string' and 'break-point' and break the string at those break-points and form tokens. Now, those tokens are stored in 'p' and are used further for printing.
strtok is replacing delimiter with'\0' NULL character in given string
CODE
#include<iostream>
#include<cstring>
int main()
{
char s[]="30/4/2021";
std::cout<<(void*)s<<"\n"; // 0x70fdf0
char *p1=(char*)0x70fdf0;
std::cout<<p1<<"\n";
char *p2=strtok(s,"/");
std::cout<<(void*)p2<<"\n";
std::cout<<p2<<"\n";
char *p3=(char*)0x70fdf0;
std::cout<<p3<<"\n";
for(int i=0;i<=9;i++)
{
std::cout<<*p1;
p1++;
}
}
OUTPUT
0x70fdf0 // 1. address of string s
30/4/2021 // 2. print string s through ptr p1
0x70fdf0 // 3. this address is return by strtok to ptr p2
30 // 4. print string which pointed by p2
30 // 5. again assign address of string s to ptr p3 try to print string
30 4/2021 // 6. print characters of string s one by one using loop
Before tokenizing the string
I assigned address of string s to some ptr(p1) and try to print string through that ptr and whole string is printed.
after tokenized
strtok return the address of string s to ptr(p2) but when I try to print string through ptr it only print "30" it did not print whole string. so it's sure that strtok is not just returning adress but it is placing '\0' character where delimiter is present.
cross check
1.
again I assign the address of string s to some ptr (p3) and try to print string it prints "30" as while tokenizing the string is updated with '\0' at delimiter.
2.
see printing string s character by character via loop the 1st delimiter is replaced by '\0' so it is printing blank space rather than ''