I am trying to build a function that takes a string of chars and provides a list of those chars separated by a token.
This is what I have so far:
char * decode_args(char arguments[]){
char* token = strtok(arguments, "00");
while (token != NULL){
printf("%s\n", token);
token = strtok(NULL, "00");
}
return 0;
}
This function prints the desired values that I am looking for. For example:
>decode_args("77900289008764")
779
289
8764
The next step is to build an array that can be used in the execv command. The arguments need to be an array. An example here. I am a beginner so I don't even know if "array" is the right word. What data type should be built and how can I do that so I can call execv with the arguments that are currently being printed in a list?
For starters let me explain some stuff about storage and strings.
There are 3 basic storage types. Automatic, dynamic, static. And static one usually split into two: read-only and read-write. Dynamic and static ones will be useful for you soon.
Automatic variables are function parameters and local variables. When you call a function they pushed into stack and when function returns they got unwind.
Dynamic one is the one you allocate in runtime with malloc family. This is how we create a dynamic array. And you need to return this source when you are done with free. If you don't, it is called memory leak and you can check for memory leaks with the tool valgrind beside other memory errors. It is quite useful for systems programming class.
And static ones are the ones stay there for lifetime of the program.
If you define a global variable or static int i = 42 it will create static read-write variable so you can change it. Now here is the trick.
void foo() {
char *string1 = "hello, world" //static read-only
char string2[] = "hello, world" //automatic
}
So if you try to change string1 you will get a segmentation fault but it is OK to change string2. I don't know why you don't get segmentation fault by doing decode_args("77900289008764") but I get on my machine. :D
Now C-strings. They are null terminated which means there is a (char) 0 character end of each string that says it is end of the string. strtok basically replace the pattern with NULL character so you have multiple substrings instead of one. So from your example it converts "77900289008764 NULL" to "779 NULL 289 NULL 8764 NULL"
So if I were you I would count encounters of "00" in the string and allocate that much character pointer. Which is something like:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char ** decode_args(char *arguments, char *delim) {
int num_substrings = 1; // Note that C don't initialize with 0 like java etc.
int len_delim = strlen(delim);
for (int i = 0;
arguments[i] != '\0' && arguments[i + 1] != '\0'; // Any of last 2 chars are terminator?
++i)
if (strncmp(arguments + i /* same as &arguments[i] */, delim, len_delim) == 0)
++num_substrings;
char **result = (char **) malloc(sizeof(char *) * (num_substrings + 1));
int i = 0;
char* token = strtok(arguments, delim);
while (token != NULL){
result[i] = token;
++i;
token = strtok(NULL, delim);
}
result[i] = NULL; //End of results. execv wants this as I remember
return result;
}
int main(int argc, char *argv[])
{
char str[] = "foo00bar00baz";
char **results = decode_args(str, "00");
for (int i = 0; results[i] != NULL; ++i) {
char *result = results[i];
puts(result);
}
free(results);
return 0;
}
Try something like this:
#define MAX_ARGUMENTS 10
int decode_args(char arguments[], char ** pcListeArgs)
{
int iNumElet = 0;
char* token = strtok(arguments, "00");
while ((token != NULL) && (iNumElet < MAX_ARGUMENTS -1))
{
size_t len = strlen(token);
pcListeArgs [iNumElet] = (char*) calloc (len+1, sizeof (char));
memset(pcListeArgs [iNumElet], 0, len+1); // reset content
memcpy(pcListeArgs [iNumElet], token, len); // copy data
token = strtok(NULL, "00");
iNumElet++;
}
if ( iNumElet >= MAX_ARGUMENTS)
return -1;
return iNumElet;
}
And int the main :
int main() {
char *pListArgs[MAX_ARGUMENTS];
char args[] = "77900289008764";
int iNbArgs = decode_args (args, pListArgs);
if ( iNbArgs > 0)
{
for ( int i=0; i<iNbArgs; i++)
printf ("Argument number %d = %s\n", i, pListArgs[i]);
for ( int i=0; i<iNbArgs; i++)
free (pListArgs[i]);
}
return 0;
}
output:
Related
I am trying to write a method that takes a string and splits it into two strings based on a delimiter string, similar to .split in Java:
char * split(char *tosplit, char *culprit) {
char *couple[2] = {"", ""};
int i = 0;
// Returns first token
char *token = strtok(tosplit, culprit);
while (token != NULL && i < 2) {
couple[i++] = token;
token = strtok(NULL, culprit);
}
return couple;
}
But I keep getting the Warnings:
In function ‘split’:
warning: return from incompatible pointer type [-Wincompatible-pointer-types]
return couple;
^~~~~~
warning: function returns address of local variable [-Wreturn-local-addr]
... and of course the method doesn't work as I hoped.
What am I doing wrong?
EDIT: I am also open to other ways of doing this besides using strtok().
A view things:
First, you are returning a pointer to a (sequence of) character(s), i.e. a char
* rather than a pointer to a (sequence of) pointer(s) to char. Hence, the return type should be char **.
Second, you return the address of a local variable, which - once the function has finished - goes out of scope and must not be accessed afterwards.
Third, you define an array of 2 pointers, whereas your while-loop may write beyond these bounds.
If you really want to split into two strings, the following method should work:
char ** split(char *tosplit, char *culprit) {
static char *couple[2];
if ((couple[0] = strtok(tosplit, culprit)) != NULL) {
couple[1] = strtok(NULL, culprit);
}
return couple;
}
I'd caution your use of strtok, it probably does not do what you want it to. If you think it does anything like a Java split, read the man page and then re-read it again seven times. It is literally tokenizing the string based on any of the values in delim.
I think you are looking for something like this:
#include <stdio.h>
#include <string.h>
char* split( char* s, char* delim ) {
char* needle = strstr(s, delim);
if (!needle)
return NULL;
needle[0] = 0;
return needle + strlen(delim);
}
int main() {
char s[] = "Fluffy furry Bunnies!";
char* res = split(s, "furry ");
printf("%s%s\n", s, res );
}
Which prints out "Fluffy Bunnies!".
First of all strtok modifies the memory of tosplit so be certain that, that's what you wish to do. If so then consider this:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
/*
* NOTE: unsafe (and leaky) implementation using strtok
*
* *into must point to a memory space where tokens can be stored
* or if *into is NULL then it allocates enough space.
* Returns:
* allocated array of items that you must free yourself
*
*/
char **__split(char *src, const char *delim)
{
size_t idx = 0;
char *next;
char **dest = NULL;
do {
dest = realloc(dest, (idx + 1)* sizeof(char *));
next = strtok(idx > 0 ? NULL:strdup(src), delim);
dest[idx++] = next;
} while(next);
return dest;
}
int main() {
int x = 0;
char **here = NULL;
here = __split("hello,there,how,,are,you?", ",");
while(here[x]) {
printf("here: %s\n", here[x]);
x++;
}
}
You can implement a much safer and non leaky version (note the strdup) of this but hopefully this is a good start.
The type of couple is char** but you have defined the function return type as char*. Furthermore you are returning the pointer to a local variable. You need to pass the pointer array into the function from the caller. For example:
#include <stdio.h>
#include <string.h>
char** split( char** couple, char* tosplit, char* culprit )
{
int i = 0;
// Returns first token
char *token = strtok( tosplit, culprit);
for( int i = 0; token != NULL && i < 2; i++ )
{
couple[i] = token;
token = strtok(NULL, culprit);
}
return couple;
}
int main()
{
char* couple[2] = {"", ""};
char tosplit[] = "Hello World" ;
char** strings = split( couple, tosplit, " " ) ;
printf( "%s, %s", strings[0], strings[1] ) ;
return 0;
}
How would I assign the value from strtok() to an array that's in a struct? Inside my struct I have char *extraRoomOne and in my main I have:
while (token!= NULL)
{
token = strtok(NULL, " ");
certainRoom.extraRoomOne[counter] = token;
}
Compiler is telling me to dereference it, but when I do I get a seg fault.
typedef struct room{
char *extraRoomOne;
}room;
In main, all I had was `room certainRoom;
Edit: changed char *extraRoomOne to char **extraRoomOne
Now I have:
token = strtok(NULL," ");
certainRoom.extraRoomOne = realloc(certainRoom.extraRoomOne,(counter + 1) * sizeof(char *));
certainRoom.extraRoomOne[counter] = malloc(strlen(token)+1);
strcpy(certainRoom.extraRoomOne[counter],token);`
Is this the correct way of realloc and malloc? I increment the counter below each time as well
You should not do that assignment because strtok() returns a pointer to the string you passed in the first call and it will change it in subsequent calls, and the '\0' terminator can be moved by strtok() so the pointer will point to a different string at the end, but instead you can copy the string first allocating space for it with malloc() and then with strcpy()
size_t length;
length = strlen(token);
certainRoom.extraRoomOne = malloc(1 + length);
if (certainRoom.extraRoomOne != NULL)
strcpy(certainRoom.extraRoomOne, token);
you should remember to include string.h.
And if what you really want is to capture more than just one token, which would explain the while loop, you could do it this way
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
typedef struct room{
char **tokens;
size_t count;
} room;
room
tokenizeString(char *string)
{
char *token;
room instance;
instance.tokens = NULL;
instance.count = 0;
token = strtok(string, " ");
while (token != NULL)
{
void *pointer;
size_t length;
pointer = realloc(instance.tokens, (1 + instance.count) * sizeof(char *));
if (pointer == NULL)
{
size_t i;
for (i = 0 ; i < instance.count ; ++i)
free(instance.tokens[i]);
free(instance.tokens);
instance.tokens = NULL;
instance.count = 0;
return instance;
}
instance.tokens = pointer;
length = strlen(token);
instance.tokens[instance.count] = malloc(1 + length);
if (instance.tokens[instance.count] != NULL)
strcpy(instance.tokens[instance.count], token);
instance.count += 1;
token = strtok(NULL, " ");
}
return instance;
}
int
main(int argc, char **argv)
{
room certainRoom;
size_t i;
if (argc < 1) /* invalid number of arguments */
return -1;
certainRoom = tokenizeString(argv[1]);
for (i = 0 ; i < certainRoom.count ; ++i)
{
printf("%s\n", certainRoom.tokens[i]);
/* we are done working with this token, release it */
free(certainRoom.tokens[i]);
}
/* all tokens where released, now released the main container,
* note, that this only contained the pointers, the data was
* in the space pointed to by these pointers. */
free(certainRoom.tokens);
return 0;
}
I want to create a function in C, so that I can pass the function a string, and a delimiter, and it will return to me an array with the parts of the string split up based on the delimiter. Commonly used to separate a sentence into words.
e.g.: "hello world foo" -> ["hello", "world", "foo"]
However, I'm new to C and a lot of the pointer things are confusing me. I got an answer mostly from this question, but it does it inline, so when I try to separate it into a function the logistics of the pointers are confusing me:
This is what I have so far:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void split_string(char string[], char *delimiter, char ***result) {
char *p = strtok(string, delimiter);
int i, num_spaces = 0;
while (p != NULL) {
num_spaces++;
&result = realloc(&result, sizeof(char *) * num_spaces);
if (&result == NULL) {
printf("Memory reallocation failed.");
exit(-1);
}
&result[num_spaces - 1] = p;
p = strtok(NULL, " ");
}
// Add the null pointer to the end of our array
&result = realloc(split_array, sizeof(char *) * num_spaces + 1);
&result[num_spaces] = 0;
for (i = 0; i < num_spaces; i++) {
printf("%s\n", &result[i]);
}
free(&result);
}
int main(int argc, char *argv[]) {
char str[] = "hello world 1 foo";
char **split_array = NULL;
split_string(str, " ", &split_array);
return 0;
}
The gist of it being that I have a function that accepts a string, accepts a delimiter and accepts a pointer to where to save the result. Then it constructs the result. The variable for the result starts out as NULL and without memory, but I gradually reallocate memory for it as needed.
But I'm really confused as to the pointers, like I said. I know my result is of type char ** as a string it of type char * and there are many of them so you need pointers to each, but then I'm supposed to pass the location of that char ** to the new function, right, so it becomes a char ***? When I try to access it with & though it doesn't seem to like it.
I feel like I'm missing something fundamental here, I'd really appreciate insight into what is going wrong with the code.
You confusing dereferencing with addressing (which is the complete opposite). Btw, I couldn't find split_array anywhere in the function, as it was down in main. Even if you had the dereferencing and addressing correct, this would still have other issues.
I'm fairly sure you're trying to do this:
#include <stdio.h>
#include <stdlib.h>
void split_string(char string[], const char *delimiter, char ***result)
{
char *p = strtok(string, delimiter);
void *tmp = NULL;
int count=0;
*result = NULL;
while (p != NULL)
{
tmp = realloc(*result, (count+1)*sizeof **result);
if (tmp)
{
*result = tmp;
(*result)[count++] = p;
}
else
{ // failed to expand
perror("Failed to expand result array");
exit(EXIT_FAILURE);
}
p = strtok(NULL, delimiter);
}
// add null pointer
tmp = realloc(*result, (count+1)*sizeof(**result));
if (tmp)
{
*result = tmp;
(*result)[count] = NULL;
}
else
{
perror("Failed to expand result array");
exit(EXIT_FAILURE);
}
}
int main()
{
char str[] = "hello world 1 foo", **toks = NULL;
char **it;
split_string(str, " ", &toks);
for (it = toks; *it; ++it)
printf("%s\n", *it);
free(toks);
}
Output
hello
world
1
foo
Honestly this would be cleaner if the function result were utilized rather than an in/out parameter, but you choice the latter, so there you go.
Best of luck.
Language: C
I am trying to program a C function which uses the header char *strrev2(const char *string) as part of interview preparation, the closest (working) solution is below, however I would like an implementation which does not include malloc... Is this possible? As it returns a character meaning if I use malloc, a free would have to be used within another function.
char *strrev2(const char *string){
int l=strlen(string);
char *r=malloc(l+1);
for(int j=0;j<l;j++){
r[j] = string[l-j-1];
}
r[l] = '\0';
return r;
}
[EDIT] I have already written implementations using a buffer and without the char. Thanks tho!
No - you need a malloc.
Other options are:
Modify the string in-place, but since you have a const char * and you aren't allowed to change the function signature, this is not possible here.
Add a parameter so that the user provides a buffer into which the result is written, but again this is not possible without changing the signature (or using globals, which is a really bad idea).
You may do it this way and let the caller responsible for freeing the memory. Or you can allow the caller to pass in an allocated char buffer, thus the allocation and the free are all done by caller:
void strrev2(const char *string, char* output)
{
// place the reversed string onto 'output' here
}
For caller:
char buffer[100];
char *input = "Hello World";
strrev2(input, buffer);
// the reversed string now in buffer
You could use a static char[1024]; (1024 is an example size), store all strings used in this buffer and return the memory address which contains each string. The following code snippet may contain bugs but will probably give you the idea.
#include <stdio.h>
#include <string.h>
char* strrev2(const char* str)
{
static char buffer[1024];
static int last_access; //Points to leftmost available byte;
//Check if buffer has enough place to store the new string
if( strlen(str) <= (1024 - last_access) )
{
char* return_address = &(buffer[last_access]);
int i;
//FixMe - Make me faster
for( i = 0; i < strlen(str) ; ++i )
{
buffer[last_access++] = str[strlen(str) - 1 - i];
}
buffer[last_access] = 0;
++last_access;
return return_address;
}else
{
return 0;
}
}
int main()
{
char* test1 = "This is a test String";
char* test2 = "George!";
puts(strrev2(test1));
puts(strrev2(test2));
return 0 ;
}
reverse string in place
char *reverse (char *str)
{
register char c, *begin, *end;
begin = end = str;
while (*end != '\0') end ++;
while (begin < --end)
{
c = *begin;
*begin++ = *end;
*end = c;
}
return str;
}
What would be an efficient way of converting a delimited string into an array of strings in C (not C++)? For example, I might have:
char *input = "valgrind --leak-check=yes --track-origins=yes ./a.out"
The source string will always have only a single space as the delimiter. And I would like a malloc'ed array of malloc'ed strings char *myarray[] such that:
myarray[0]=="valgrind"
myarray[1]=="--leak-check=yes"
...
Edit I have to assume that there are an arbitrary number of tokens in the inputString so I can't just limit it to 10 or something.
I've attempted a messy solution with strtok and a linked list I've implemented, but valgrind complained so much that I gave up.
(If you're wondering, this is for a basic Unix shell I'm trying to write.)
What's about something like:
char* string = "valgrind --leak-check=yes --track-origins=yes ./a.out";
char** args = (char**)malloc(MAX_ARGS*sizeof(char*));
memset(args, 0, sizeof(char*)*MAX_ARGS);
char* curToken = strtok(string, " \t");
for (int i = 0; curToken != NULL; ++i)
{
args[i] = strdup(curToken);
curToken = strtok(NULL, " \t");
}
if you have all of the input in input to begin with then you can never have more tokens than strlen(input). If you don't allow "" as a token, then you can never have more than strlen(input)/2 tokens. So unless input is huge you can safely write.
char ** myarray = malloc( (strlen(input)/2) * sizeof(char*) );
int NumActualTokens = 0;
while (char * pToken = get_token_copy(input))
{
myarray[++NumActualTokens] = pToken;
input = skip_token(input);
}
char ** myarray = (char**) realloc(myarray, NumActualTokens * sizeof(char*));
As a further optimization, you can keep input around and just replace spaces with \0 and put pointers into the input buffer into myarray[]. No need for a separate malloc for each token unless for some reason you need to free them individually.
Were you remembering to malloc an extra byte for the terminating null that marks the end of string?
From the strsep(3) manpage on OSX:
char **ap, *argv[10], *inputstring;
for (ap = argv; (*ap = strsep(&inputstring, " \t")) != NULL;)
if (**ap != '\0')
if (++ap >= &argv[10])
break;
Edited for arbitrary # of tokens:
char **ap, **argv, *inputstring;
int arglen = 10;
argv = calloc(arglen, sizeof(char*));
for (ap = argv; (*ap = strsep(&inputstring, " \t")) != NULL;)
if (**ap != '\0')
if (++ap >= &argv[arglen])
{
arglen += 10;
argv = realloc(argv, arglen);
ap = &argv[arglen-10];
}
Or something close to that. The above may not work, but if not it's not far off. Building a linked list would be more efficient than continually calling realloc, but that's really besides the point - the point is how best to make use of strsep.
Looking at the other answers, for a beginner in C, it would look complex due to the tight size of code, I thought I would put this in for a beginner, it might be easier to actually parse the string instead of using strtok...something like this:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
char **parseInput(const char *str, int *nLen);
void resizeptr(char ***, int nLen);
int main(int argc, char **argv){
int maxLen = 0;
int i = 0;
char **ptr = NULL;
char *str = "valgrind --leak-check=yes --track-origins=yes ./a.out";
ptr = parseInput(str, &maxLen);
if (!ptr) printf("Error!\n");
else{
for (i = 0; i < maxLen; i++) printf("%s\n", ptr[i]);
}
for (i = 0; i < maxLen; i++) free(ptr[i]);
free(ptr);
return 0;
}
char **parseInput(const char *str, int *Index){
char **pStr = NULL;
char *ptr = (char *)str;
int charPos = 0, indx = 0;
while (ptr++ && *ptr){
if (!isspace(*ptr) && *ptr) charPos++;
else{
resizeptr(&ptr, ++indx);
pStr[indx-1] = (char *)malloc(((charPos+1) * sizeof(char))+1);
if (!pStr[indx-1]) return NULL;
strncpy(pStr[indx-1], ptr - (charPos+1), charPos+1);
pStr[indx-1][charPos+1]='\0';
charPos = 0;
}
}
if (charPos > 0){
resizeptr(&pStr, ++indx);
pStr[indx-1] = (char *)malloc(((charPos+1) * sizeof(char))+1);
if (!pStr[indx-1]) return NULL;
strncpy(pStr[indx-1], ptr - (charPos+1), charPos+1);
pStr[indx-1][charPos+1]='\0';
}
*Index = indx;
return (char **)pStr;
}
void resizeptr(char ***ptr, int nLen){
if (*(ptr) == (char **)NULL){
*(ptr) = (char **)malloc(nLen * sizeof(char*));
if (!*(ptr)) perror("error!");
}else{
char **tmp = (char **)realloc(*(ptr),nLen);
if (!tmp) perror("error!");
*(ptr) = tmp;
}
}
I slightly modified the code to make it easier. The only string function that I used was strncpy..sure it is a bit long-winded but it does reallocate the array of strings dynamically instead of using a hard-coded MAX_ARGS, which means that the double pointer is already hogging up memory when only 3 or 4 would do, also which would make the memory usage efficient and tiny, by using realloc, the simple parsing is covered by employing isspace, as it iterates using the pointer. When a space is encountered, it reallocates the double pointer, and malloc the offset to hold the string.
Notice how the triple pointers are used in the resizeptr function.. in fact, I thought this would serve an excellent example of a simple C program, pointers, realloc, malloc, passing-by-reference, basic element of parsing a string...
Hope this helps,
Best regards,
Tom.