I'm working on a function, that has to take a dynamic char array, separate it at spaces, and put each word in an array of char arrays. Here's the code:
char** parse_cmdline(const char *cmdline)
{
char** arguments = (char**)malloc(sizeof(char));
char* buffer;
int lineCount = 0, strCount = 0, argCount = 0;
int spaceBegin = 0;
while((cmdline[lineCount] != '\n'))
{
if(cmdline[lineCount] == ' ')
{
argCount++;
arguments[argCount] = (char*)malloc(sizeof(char));
strCount = 0;
}
else
{
buffer = realloc(arguments[argCount], strCount + 1);
arguments[argCount] = buffer;
arguments[argCount][strCount] = cmdline[lineCount];
strCount++;
}
lineCount++;
}
arguments[argCount] = '\0';
free(buffer);
return arguments;
}
The problem is that somewhere along the way I get a Segmentation fault and I don't exacly know where.
Also, this current version of the function assumes that the string does not begin with a space, that is for the next version, i can handle that, but i can't find the reason for the seg. fault
This code is surely not what you intended:
char** arguments = (char**)malloc(sizeof(char));
It allocates a block of memory large enough for one char, and sets a variable of type char ** (arguments) to point to it. But even if you wanted only enough space in arguments for a single char *, what you have allocated is not enough (not on any C system you're likely to meet, anyway). It is certainly not long enough for multiple pointers.
Supposing that pointers are indeed wider than single chars on your C system, your program invokes undefined behavior as soon as it dereferences arguments. A segmentation fault is one of the more likely results.
The simplest way forward is probably to scan the input string twice: once to count the number of individual arguments there are, so that you can allocate enough space for the pointers, and again to create the individual argument strings and record pointers to them in your array.
Note, too, that the return value does not carry any accessible information about how much space was allocated, or, therefore, how many argument strings you extracted. The usual approach to this kind of problem is to allocate space for one additional pointer, and to set that last pointer to NULL as a sentinel. This is much akin to, but not the same as, using a null char to mark the end of a C string.
Edited to add:
The allocation you want for arguments is something more like this:
arguments = malloc(sizeof(*arguments) * (argument_count + 1));
That is, allocate space for one more object than there are arguments, with each object the size of the type of thing that arguments is intended to point at. The value of arguments is not accessed by sizeof, so it doesn't matter that it is indeterminate at that point.
Edited to add:
The free() call at the end is also problematic:
free(buffer);
At that point, variable buffer points to the same allocated block as the last element of arguments points to (or is intended to point to). If you free it then all pointers to that memory are invalidated, including the one you are about to return to the caller. You don't need to free buffer at that point any more than you needed to free it after any of the other allocations.
This is probably why you have a segmentation fault:
In char** arguments = (char**)malloc(sizeof(char));, you have used malloc (sizeof (char)), this allocates space for only a single byte (enough space for one char). This is not enough to hold a single char* in arguments.
But even if it was in some system, so arguments[argCount] is only reading allocated memory for argCount = 0. For other values of argCount, the array index is out of bounds - leading to a segmentation fault.
For example, if your input string is something like this - "Hi. How are you doing?", then it has 4 ' ' characters before \n is reached, and the value of argCount will go up till 3.
What you want to do is somthing like this:
char** parse_cmdline( const char *cmdline )
{
Allocate your array of argument pointers with length for 1 pointer and init it with 0.
char** arguments = malloc( sizeof(char*) );
arguments[0] = NULL;
Set a char* pointer to the first char in yor command line and remember the
beginn of the first argument
int argCount = 0, len = 0;
const char *argStart = cmdline;
const char *actPos = argStart;
Continue until end of command line reached.
If you find a blank you have a new argument which consist of th characters between argStart and actPos . Allocate and copy argument from command line.
while( *actPos != '\n' && *actPos != '\0' )
{
if( cmdline[lineCount] == ' ' && actPos > argStart )
{
argCount++; // increment number of arguments
arguments = realloc( arguments, (argCount+1) * sizeof(char*) ); // allocate argCount + 1 (NULL at end of list of arguments)
arguments[argCount] = NULL; // list of arguments ends with NULL
len = actPos - argStart;
arguments[argCount-1] = malloc( len+1 ); // allocate number of characters + '\0'
memcpy( arguments[argCount-1], actPos, len ); // copy characters of argument
arguments[argCount-1] = 0; // set '\0' at end of argument string
argStart = actPos + 1; // next argument starts after blank
}
actPos++;
}
return arguments;
}
some suggestions i would give is, before calling malloc, you might want to first count the number of words you have. then call malloc as char ** charArray = malloc(arguments*sizeof(char*));. This will be the space for the char ** charArray. Then each element in charArray should be malloced by the size of the word you are trying to store in that element. Then you may store that word inside that index.
Ex. *charArray = malloc(sizeof(word)); Then you can store it as **charArray = word;
Be careful with pointer arithmetic however.
The segmentation fault is definitly arising from you trying to access an element in an array in an undefined space. Which arises from you not mallocing space correctly for the array.
Related
I need to create 2 separate functions readLine() and readLines() in C. The first one has to return the pointer to the input string and the second one should return the array of pointers of input strings. readLines() is terminated with a new line character. I am getting some errors, probably something with memory, sometimes it works, sometimes it doesn't. Here is the code:
char* readLine() {
char pom[1000];
gets(pom);
char* s = (char*)malloc(sizeof(char) * strlen(pom));
strcpy(s, pom);
return s;
}
And here is readLines()
char** readLines() {
char** lines = (char**)malloc(sizeof(char*));
int i = 0;
do {
char pom[1000];
gets(pom);
lines[i] = (char*)malloc(sizeof(char) * strlen(pom));
strcpy(lines[i], pom);
i++;
} while (strlen(lines[i - 1]) != 0);
return lines;
}
In the main, I call these functions as
char* p = readLine();
char** lines = readLines();
When allocating memory for a string using malloc, you should allocate enough memory for the entire string, including the terminating null character.
In your case, strcpy will cause a buffer overflow, because the destination buffer isn't large enough.
You should change the line
char* s = (char*)malloc(sizeof(char) * strlen(pom));
to
char* s = (char*)malloc( sizeof(char) * (strlen(pom)+1) );
and change the line
lines[i] = (char*)malloc(sizeof(char) * strlen(pom));
to
lines[i] = (char*)malloc( sizeof(char) * (strlen(pom)+1) );
Also, the line
char** lines = (char**)malloc(sizeof(char*));
is wrong, as it only allocates enough memory for a single pointer. You need one pointer per line. Unfortunately, you don't know in advance how many lines there will be, so you also don't know how many pointers you will need. However, you can resize the buffer as required, by using the function realloc.
Although it is unrelated to your problem, it is worth noting that the function gets has been removed from the ISO C standard and should no longer be used. It is recommended to use fgets instead. See this question for further information: Why is the gets function so dangerous that it should not be used?
Also, in C, it is not necessary to cast the return value of malloc. This is only necessary in C++. See this question for further information: Do I cast the result of malloc?
In your code, you first increment i using the statement i++; and then you reconstruct the previous value of i by subtracting 1:
while (strlen(lines[i - 1]) != 0);
This is unnecessarily cumbersome. It would be better to write
while (strlen(lines[i++]) != 0);
and remove the line i++;. That way, you no longer have to subtract by 1.
i am trying to convert a string (example: "hey there mister") into a double pointer that's pointing to every word in the sentence.
so: split_string->|pointer1|pointer2|pointer3| where pointer1->"hey", pointer2->"there" and pointer3->"mister".
char **split(char *s) {
char **nystreng = malloc(strlen(s));
char str[strlen(s)];
int i;
for(i = 0; i < strlen(s); i++){
str[i] = s[i];
}
char *temp;
temp = strtok(str, " ");
int teller = 0;
while(temp != NULL){
printf("%s\n", temp);
nystreng[teller] = temp;
temp = strtok(NULL, " ");
}
nystreng[teller++] = NULL;
//free(nystreng);
return nystreng;
}
My question is, why isnt this working?
Your code has multiple problems. Among them:
char **nystreng = malloc(strlen(s)); is just wrong. The amount of space you need is the size of a char * times the number pieces into which the string will be split plus one (for the NULL pointer terminator).
You fill *nystreng with pointers obtained from strtok() operating on local array str. Those pointers are valid only for the lifetime of str, which ends when the function returns.
You do not allocate space for a string terminator in str, and you do not write one, yet you pass it to strtok() as if it were a terminated string.
You do not increment teller inside your tokenization loop, so each token pointer overwrites the previous one.
You have an essential problem here in that you do not know before splitting the string how many pieces there will be. You could nevertheless get an upper bound on that by counting the number of delimiter characters and adding 1. You could then allocate space for that many char pointers plus one. Alternatively, you could build a linked list to handle the pieces as you tokenize, then allocate the result array only after you know how many pieces there are.
As for str, if you want to return pointers into it, as apparently you do, then it needs to be dynamically allocated, too. If your platform provides strdup() then you could just use
char *str = strdup(s);
Otherwise, you'll need to check the length, allocate enough space with malloc() (including space for the terminator), and copy the input string into the allocated space, presumably with strcpy(). Normally you would want to free the string afterward, but you must not do that if you are returning pointers into that space.
On the other hand, you might consider returning an array of strings that can be individually freed. For that, you must allocate each substring individually (strdup() would again be your friend if you have it), and in that event you would want to free the working space (or allow it to be cleaned up automatically if you use a VLA).
There are two things you need to do -
char str[strlen(s)]; //size should be equal to strlen(s)+1
Extra 1 for '\0'. Right now you pass str (not terminated with '\0') to strtok which causes undefined behaviour .
And second thing ,you also need allocate memory to each pointer of nystring and then use strcpy instead of pointing to temp(don't forget space for nul terminator).
struct integer
{
int len;
char* str;
int* arr;
}int1, int2;
int main(void) {
printf("Please enter 1st number\n");
int1.str= str_input();
int1.len=chars_read-1;
int1.arr= func2(int1.len, int1.str);
printf(("\%c\n"), *int1.str);
printf("Please enter 2nd number\n");
int2.str = str_input();
int2.len=chars_read-1;
printf(("\n %c\n"), *int1.str );
int2.arr= func2(int2.len, int2.str);
if the input is 4363 and 78596 , the output is 4 and 7 respectively.
The output is not 4 and 4. Given that both are different objects, shouldn't both have different memory allocation?
Please note: this is NOT a typographical error. I have used the same *int1.str both times. the problem is that although I have made no changes in it, its value is changing. How?
I do not think that str_input() can make a difference.
char* str_input(void) {
char cur_char;
char* input_ptr = (char*)malloc(LINE_LEN * sizeof(char));
char input_string[LINE_LEN];
//while ((cur_char = getchar()) != '\n' && cur_char<='9' && cur_char>='0' && chars_read < 10000)
for(chars_read=1; chars_read<10000; chars_read++)
{
scanf("%c", &cur_char);
if(cur_char!='\n' && cur_char<='9' && cur_char>='0')
{
input_string[chars_read-1]= cur_char;
printf("%c\n", input_string[chars_read-1]);
}
else{
break;
}
}
input_string[chars_read] = '\n';
input_ptr = &input_string[0]; /* sets pointer to address of 0th index */
return input_ptr;
}
//chars_read is a global variable.
Thanks in advance.
you have printed the same variable, *int1.str
It will be helpful have the source code of str_input(), but it's probably that it returns a pointer to the same buffer in each call, so the second call to str_input() updates also the target of int1.str (beacuse it's pointing to the same char* than int2.str)
As noted elsewhere, both of the printf calls in your question pass *int1.str to printf.
However, if that is merely a typographical error in your question, and the second printf call passes *int2.str, then most likely the problem is that str_input returns the address of a fixed buffer (with static or, worse, automatic storage duration). Instead, str_input should use malloc or strdup to allocate new memory for each string and should return a pointer to that. Then the caller should free the memory.
Alternatively, str_input may be changed to accept a buffer and size passed to it by the caller, and the caller will have the responsibility of providing a different buffer for each call.
About the newly posted code
The code for str_input contains this line:
char* input_ptr = (char*)malloc(LINE_LEN * sizeof(char));
That declares input_ptr to be a char * and calls malloc to get space. Then input_ptr is set to contain the address of that space. Later, str_input contains this line:
input_ptr = &input_string[0];
That line completely ignores the prior value of input_ptr and overwrites it with the address of input_string[0]. So the address returned by malloc is gone. The str_input function returns the address of input_string[0] each time it is called. This is wrong. str_input must return the address of the allocated space each time.
Typically, a routine like this would use input_ptr throughout, doing its work in the char array at that address. It would not use a separate array, input_string, for its work. So delete the definition of input_string and change str_input to do all its work in the space pointed to by input_ptr.
Also, do not set the size of the buffer to LINE_LEN in one place but limit the number of characters in it with chars_read < 10000. Use the same limit in all places. Also allow one byte for a null character at the end (unless you are very careful never to perform any operation that requires a null byte at the end).
I am trying to remove first semicolon from a character arrary whose value is:
Input: ; Test: 876033074, 808989746, 825766962, ; Test1:
825766962,
Code:
char *cleaned = cleanResult(result);
printf("Returned BY CLEAN: %s\n",cleaned);
char *cleanResult(char *in)
{
printf("Cleaning this: %s\n",in);
char *firstOccur = strchr(in,';');
printf("CLEAN To Remove: %s\n",firstOccur);
char *restOfArray = firstOccur + 2;
printf("CLEAN To Remove: %s\n",restOfArray); //Correct Value Printed here
char *toRemove;
while ((toRemove = strstr(restOfArray + 2,", ;"))!=NULL)
{
printf("To Remove: %s\n",toRemove);
memmove (toRemove, toRemove + 2, strlen(toRemove + 2));
printf("Removed: %s\n",toRemove); //Correct Value Printed
}
return in;
}
Output (first semicolon still there): ; Test: 876033074,
808989746, 825766962; Test1: 825766962;
Regarding sizeof(cleaned): using sizeof to get the capacity of an array only works if the argument is an array, not a pointer:
char buffer[100];
const char *pointer = "something something dark side";
// Prints 100
printf("%zu\n", sizeof(buffer));
// Prints size of pointer itself, usually 4 or 8
printf("%zu\n", sizeof(pointer));
Although both a local array and a pointer can be subscripted, they behave differently when it comes to sizeof. Thus, you cannot determine the capacity of an array given only a pointer to it.
Also, bear this in mind:
void foo(char not_really_an_array[100])
{
// Prints size of pointer!
printf("%zu\n", sizeof(not_really_an_array));
// Compiles, since not_really_an_array is a regular pointer
not_really_an_array++;
}
Although not_really_an_array is declared like an array, it is a function parameter, so is actually a pointer. It is exactly the same as:
void foo(char *not_really_an_array)
{
...
Not really logical, but we're stuck with it.
On to your question. I'm unclear on what you're trying to do. Simply removing the first character of a string (in-place) can be accomplished with a memmove:
memmove( buffer // destination
, buffer + 1 // source
, strlen(buffer) - 1 // number of bytes to copy
);
This takes linear time, and assumes buffer does not contain an empty string.
The reason strcpy(buffer, buffer + 1) won't do is because the strings overlap, so this yields undefined behavior. memmove, however, explicitly allows the source and destination to overlap.
For more complex character filtering, you should consider traversing the string manually, using a "read" pointer and a "write" pointer. Just make sure the write pointer does not get ahead of the read pointer, so the string won't be clobbered while it is read.
void remove_semicolons(char *buffer)
{
const char *r = buffer;
char *w = buffer;
for (; *r != '\0'; r++)
{
if (*r != ';')
*w++ = *r;
}
*w = 0; // Terminate the string at its new length
}
You are using strcpy with overlapping input / output buffer, which results in undefined behavior.
You're searching for a sequence of three characters (comma space semicolon) and then removing the first two (the comma and the space). If you want to remove the semicolon too, you need to remove all three characters (use toRemove+3 instead of toRemove+2). You also need to add 1 to the strlen result to account for the NUL byte terminating the string.
If, as you say, you just want to remove the first semicolon and nothing else, you need to search for just the semicolon (which you can do with strchr):
if ((toRemove = strchr(in, ';')) // find a semicolon
memmove(toRemove, toRemove+1, strlen(toRemove+1)+1); // remove 1 char at that position
When I do this
char *paths[10];
paths[0] = "123456";
printf(1,"%s\n",paths[0]);
printf(1,"%c\n",paths[0][2]);
Output:
123456
3
But when I you do this
char *paths[10];
paths[0][0]='1';
paths[0][1]='2';
paths[0][2]='3';
paths[0][3]='4';
paths[0][4]='5';
paths[0][5]='6';
printf(1,"%s\n",paths[0]);
printf(1,"%c\n",paths[0][2]);
Output:
(null)
3
Why it is null in this case?
How to create a string array using characters in C? I am a bit new to C and feeling some difficulties to program in C
You have a lot of options provided by various answers, I'm just adding a few points.
You can create a string array as follows:
I. You can create an array of read only strings as follows:
char *string_array0[] = {"Hello", "World", "!" };
This will create array of 3 read-only strings. You cannot modify the characters of the string in this case i.e. string_array0[0][0]='R'; is illegal.
II. You can declare array of pointers & use them as you need.
char *string_array1[2]; /* Array of pointers */
string_array1[0] = "Hey there"; /* This creates a read-only string */
/* string_array1[0][0] = 'R';*/ /* This is illegal, don't do this */
string_array1[1] = malloc(3); /* Allocate memory for 2 character string + 1 NULL char*/
if(NULL == string_array1[1])
{
/* Handle memory allocation failure*/
}
string_array1[1][0] = 'H';
string_array1[1][1] = 'i';
string_array1[1][2] = '\0'; /* This is important. You can use 0 or NULL as well*/
...
/* Use string_array1*/
...
free(string_array1[1]); /* Don't forget to free memory after usage */
III. You can declare a two dimensional character array.
char string_array2[2][4]; /* 2 strings of atmost 3 characters can be stored */
string_array2[0][0] = 'O';
string_array2[0][1] = 'l';
string_array2[0][2] = 'a';
string_array2[0][3] = '\0';
string_array2[1][0] = 'H';
string_array2[1][1] = 'i';
string_array2[1][2] = '\0'; /* NUL terminated, thus string of length of 2 */
IV. You can use pointer to pointer.
char ** string_array3;
int i;
string_array3 = malloc(2*sizeof(char*)); /* 2 strings */
if(NULL == string_array3)
{
/* Memory allocation failure handling*/
}
for( i = 0; i < 2; i++ )
{
string_array3[i] = malloc(3); /* String can hold at most 2 characters */
if(NULL == string_array3[i])
{
/* Memory allocation failure handling*/
}
}
strcpy(string_array3[0], "Hi");
string_array3[1][0]='I';
string_array3[1][1]='T';
string_array3[1][2]='\0';
/*Use string_array3*/
for( i = 0; i < 2; i++ )
{
free(string_array3[i]);
}
free(string_array3);
Points to remember:
If you are creating read-only string, you cannot change the character
in the string.
If you are filling the character array, make sure you have
memory to accommodate NUL character & make sure you terminate your
string data with NUL character.
If you are using pointers &
allocating memory, make sure you check if memory allocation is done
correctly & free the memory after use.
Use string functions from
string.h for string manipulation.
Hope this helps!
P.S.: printf(1,"%s\n",paths[0]); looks shady
char *paths[10];
paths[0][0]='1';
paths is an array of pointers. paths is not initialized to anything. So, it has garbage values. You need to use allocate memory using malloc. You just got unlucky that this program actually worked silently. Think of what address location would paths[0][0] would yield to assign 1 to it.
On the other hand, this worked because -
char *paths[10];
paths[0] = "123456";
"123456" is a string literal residing in the reading only location. So, it returns the starting address of that location which paths[0] is holding. To declare correctly -
const char* paths[10];
paths is an array of 10 pointers to char. However, the initial elements are pointers which can not be dereferenced. These are either wild (for an array in a function) or NULL for a static array. A wild pointer is an uninitialized one.
I would guess yours is static, since paths[0] is NULL. However, it could be a coincidence.
When you dereference a NULL or uninitialized pointer, you're reading or writing to memory you don't own. This causes undefined behavior, which can include crashing your program.
You're actually lucky it doesn't crash. This is probably because the compiler sees you're writing a constant to paths[0][2], and changes your printf to print the constant directly. You can not rely on this.
If you want to have a pointer you're allowed to write to, do:
paths[0] = malloc(string_length + 1);
string_length is the number of characters you can write. The 1 gives you room for a NUL. When you're done, you have to free it.
For your second example, if you know the size of each string you can write e.g.
char paths[10][6];
paths[0][0]='1';
...
paths[0][5]='6';
This way you have 10 strings of length 6 (and use only the first string so far).
You can define the string yourself, right after
#inlcude <stdio.h>
like this
typedef char string[];
and in main you can do this
string paths = "123456";
printf("%s\n", paths);
return 0;
so your code would look like this
#include <stdio.h>
typedef char string[];
int main() {
string paths = "123456";
printf("%s", paths);
}