just brushing up on some C for a class and I've run across a little something that makes me scratch me head. For this code:
char * findString(const char * s){
/* Allocate space */
char * ret = malloc(strlen(s) + 1);
/* Copy characters */
char * n;
n = ret;
for ( ;*s != 0; s++)
if (isLetter(*s))
*n++ = *s;
*n = 0;
/* return pointer to beginning of string */
return ret;
}
(We're just assuming an isLetter that returns a 1/0).
The idea of the snippet is to take a string with a bunch of crap in it, and return a string that contains only the letters.
So, how does 'ret' work in this instance? I'm very confused by the returning of 'ret' when 'n = ret' is declared above the for loop and 'ret' never gets set to anything afterwards. Obviously I'm missing something here. Help!
-R. L.
both ret and n are pointers to the same block of memory. their 'values' are simply memory addresses -- when you change *n, you change *ret, even though n and ret retain their original values.
//make n point to the beginning of the block of memory pointed
//to by ret
n = ret;
//iterate through the string which was passed to
//the function
for ( ;*s != 0; s++)
//if the current character is a letter:
if (isLetter(*s))
//set the character pointed to by n to
//the current character in the string, and then
//make n point to the next one.
*n++ = *s;
note that the loop increments n, and then after the loop sets the last character to 0 (to null terminate the string). Now, n points to the end of the string -- but since ret was never changed it still points to the beginning of the memory that you malloced before the loop. When you return it, you're returning a pointer to the new string which is the string you passed to the function, minus all non-letters.
Note that after this function returns, it is the caller's responsibility to free() the memory allocated by the function, lest ye roam into memory leaks.
ret, while semantically being a string, is actually a pointer to the first character of the string. n is used as a pointer to the current position in that string. So ret stays pointing to the start of the string, while n moves along the string as it is filled in. the *n = 0 then adds the null terminator. Thus while ret doesn't get set to anything, the contents of the string it points to are set.
n and ret are pointers, which means they contain addresses. In this case they both contain the address of the same character buffer that's being allocated with malloc. In this sense, n and ret are interchangeable.
Not much to it, really, line 3 allocates an empty string, ret, that is long enough to hold the argument even if it is all letters. That string will eventually be returned. the function then iterates through the argument and if it is a letter, puts it into the return string, by way of an intermediate pointer, n, which keeps track of the current position in the returned string.
The ret is a pointer to the beginning of the string to be returned. You need to create another pointer, n, because this pointer will not always point to the beginning of the string to be returned, it will walk on it, changing its characters. And, to be able to return a string, you must return a pointer to the beginning of the string, and you need to know where it ends (that's why you need the 0 added to the end).
Hope I helped!
Related
So i have a buffer (array) :
char *buf;
buf = malloc(1024);
the buf is like "foo\0bar\0foo\0bar\0\0\0\0\0\0\0..."
it contains strings separated by the null terminator. I need to separate every string. I tried using the strtok() with \0 as the delemiter but of course it didnt work. How can i achieve that? Also afterwards each string needs to be "copied" somewhere else.
You can go through the array and copy every character except the \0 into another array/struct depending on what that "somewhere else" needs to be. So every string would end at \0.
Since what you have is not actually a string but a character array that may contain nulls, you can use the memchr function to search for nulls in the array. Then you can use strncpy or strcpy to copy out the individual strings.
char *p = buf;
char *list[1024];
int cnt = 0;
while (p) {
char *n = memchr(p, 0, 1024 - (p-buf));
if (n) {
list[cnt++] = strdup(p);
} else {
int size = 1024 - (p-buf);
list[cnt] = malloc(size + 1);
strncpy(list[cnt], p, size);
list[cnt++][size] = 0;
}
p = n;
if (p) p++;
}
We start by setting p to the beginning of buf. Then on each iteration, we use memchr to look for the next null byte between p and the end of the array. If we find one, we can treat p as a string and use strdup to allocate space for and duplicate the string. If we don't find a null, we copy the remaining bytes to a newly allocated buffer and manually add a null byte.
Note that you'll need to know how large your buffer is so that you don't read past the end of it.
EDIT:
There was an issue with the code as originally written. After one iteration, p was pointing to a null byte, so memchr would keep returning a pointer to that byte. I added an increment past that byte at the end of the loop so it isn't checked again.
I'm working on a function, that has to take a dynamic char array, separate it at spaces, and put each word in an array of char arrays. Here's the code:
char** parse_cmdline(const char *cmdline)
{
char** arguments = (char**)malloc(sizeof(char));
char* buffer;
int lineCount = 0, strCount = 0, argCount = 0;
int spaceBegin = 0;
while((cmdline[lineCount] != '\n'))
{
if(cmdline[lineCount] == ' ')
{
argCount++;
arguments[argCount] = (char*)malloc(sizeof(char));
strCount = 0;
}
else
{
buffer = realloc(arguments[argCount], strCount + 1);
arguments[argCount] = buffer;
arguments[argCount][strCount] = cmdline[lineCount];
strCount++;
}
lineCount++;
}
arguments[argCount] = '\0';
free(buffer);
return arguments;
}
The problem is that somewhere along the way I get a Segmentation fault and I don't exacly know where.
Also, this current version of the function assumes that the string does not begin with a space, that is for the next version, i can handle that, but i can't find the reason for the seg. fault
This code is surely not what you intended:
char** arguments = (char**)malloc(sizeof(char));
It allocates a block of memory large enough for one char, and sets a variable of type char ** (arguments) to point to it. But even if you wanted only enough space in arguments for a single char *, what you have allocated is not enough (not on any C system you're likely to meet, anyway). It is certainly not long enough for multiple pointers.
Supposing that pointers are indeed wider than single chars on your C system, your program invokes undefined behavior as soon as it dereferences arguments. A segmentation fault is one of the more likely results.
The simplest way forward is probably to scan the input string twice: once to count the number of individual arguments there are, so that you can allocate enough space for the pointers, and again to create the individual argument strings and record pointers to them in your array.
Note, too, that the return value does not carry any accessible information about how much space was allocated, or, therefore, how many argument strings you extracted. The usual approach to this kind of problem is to allocate space for one additional pointer, and to set that last pointer to NULL as a sentinel. This is much akin to, but not the same as, using a null char to mark the end of a C string.
Edited to add:
The allocation you want for arguments is something more like this:
arguments = malloc(sizeof(*arguments) * (argument_count + 1));
That is, allocate space for one more object than there are arguments, with each object the size of the type of thing that arguments is intended to point at. The value of arguments is not accessed by sizeof, so it doesn't matter that it is indeterminate at that point.
Edited to add:
The free() call at the end is also problematic:
free(buffer);
At that point, variable buffer points to the same allocated block as the last element of arguments points to (or is intended to point to). If you free it then all pointers to that memory are invalidated, including the one you are about to return to the caller. You don't need to free buffer at that point any more than you needed to free it after any of the other allocations.
This is probably why you have a segmentation fault:
In char** arguments = (char**)malloc(sizeof(char));, you have used malloc (sizeof (char)), this allocates space for only a single byte (enough space for one char). This is not enough to hold a single char* in arguments.
But even if it was in some system, so arguments[argCount] is only reading allocated memory for argCount = 0. For other values of argCount, the array index is out of bounds - leading to a segmentation fault.
For example, if your input string is something like this - "Hi. How are you doing?", then it has 4 ' ' characters before \n is reached, and the value of argCount will go up till 3.
What you want to do is somthing like this:
char** parse_cmdline( const char *cmdline )
{
Allocate your array of argument pointers with length for 1 pointer and init it with 0.
char** arguments = malloc( sizeof(char*) );
arguments[0] = NULL;
Set a char* pointer to the first char in yor command line and remember the
beginn of the first argument
int argCount = 0, len = 0;
const char *argStart = cmdline;
const char *actPos = argStart;
Continue until end of command line reached.
If you find a blank you have a new argument which consist of th characters between argStart and actPos . Allocate and copy argument from command line.
while( *actPos != '\n' && *actPos != '\0' )
{
if( cmdline[lineCount] == ' ' && actPos > argStart )
{
argCount++; // increment number of arguments
arguments = realloc( arguments, (argCount+1) * sizeof(char*) ); // allocate argCount + 1 (NULL at end of list of arguments)
arguments[argCount] = NULL; // list of arguments ends with NULL
len = actPos - argStart;
arguments[argCount-1] = malloc( len+1 ); // allocate number of characters + '\0'
memcpy( arguments[argCount-1], actPos, len ); // copy characters of argument
arguments[argCount-1] = 0; // set '\0' at end of argument string
argStart = actPos + 1; // next argument starts after blank
}
actPos++;
}
return arguments;
}
some suggestions i would give is, before calling malloc, you might want to first count the number of words you have. then call malloc as char ** charArray = malloc(arguments*sizeof(char*));. This will be the space for the char ** charArray. Then each element in charArray should be malloced by the size of the word you are trying to store in that element. Then you may store that word inside that index.
Ex. *charArray = malloc(sizeof(word)); Then you can store it as **charArray = word;
Be careful with pointer arithmetic however.
The segmentation fault is definitly arising from you trying to access an element in an array in an undefined space. Which arises from you not mallocing space correctly for the array.
struct integer
{
int len;
char* str;
int* arr;
}int1, int2;
int main(void) {
printf("Please enter 1st number\n");
int1.str= str_input();
int1.len=chars_read-1;
int1.arr= func2(int1.len, int1.str);
printf(("\%c\n"), *int1.str);
printf("Please enter 2nd number\n");
int2.str = str_input();
int2.len=chars_read-1;
printf(("\n %c\n"), *int1.str );
int2.arr= func2(int2.len, int2.str);
if the input is 4363 and 78596 , the output is 4 and 7 respectively.
The output is not 4 and 4. Given that both are different objects, shouldn't both have different memory allocation?
Please note: this is NOT a typographical error. I have used the same *int1.str both times. the problem is that although I have made no changes in it, its value is changing. How?
I do not think that str_input() can make a difference.
char* str_input(void) {
char cur_char;
char* input_ptr = (char*)malloc(LINE_LEN * sizeof(char));
char input_string[LINE_LEN];
//while ((cur_char = getchar()) != '\n' && cur_char<='9' && cur_char>='0' && chars_read < 10000)
for(chars_read=1; chars_read<10000; chars_read++)
{
scanf("%c", &cur_char);
if(cur_char!='\n' && cur_char<='9' && cur_char>='0')
{
input_string[chars_read-1]= cur_char;
printf("%c\n", input_string[chars_read-1]);
}
else{
break;
}
}
input_string[chars_read] = '\n';
input_ptr = &input_string[0]; /* sets pointer to address of 0th index */
return input_ptr;
}
//chars_read is a global variable.
Thanks in advance.
you have printed the same variable, *int1.str
It will be helpful have the source code of str_input(), but it's probably that it returns a pointer to the same buffer in each call, so the second call to str_input() updates also the target of int1.str (beacuse it's pointing to the same char* than int2.str)
As noted elsewhere, both of the printf calls in your question pass *int1.str to printf.
However, if that is merely a typographical error in your question, and the second printf call passes *int2.str, then most likely the problem is that str_input returns the address of a fixed buffer (with static or, worse, automatic storage duration). Instead, str_input should use malloc or strdup to allocate new memory for each string and should return a pointer to that. Then the caller should free the memory.
Alternatively, str_input may be changed to accept a buffer and size passed to it by the caller, and the caller will have the responsibility of providing a different buffer for each call.
About the newly posted code
The code for str_input contains this line:
char* input_ptr = (char*)malloc(LINE_LEN * sizeof(char));
That declares input_ptr to be a char * and calls malloc to get space. Then input_ptr is set to contain the address of that space. Later, str_input contains this line:
input_ptr = &input_string[0];
That line completely ignores the prior value of input_ptr and overwrites it with the address of input_string[0]. So the address returned by malloc is gone. The str_input function returns the address of input_string[0] each time it is called. This is wrong. str_input must return the address of the allocated space each time.
Typically, a routine like this would use input_ptr throughout, doing its work in the char array at that address. It would not use a separate array, input_string, for its work. So delete the definition of input_string and change str_input to do all its work in the space pointed to by input_ptr.
Also, do not set the size of the buffer to LINE_LEN in one place but limit the number of characters in it with chars_read < 10000. Use the same limit in all places. Also allow one byte for a null character at the end (unless you are very careful never to perform any operation that requires a null byte at the end).
#include<stdio.h>
#include<malloc.h>
void my_strcpy(char *sour,char *dest){
if(sour == NULL || dest == NULL){
return;
}
while(*sour != '\0'){
*dest++ = *sour++;
}
*dest = '\0';
}
int main(){
char *d = NULL;
char *s = "Angus Declan R";
d = malloc(sizeof(char*));
my_strcpy(s,d);
printf("\n %s \n",d);
return 0;
}
This func works fine and prints the string. My doubt is as the pointer "dest" will be pointing to the '\0' how does it prints the whole string(as it didnt point to the initial address of the string).
It's true that dest will point to the end of the string. But you are not printing the string by using dest - you are printing the string by using d which is a different variable.
Remember that in C and C++ values are passed by value by default - so when you call the function my_strcpy the value of the variable d is copied into the variable dest which is local to the function my_strcpy only and any changes to that variable will not affect d.
Also note that you are not allocating enough space for your d variable:
d = malloc(sizeof(char*));
This will allocate enough space for a pointer to character which will usually mean enough space for 4 (or maybe 8) characters. You should allocate enough space for the string you intend to copy plus one character for the terminating null byte. What is the size of the string you are trying to copy? Hint: strlen should help.
The d pointer passed in is a copy of the pointer, so the position/movement of the pointer in your function is not reflected back to where it was called in main. While d and dest both point to the same block of memory (at least initially) and any changes to that block of memory will be reflected on both ends, the dest pointer is only a copy.
my_strcpy(s,d);
C passes arguments by value. The value of d is passed to my_strcpy, which means d object inside main is not modified in your my_strcpy function.
I want to reinitialize d everytime in a loop
char d[90];
while(ptr != NULL)
{
printf("Word: %s\n",ptr);
//int k = 0;
strcpy(d, ptr);
d[sizeof(d)-1] = '\0';
//something more
....
....
}
There's no need to do anything "before" strcpy(). Calling strcpy() on the buffer d will overwrite whatever is in the buffer, and leave the buffer holding the string pointed at by ptr at the time of the call. There's no need for the assignment of the last character to '\0'.
Of course, if you're doing the explicit termination because you're not sure if the strcpy() will overwrite d, then you have a problem. You should use strlen() on ptr before the copy to make sure it fits, or use snprintf() if you have it.
memset(d, 0, 90)?
unwind is exactly right. You don't need to do anything before strcpy(). However, if you find yourself later needing to initialize some memory to a particular pattern you can use this:
void* pointerToMemory; // Initialize this appropriately
unsigned char initValue = 0; // Or whatever 8-bit value you want
size_t numBytesToInitialize; // Set this appropriately
memset(pointerToMemory, initValue, numBytesToInitialize);
Simply do strcpy(d, ptr). strcpy will add the NULL terminator at the end for you, of course assuming d can hold at least as many characters as ptr.