Fundamental C pointer question using 'strcpy' - c

Man, pointers continue to give me trouble. I thought I understood the concept.(Basically, that you would use *ptr when you want to manipulate the actual memory saved at the location that ptr points to. You would just use ptr if you would like to move that pointer by doing things such as ptr++ or ptr--.) So, if that is the case, if you use the asterisk to manipulate the files that the pointer is pointing to, how does this work:
char *MallocAndCopy( char *line ) {
char *pointer;
if ( (pointer = malloc( strlen(line)+1 )) == NULL ) {
printf( "Out of memory!!! Goodbye!\n" );
exit( 0 );
}
strcpy( pointer, line );
return pointer;
}
malloc returns a pointer, so I understand why the "pointer" in the if condition does not utilize the asterisk. However, in the strcpy function, it is sending the CONTENTS of line to the CONTENTS of pointer. Shouldn't it be:
strcpy( *pointer, *line);
????? Or is my understanding of pointers correct and that is just the way that the strcpy function works?

A C-style string is an array of character bytes. When you pass an array around as a pointer to the array type, the pointer contains the address of the first element of the array.
The strcpy function takes the pointer to the first char of the source array (which is the start of the string) and the pointer to the first char in the destination array, and iterates over the source until it reaches a '\0' character, which terminates the string.
That's also why when you call malloc, the size that you pass to it is strlen(line)+1, because you need to allocate one more byte for the termination character (as far as I know).

char* strcpy(char *destination, const char *source) is the signature for strcpy. It expects to get pointers as arguments. In your instance, pointer is already a pointer, and so is line.
strcpy will take the pointers source and destination, and copy the underlying bytes pointed at from source to destination until it either hits the \0 (NULL) byte in the string pointed to by source, or if it segmentation faults because it never encounters that byte (unterminated string) and just reads off into the abyss.
If you used *pointer, you would actually be dereferencing the pointer and getting the char at the address the pointer points to.

The signature of strcpy is
char * strcpy ( char * destination, const char * source );
You are sending in pointer which is char*, and line which is also char*. Thus you are matching the signature exactly as expected. Sending in *pointer or *line would be sending 'the values these pointers point to' - which would be wrong.

scrcpy takes one pointer, and writes the contents it is pointing to to the location pointed by the another pointer. It is like rewriting cell contents in the table. You don't necessarily have to cut and paste a part of the table, you can rewrite the numbers in it. In that case pointer is just a hint which cells you have to touch.

Look at the declaration of strpcy().
char *strcpy(
char *strDestination,
const char *strSource
);
You need to pass a pointer. So you don't dereference pointer and line. Because that would pass a char.

The * is inside strcpy.
Applying the * operator is called dereferencing, and is exactly the same as applying the [0].
strlen and strcpy need to know where the strings start so that they can access all of their elements, not just the first.
The type of pointer is char * while the type of *pointer is char, meaning a single character with no concept of neighboring characters.

By writing *pointer, you obtain the character that it points to (which happens to be the first character of the string that pointer represents). Passing it to strcpy involves creating a copy of that character. It's impossible for strcpy to know where it should be writing, unless you give it a pointer to that data.
In a similar fashion, by providing *line you're only giving strcpy a copy of the first character of your string.
So, you're basically saying:
I have two strings. I'm not giving them to you. The first letter of one is C, the first letter of the other is µ. Copy the contents of the first to the second

I see why you expect to pass *pointer and *line to strcpy. But remember that pointer points to the location in memory where the content is stored and where there is strlen(line)+1 bytes of memory is reserved and allocated. Thus what strcpy is exactly doing is to copy strlen(line) bytes of memory starting from the address line into corresponding locations, starting from the address of pointer. If you pass *pointer and *line, strcpy will have access to only the first char and not the rest of strlen(line)-1. Hope that clarifies your confusion about the pointers and why we need to pass pointers to strcpy.

Your basic understanding of strcpy is correct. The function indeed copies the contents of line over the contents of pointer. However, you have to pass pointers to the function and not the data itself. This is done for several reasons.
First, the data that is being copied is a string (that is, an array) and not a regular variable. When passing an array to a function, you can't cram the entire array into a single function argument so instead you have to pass a pointer to the beginning of the array. This is a limitation of array passing in C and is not specific to strcpy.
Also, passing data to a function provides the function with a copy of the variable's content. If you directly passed data (or de-referenced pointers) to strcpy, the function would be working with copies and not the original data. It would be unable to write data to the original string.
Essentially, by handing pointers to strcpy you are telling it the location of the source data and of the destination. The function handles all of the de-referencing internally. In human language, the call to strcpy can be thought of as "take the string that starts at memory address line and write a copy it starting at memory address pointer". To communicate memory addresses, you use pointers.

Consider the strcpy code as:
char * strcpy(char * dst, char* src)
{
int i = 0 ;
for (;dst[i]=source[i++];);
return dst ;
}
\0(NULL) is the end of both the string and the for loop.

Your function is basically what strdup does. The function is defined on many platform, if it isn't you can define this way:
char * my_strdup(const char* s)
{
size_t len = strlen(s);
char *r = malloc(len+1);
return r ? memcpy(r, s, len+1) : NULL;
}
memcpy is usually faster than strcpy for longer strings.

Related

Does setting a string char to null cause a memory leak in C?

This seems like a silly question, but I couldn't find the answer.
Anyways, if you set an arbitrary character to null in a string,
then free the string, does that cause a memory leak?
I suppose my knowledge of how the free function works is limited.
/*
char *
strchr(const char *s, int c);
char *
strrchr(const char *s, int c);
The strchr() function locates the first occurrence of c (converted to a
char) in the string pointed to by s. The terminating null character is
considered part of the string; therefore if c is ‘\0’, the functions
locate the terminating ‘\0’.
The strrchr() function is identical to strchr() except it locates the
last occurrence of c.
*/
char* string = strdup ("THIS IS, A STRING WITH, COMMAS!");
char* ch = strrchr( string, ',' );
*ch = 0;
free( string );
/*
The resulting string should be: "THIS IS, A STRING WITH"
When the string pointer is freed, does this result in a memory leak?
*/
Not a stupid question in my opinion.
TLDR: no you do not cause a memory leak.
Now the longer answer: free has no idea what a string is. If you pass it a char* or an int* it could not care less.
The way malloc and free works is the following: when you call malloc you supply a size and receive a pointer with the promise of that many bytes being reserved on the heap from the position of the pointer onwards. However at that point the size and position are also saved internally in some way (this depends and is an implementation detail).
Now when you call free it does not need to know the size, it can just remove the entry your pointer belongs to together with the size
Addendum: also not every char* points to a string, it just so happens that "abcd" becomes a null terminated char* pointing to the 'a', but a char* itself points to a single char, not multiple
malloc only allocates the chunk of memory and gives you the reference to it. If you do not read or write outside the boundaries of this chunk you can do whatever you want with it.

What is the difference between using strcpy and equating the addresses of strings?

I am not able to understand the difference between strcpy function and the method of equating the addresses of the strings using a pointer.The code given below would make my issue more clear. Any help would be appreciated.
//code to take input of strings in an array of pointers
#include <stdio.h>
#include <strings.h>
int main()
{
//suppose the array of pointers is of 10 elements
char *strings[10],string[50],*p;
int length;
//proper method to take inputs:
for(i=0;i<10;i++)
{
scanf(" %49[^\n]",string);
length = strlen(string);
p = (char *)malloc(length+1);
strcpy(p,string);//why use strcpy here instead of p = string
strings[i] = p; //why use this long way instead of writing directly strcpy(strings[i],string) by first defining malloc for strings[i]
}
return 0;
}
A short introduction into the magic of pointers:
char *strings[10],string[50],*p;
These are three variables with distinct types:
char *strings[10]; // an array of 10 pointers to char
char string[50]; // an array of 50 char
char *p; // a pointer to char
Then the followin is done (10 times):
scanf(" %49[^\n]",string);
Read C string from input and store it into string considering that a 0 terminator must fit in also.
length = strlen(string);
Count non-0 characters until 0 terminator is found and store in length.
p = (char *)malloc(length+1);
Allocate memory on heap with length + 1 (for 0 terminator) and store address of that memory in p. (malloc() might fail. A check if (p != NULL) wouldn't hurt.)
strcpy(p,string);//why use strcpy here instead of p = string
Copy C string in string to memory pointed in p. strcpy() copies until (inclusive) 0 terminator is found in source.
strings[i] = p;
Assign p (the pointer to memory) to strings[i]. (After assignment strings[i] points to the same memory than p. The assignment is a pointer assignment but not the assignment of the value to which is pointed.)
Why strcpy(p,string); instead of p = string:
The latter would assign address of string (the local variable, probably stored on stack) to p.
The address of allocated memory (with malloc()) would have been lost. (This introduces a memory leak - memory in heap which cannot be addressed by any pointer in code.)
p would now point to the local variable in string (for every iteration in for loop). Hence afterwards, all entries of strings[10] would point to string finally.
char *strings[10]---- --------->1.
strcpy(strings[i],string) ----->2.
strings[i] = string ----------->3.
p = (char *)malloc(length+1); -|
strcpy(p,string); |-> 4.
strings[i] = p;----------------|
strings is an array of pointers, each pointer must point to valid memory.
Will lead undefined behavior since strings[i] is not pointing to valid memory.
Works but every pointer of strings will point to same location thus each will have same contents.
Thus create the new memory first, copy the contents to it and assign that memory to strings[i]
strcpy copies a particular string into allocated memory. Assigning pointers doesn't actually copy the string, just sets the second pointer variable to the same value as the first.
strcpy(char *destination, char *source);
copies from source to destination until the function finds '\0'. This function is not secure and should not be used - try strncpy or strlcpy instead. You can find useful information about these two functions at https://linux.die.net/man/3/strncpy - check where your code is going to run in order to help you choose the best option.
In your code block you have this declaration
char *strings[10],string[50],*p;
This declares three pointers, but they are quite different. *p is an ordinary pointer, and must have space allocated for it (via malloc) before you can use it. string[50] is also a pointer, but of length 50 (characters, usually 1 byte) - and it's allocated on the function stack directly so you can use it right away (though the very first use of it should be to zero out the memory unless you've used a zeroing allocator like Solaris' calloc. Finally, *strings[10] is a double pointer - you have allocated an array of 10 pointers, each element of which (strings[1], strings[9] etc) must be allocated for before use.
The only one of those which you can assign to immediately is string, because the space is already allocated. Each of those pointers can be addressed via subscripts - but in each case you must ensure that you do not walk off the end otherwise you'll incur a SIGSEGV "segmentation violation" and your program will crash. Or at least, it should, but you might instead get merely weird results.
Finally, pointers allocated to must be freed manually otherwise you'll have memory leaks. Items allocated on the stack (string) do not need to be freed because the compiler handles that for you when the function ends.

About pointers and strcpy() in C

I am practicing allocation memory using malloc() with pointers, but 1 observation about pointers is that, why can strcpy() accept str variable without *:
char *str;
str = (char *) malloc(15);
strcpy(str, "Hello");
printf("String = %s, Address = %u\n", str, str);
But with integers, we need * to give str a value.
int *str;
str = (int *) malloc(15);
*str = 10;
printf("Int = %d, Address = %u\n", *str, str);
it really confuses me why strcpy() accepts str, because in my own understanding, "Hello" will be passed to the memory location of str that will cause some errors.
In C, a string is (by definition) an array of characters. However (whether we realize it all the time or not) we almost always end up accessing arrays using pointers. So, although C does not have a true "string" type, for most practical purposes, the type pointer-to-char (i.e. char *) serves this purpose. Almost any function that accepts or returns a string will actually use a char *. That's why strlen() and strcpy() accept char *. That's why printf %s expects a char *. In all of these cases, what these functions need is a pointer to the first character of the string. (They then read the rest of the string sequentially, stopping when they find the terminating '\0' character.)
In these cases, you don't use an explicit * character. * would extract just the character pointed to (that is, the first character of the string), but you don't want to extract the first character, you want to hand the whole string (that is, a pointer to the whole string) to strcpy so it can do its job.
In your second example, you weren't working with a string at all. (The fact that you used a variable named str confused me for a moment.) You have a pointer to some ints, and you're working with the first int pointed to. Since you're directly accessing one of the things pointed to, that's why you do need the explicit * character.
The * is called indirection or dereference operator.
In your second code,
*str = 10;
assigns the value 10 to the memory address pointed by str. This is one value (i.e., a single variable).
OTOTH, strcpy() copies the whole string all at a time. It accepts two char * parameters, so you don't need the * to dereference to get the value while passing arguments.
You can use the dereference operator, without strcpy(), copying element by element, like
char *str;
str = (char *) malloc(15); //success check TODO
int len = strlen("Hello"); //need string.h header
for (i = 0; i < len; i ++)
*(str+i)= "Hello"[i]; // the * form. as you wanted
str[i] = 0; //null termination
Many string manipulation functions, including strcpy, by convention and design, accept the pointer to the first character of the array, not the pointer to the whole array, even though their values are the same.
This is because their types are different; e.g. a pointer to char[10] has a different type from that of a pointer to char[15], and passing around the pointer to the whole array would be impossible or very clumsy because of this, unless you cast them everywhere or make different functions for different lengths.
For this reason, they have established a convention of passing around a string with the pointer to its first character, not to the whole array, possibly with its length when necessary. Many functions that operate on an array, such as memset, work the same way.
Well, here's what happens in the first snippet :
You are first dynamically allocating 15 bytes of memory, storing this address to the char pointer, which is pointer to a 1-byte sequence of data (a string).
Then you call strcpy(), which iterates over the string and copy characters, byte per byte, into the newly allocated memory space. Each character is a number based on the ASCII table, eg. character a = 97 (take a look at man ascii).
Then you pass this address to printf() which reads from the string, byte per byte, then flush it to your terminal.
In the second snippet, the process is the same, you are still allocating 15 bytes, storing the address in an int * pointer. An int is a 4 byte data type.
When you do *str = 10, you are dereferencing the pointer to store the value 10 at the address pointed by str. Remind what I wrote ahead, you could have done *str = 'a', and this index 0 integer would had the value 97, even if you try to read it as an int. you can event print it if you would.
So why strcpy() can take a int * as parameter? Because it's a memory space where it can write, byte per byte. You can store "Hell" in an int, then "o!" in the next one.
It's just all about usage easiness.
See there is a difference between = operator and the function strcpy.
* is deference operator. When you say *str, it means value at the memory location pointed by str.
Also as a good practice, use this
str = (char *) malloc( sizeof(char)*15 )
It is because the size of a data type might be different on different platforms. Hence use sizeof function to determine its actual size at the run time.

Difference between char* and char** (in C)

I have written this code which is simple
#include <stdio.h>
#include <string.h>
void printLastLetter(char **str)
{
printf("%c\n",*(*str + strlen(*str) - 1));
printf("%c\n",**(str + strlen(*str) - 1));
}
int main()
{
char *str = "1234556";
printLastLetter(&str);
return 1;
}
Now, if I want to print the last char in a string I know the first line of printLastLetter is the right line of code. What I don't fully understand is what the difference is between *str and **str. The first one is an array of characters, and the second??
Also, what is the difference in memory allocation between char *str and str[10]?
Thnks
char* is a pointer to char, char ** is a pointer to a pointer to char.
char *ptr; does NOT allocate memory for characters, it allocates memory for a pointer to char.
char arr[10]; allocates 10 characters and arr holds the address of the first character. (though arr is NOT a pointer (not char *) but of type char[10])
For demonstration: char *str = "1234556"; is like:
char *str; // allocate a space for char pointer on the stack
str = "1234556"; // assign the address of the string literal "1234556" to str
As #Oli Charlesworth commented, if you use a pointer to a constant string, such as in the above example, you should declare the pointer as const - const char *str = "1234556"; so if you try to modify it, which is not allowed, you will get a compile-time error and not a run-time access violation error, such as segmentation fault. If you're not familiar with that, please look here.
Also see the explanation in the FAQ of newsgroup comp.lang.c.
char **x is a pointer to a pointer, which is useful when you want to modify an existing pointer outside of its scope (say, within a function call).
This is important because C is pass by copy, so to modify a pointer within another function, you have to pass the address of the pointer and use a pointer to the pointer like so:
void modify(char **s)
{
free(*s); // free the old array
*s = malloc(10); // allocate a new array of 10 chars
}
int main()
{
char *s = malloc(5); // s points to an array of 5 chars
modify(&s); // s now points to a new array of 10 chars
free(s);
}
You can also use char ** to store an array of strings. However, if you dynamically allocate everything, remember to keep track of how long the array of strings is so you can loop through each element and free it.
As for your last question, char *str; simply declares a pointer with no memory allocated to it, whereas char str[10]; allocates an array of 10 chars on the local stack. The local array will disappear once it goes out of scope though, which is why if you want to return a string from a function, you want to use a pointer with dynamically allocated (malloc'd) memory.
Also, char *str = "Some string constant"; is also a pointer to a string constant. String constants are stored in the global data section of your compiled program and can't be modified. You don't have to allocate memory for them because they're compiled/hardcoded into your program, so they already take up memory.
The first one is an array of characters, and the second??
The second is a pointer to your array. Since you pass the adress of str and not the pointer (str) itself you need this to derefence.
printLastLetter( str );
and
printf("%c\n",*(str + strlen(str) - 1));
makes more sense unless you need to change the value of str.
You might care to study this minor variation of your program (the function printLastLetter() is unchanged except that it is made static), and work out why the output is:
3
X
The output is fully deterministic - but only because I carefully set up the list variable so that it would be deterministic.
#include <stdio.h>
#include <string.h>
static void printLastLetter(char **str)
{
printf("%c\n", *(*str + strlen(*str) - 1));
printf("%c\n", **(str + strlen(*str) - 1));
}
int main(void)
{
char *list[] = { "123", "abc", "XYZ" };
printLastLetter(list);
return 0;
}
char** is for a string of strings basically - an array of character arrays. If you want to pass multiple character array arguments you can use this assuming they're allocated correctly.
char **x;
*x would dereference and give you the first character array allocated in x.
**x would dereference that character array giving you the first character in the array.
**str is nothing else than (*str)[0] and the difference between *str and str[10] (in the declaration, I assume) I think is, that the former is just a pointer pointing to a constant string literal that may be stored somewhere in global static memory, whereas the latter allocates 10 byte of memory on the stack where the literal is stored into.
char * is a pointer to a memory location. for char * str="123456"; this is the first character of a string. The "" are just a convenient way of entering an array of character values.
str[10] is a way of reserving 10 characters in memory without saying what they are.(nb Since the last character is a NULL this can actually only hold 9 letters. When a function takes a * parameter you can use a [] parameter but not the other way round.
You are making it unnecessarily complicated by taking the address of str before using it as a parameter. In C you often pass the address of an object to a function because it is a lot faster then passing the whole object. But since it is already a pointer you do not make the function any better by passing a pointer to a pointer. Assuming you do not want to change the pointer to point to a different string.
for your code snippet, *str holds address to a char and **str holds address to a variable holding address of a char. In another word, pointer to pointer.
Whenever, you have *str, only enough memory is allocated to hold a pointer type variable(4 byte on a 32 bit machine). With str[10], memory is already allocated for 10 char.

Why does reading into a string buffer with scanf work both with and without the ampersand (&)?

I'm a little bit confused about something. I was under the impression that the correct way of reading a C string with scanf() went along the lines of
(never mind the possible buffer overflow, it's just a simple example)
char string[256];
scanf( "%s" , string );
However, the following seems to work too,
scanf( "%s" , &string );
Is this just my compiler (gcc), pure luck, or something else?
An array "decays" into a pointer to its first element, so scanf("%s", string) is equivalent to scanf("%s", &string[0]). On the other hand, scanf("%s", &string) passes a pointer-to-char[256], but it points to the same place.
Then scanf, when processing the tail of its argument list, will try to pull out a char *. That's the Right Thing when you've passed in string or &string[0], but when you've passed in &string you're depending on something that the language standard doesn't guarantee, namely that the pointers &string and &string[0] -- pointers to objects of different types and sizes that start at the same place -- are represented the same way.
I don't believe I've ever encountered a system on which that doesn't work, and in practice you're probably safe. None the less, it's wrong, and it could fail on some platforms. (Hypothetical example: a "debugging" implementation that includes type information with every pointer. I think the C implementation on the Symbolics "Lisp Machines" did something like this.)
I think that this below is accurate and it may help.
Feel free to correct it if you find any errors. I'm new at C.
char str[]
array of values of type char, with its own address in memory
array of values of type char, with its own address in memory
as many consecutive addresses as elements in the array
including termination null character '\0' &str, &str[0] and str, all three represent the same location in memory which is address of the first element of the array str
char *strPtr = &str[0]; //declaration and initialization
alternatively, you can split this in two:
char *strPtr; strPtr = &str[0];
strPtr is a pointer to a char
strPtr points at array str
strPtr is a variable with its own address in memory
strPtr is a variable that stores value of address &str[0]
strPtr own address in memory is different from the memory address that it stores (address of array in memory a.k.a &str[0])
&strPtr represents the address of strPtr itself
I think that you could declare a pointer to a pointer as:
char **vPtr = &strPtr;
declares and initializes with address of strPtr pointer
Alternatively you could split in two:
char **vPtr;
*vPtr = &strPtr
*vPtr points at strPtr pointer
*vPtr is a variable with its own address in memory
*vPtr is a variable that stores value of address &strPtr
final comment: you can not do str++, str address is a const, but
you can do strPtr++

Resources