unsigned char *dup = malloc(size);
My question may be naive. What does dup[2] mean? Is it a pointer to the third char from the malloced memory or it's the value of the third char from the malloced memory? I have searched google but found no result to explain this. Many thanks for your time.
it's the value of the third char from the malloced memory?
This.
dup[2] equivalent to *(dup + 2). The + 2 implicitly acts like + 2 * sizeof(char).
If you want a pointer to the third char in the memory, without dereferencing it, then you just use the same as above. without the dereferencing operator:
unsigned char *thirdChar = dup + 2;
dup[2] is sematically identical to *(dup + 2). So it is the value of the third byte pointed to by dup. That is, the memory addresses are:
dup, dup+1, dup+2, ....., dup+size-1
Note that malloc does not initialize the returned memory, so strictly speaking, the value of dup[2] could be anything.
Clearly, dup[k] in this case representing the third character of the string, which is very much similiar to *(dup + 2).
The supporting code as follows:
#include<stdio.h>
#include<string.h>
int main() {
unsigned char *dup = malloc(10);
scanf("%s", dup);
printf("%c", dup[2]);
printf("\n%c", *(dup+2));
return 0;
}
The output being the same for both printf statement, it made it very clear.
Related
I am working on a short program that reads a .txt file. Intially, I was playing around in main function, and I had gotten to my code to work just fine. Later, I decided to abstract it to a function. Now, I cannot seem to get my code to work, and I have been hung up on this problem for quite some time.
I think my biggest issue is that I don't really understand what is going on at a memory/hardware level. I understand that a pointer simply holds a memory address, and a pointer to a pointer simply holds a memory address to an another memory address, a short breadcrumb trail to what we really want.
Yet, now that I am introducing malloc() to expand the amount of memory allocated, I seem to lose sight of whats going on. In fact, I am not really sure how to think of memory at all anymore.
So, a char takes up a single byte, correct?
If I understand correctly, then by a char* takes up a single byte of memory?
If we were to have a:
char* str = "hello"
Would it be say safe to assume that it takes up 6 bytes of memory (including the null character)?
And, if we wanted to allocate memory for some "size" unknown at compile time, then we would need to dynamically allocate memory.
int size = determine_size();
char* str = NULL;
str = (char*)malloc(size * sizeof(char));
Is this syntactically correct so far?
Now, if you would judge my interpretation. We are telling the compiler that we need "size" number of contiguous memory reserved for chars. If size was equal to 10, then str* would point to the first address of 10 memory addresses, correct?
Now, if we could go one step further.
int size = determine_size();
char* str = NULL;
file_read("filename.txt", size, &str);
This is where my feet start to leave the ground. My interpretation is that file_read() looks something like this:
int file_read(char* filename, int size, char** buffer) {
// Set up FILE stream
// Allocate memory to buffer
buffer = malloc(size * sizeof(char));
// Add characters to buffer
int i = 0;
char c;
while((c=fgetc(file))!=EOF){
*(buffer + i) = (char)c;
i++;
}
Adding the characters to the buffer and allocating the memory is what is I cannot seem to wrap my head around.
If **buffer is pointing to *str which is equal to null, then how do I allocate memory to *str and add characters to it?
I understand that this is lengthy, but I appreciate the time you all are taking to read this! Let me know if I can clarify anything.
EDIT:
Whoa, my code is working now, thanks so much!
Although, I don't know why this works:
*((*buffer) + i) = (char)c;
So, a char takes up a single byte, correct?
Yes.
If I understand correctly, by default a char* takes up a single byte of memory.
Your wording is somewhat ambiguous. A char takes up a single byte of memory. A char * can point to one char, i.e. one byte of memory, or a char array, i.e. multiple bytes of memory.
The pointer itself takes up more than a single byte. The exact value is implementation-defined, usually 4 bytes (32bit) or 8 bytes (64bit). You can check the exact value with printf( "%zd\n", sizeof char * ).
If we were to have a char* str = "hello", would it be say safe to assume that it takes up 6 bytes of memory (including the null character)?
Yes.
And, if we wanted to allocate memory for some "size" unknown at compile time, then we would need to dynamically allocate memory.
int size = determine_size();
char* str = NULL;
str = (char*)malloc(size * sizeof(char));
Is this syntactically correct so far?
Do not cast the result of malloc. And sizeof char is by definition always 1.
If size was equal to 10, then str* would point to the first address of 10 memory addresses, correct?
Yes. Well, almost. str* makes no sense, and it's 10 chars, not 10 memory addresses. But str would point to the first of the 10 chars, yes.
Now, if we could go one step further.
int size = determine_size();
char* str = NULL;
file_read("filename.txt", size, &str);
This is where my feet start to leave the ground. My interpretation is that file_read() looks something like this:
int file_read(char* filename, int size, char** buffer) {
// Set up FILE stream
// Allocate memory to buffer
buffer = malloc(size * sizeof(char));
No. You would write *buffer = malloc( size );. The idea is that the memory you are allocating inside the function can be addressed by the caller of the function. So the pointer provided by the caller -- str, which is NULL at the point of the call -- needs to be changed. That is why the caller passes the address of str, so you can write the pointer returned by malloc() to that address. After your function returns, the caller's str will no longer be NULL, but contain the address returned by malloc().
buffer is the address of str, passed to the function by value. Allocating to buffer would only change that (local) pointer value.
Allocating to *buffer, on the other hand, is the same as allocating to str. The caller will "see" the change to str after your file_read() returns.
Although, I don't know why this works: *((*buffer) + i) = (char)c;
buffer is the address of str.
*buffer is, basically, the same as str -- a pointer to char (array).
(*buffer) + i) is pointer arithmetic -- the pointer *buffer plus i means a pointer to the ith element of the array.
*((*buffer) + i) is dereferencing that pointer to the ith element -- a single char.
to which you are then assigning (char)c.
A simpler expression doing the same thing would be:
(*buffer)[i] = (char)c;
with char **buffer, buffer stands for the pointer to the pointer to the char, *buffer accesses the pointer to a char, and **buffer accesses the char value itself.
To pass back a pointer to a new array of chars, write *buffer = malloc(size).
To write values into the char array, write *((*buffer) + i) = c, or (probably simpler) (*buffer)[i] = c
See the following snippet demonstrating what's going on:
void generate0to9(char** buffer) {
*buffer = malloc(11); // *buffer dereferences the pointer to the pointer buffer one time, i.e. it writes a (new) pointer value into the address passed in by `buffer`
for (int i=0;i<=9;i++) {
//*((*buffer)+i) = '0' + i;
(*buffer)[i] = '0' + i;
}
(*buffer)[10]='\0';
}
int main(void) {
char *b = NULL;
generate0to9(&b); // pass a pointer to the pointer b, such that the pointer`s value can be changed in the function
printf("b: %s\n", b);
free(b);
return 0;
}
Output:
0123456789
So, I just sat down and decided to write a memory allocator. I was tired, so I just threw something together. What I ended up with was this:
#include <stdio.h>
#include <stdlib.h>
#define BUFSIZE 1024
char buffer[BUFSIZE];
char *next = buffer;
void *alloc(int n){
if(next + n <= buffer + BUFSIZE){
next += n;
return (void *)(next - n);
}else{
return NULL;
}
}
void afree(void *c){
next = (char *)c;
}
int main(){
int *num = alloc(sizeof(int));
*num = 5643;
printf("%d: %d", *num, sizeof(int));
afree(num);
}
For some reason, this works. But I can not explain why it works. It may have to do with the fact that I am tired, but I really can not see why it works. So, this is what it should be doing, logically, and as I understand it:
It creates a char array with a pointer which points to the first element of the array.
When I call alloc with a value of 4 (which is the size of an int, as I have tested down below), it should set next to point to the fourth element of the array. It should then return a char pointer to the first 4 bytes of the array casted to a void pointer.
I then set that value to something greater than the max value of a char. C should realise that that isn't possible and should then truncate it to *num % sizeof(char).
I have one guess as to why this works: When the char pointer is casted to a void pointer and then gets turned into an integer it somehow changes the size of the pointer so that it is able to point to an integer. (I haven't only tried this memory allocator with integers, but with structures as well, and it seems to work with them as well).
Is this guess correct, or am I too tired to think?
EDIT:
EDIT 2: I think I've understood it. I realised that my phrasing from yesterday was quite bad. The thing which threw me off was the fact that the returned pointer actually points to a char, but I am still somehow able to store an integer value.
The allocator posted implements a mark and release allocation scheme:
alloc(size) returns a valid pointer if there is at least size unallocated bytes available in the arena. The available size is reduced accordingly. Note that this pointer can only be used to store bytes, as it is not properly aligned for anything else. Furthermore, from a strict interpretation of the C Standard, even if the pointer is properly aligned, using it as a pointer to any other type would violate the strict aliasing rule.
afree(ptr) resets the arena to the state is was before alloc() returned ptr. It would be a useful extension to make afree(NULL) reset the arena to its initial state.
Note that the main() function attempts to use the pointer returned by alloc(sizeof(int)) as a pointer to int. This invokes undefined behavior because there is no guarantee that buffer is properly aligned for this, and because of the violation of the strict aliasing rule.
Note also that the printf format printf("%d: %d", *num, sizeof(int)); is incorrect for the second argument. It should be printf("%d: %zd", *num, sizeof(int)); or printf("%d: %d", *num, (int)sizeof(int)); if the C runtime library is too old to support %zd.
Actually! I came up with a reason for the behaviour! This is what I was wondering, however, I wasn't too good at putting my thoughs into words yesterday (sorry). I modified my code to something like this:
#include <stdio.h>
#include <stdlib.h>
#define BUFSIZE 1024
char buffer[BUFSIZE];
char *next = buffer;
void *alloc(int n){
if(next + n <= buffer + BUFSIZE){
next += n;
return (void *)(next - n);
}else{
return NULL;
}
}
void afree(void *c){
next = (char *)c;
}
int main(){
int *num = alloc(sizeof(int));
*num = 346455;
printf("%d: %d\n", *num, (int)sizeof(int));
printf("%d, %d, %d, %d", *(next - 4), *(next - 3), *(next - 2), *(next - 1));
afree(num);
}
Now, the last printf produces "87, 73, 5, 0".
If you convert all the values into a big binary value you get this: 00000000 00000101 01001001 01010111. If you take that binary value and convert it to decimal you get the original value of *num, which is 346455. So, basically it separates the integer into 4 bytes and puts them into different elements of the array. I think this is implementation-defined and has to do with little endian and big endian. Is this correct? My first prediction was that it would truncate the integer and basically set the value to (integer value) % sizeof(char).
int *num = alloc(sizeof(int));
Says - 'here is a pointer (alloc) that points to some space, lets says it points to an integer (int*).'
The you say
*num = 5643;
Which says - set that integer to 5643.
Why wouldnt it work - given that alloc did in fact return a pointer to a block of good memory that can hold an integer
I have to implement a function that returns the memory address of a pointer when I allocate it with malloc(). I know that malloc(value) allocates an area on the heap which is of size value.
I know how to implement the code for printing the memory address of that pointer:
void *s = malloc (size)
printf("%p\n",s);
My question is, how can I save the value printed by that code in an int or string (e.g. char *)?
Storing the value of the pointer (i.e. the memory location of some variable) in a string can be done much like you've used printf:
char buf[128];
void *s = malloc (size);
sprintf(buf, "%p\n",s);
To 'save' the value into an integer (type) you can do a simple cast:
void *s = malloc (size);
size_t int_value = (size_t)s;
Since in c you never know what your machine address pointer length is, this (technically) isn't guaranteed to work quite right; both of these methods can go wrong with wacky architectures or compilers.
char buf[32] = {0}
snprintf(buf, sizeof buf, "%p\n", s);
then you can print it:
printf("%s\n", buf);
You've already saved the value as a bit pattern in s, so I assume you mean that you simply want the text output by printf as a string. The call you want is sprintf:
char text[255];
sprintf(text, "%p\n", s);
If you want the pointer address returned from your function, you can declare your function to return the pointer type:
int* myFunc(int n) {
int* p;
p = malloc(n*sizeof(int));
// more stuff
return p;
}
This is an alternative to the use of sprintf as suggested (very reasonably) by other answers.
Take your pick.
Note that on some systems an int would not be big enought to hold a int* data type - using int* is not only clearer but safer as well.
Yes sprintf() is the best option. Here you can simply take any thing inside a string.
I'm reading K&R and I'm almost through the chapter on pointers. I'm not entirely sure if I'm going about using them the right way. I decided to try implementing itoa(n) using pointers. Is there something glaringly wrong about the way I went about doing it? I don't particularly like that I needed to set aside a large array to work as a string buffer in order to do anything, but then again, I'm not sure if that's actually the correct way to go about it in C.
Are there any general guidelines you like to follow when deciding to use pointers in your code? Is there anything I can improve on in the code below? Is there a way I can work with strings without a static string buffer?
/*Source file: String Functions*/
#include <stdio.h>
static char stringBuffer[500];
static char *strPtr = stringBuffer;
/* Algorithm: n % 10^(n+1) / 10^(n) */
char *intToString(int n){
int p = 1;
int i = 0;
while(n/p != 0)
p*=10, i++;
for(;p != 1; p/=10)
*(strPtr++) = ((n % p)/(p/10)) + '0';
*strPtr++ = '\0';
return strPtr - i - 1;
}
int main(){
char *s[3] = {intToString(123), intToString(456), intToString(78910)};
printf("%s\n",s[2]);
int x = stringToInteger(s[2]);
printf("%d\n", x);
return 0;
}
Lastly, can someone clarify for me what the difference between an array and a pointer is? There's a section in K&R that has me very confused about it; "5.5 - Character Pointers and Functions." I'll quote it here:
"There is an important difference between the definitions:
char amessage[] = "now is the time"; /*an array*/
char *pmessage = "now is the time"; /*a pointer*/
amessage is an array, just big enough to hold the sequence of characters and '\0' that
initializes it. Individual characters within the array may be changed but amessage will
always refer to the same storage. On the other hand, pmessage is a pointer, initialized
to point to a string constant; the pointer may subsequently be modified to point
elsewhere, but the result is undefined if you try to modify the string contents."
What does that even mean?
For itoa the length of a resulting string can't be greater than the length of INT_MAX + minus sign - so you'd be safe with a buffer of that length. The length of number string is easy to determine by using log10(number) + 1, so you'd need buffer sized log10(INT_MAX) + 3, with space for minus and terminating \0.
Also, generally it's not a good practice to return pointers to 'black box' buffers from functions. Your best bet here would be to provide a buffer as a pointer argument in intToString, so then you can easily use any type of memory you like (dynamic, allocated on stack, etc.). Here's an example:
char *intToString(int n, char *buffer) {
// ...
char *bufferStart = buffer;
for(;p != 1; p/=10)
*(buffer++) = ((n % p)/(p/10)) + '0';
*buffer++ = '\0';
return bufferStart;
}
Then you can use it as follows:
char *buffer1 = malloc(30);
char buffer2[15];
intToString(10, buffer1); // providing pointer to heap allocated memory as a buffer
intToString(20, &buffer2[0]); // providing pointer to statically allocated memory
what the difference between an array and a pointer is?
The answer is in your quote - a pointer can be modified to be pointing to another memory address. Compare:
int a[] = {1, 2, 3};
int b[] = {4, 5, 6};
int *ptrA = &a[0]; // the ptrA now contains pointer to a's first element
ptrA = &b[0]; // now it's b's first element
a = b; // it won't compile
Also, arrays are generally statically allocated, while pointers are suitable for any allocation mechanism.
Regarding your code:
You are using a single static buffer for every call to intToString: this is bad because the string produced by the first call to it will be overwritten by the next.
Generally, functions that handle strings in C should either return a new buffer from malloc, or they should write into a buffer provided by the caller. Allocating a new buffer is less prone to problems due to running out of buffer space.
You are also using a static pointer for the location to write into the buffer, and it never rewinds, so that's definitely a problem: enough calls to this function, and you will run off the end of the buffer and crash.
You already have an initial loop that calculates the number of digits in the function. So you should then just make a new buffer that big using malloc, making sure to leave space for the \0, write in to that, and return it.
Also, since i is not just a loop index, change it to something more obvious like length:
That is to say: get rid of the global variables, and instead after computing length:
char *s, *result;
// compute length
s = result = malloc(length+1);
if (!s) return NULL; // out of memory
for(;p != 1; p/=10)
*(s++) = ((n % p)/(p/10)) + '0';
*s++ = '\0';
return result;
The caller is responsible for releasing the buffer when they're done with it.
Two other things I'd really recommend while learning about pointers:
Compile with all warnings turned on (-Wall etc) and if you get an error try to understand what caused it; they will have things to teach you about how you're using the language
Run your program under Valgrind or some similar checker, which will make pointer bugs more obvious, rather than causing silent corruption
Regarding your last question:
char amessage[] = "now is the time"; - is an array. Arrays cannot be reassigned to point to something else (unlike pointers), it points to a fixed address in memory. If the array was allocated in a block, it will be cleaned up at the end of the block (meaning you cannot return such an array from a function). You can however fiddle with the data inside the array as much as you like so long as you don't exceed the size of the array.
E.g. this is legal amessage[0] = 'N';
char *pmessage = "now is the time"; - is a pointer. A pointer points to a block in memory, nothing more. "now is the time" is a string literal, meaning it is stored inside the executable in a read only location. You cannot under any circumstances modify the data it is pointing to. You can however reassign the pointer to point to something else.
This is NOT legal -*pmessage = 'N'; - will segfault most likely (note that you can use the array syntax with pointers, *pmessage is equivalent to pmessage[0]).
If you compile it with gcc using the -S flag you can actually see "now is the time" stored in the read only part of the assembly executable.
One other thing to point out is that arrays decay to pointers when passed as arguments to a function. The following two declarations are equivalent:
void foo(char arr[]);
and
void foo(char* arr);
About how to use pointers and the difference between array and pointer, I recommend you read the "expert c programming" (http://www.amazon.com/Expert-Programming-Peter-van-Linden/dp/0131774298/ref=sr_1_1?ie=UTF8&qid=1371439251&sr=8-1&keywords=expert+c+programming).
Better way to return strings from functions is to allocate dynamic memory (using malloc) and fill it with the required string...return this pointer to the calling function and then free it.
Sample code :
#include "stdio.h"
#include "stdlib.h"
#include "string.h"
#define MAX_NAME_SIZE 20
char * func1()
{
char * c1= NULL;
c1 = (char*)malloc(sizeof(MAX_NAME_SIZE));
strcpy(c1,"John");
return c1;
}
main()
{
char * c2 = NULL;
c2 = func1();
printf("%s \n",c2);
free(c2);
}
And this works without the static strings.
I am trying to understand how pointer incrementing and dereferencing go together, and I did this to try it out:
#include <stdio.h>
int main(int argc, char *argv[])
{
char *words[] = {"word1","word2"};
printf("%p\n",words);
printf("%s\n",*words++);
printf("%p\n",words);
return 0;
}
I expected this code to do one of these:
First dereference then increase the pointer (printing word1)
First dereference then increase the value (printing ord1)
Dereference pointer + 1 (printing word2)
But compiler won't even compile this, and gives this error: lvalue required as increment operand am I doing something wrong here?
You cannot increment an array, but you can increment a pointer. If you convert the array you declare to a pointer, you will get it to work:
#include <stdio.h>
int main(int argc, char *argv[])
{
const char *ww[] = {"word1","word2"};
const char **words = ww;
printf("%p\n",words);
printf("%s\n",*words++);
printf("%p\n",words);
return 0;
}
You need to put braces around the pointer dereference in the second printf, e.g.:printf("%s\n",(*words)++); Also, if you're attempting to get number 2 in your list there, you need to use the prefix increment rather than postfix.
words is the name of the array, so ++ makes no sense on it. You can take a pointer to the array elements, though:
for (char ** p = words; p != words + 2; ++p)
{
printf("Address: %p, value: '%s'\n", (void*)(p), *p);
}
Instead of 2 you can of course use the more generic sizeof(words)/sizeof(*words).
The problem is with this line:
printf("%s\n",*words++);
It is read as *(words++), i.e. increment a block of memory. That doesn't make sense, it is a bit like trying to do:
int a = 1;
(&a)++; // move a so that it points to the next address
which is illegal in C.
The problem is caused by the distinction between arrays and pointers in C: (basically) an array is a block of memory (allocated at compile time), while a pointer is a pointer to a block of memory (not necessarily allocated at compile time). It is a common trip-up when using C, and there are other question on SO about it (e.g. C: differences between char pointer and array).
(The fix is described in other answers, but basically you want to use a pointer to strings rather than an array of strings.)