Using sizeof with a dynamically allocated array - c

gcc 4.4.1 c89
I have the following code snippet:
#include <stdlib.h>
#include <stdio.h>
char *buffer = malloc(10240);
/* Check for memory error */
if(!buffer)
{
fprintf(stderr, "Memory error\n");
return 1;
}
printf("sizeof(buffer) [ %d ]\n", sizeof(buffer));
However, the sizeof(buffer) always prints 4. I know that a char* is only 4 bytes. However, I have allocated the memory for 10kb. So shouldn't the size be 10240? I am wondering am I thinking right here?
Many thanks for any suggestions,

You are asking for the size of a char* which is 4 indeed, not the size of the buffer. The sizeof operator can not return the size of a dynamically allocated buffer, only the size of static types and structs known at compile time.

sizeof doesn't work on dynamic allocations (with some exceptions in C99). Your use of sizeof here is just giving you the size of the pointer. This code will give you the result you want:
char buffer[10240];
printf("sizeof(buffer) [ %d ]\n", sizeof(buffer));
If malloc() succeeds, the memory pointed to is at least as big as you asked for, so there's no reason to care about the actual size it allocated.
Also, you've allocated 10 kB, not 1 kB.

It is up to you to track the size of the memory if you need it. The memory returned by malloc is only a pointer to "uninitialized" data. The sizeof operator is only working on the buffer variable.

No. buffer is a char *. It is a pointer to char data. The pointer only takes up 4 bytes (on your system).
It points to 10240 bytes of data (which, by the way, is not 1Kb. More like 10Kb), but the pointer doesn't know that. Consider:
int a1[3] = {0, 1, 2};
int a2[5] = {0, 1, 2, 3, 4};
int *p = a1;
// sizeof a1 == 12 (3 * sizeof(int))
// but sizeof p == 4
p = a2
// sizeof a2 == 20
// sizeof p still == 4
It's the main difference between arrays and pointers. If it didn't work that way, sizeof p would change in the above code, which doesn't make sense for a compile-time constant.

Replace your sizeof by malloc_usable_size (the manpage indicates that this is non-portable, so may not be available with your particular C implementation).

Related

Memory, pointers, and pointers to pointers

I am working on a short program that reads a .txt file. Intially, I was playing around in main function, and I had gotten to my code to work just fine. Later, I decided to abstract it to a function. Now, I cannot seem to get my code to work, and I have been hung up on this problem for quite some time.
I think my biggest issue is that I don't really understand what is going on at a memory/hardware level. I understand that a pointer simply holds a memory address, and a pointer to a pointer simply holds a memory address to an another memory address, a short breadcrumb trail to what we really want.
Yet, now that I am introducing malloc() to expand the amount of memory allocated, I seem to lose sight of whats going on. In fact, I am not really sure how to think of memory at all anymore.
So, a char takes up a single byte, correct?
If I understand correctly, then by a char* takes up a single byte of memory?
If we were to have a:
char* str = "hello"
Would it be say safe to assume that it takes up 6 bytes of memory (including the null character)?
And, if we wanted to allocate memory for some "size" unknown at compile time, then we would need to dynamically allocate memory.
int size = determine_size();
char* str = NULL;
str = (char*)malloc(size * sizeof(char));
Is this syntactically correct so far?
Now, if you would judge my interpretation. We are telling the compiler that we need "size" number of contiguous memory reserved for chars. If size was equal to 10, then str* would point to the first address of 10 memory addresses, correct?
Now, if we could go one step further.
int size = determine_size();
char* str = NULL;
file_read("filename.txt", size, &str);
This is where my feet start to leave the ground. My interpretation is that file_read() looks something like this:
int file_read(char* filename, int size, char** buffer) {
// Set up FILE stream
// Allocate memory to buffer
buffer = malloc(size * sizeof(char));
// Add characters to buffer
int i = 0;
char c;
while((c=fgetc(file))!=EOF){
*(buffer + i) = (char)c;
i++;
}
Adding the characters to the buffer and allocating the memory is what is I cannot seem to wrap my head around.
If **buffer is pointing to *str which is equal to null, then how do I allocate memory to *str and add characters to it?
I understand that this is lengthy, but I appreciate the time you all are taking to read this! Let me know if I can clarify anything.
EDIT:
Whoa, my code is working now, thanks so much!
Although, I don't know why this works:
*((*buffer) + i) = (char)c;
So, a char takes up a single byte, correct?
Yes.
If I understand correctly, by default a char* takes up a single byte of memory.
Your wording is somewhat ambiguous. A char takes up a single byte of memory. A char * can point to one char, i.e. one byte of memory, or a char array, i.e. multiple bytes of memory.
The pointer itself takes up more than a single byte. The exact value is implementation-defined, usually 4 bytes (32bit) or 8 bytes (64bit). You can check the exact value with printf( "%zd\n", sizeof char * ).
If we were to have a char* str = "hello", would it be say safe to assume that it takes up 6 bytes of memory (including the null character)?
Yes.
And, if we wanted to allocate memory for some "size" unknown at compile time, then we would need to dynamically allocate memory.
int size = determine_size();
char* str = NULL;
str = (char*)malloc(size * sizeof(char));
Is this syntactically correct so far?
Do not cast the result of malloc. And sizeof char is by definition always 1.
If size was equal to 10, then str* would point to the first address of 10 memory addresses, correct?
Yes. Well, almost. str* makes no sense, and it's 10 chars, not 10 memory addresses. But str would point to the first of the 10 chars, yes.
Now, if we could go one step further.
int size = determine_size();
char* str = NULL;
file_read("filename.txt", size, &str);
This is where my feet start to leave the ground. My interpretation is that file_read() looks something like this:
int file_read(char* filename, int size, char** buffer) {
// Set up FILE stream
// Allocate memory to buffer
buffer = malloc(size * sizeof(char));
No. You would write *buffer = malloc( size );. The idea is that the memory you are allocating inside the function can be addressed by the caller of the function. So the pointer provided by the caller -- str, which is NULL at the point of the call -- needs to be changed. That is why the caller passes the address of str, so you can write the pointer returned by malloc() to that address. After your function returns, the caller's str will no longer be NULL, but contain the address returned by malloc().
buffer is the address of str, passed to the function by value. Allocating to buffer would only change that (local) pointer value.
Allocating to *buffer, on the other hand, is the same as allocating to str. The caller will "see" the change to str after your file_read() returns.
Although, I don't know why this works: *((*buffer) + i) = (char)c;
buffer is the address of str.
*buffer is, basically, the same as str -- a pointer to char (array).
(*buffer) + i) is pointer arithmetic -- the pointer *buffer plus i means a pointer to the ith element of the array.
*((*buffer) + i) is dereferencing that pointer to the ith element -- a single char.
to which you are then assigning (char)c.
A simpler expression doing the same thing would be:
(*buffer)[i] = (char)c;
with char **buffer, buffer stands for the pointer to the pointer to the char, *buffer accesses the pointer to a char, and **buffer accesses the char value itself.
To pass back a pointer to a new array of chars, write *buffer = malloc(size).
To write values into the char array, write *((*buffer) + i) = c, or (probably simpler) (*buffer)[i] = c
See the following snippet demonstrating what's going on:
void generate0to9(char** buffer) {
*buffer = malloc(11); // *buffer dereferences the pointer to the pointer buffer one time, i.e. it writes a (new) pointer value into the address passed in by `buffer`
for (int i=0;i<=9;i++) {
//*((*buffer)+i) = '0' + i;
(*buffer)[i] = '0' + i;
}
(*buffer)[10]='\0';
}
int main(void) {
char *b = NULL;
generate0to9(&b); // pass a pointer to the pointer b, such that the pointer`s value can be changed in the function
printf("b: %s\n", b);
free(b);
return 0;
}
Output:
0123456789

Copy a string to a malloc'd array of strings

I thought I understood the answer to this question but I don't. I understand the first result but I still don't know how to do the copy correctly. I tried the following code:
// TstStrArr.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
#include <string.h>
#include <malloc.h>
int main()
{
char ** StrPtrArr;
char InpBuf0[] = "TstFld0";
char InpBuf1[] = "TstFld1";
StrPtrArr = (char **)malloc(2 * sizeof(char *));
StrPtrArr[0] = (char *)malloc(10 + 1);
printf("inpbuf=%s sizeof=%2d ", InpBuf0, sizeof(StrPtrArr[0]));
strncpy_s(StrPtrArr[0], sizeof(StrPtrArr[0]), InpBuf0, _TRUNCATE);
printf("strptrarr=%s\n", StrPtrArr[0]);
StrPtrArr[1] = (char *)malloc(10 + 1);
printf("inpbuf=%s sizeof=%2d ", InpBuf1, sizeof(*StrPtrArr[1]));
strncpy_s(*StrPtrArr[1], sizeof(*StrPtrArr[1]), InpBuf1, _TRUNCATE); // error here
printf("*strptrarr=%s\n", StrPtrArr[1]);
free(StrPtrArr[0]);
free(StrPtrArr[1]);
free(StrPtrArr);
return 0;
}
The result I got was:
inpbuf=TstFld0 sizeof= 4 strptrarr=Tst
inpbuf=TstFld1 sizeof= 1
and the following error:
Exception thrown: write access violation.
destination_it was 0xFFFFFFCD.
The result I thought I'd get was either of the following:
inpbuf=TstFld1 sizeof=11 *strptrarr=TstFld1
inpbuf=TstFld1 sizeof= 1 *strptrarr=T
I understand the first copy copied the input buffer to the 4 byte pointer which was incorrect. I thought the second copy would copy the input buffer to the value of the dereferenced pointer of a size of 11 but it didn't. I'm guessing the copy was to the first character of the string in the array. I don't understand memory enough to know the significance of the address 0xFFFFFFCD but I guess it's in read-only memory thus causing the error.
What is the correct way to do the copy?
(I don't think it matters, but I'm using VS 2015 Community Edition Update 3.)
Why
strncpy_s(*StrPtrArr[1], sizeof(*StrPtrArr[1]), InpBuf1, _TRUNCATE);
?
*StrPtrArr[1] should be StrPtrArr[1] because StrPtrArr is of type char** and you need char* here.
and sizeof(*StrPtrArr[1]) - is quite strange....
actually sizeof(StrPtrArr[1]) also cannot provide correct value.
You should remember size of allocated memory and then use it like:
size_t arrSize = 10 + 1;
StrPtrArr[1] = (char *)malloc(arrSize);
. . .
strncpy_s(StrPtrArr[1], arrSize, InpBuf1, _TRUNCATE);
The problem is that you are using sizeof when deciding how many characters to copy. However, you allocated a fixed number of characters which is not known to sizeof operator: sizeof StrPtrArr[0] is equal to the size of char pointer on your system (four bytes, judging from the output), not 10 + 1. Hence, you need to specify that same number again in the call to secure string copy.
It isn't as complicated as people seem to think.
char* array = calloc( n, sizeof(array[0]) ); // allocate array of pointers
// assign a dynamically allocated pointer:
size_t size = strlen(str) + 1;
array[i] = malloc(size);
memcpy(array[i], str, size);
I intentionally used calloc during allocation, since that sets all pointers to NULL. This gives the advantage that you can harmlessly call free() on the pointer, even before it is assigned to point at a string.
This in turn means that you can easily (re)assign a new string to an index at any time, in the following way:
void str_assign (char** dst, const char* src)
{
size_t size = strlen(src) + 1;
free(*dst);
*dst = malloc(size);
if(*dst != NULL)
{
memcpy(*dst, src, size);
}
}
...
str_assign(&array[i], "something");

Using calloc and manually inputting chars results in a crash

I have this code right here.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
int *size;
int i = 0;
char buf[] = "Thomas was alone";
size = (int*)calloc(1, sizeof(buf)+1);
for(i=0; i<strlen(buf); i++)
{
*(size+i) = buf[i];
printf("%c", *(size+i));
}
free(size);
}
To my understanding calloc reserves a memspace the size of the first arg multiplied by the second, in this case 18. The length of buf is 17 and thus the for loop should not have any problems at all.
Running this program results in the expected results ( It prints Thomas was alone ), however it crashes immediately too. This persists unless I crank up the size of calloc ( like multiplied by ten ).
Am I perhaps understanding something wrongly?
Should I use a function to prevent this from happening?
int *size means you need:
size = calloc(sizeof(int), sizeof(buf));
You allocated enough space for an array of char, but not an array of int (unless you're on an odd system where sizeof(char) == sizeof(int), which is a theoretical possibility rather than a practical one). That means your code writes well beyond the end of the allocated memory, which is what leads to the crashing. Or you can use char *size in which case the original call to calloc() is OK.
Note that sizeof(buf) includes the terminal null; strlen(buf) does not. That means you overallocate slightly with the +1 term.
You could also perfectly sensibly write size[i] instead of *(size+i).
Change the type of size to char.
You are using an int and when you add to the pointer here *(size+i), you go out of bounds.
Pointer arithmetic takes account of the type, which in you case is int not char. sizeof int is larger than char on your system.
You allocate place for char array not for int array:
char is 1 byte in memory (most often)
int is 4 bytes in memory (most often)
so you allocate 1 * sizeof(buf) + 1 = 18 bytes
so for example in memory:
buf[0] = 0x34523
buf[1] = 0x34524
buf[2] = 0x34525
buf[3] = 0x34526
but when you use *(size + 1) you don't move pointer on 1 byte but for sizeof(int) so for 4 bytes.
So in memory it will look like:
size[0] = 0x4560
size[1] = 0x4564
size[2] = 0x4568
size[3] = 0x4572
so after few loops you are out of memory.
change calloc(1, sizeof(buf) + 1); to calloc(sizeof(int), sizeof(buf) + 1); to have enough memory.
Second think, I think is some example on which you learn how it works?
My suggestion:
Use the same type of pointer and variable.
when you assign diffnerent type of variables, use explicit conversion, in this example
*(size+i) = (int)buf[i];

Need help creating a dynamic char array in C

I'm having trouble creating a dynamic char array. This is what I have so far.
char * arr;
arr = (char*)malloc (2 * sizeof (char));
It's not allocating space for only 2 characters, it's letting me enter up to arr[8] and then giving me strange errors after 8.
I also tried making a 2 dimensional char array. The first dimension allocates correctly, but then the second dimension has more space than I allow it to have and gets an error at around 12 characters or so. Any help would be greatly appreciated. I would prefer to make a 1 dimensional dynamic array if possible.
This line arr = (char*)malloc (2 * sizeof (char)); will allocate memory for 2 bytes only. But you are overwriting the memory by accessing the more 8 or more than 8 byes. If you access more than two byes means, it will give some unpredictable issue. In case you want more memory please follow the below code.
#define USER_SIZE 10
arr = (char*)malloc ( USER_SIZE * sizeof (char));
Assign the value in USER_SIZE macro and then allocate the memory as much as you want.
Example for 2D pointer ( 5 X 10 )
#define ROW 5
#define COLUMN 10
main()
{
unsigned char **p = NULL, colum = 0;
p = malloc ( ROW * sizeof ( unsigned char *) );
for (;colum< ROW; ++colum )
{
p[colum] = malloc (COLUMN * sizeof (unsigned char ));
}
}
What you are doing is called buffer overflow by writing beyond the bounds of memory allocated by malloc call. The compiler doesn't do bounds checking (it assumes you know what you are doing, and you only pay for what you use) and allow you to compile and run. However, it will lead to undefined behaviour and your program may crash. You shouldn't rely on such behaviour.
You, the programmer, has to make sure that you don't do illegal memory access. You should not cast the result of malloc. Also, malloc can fail to allocate memory in which case it returns NULL, the null pointer, which you should take care of. You can combine the two statements into one.
int length = 8; // you can also use a macro
char *arr = malloc(length * sizeof *arr);
if(arr) {
// malloc call successful
// do stuff with arr
}

char * buf = malloc(sizeof (char *) * 16) vs char buf[ sizeof (char *) * 16]

I'm reading a C code that do
char * buf = malloc(sizeof (char *) * 16)
instead of
char buf[sizeof (char *) * 16]
what's the difference? well, I think the first expression unnecessary, if realloc() is not called, or am I wrong thinking?
char buf[sizeof(char*)*16] is an array allocated automatically, which is generally the stack. It is valid as long as buf is in scope, and there is sufficient stack space.
malloc allocates memory from some heap. It is valid until this memory is free()ed. Generally, there is much more heap available.
Yann's note is correct.
This appears to be an array of pointers. Since it is allocating memory for 16 times the size of a char pointer. Pointer size can vary on different systems. Pointers on some are 32-bit (4 bytes) where others are 64-bit (8 bytes).
char buf[sizeof(char *) * 16] is not an array of pointers, it's an array of chars that has elements equal to the size of a char pointer times 16.
Dynamic Array
The first one, is a dynamic array. The expression char * buf = malloc(sizeof (char *) * 16) stores the elements in memory ( the malloc is basically used for memory allocation ). The advantages of using it are, you can reallocate it, i.e resize it during runtime. However, you may have to allocate new memory every time you add a new element. Here's an example:
#include <stdlib.h>
#include <stdio.h>
#include <conio.h>
int main() {
int* array;
int n, i;
printf("Enter the number of elements: ");
scanf("%d", &n);
array = malloc(n*sizeof(int));
for (i=0; i<n; i++) {
printf("Enter number %d: ", i);
scanf("%d", &array[i]);
}
printf("\nThe Dynamic Array is: \n");
for (i=0; i<n; i++) {
printf("The value of %d is %d\n", i, array[i]);
}
printf("Size= %d\n", i);
getch();
return 0;
}
The output:
Automatic (Static?) Array
The second expression char buf[sizeof (char *) * 16] just declares a boring automatic array. It's size is static. No dynamic resizing, reallocation etc.
note: apologies for the type cast before malloc. typecasting the return value of malloc will result in the compiler not giving an error if you do something wrong. This may be followed by undefined runtime errors and debugging hell. Always avoid typecasting the result of malloc. Thanks #Lundin.
The main difference is that if this is code is in a function, you can still use the pointer declared in the former after you return.

Resources