How does this memory allocator work? - c

So, I just sat down and decided to write a memory allocator. I was tired, so I just threw something together. What I ended up with was this:
#include <stdio.h>
#include <stdlib.h>
#define BUFSIZE 1024
char buffer[BUFSIZE];
char *next = buffer;
void *alloc(int n){
if(next + n <= buffer + BUFSIZE){
next += n;
return (void *)(next - n);
}else{
return NULL;
}
}
void afree(void *c){
next = (char *)c;
}
int main(){
int *num = alloc(sizeof(int));
*num = 5643;
printf("%d: %d", *num, sizeof(int));
afree(num);
}
For some reason, this works. But I can not explain why it works. It may have to do with the fact that I am tired, but I really can not see why it works. So, this is what it should be doing, logically, and as I understand it:
It creates a char array with a pointer which points to the first element of the array.
When I call alloc with a value of 4 (which is the size of an int, as I have tested down below), it should set next to point to the fourth element of the array. It should then return a char pointer to the first 4 bytes of the array casted to a void pointer.
I then set that value to something greater than the max value of a char. C should realise that that isn't possible and should then truncate it to *num % sizeof(char).
I have one guess as to why this works: When the char pointer is casted to a void pointer and then gets turned into an integer it somehow changes the size of the pointer so that it is able to point to an integer. (I haven't only tried this memory allocator with integers, but with structures as well, and it seems to work with them as well).
Is this guess correct, or am I too tired to think?
EDIT:
EDIT 2: I think I've understood it. I realised that my phrasing from yesterday was quite bad. The thing which threw me off was the fact that the returned pointer actually points to a char, but I am still somehow able to store an integer value.

The allocator posted implements a mark and release allocation scheme:
alloc(size) returns a valid pointer if there is at least size unallocated bytes available in the arena. The available size is reduced accordingly. Note that this pointer can only be used to store bytes, as it is not properly aligned for anything else. Furthermore, from a strict interpretation of the C Standard, even if the pointer is properly aligned, using it as a pointer to any other type would violate the strict aliasing rule.
afree(ptr) resets the arena to the state is was before alloc() returned ptr. It would be a useful extension to make afree(NULL) reset the arena to its initial state.
Note that the main() function attempts to use the pointer returned by alloc(sizeof(int)) as a pointer to int. This invokes undefined behavior because there is no guarantee that buffer is properly aligned for this, and because of the violation of the strict aliasing rule.
Note also that the printf format printf("%d: %d", *num, sizeof(int)); is incorrect for the second argument. It should be printf("%d: %zd", *num, sizeof(int)); or printf("%d: %d", *num, (int)sizeof(int)); if the C runtime library is too old to support %zd.

Actually! I came up with a reason for the behaviour! This is what I was wondering, however, I wasn't too good at putting my thoughs into words yesterday (sorry). I modified my code to something like this:
#include <stdio.h>
#include <stdlib.h>
#define BUFSIZE 1024
char buffer[BUFSIZE];
char *next = buffer;
void *alloc(int n){
if(next + n <= buffer + BUFSIZE){
next += n;
return (void *)(next - n);
}else{
return NULL;
}
}
void afree(void *c){
next = (char *)c;
}
int main(){
int *num = alloc(sizeof(int));
*num = 346455;
printf("%d: %d\n", *num, (int)sizeof(int));
printf("%d, %d, %d, %d", *(next - 4), *(next - 3), *(next - 2), *(next - 1));
afree(num);
}
Now, the last printf produces "87, 73, 5, 0".
If you convert all the values into a big binary value you get this: 00000000 00000101 01001001 01010111. If you take that binary value and convert it to decimal you get the original value of *num, which is 346455. So, basically it separates the integer into 4 bytes and puts them into different elements of the array. I think this is implementation-defined and has to do with little endian and big endian. Is this correct? My first prediction was that it would truncate the integer and basically set the value to (integer value) % sizeof(char).

int *num = alloc(sizeof(int));
Says - 'here is a pointer (alloc) that points to some space, lets says it points to an integer (int*).'
The you say
*num = 5643;
Which says - set that integer to 5643.
Why wouldnt it work - given that alloc did in fact return a pointer to a block of good memory that can hold an integer

Related

What is the C-language statement for function returning a pointer to an array?

What would be the C code for a function that accepts a pointer to a character as argument and returns a pointer to an array of integers?
I have a confusion here. My answer is as follows:
int * q (char *) [ ]
Im not sure if I'm correct. But if its incorrect then what is the correct answer more importantly what is the approach to answer it. In general i would appreciate any general method to learn to interpret such questions and convert them to C code?
When dealing with functions, you basically needs to consider arrays as pointers because it is very hard (if possible) to pass or return an array in a function and make operations such as the sizeof operator still work as intended.
For you purpose, int ** q (char *) is enough, although you would not be able to know the length of the returned array this way.
A pointer to an array of integers looks like:
int (*p)[];
where it is optional to have a dimension inside the square brackets.
So a function returning that would look like:
int (*func(char *))[];
Note that "pointer to array" is a different thing to "pointer to first element of array". Sometimes people say the former when they mean the latter. If you actually meant the latter then your function could be more simply:
int **func(char *);
The first form is rarely used because there is nothing you can do with the return value other than decay it to int ** anyway. It would sometimes be useful to specify a dimension if the function always is to return a pointer to a fixed-size buffer, but in that case I would recommend using a typedef for readability:
typedef int ARR_4_INT[4];
ARR_4_INT * func(char *);
I think I see where your confusion is, and I'm not sure it has been cleared based on the answer you selected. int **q (char *) is a function that returns a pointer to pointer to int, which if that is what you need, that is fine, but understand, the only way a function can return a pointer to pointer to int is if (1) the address of an array of int (pointer to array) is passed as a parameter to q, or (2), pointers are allocated in q and the pointer to allocated pointers is returned.
You appear to want (1), but have selected an answer for (2). Take a look at cdecl.org (C-declarations tool) which can always help decipher declarations
The best way to help you sort it out is probably an example of each. The following examples just plays on the ASCII value of the character variable c (ASCII '5' by default -- decimal 53). In the first case, returning a pointer-to-int (which is a pointer to a block of memory holding zero or more integers). For example, here a block of memory is allocated to hold 53 integers filled with values from 53-105:
#include <stdio.h>
#include <stdlib.h>
int *q (char *c)
{
int *a = calloc (*c, sizeof *a);
if (a)
for (int i = 0; i < (int)*c; i++)
a[i] = *c + i;
return a;
}
int main (int argc, char **argv) {
char c = argc > 1 ? *argv[1] : '5';
int *array = NULL;
if ((array = q (&c)))
for (int i = 0; i < (int)c; i++)
printf ("array[%3d] : %d\n", i, array[i]);
free (array); /* don't forget to free mem */
return 0;
}
Now in the second case, you are returning a pointer-to-pointer-to-int (a pointer to a block of memory holding zero or more pointers to int -- which each in turn can be separately allocated to hold zero or more integers each). In this case each individual pointer is allocated with space to hold one int and the same values as above are assigned:
#include <stdio.h>
#include <stdlib.h>
int **q (char *c)
{
int **a = calloc (*c, sizeof *a); /* allocate *c pointers to int */
if (a) {
for (int i = 0; i < (int)*c; i++)
if ((a[i] = calloc (1, sizeof **a))) /* alloc 1 int per-pointer */
*(a[i]) = *c + i;
else {
fprintf (stderr, "error: memory exhausted.\n");
break;
}
}
return a;
}
int main (int argc, char **argv) {
char c = argc > 1 ? *argv[1] : '5';
int **array = NULL;
if ((array = q (&c)))
for (int i = 0; i < (int)c; i++) {
printf ("array[%3d] : %d\n", i, *(array[i]));
free (array[i]); /* free individually allocated blocks */
}
free (array); /* don't forget to free pointers */
return 0;
}
Example Use/Output
In each case, if no argument is passed to the program, the default output would be:
$ ./bin/qfunction2
array[ 0] : 53
array[ 1] : 54
array[ 2] : 55
...
array[ 50] : 103
array[ 51] : 104
array[ 52] : 105
Now it is entirely unclear which of the two cases you are after as you seem to explain you want a pointer to an array of integers returned. As I started my last answer with, you cannot return an array, all you can return is a pointer, but that pointer can point to a block of memory that can hold multiple integers, or it can point to a block of memory holding multiple pointers that can each in turn be allocated to hold multiple integers.
So the ambiguity comes in "Do you want the return to point to an array of integers or an array of pointers?" int *q (char *) is for the first, int **q (char*) is for the second.
Now, going forward, you will realize that you have not provided a way to know how many integers (or pointers) are being returned. (that requires an extra parameter or global variable (discouraged) at the very least). That is left for another day. (it is also why the examples are a play on the ASCII value of 53 or whatever character is the first character of the first argument, it provides a fixed value to know what has been allocated)
I'm happy to provide further help, but I will need a bit of clarification on what you are trying to accomplish.

What does pointer array from malloc mean?

unsigned char *dup = malloc(size);
My question may be naive. What does dup[2] mean? Is it a pointer to the third char from the malloced memory or it's the value of the third char from the malloced memory? I have searched google but found no result to explain this. Many thanks for your time.
it's the value of the third char from the malloced memory?
This.
dup[2] equivalent to *(dup + 2). The + 2 implicitly acts like + 2 * sizeof(char).
If you want a pointer to the third char in the memory, without dereferencing it, then you just use the same as above. without the dereferencing operator:
unsigned char *thirdChar = dup + 2;
dup[2] is sematically identical to *(dup + 2). So it is the value of the third byte pointed to by dup. That is, the memory addresses are:
dup, dup+1, dup+2, ....., dup+size-1
Note that malloc does not initialize the returned memory, so strictly speaking, the value of dup[2] could be anything.
Clearly, dup[k] in this case representing the third character of the string, which is very much similiar to *(dup + 2).
The supporting code as follows:
#include<stdio.h>
#include<string.h>
int main() {
unsigned char *dup = malloc(10);
scanf("%s", dup);
printf("%c", dup[2]);
printf("\n%c", *(dup+2));
return 0;
}
The output being the same for both printf statement, it made it very clear.

Using calloc and manually inputting chars results in a crash

I have this code right here.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
int *size;
int i = 0;
char buf[] = "Thomas was alone";
size = (int*)calloc(1, sizeof(buf)+1);
for(i=0; i<strlen(buf); i++)
{
*(size+i) = buf[i];
printf("%c", *(size+i));
}
free(size);
}
To my understanding calloc reserves a memspace the size of the first arg multiplied by the second, in this case 18. The length of buf is 17 and thus the for loop should not have any problems at all.
Running this program results in the expected results ( It prints Thomas was alone ), however it crashes immediately too. This persists unless I crank up the size of calloc ( like multiplied by ten ).
Am I perhaps understanding something wrongly?
Should I use a function to prevent this from happening?
int *size means you need:
size = calloc(sizeof(int), sizeof(buf));
You allocated enough space for an array of char, but not an array of int (unless you're on an odd system where sizeof(char) == sizeof(int), which is a theoretical possibility rather than a practical one). That means your code writes well beyond the end of the allocated memory, which is what leads to the crashing. Or you can use char *size in which case the original call to calloc() is OK.
Note that sizeof(buf) includes the terminal null; strlen(buf) does not. That means you overallocate slightly with the +1 term.
You could also perfectly sensibly write size[i] instead of *(size+i).
Change the type of size to char.
You are using an int and when you add to the pointer here *(size+i), you go out of bounds.
Pointer arithmetic takes account of the type, which in you case is int not char. sizeof int is larger than char on your system.
You allocate place for char array not for int array:
char is 1 byte in memory (most often)
int is 4 bytes in memory (most often)
so you allocate 1 * sizeof(buf) + 1 = 18 bytes
so for example in memory:
buf[0] = 0x34523
buf[1] = 0x34524
buf[2] = 0x34525
buf[3] = 0x34526
but when you use *(size + 1) you don't move pointer on 1 byte but for sizeof(int) so for 4 bytes.
So in memory it will look like:
size[0] = 0x4560
size[1] = 0x4564
size[2] = 0x4568
size[3] = 0x4572
so after few loops you are out of memory.
change calloc(1, sizeof(buf) + 1); to calloc(sizeof(int), sizeof(buf) + 1); to have enough memory.
Second think, I think is some example on which you learn how it works?
My suggestion:
Use the same type of pointer and variable.
when you assign diffnerent type of variables, use explicit conversion, in this example
*(size+i) = (int)buf[i];

WHY I got seg fault here? need help. Want to put integer into char pointer array

#include <stdio.h>
#include <stdlib.h>
int main()
{
int num = 1;
char* test[8];
sprintf(test[0],"%d",num);
printf("%s\n",test[0]);
}
char *test[8] is an array of 8 char *, or pointers to strings, and since you don't specify, they're all set to garbage values. So sprintf is trying to write data to who-knows-where.
You should use char test[8] instead, which allocates an array of 8 char, and then sprintf(test, "%d", num);.
UPDATE: If you want to use char * pointers, you should allocate space:
char *test = malloc(8 /* see note below */);
sprintf(test, "%d", num);
If you want to use an array of char * pointers, it works the same:
char *test[8]; // 8 pointers to strings
test[0] = malloc(8); // allocate memory for the first pointer
sprintf(test[0], "%d", num);
Keep in mind you would have to call malloc for each of test[0] through test[7] individually.
Also, as mentioned in the comments, if your compiler supports it you should use snprintf(). It's like sprintf but it takes an extra parameter which is the size of the buffer:
snprintf(test, 8, "%d", num);
and guarantees not to use more space than you allow it. It's safer, and if you need to, snprintf returns the amount of space it actually wanted, so if you gave it too little room you can realloc and try again.
Note: some will say this should be malloc(8 * sizeof(char)) (or sizeof *test). They are wrong (in my objectively-correct opinion; note the sarcasm)! sizeof(char) is guaranteed to be 1, so this multiplication is unnecessary.
Some advocate the usage of TYPE *p = malloc(x * sizeof *p) so that if TYPE changes, you'll only need to change it in one place, and sizeof *p will adapt. I am one of these people, but in my opinion you will rarely need to upgrade a char * to another type. Since so many functions use char * and would need to be changed in such an upgrade, I'm not worried about making malloc lines more flexible.
sprintf() does not allocate space for the string; you must do that yourself beforehand.
Look at your warnings:
test.c: In function ‘main’:
test.c:8: warning: ‘test[0]’ is used uninitialized in this function
You allocate an array of 8 pointers, but use one without initializing it. You must call malloc and store the result in test[0] before you can write to the memory pointed to by test[0]. You free it at the end.
A useful function, present in GNU and BSD, is asprintf, which will call malloc for you to allocate enough memory for the formatted string:
#include <stdio.h>
#include <stdlib.h>
int main(void) {
int num = 1;
char* test[8];
asprintf(&test[0],"%d",num);
printf("%s\n",test[0]);
free(test[0]);
return 0;
}
(Note that you pass the address of your pointer to asprintf — since your pointer is test[0], its address is &test[0].)
You did allocate space but you you are passing the wrong thing. Try this:
#include <stdio.h>
#include <stdlib.h>
int main()
{
int num = 1;
char test[8];
sprintf(test,"%d",num);
printf("%s\n",test);
}
int main()
{
char *str[5];
sprintf(str[0], "%d",55);
printf("%s\n",str[0]);
return 0;
}
This will be work. But, if you specify variable instead of integer constant value show the segmentation fault will be occur. This error will be happened at the time of sprintf function execution. Because user space memory access.

Problem with type-casting array of strings in C

I am trying to read a large list of English words from a text file to array of strings. The number of words is 2016415, and maximum length of a word is 69 characters.
If I define array like "char data[2016415][70]; " then I get stack overflow when I run the program.
So I am trying to use calloc() instead, however I can't understand how should I type-cast it so that it becomes equivalent to "char data[2016415][70];".
The following program returns "passing arg 1 of `fgets' makes pointer from integer without a cast" warning during compiling. And when I execute it, it gets "Exception: STATUS_ACCESS_VIOLATION" problem.
Can you help me?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void){
char *data; //data[2016415][70];
int i;
FILE *fsol;
fsol = fopen("C:\\Downloads\\abc\\sol2.txt","r");
data = (char*) calloc(2016415,70);
for(i=0;i<2016415;i++){
fgets(data[i] , 70 , fsol);
}
fclose(fsol);
return 0;
}
calloc just allocates a big swath of memory - not an array of pointers to other arrays.
fgets expects a pointer to the memory location it should dump it's stuff at.
So instead of giving it the contents of data[i], you want to give it the address of data[i] so it can put its stuff there.
fgets(&data[i], 70, fsol);
You'll probably also need to adjust your loop so that it goes up by 70-odd characters at a time rather than one.
Okay, sorry about the previous suggestion. I forgot how horrible arrays can be. This one is tested with a small data set of 10 words, but it should scale to your word count. Note that fgets() seems to pull in the line endings as part of the preceding word.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_WORD_CNT 2016415
#define MAX_WORD_LEN 70
int main(void)
{
char *data; //data[2016415][70];
int i;
FILE *fsol;
fsol = fopen("C:\\Downloads\\abc\\sol2.txt","r");
data = (char*) calloc(MAX_WORD_CNT, MAX_WORD_LEN);
// check for valid allocation
if (data == NULL)
{
return 1;
}
for(i=0; i<MAX_WORD_CNT; i++)
{
fgets(&data[i * MAX_WORD_LEN], MAX_WORD_LEN, fsol);
}
fclose(fsol);
return 0;
}
Here's how I would allocate the array
char **data = malloc(MAX_WORD_CNT * sizeof(char *));
for(int i = 0; i < MAX_WORD_CNT; i++)
data[i] = malloc(MAX_WORD_LEN);
you might want to add some error checking for malloc though.
data is a pointer to char (also addressable as an array of char), so data[i] is a single char. fgets expects a pointer to char but you're passing it a single char; that's why you're getting the warning, you're trying to use a char (integer) as a pointer.
When you run the program, it then takes that single char argument and interprets it as a pointer to char, hence the access violation because the value of that char is not a valid address.
So, in your loop you should pass fgets a pointer into data and increment that by 70 with each iteration. You can use the "pointer to an array element" form &data[i] and increment i, or the simple pointer form, with another pointer variable initially set to data, and itself incremented.
The answer is simple: you DON'T cast it. Casting the results of malloc/calloc/etc. has no purpose but can have the side-effect of hiding a major bug if you forgot to include stdlib.h. The return type of these allocation functions, which is void *, will automatically be cast to whatever you need.
If you really want to know the type, it's (char (*)[70]). But please don't actually obfuscate your program by writing that. (Unless you're actually writing C++, in which case you should have tagged your question C++ and not C, or better yet used new.)

Resources