How does MRI Ruby store the contents of a String? - c

Primer: This question is quite long, because I want to give an overview of my current understanding of the inner mechanisms of MRI and how I came to my conclusions. I want to understand the code better, so please correct me if any assumption I'm making is wrong.
I'm trying to find out where MRI Ruby stores the data part (aka the contents) of a String, because I'd like to create String objects which reuse memory allocated by another binary (same allocator of course).
Here's what I know so far:
RString: internal representation of a String.
struct RString {
struct RBasic basic;
union {
struct {
long len;
char *ptr;
union {
long capa;
VALUE shared;
} aux;
} heap;
char ary[RSTRING_EMBED_LEN_MAX + 1];
} as;
};
reference
From the above snippet I conclude that there are 2 ways the data can be stored:
on the heap via the heap struct (ptr points to data)
in the ary char array directly (probably some optimization)
I'm only interested in the heap case.
str_new0() seems to be the most common way to create a String from a pointer to some string data and a length.
static VALUE
str_new0(VALUE klass, const char *ptr, long len, int termlen)
{
VALUE str;
if (len < 0) {
rb_raise(rb_eArgError, "negative string size (or size too big)");
}
RUBY_DTRACE_CREATE_HOOK(STRING, len);
str = str_alloc(klass);
if (len > RSTRING_EMBED_LEN_MAX) {
RSTRING(str)->as.heap.aux.capa = len;
RSTRING(str)->as.heap.ptr = ALLOC_N(char, len + termlen);
STR_SET_NOEMBED(str);
}
else if (len == 0) {
ENC_CODERANGE_SET(str, ENC_CODERANGE_7BIT);
}
if (ptr) {
memcpy(RSTRING_PTR(str), ptr, len);
}
STR_SET_LEN(str, len);
TERM_FILL(RSTRING_PTR(str) + len, termlen);
return str;
}
reference
Memory is allocated with the macro ALLOC_N which is an alias for RB_ALLOC_N which expands to ruby_xmalloc2() which calls objspace_xmalloc2() which calls objspace_xmalloc0().
Phew
static void *
objspace_xmalloc0(rb_objspace_t *objspace, size_t size)
{
void *mem;
size = objspace_malloc_prepare(objspace, size);
TRY_WITH_GC(mem = malloc(size));
size = objspace_malloc_size(objspace, mem, size);
objspace_malloc_increase(objspace, mem, size, 0, MEMOP_TYPE_MALLOC);
return objspace_malloc_fixup(objspace, mem, size);
}
reference
So here we are. TRY_WITH_GC seems to check if the allocation mem = malloc(size) succeeds and if not it tries again after a GC run I think.
#define TRY_WITH_GC(alloc) do { \
objspace_malloc_gc_stress(objspace); \
if (!(alloc) && \
(!garbage_collect_with_gvl(objspace, TRUE, TRUE, TRUE, GPR_FLAG_MALLOC) || /* full/immediate mark && immediate sweep */ \
!(alloc))) { \
ruby_memerror(); \
} \
} while (0)
reference
Here's the first thing I'm unsure about: It seems to malloc just some memory (important: not in objspace). Is this the case? I don't know if they overwrote malloc somewhere to allocate GC friendly or whatever.
OK after that they mutate objspace with objspace_malloc_increase() and friends. I don't understand what these functions do. They do not seem to store the pointer mem in objspace, but maybe I overlooked it. I need clarification here.
As noted in the beginning I want to write code that creates a Ruby String, which uses memory allocated by some other binary, eg. C via FFI, of course with the system allocator. Do I have to register my "foreign" memory via the objspace_* functions? If yes, how does that exactly work? And are there subtleties when it comes to freeing the memory again? (I guess the GC does that, but what conditions must be true for this to work?)
I hope my question is not too vague, I can ask more precisely if necessary!
Thanks in advance!

Related

How to initialize array size in a library in C?

I'm creating a C-library with .h and .c files for a ring buffer. Ideally, you would initialize this ring buffer library in the main project with something like ringbuff_init(int buff_size); and the size that is sent, will be the size of the buffer. How can I do this when arrays in C needs to be initialized statically?
I have tried some dynamically allocating of arrays already, I did not get it to work. Surely this task is possible somehow?
What I would like to do is something like this:
int buffSize[];
int main(void)
{
ringbuffer_init(100); // initialize buffer size to 100
}
void ringbuffer_init(int buff_size)
{
buffSize[buff_size];
}
This obviously doesn't compile because the array should have been initialized at the declaration. So my question is really, when you make a library for something like a buffer, how can you initialize it in the main program (so that in the .h/.c files of the buffer library) the buffer size is set to the wanted size?
You want to use dynamic memory allocation. A direct translation of your initial attempt would look like this:
size_t buffSize;
int * buffer;
int main(void)
{
ringbuffer_init(100); // initialize buffer size to 100
}
void ringbuffer_init(size_t buff_size)
{
buffSize = buff_size;
buffer = malloc(buff_size * sizeof(int));
}
This solution here is however extremely bad. Let me list the problems here:
There is no check of the result of malloc. It could return NULL if the allocation fails.
Buffer size needs to be stored along with the buffer, otherwise there's no way to know its size from your library code. It isn't exactly clean to keep these global variables around.
Speaking of which, these global variables are absolutely not thread-safe. If several threads call functions of your library, results are inpredictible. You might want to store your buffer and its size in a struct that would be returned from your init function.
Nothing keeps you from calling the init function several times in a row, meaning that the buffer pointer will be overwritten each time, causing memory leaks.
Allocated memory must be eventually freed using the free function.
In conclusion, you need to think very carefully about the API you expose in your library, and the implementation while not extremely complicated, will not be trivial.
Something more correct would look like:
typedef struct {
size_t buffSize;
int * buffer;
} RingBuffer;
int ringbuffer_init(size_t buff_size, RingBuffer * buf)
{
if (buf == NULL)
return 0;
buf.buffSize = buff_size;
buf.buffer = malloc(buff_size * sizeof(int));
return buf.buffer != NULL;
}
void ringbuffer_free(RingBuffer * buf)
{
free(buf.buffer);
}
int main(void)
{
RingBuffer buf;
int ok = ringbuffer_init(100, &buf); // initialize buffer size to 100
// ...
ringbuffer_free(&buf);
}
Even this is not without problems, as there is still a potential memory leak if the init function is called several times for the same buffer, and the client of your library must not forget to call the free function.
Static/global arrays can't have dynamic sizes.
If you must have a global dynamic array, declare a global pointer instead and initialize it with a malloc/calloc/realloc call.
You might want to also store its size in an accompanying integer variable as sizeof applied to a pointer won't give you the size of the block the pointer might be pointing to.
int *buffer;
int buffer_nelems;
char *ringbuffer_init(int buff_size)
{
assert(buff_size > 0);
if ( (buffer = malloc(buff_size*sizeof(*buffer)) ) )
buffer_nelems = buff_size;
return buffer;
}
You should use malloc function for a dynamic memory allocation.
It is used to dynamically allocate a single large block of memory with the specified size. It returns a pointer of type void which can be cast into a pointer of any form.
Example:
// Dynamically allocate memory using malloc()
buffSize= (int*)malloc(n * sizeof(int));
// Initialize the elements of the array
for (i = 0; i < n; ++i) {
buffSize[i] = i + 1;
}
// Print the elements of the array
for (i = 0; i < n; ++i) {
printf("%d, ", buffSize[i]);
}
I know I'm three years late to the party, but I feel I have an acceptable solution without using dynamic allocation.
If you need to do this without dynamic allocation for whatever reason (I have a similar issue in an embedded environment, and would like to avoid it).
You can do the following:
Library:
int * buffSize;
int buffSizeLength;
void ringbuffer_init(int buff_size, int * bufferAddress)
{
buffSize = bufferAddress;
buffSizeLength = buff_size;
}
Main :
#define BUFFER_SIZE 100
int LibraryBuffer[BUFFER_SIZE];
int main(void)
{
ringbuffer_init(BUFFER_SIZE, LibraryBuffer ) // initialize buffer size to 100
}
I have been using this trick for a while now, and it's greatly simplified some parts of working with a library.
One drawback: you can technically mess with the variable in your own code, breaking the library. I don't have a solution to that yet. If anyone has a solution to that I would love to here it. Basically good discipline is required for now.
You can also combine this with #SirDarius 's typedef for ring buffer above. I would in fact recommend it.

malloc alternative for memory allocation as a stack

I am looking for a malloc alternative for c that will only ever be used as a stack. Something more like alloca but not limited in space by the stack size. It is for coding a math algorithm.
I will work with large amounts of memory (possibly hundreds of megabytes in use in the middle of the algorithm)
memory is accessed in a stack-like order. What I mean is that the next memory to be freed is always the memory that was most recently allocated.
would like to be able to run an a variety of systems (Windows and Unix-like)
as an extra, something that can be used with threading, where the stack-like allocate and free order applies just to individual threads. (ie ideally each thread has its own "pool" for memory allocation)
My question is, is there anything like this, or is this something that would be easy to implement?
This sounds like a perfect use for Obstack.
I've never used it myself since the API is really confusing, and I can't dig up an example right now. But it supports all the operations you want, and additionally supports streaming creation of the "current" object.
Edit: whipped up a quick example. The Obstack API shows signs of age, but the principle is sound at least.
You will probably want to look into tuning the align/block settings and likely use obstack_next_free and obstack_object_size if you do any fancy growing.
#include <obstack.h>
#include <stdio.h>
#include <stdlib.h>
void *xmalloc(size_t size)
{
void *rv = malloc(size);
if (rv == NULL)
abort();
return rv;
}
#define obstack_chunk_alloc xmalloc
#define obstack_chunk_free free
const char *cat(struct obstack *obstack_ptr, const char *dir, const char *file)
{
obstack_grow(obstack_ptr, dir, strlen(dir));
obstack_1grow(obstack_ptr, '/');
obstack_grow0(obstack_ptr, file, strlen(file));
return obstack_finish(obstack_ptr);
}
int main()
{
struct obstack main_stack;
obstack_init(&main_stack);
const char *cat1 = cat(&main_stack, "dir1", "file1");
const char *cat2 = cat(&main_stack, "dir1", "file2");
const char *cat3 = cat(&main_stack, "dir2", "file3");
puts(cat1);
puts(cat2);
puts(cat3);
obstack_free(&main_stack, cat2);
// cat2 and cat3 both freed, cat1 still valid
}
As you already found out, as long as it works with malloc you should use it and only come back when you need to squeeze out the last bit of performance.
An idea fit that case: You could use a list of blocks, that you allocate when needed. Using a list makes it possible to eventually swap out data in case you hit the virtual memory limit.
struct block {
size_t size;
void * memory;
struct block * next;
};
struct stacklike {
struct block * top;
void * last_alloc;
};
void * allocate (struct stacklike * a, size_t s) {
// add null check for top
if (a->top->size - (a->next_alloc - a->top->memory) < s + sizeof(size_t)) {
// not enough memory left in top block, allocate new one
struct block * nb = malloc(sizeof(*nb));
nb->next = a->top;
a->top = nb;
nb->memory = malloc(/* some size large enough to hold multiple data entities */);
// also set nb size to that size
a->next_alloc = nb->memory;
}
void * place = a->next_alloc;
a->next_alloc += s;
*((size_t *) a->next_alloc) = s; // store size to be able to free
a->next_alloc += sizeof (size_t);
return place;
}
I hope this shows the general idea, for an actual implementation there's much more to consider.
To swap out stuff you change that to a doubly linked list an keep track of the total allocated bytes. If you hit a limit, write the end to some file.
I have seen a strategy used in an old FORTRAN program that might be what you are looking for. The strategy involves use of a global array that is passed down to each function from main.
char global_buffer[SOME_LARGE_SIZE];
void foo1(char* buffer, ...);
void foo2(char* buffer, ...);
void foo3(char* buffer, ...);
int main()
{
foo1(global_buffer, ....);
}
void foo1(char* buffer, ...)
{
// This function needs to use SIZE1 characters of buffer.
// It can let the functions that it calls use buffer+SIZE1
foo2(buffer+SIZE1, ...);
// When foo2 returns, everything from buffer+SIZE1 is assumed
// to be free for re-use.
}
void foo2(char* buffer, ...)
{
// This function needs to use SIZE2 characters of buffer.
// It can let the functions that it calls use buffer+SIZE2
foo3(buffer+SIZE2, ...);
}
void foo3(char* buffer, ...)
{
// This function needs to use SIZE3 characters of buffer.
// It can let the functions that it calls use buffer+SIZE3
bar1(buffer+SIZE3, ...);
}

Get the length of an array with a pointer? [duplicate]

I've allocated an "array" of mystruct of size n like this:
if (NULL == (p = calloc(sizeof(struct mystruct) * n,1))) {
/* handle error */
}
Later on, I only have access to p, and no longer have n. Is there a way to determine the length of the array given just the pointer p?
I figure it must be possible, since free(p) does just that. I know malloc() keeps track of how much memory it has allocated, and that's why it knows the length; perhaps there is a way to query for this information? Something like...
int length = askMallocLibraryHowMuchMemoryWasAlloced(p) / sizeof(mystruct)
I know I should just rework the code so that I know n, but I'd rather not if possible. Any ideas?
No, there is no way to get this information without depending strongly on the implementation details of malloc. In particular, malloc may allocate more bytes than you request (e.g. for efficiency in a particular memory architecture). It would be much better to redesign your code so that you keep track of n explicitly. The alternative is at least as much redesign and a much more dangerous approach (given that it's non-standard, abuses the semantics of pointers, and will be a maintenance nightmare for those that come after you): store the lengthn at the malloc'd address, followed by the array. Allocation would then be:
void *p = calloc(sizeof(struct mystruct) * n + sizeof(unsigned long int),1));
*((unsigned long int*)p) = n;
n is now stored at *((unsigned long int*)p) and the start of your array is now
void *arr = p+sizeof(unsigned long int);
Edit: Just to play devil's advocate... I know that these "solutions" all require redesigns, but let's play it out.
Of course, the solution presented above is just a hacky implementation of a (well-packed) struct. You might as well define:
typedef struct {
unsigned int n;
void *arr;
} arrInfo;
and pass around arrInfos rather than raw pointers.
Now we're cooking. But as long as you're redesigning, why stop here? What you really want is an abstract data type (ADT). Any introductory text for an algorithms and data structures class would do it. An ADT defines the public interface of a data type but hides the implementation of that data type. Thus, publicly an ADT for an array might look like
typedef void* arrayInfo;
(arrayInfo)newArrayInfo(unsignd int n, unsigned int itemSize);
(void)deleteArrayInfo(arrayInfo);
(unsigned int)arrayLength(arrayInfo);
(void*)arrayPtr(arrayInfo);
...
In other words, an ADT is a form of data and behavior encapsulation... in other words, it's about as close as you can get to Object-Oriented Programming using straight C. Unless you're stuck on a platform that doesn't have a C++ compiler, you might as well go whole hog and just use an STL std::vector.
There, we've taken a simple question about C and ended up at C++. God help us all.
keep track of the array size yourself; free uses the malloc chain to free the block that was allocated, which does not necessarily have the same size as the array you requested
Just to confirm the previous answers: There is no way to know, just by studying a pointer, how much memory was allocated by a malloc which returned this pointer.
What if it worked?
One example of why this is not possible. Let's imagine the code with an hypothetic function called get_size(void *) which returns the memory allocated for a pointer:
typedef struct MyStructTag
{ /* etc. */ } MyStruct ;
void doSomething(MyStruct * p)
{
/* well... extract the memory allocated? */
size_t i = get_size(p) ;
initializeMyStructArray(p, i) ;
}
void doSomethingElse()
{
MyStruct * s = malloc(sizeof(MyStruct) * 10) ; /* Allocate 10 items */
doSomething(s) ;
}
Why even if it worked, it would not work anyway?
But the problem of this approach is that, in C, you can play with pointer arithmetics. Let's rewrite doSomethingElse():
void doSomethingElse()
{
MyStruct * s = malloc(sizeof(MyStruct) * 10) ; /* Allocate 10 items */
MyStruct * s2 = s + 5 ; /* s2 points to the 5th item */
doSomething(s2) ; /* Oops */
}
How get_size is supposed to work, as you sent the function a valid pointer, but not the one returned by malloc. And even if get_size went through all the trouble to find the size (i.e. in an inefficient way), it would return, in this case, a value that would be wrong in your context.
Conclusion
There are always ways to avoid this problem, and in C, you can always write your own allocator, but again, it is perhaps too much trouble when all you need is to remember how much memory was allocated.
Some compilers provide msize() or similar functions (_msize() etc), that let you do exactly that
May I recommend a terrible way to do it?
Allocate all your arrays as follows:
void *blockOfMem = malloc(sizeof(mystruct)*n + sizeof(int));
((int *)blockofMem)[0] = n;
mystruct *structs = (mystruct *)(((int *)blockOfMem) + 1);
Then you can always cast your arrays to int * and access the -1st element.
Be sure to free that pointer, and not the array pointer itself!
Also, this will likely cause terrible bugs that will leave you tearing your hair out. Maybe you can wrap the alloc funcs in API calls or something.
malloc will return a block of memory at least as big as you requested, but possibly bigger. So even if you could query the block size, this would not reliably give you your array size. So you'll just have to modify your code to keep track of it yourself.
For an array of pointers you can use a NULL-terminated array. The length can then determinate like it is done with strings. In your example you can maybe use an structure attribute to mark then end. Of course that depends if there is a member that cannot be NULL. So lets say you have an attribute name, that needs to be set for every struct in your array you can then query the size by:
int size;
struct mystruct *cur;
for (cur = myarray; cur->name != NULL; cur++)
;
size = cur - myarray;
Btw it should be calloc(n, sizeof(struct mystruct)) in your example.
Other have discussed the limits of plain c pointers and the stdlib.h implementations of malloc(). Some implementations provide extensions which return the allocated block size which may be larger than the requested size.
If you must have this behavior you can use or write a specialized memory allocator. This simplest thing to do would be implementing a wrapper around the stdlib.h functions. Some thing like:
void* my_malloc(size_t s); /* Calls malloc(s), and if successful stores
(p,s) in a list of handled blocks */
void my_free(void* p); /* Removes list entry and calls free(p) */
size_t my_block_size(void* p); /* Looks up p, and returns the stored size */
...
really your question is - "can I find out the size of a malloc'd (or calloc'd) data block". And as others have said: no, not in a standard way.
However there are custom malloc implementations that do it - for example http://dmalloc.com/
I'm not aware of a way, but I would imagine it would deal with mucking around in malloc's internals which is generally a very, very bad idea.
Why is it that you can't store the size of memory you allocated?
EDIT: If you know that you should rework the code so you know n, well, do it. Yes it might be quick and easy to try to poll malloc but knowing n for sure would minimize confusion and strengthen the design.
One of the reasons that you can't ask the malloc library how big a block is, is that the allocator will usually round up the size of your request to meet some minimum granularity requirement (for example, 16 bytes). So if you ask for 5 bytes, you'll get a block of size 16 back. If you were to take 16 and divide by 5, you would get three elements when you really only allocated one. It would take extra space for the malloc library to keep track of how many bytes you asked for in the first place, so it's best for you to keep track of that yourself.
This is a test of my sort routine. It sets up 7 variables to hold float values, then assigns them to an array, which is used to find the max value.
The magic is in the call to myMax:
float mmax = myMax((float *)&arr,(int) sizeof(arr)/sizeof(arr[0]));
And that was magical, wasn't it?
myMax expects a float array pointer (float *) so I use &arr to get the address of the array, and cast it as a float pointer.
myMax also expects the number of elements in the array as an int. I get that value by using sizeof() to give me byte sizes of the array and the first element of the array, then divide the total bytes by the number of bytes in each element. (we should not guess or hard code the size of an int because it's 2 bytes on some system and 4 on some like my OS X Mac, and could be something else on others).
NOTE:All this is important when your data may have a varying number of samples.
Here's the test code:
#include <stdio.h>
float a, b, c, d, e, f, g;
float myMax(float *apa,int soa){
int i;
float max = apa[0];
for(i=0; i< soa; i++){
if (apa[i]>max){max=apa[i];}
printf("on i=%d val is %0.2f max is %0.2f, soa=%d\n",i,apa[i],max,soa);
}
return max;
}
int main(void)
{
a = 2.0;
b = 1.0;
c = 4.0;
d = 3.0;
e = 7.0;
f = 9.0;
g = 5.0;
float arr[] = {a,b,c,d,e,f,g};
float mmax = myMax((float *)&arr,(int) sizeof(arr)/sizeof(arr[0]));
printf("mmax = %0.2f\n",mmax);
return 0;
}
In uClibc, there is a MALLOC_SIZE macro in malloc.h:
/* The size of a malloc allocation is stored in a size_t word
MALLOC_HEADER_SIZE bytes prior to the start address of the allocation:
+--------+---------+-------------------+
| SIZE |(unused) | allocation ... |
+--------+---------+-------------------+
^ BASE ^ ADDR
^ ADDR - MALLOC_HEADER_SIZE
*/
/* The amount of extra space used by the malloc header. */
#define MALLOC_HEADER_SIZE \
(MALLOC_ALIGNMENT < sizeof (size_t) \
? sizeof (size_t) \
: MALLOC_ALIGNMENT)
/* Set up the malloc header, and return the user address of a malloc block. */
#define MALLOC_SETUP(base, size) \
(MALLOC_SET_SIZE (base, size), (void *)((char *)base + MALLOC_HEADER_SIZE))
/* Set the size of a malloc allocation, given the base address. */
#define MALLOC_SET_SIZE(base, size) (*(size_t *)(base) = (size))
/* Return base-address of a malloc allocation, given the user address. */
#define MALLOC_BASE(addr) ((void *)((char *)addr - MALLOC_HEADER_SIZE))
/* Return the size of a malloc allocation, given the user address. */
#define MALLOC_SIZE(addr) (*(size_t *)MALLOC_BASE(addr))
malloc() stores metadata regarding space allocation before 8 bytes from space actually allocated. This could be used to determine space of buffer. And on my x86-64 this always return multiple of 16. So if allocated space is multiple of 16 (which is in most cases) then this could be used:
Code
#include <stdio.h>
#include <malloc.h>
int size_of_buff(void *buff) {
return ( *( ( int * ) buff - 2 ) - 17 ); // 32 bit system: ( *( ( int * ) buff - 1 ) - 17 )
}
void main() {
char *buff = malloc(1024);
printf("Size of Buffer: %d\n", size_of_buff(buff));
}
Output
Size of Buffer: 1024
This is my approach:
#include <stdio.h>
#include <stdlib.h>
typedef struct _int_array
{
int *number;
int size;
} int_array;
int int_array_append(int_array *a, int n)
{
static char c = 0;
if(!c)
{
a->number = NULL;
a->size = 0;
c++;
}
int *more_numbers = NULL;
a->size++;
more_numbers = (int *)realloc(a->number, a->size * sizeof(int));
if(more_numbers != NULL)
{
a->number = more_numbers;
a->number[a->size - 1] = n;
}
else
{
free(a->number);
printf("Error (re)allocating memory.\n");
return 1;
}
return 0;
}
int main()
{
int_array a;
int_array_append(&a, 10);
int_array_append(&a, 20);
int_array_append(&a, 30);
int_array_append(&a, 40);
int i;
for(i = 0; i < a.size; i++)
printf("%d\n", a.number[i]);
printf("\nLen: %d\nSize: %d\n", a.size, a.size * sizeof(int));
free(a.number);
return 0;
}
Output:
10
20
30
40
Len: 4
Size: 16
If your compiler supports VLA (variable length array), you can embed the array length into the pointer type.
int n = 10;
int (*p)[n] = malloc(n * sizeof(int));
n = 3;
printf("%d\n", sizeof(*p)/sizeof(**p));
The output is 10.
You could also choose to embed the information into the allocated memory yourself with a structure including a flexible array member.
struct myarray {
int n;
struct mystruct a[];
};
struct myarray *ma =
malloc(sizeof(*ma) + n * sizeof(struct mystruct));
ma->n = n;
struct mystruct *p = ma->a;
Then to recover the size, you would subtract the offset of the flexible member.
int get_size (struct mystruct *p) {
struct myarray *ma;
char *x = (char *)p;
ma = (void *)(x - offsetof(struct myarray, a));
return ma->n;
}
The problem with trying to peek into heap structures is that the layout might change from platform to platform or from release to release, and so the information may not be reliably obtainable.
Even if you knew exactly how to peek into the meta information maintained by your allocator, the information stored there may have nothing to do with the size of the array. The allocator simply returned memory that could be used to fit the requested size, but the actual size of the memory may be larger (perhaps even much larger) than the requested amount.
The only reliable way to know the information is to find a way to track it yourself.

What is the opposite of calloc in C

It is more than a funny question. :-)
I wish to initialize an array in C, but instead of zeroing out the array with calloc. I want to set all element to one. Is there a single function that does just that?
I have used my question above to search in google, no answer. Hope you can help me out! FYI, I am first year CS student just starting to program in C.
There isn't a standard C memory allocation function that allows you to specify a value other than 0 that the allocated memory is initialized to.
You could easily enough write a cover function to do the job:
void *set_alloc(size_t nbytes, char value)
{
void *space = malloc(nbytes);
if (space != 0)
memset(space, value, nbytes);
return space;
}
Note that this assumes you want to set each byte to the same value. If you have a more complex initialization requirement, you'll need a more complex function. For example:
void *set_alloc2(size_t nelems, size_t elemsize, void *initializer)
{
void *space = malloc(nelems * elemsize);
if (space != 0)
{
for (size_t i = 0; i < nelems; i++)
memmove((char *)space + i * elemsize, initializer, elemsize);
}
return space;
}
Example usage:
struct Anonymous
{
double d;
int i;
short s;
char t[2];
};
struct Anonymous a = { 3.14159, 23, -19, "A" };
struct Anonymous *b = set_alloc2(20, sizeof(struct Anonymous), &a);
memset is there for you:
memset(array, value, length);
There is no such function. You can implement it yourself with a combination of malloc() and either memset() (for character data) or a for loop (for other integer data).
The impetus for the calloc() function's existence (vs. malloc() + memset()) is that it can be a nice performance optimization in some cases. If you're allocating a lot of data, the OS might be able to give you a range of virtual addresses that are already initialized to zero, which saves you the extra cost of manually writing out 0's into that memory range. This can be a large performance gain because you don't need to page all of those pages in until you actually use them.
Under the hood, calloc() might look something like this:
void *calloc(size_t count, size_t size)
{
// Error checking omitted for expository purposes
size_t total_size = count * size;
if (total_size < SOME_THRESHOLD) // e.g. the OS's page size (typically 4 KB)
{
// For small allocations, allocate from normal malloc pool
void *mem = malloc(total_size);
memset(mem, 0, total_size);
return mem;
}
else
{
// For large allocations, allocate directory from the OS, already zeroed (!)
return mmap(NULL, total_size, PROT_READ|PROT_WRITE, MAP_ANON|MAP_PRIVATE, -1, 0);
// Or on Windows, use VirtualAlloc()
}
}

how can i know the allocated memory size of pointer variable in c [duplicate]

This question already has answers here:
Determine size of dynamically allocated memory in C
(15 answers)
Closed 3 years ago.
I have faced some problem in this case can you please your ideas.
main()
{
char *p=NULL;
p=(char *)malloc(2000 * sizeof(char));
printf("size of p = %d\n",sizeof (p));
}
In this program Its print the 4 that (char *) value,but i need how many bytes allocated for
that.
You could also implement a wrapper for malloc and free to add tags (like allocated size and other meta information) before the pointer returned by malloc. This is in fact the method that a c++ compiler tags objects with references to virtual classes.
Here is one working example:
#include <stdlib.h>
#include <stdio.h>
void * my_malloc(size_t s)
{
size_t * ret = malloc(sizeof(size_t) + s);
*ret = s;
return &ret[1];
}
void my_free(void * ptr)
{
free( (size_t*)ptr - 1);
}
size_t allocated_size(void * ptr)
{
return ((size_t*)ptr)[-1];
}
int main(int argc, const char ** argv) {
int * array = my_malloc(sizeof(int) * 3);
printf("%u\n", allocated_size(array));
my_free(array);
return 0;
}
The advantage of this method over a structure with size and pointer
struct pointer
{
size_t size;
void *p;
};
is that you only need to replace the malloc and free calls. All other pointer operations require no refactoring.
There is no portable way but for windows:
#include <stdio.h>
#include <malloc.h>
#if defined( _MSC_VER ) || defined( __int64 ) /* for VisualC++ or MinGW/gcc */
#define howmanybytes(ptr) ((unsigned long)_msize(ptr))
#else
#error no known way
#endif
int main()
{
char *x=malloc(1234);
printf( "%lu", howmanybytes(x) );
return 0;
}
Although it may be possible that some libraries allows you to determine the size of an allocated buffer, it wouldn't be a standard C function and you should be looking at your library's own documentations for this.
However, if there are many places that you need to know the size of your allocated memory, the cleanest way you could do it is to keep the size next to the pointer. That is:
struct pointer
{
size_t size;
void *p;
};
Then every time you malloc the pointer, you write down the size in the size field also. The problem with this method however is that you have to cast the pointer every time you use it. If you were in C++, I would have suggested using template classes. However, in this case also it's not hard, just create as many structs as the types you have. So for example
struct charPtr
{
size_t size;
char *p;
};
struct intPtr
{
size_t size;
int *p;
};
struct objectPtr
{
size_t size;
struct object *p;
};
Given similar names, once you define the pointer, you don't need extra effort (such as casting) to access the array. An example of usage is:
struct intPtr array;
array.p = malloc(1000 * sizeof *array.p);
array.size = array.p?1000:0;
...
for (i = 0; i < array.size; ++i)
printf("%s%d", i?" ":"", array.p[i]);
printf("\n");
It is impossible to know how much memory was allocated by just the pointer. doing sizeof (p) will get the size of the pointer variable p which it takes at compile time, and which is the size of the pointer. That is, the memory the pointer variable takes to store the pointer variable p. Inside p the starting address of the memory block is stored.
Once you allocate some memory with malloc it will return the starting address of the memory block, but the end of the block cannot be found from it, as there is no terminator for a block. You define the end of the block therefore you need to identify it by any means, so store it somewhere. Therefore you need to preserve the block length somewhere to know where the block which is pointed to by p ends.
Note: Although the memory allocation structure keeps track of allocated and unallocated blocks, therefore we can know the allocated memory block length from these structures, but these structures are not available to be used by the users, unless any library function provides them. Therefore a code using such feature is not portable (pointed by #Rudy Velthuis) . Therefore it is the best to keep track of the structure yourself.
You need to keep track of it in a variable if you want to know it for later:
char *p = NULL;
int sizeofp = 2000*sizeof(char);
p = (char *)malloc(sizeofp);
printf("size of p = %d\n",sizeofp);
You cannot use the sizeof in this case, since p is a pointer, not an array, but since you allocate it, you already know:
main()
{
size_t arr_size = 2000;
char *p=NULL;
p=malloc(arr_size * sizeof(char));
printf("size of p = %d\n",arr_size);
}
Edit - If the malloc fails to allocate the size you wanted, it won't give you a pointer to a smaller buffer, but it will return NULL.

Resources