Obtain size of array via write permission check

Obtain size of array via write permission check - c

To obtain the length of a null terminated string,we simply write len = strlen(str) however,i often see here on SO posts saying that to get the size of an int array for example,you need to keep track of it on your own and that's what i do normally.But,i have a question,could we obtain the size by using some sort of write permission check,that checks if we have writing permissions to a block of memory? for example :
#include <stdio.h>
int getSize(int *arr);
bool permissionTo(int *ptr);
int main(void)
{
int arr[3] = {1,2,3};
int size = getSize(arr) * sizeof(int);
}
int getSize(int *arr)
{
int *ptr = arr;
int size = 0;
while( permissionTo(ptr) )
{
size++;
ptr++;
}
return size;
}
bool permissionTo(int *ptr)
{
/*............*/
}

No, you can't. Memory permissions don't have this granularity on most, if not all, architectures.
Almost all CPU architectures manage memory in pages. On most things you'll run into today one page is 4kB. There's no practical way to control permissions on anything smaller than that.
Most memory management is done by your libc allocating a large:ish chunk of memory from the kernel and then handing out smaller chunks of it to individual malloc calls. This is done for performance (among other things) because creating, removing or modifying a memory mapping is an expensive operation especially on multiprocessor systems.
For the stack (as in your example), allocations are even simpler. The kernel knows that "this large area of memory will be used by the stack" and memory accesses to it just simply allocates the necessary pages to back it. All tracking your program does of stack allocations is one register.

If you are trying to achive, that an allocation becomes comfortable to use by carrying its own size around then do this:
Wrap malloc and free by prefixing the memory with its size internally (written from memory, not tested yet):
void* myMalloc(long numBytes) {
char* mem = malloc(numBytes+sizeof(long));
((long*)mem)[0] = numBytes;
return mem+sizeof(long);
}
void myFree(void* memory) {
char* mem = (char*)memory-sizeof(long);
free(mem)
}
long memlen(void* memory) {
char* mem = (char*)memory-sizeof(long);
return ((long*)mem)[0];
}

Related

Memory Allocation, Recursive Function and Pure C [duplicate]

I know that on your hard drive, if you delete a file, the data is not (instantly) gone. The data is still there until it is overwritten. I was wondering if a similar concept existed in memory. Say I allocate 256 bytes for a string, is that string still floating in memory somewhere after I free() it until it is overwritten?

Your analogy is correct. The data in memory doesn't disappear or anything like that; the values may indeed still be there after a free(), though attempting to read from freed memory is undefined behaviour.

Generally, it does stay around, unless you explicitly overwrite the string before freeing it (like people sometimes do with passwords). Some library implementations automatically overwrite deallocated memory to catch accesses to it, but that is not done in release mode.

The answer depends highly on the implementation. On a good implementation, it's likely that at least the beginning (or the end?) of the memory will be overwritten with bookkeeping information for tracking free chunks of memory that could later be reused. However the details will vary. If your program has any level of concurrency/threads (even in the library implementation you might not see), then such memory could be clobbered asynchronously, perhaps even in such a way that even reading it is dangerous. And of course the implementation of free might completely unmap the address range from the program's virtual address space, in which case attempting to do anything with it will crash your program.
From a standpoint of an application author, you should simply treat free according to the specification and never access freed memory. But from the standpoint of a systems implementor or integrator, it might be useful to know (or design) the implementation, in which case your question is then interesting.

If you want to verify the behaviour for your implementation, the simple program below will do that for you.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
/* The number of memory bytes to test */
#define MEM_TEST_SIZE 256
void outputMem(unsigned char *mem, int length)
{
int i;
for (i = 0; i < length; i++) {
printf("[%02d]", mem[i] );
}
}
int bytesChanged(unsigned char *mem, int length)
{
int i;
int count = 0;
for (i = 0; i < MEM_TEST_SIZE; i++) {
if (mem[i] != i % 256)
count++;
}
return count;
}
main(void)
{
int i;
unsigned char *mem = (unsigned char *)malloc(MEM_TEST_SIZE);
/* Fill memory with bytes */
for (i = 0; i < MEM_TEST_SIZE; i++) {
mem[i] = i % 256;
}
printf("After malloc and copy to new mem location\n");
printf("mem = %ld\n", mem );
printf("Contents of mem: ");
outputMem(mem, MEM_TEST_SIZE);
free(mem);
printf("\n\nAfter free()\n");
printf("mem = %ld\n", mem );
printf("Bytes changed in memory = %d\n", bytesChanged(mem, MEM_TEST_SIZE) );
printf("Contents of mem: ");
outputMem(mem, MEM_TEST_SIZE);
}

Is it possible to increase char array while using it, WITHOUT malloc?

I have a char array, we know that that a char size is 1 byte. Now I have to collect some char -> getchar() of course and simultaneously increase the array by 1 byte (without malloc, only library: stdio.h)
My suggestion would be, pointing to the array and somehow increase that array by 1 till there are no more chars to get OR you run out of Memory...

Is it possible to increase char array while using it, WITHOUT malloc?
No.
You cannot increase the size of a fixed size array.
For that you need realloc() from <stdlib.h>, which it seems you are not "allowed" to use.

Is it possible to increase char array while using it, WITHOUT malloc?
Quick answer: No it is not possible to increase the size of an array without reallocating it.
Fun answer: Don't use malloc(), use realloc().
Long answer:
If the char array has static or automatic storage class, it is most likely impossible to increase its size at runtime because keeping it at the same address that would require objects that are present at higher addresses to be moved or reallocated elsewhere.
If the array was obtained by malloc, it might be possible to extend its size if no other objects have been allocated after it in memory. Indeed realloc() to a larger size might return the same address. The problem is it is impossible to predict and if realloc returns a different address, the current space has been freed so pointers to it are now invalid.
The efficient way to proceed with this reallocation is to increase the size geometrically, by a factor at a time, 2x, 1.5x, 1.625x ... to minimize the number of reallocations and keep linear time as the size of the array grows linearly. You would a different variable for the allocated size of the array and the number of characters that you have stored into it.
Here is an example:
#include <stdio.h>
#include <stdlib.h>
int main(void) {
char *a = NULL;
size_t size = 0;
size_t count = 0;
int c;
while ((c = getchar()) != EOF && c != '\n') {
if (count >= size) {
/* reallocate the buffer to 1.5x size */
size_t newsize = size + size / 2 + 16;
char *new_a = realloc(a, new_size);
if (new_a == NULL) {
fprintf("out of memory for %zu bytes\n", new_size);
free(a);
return 1;
}
a = new_a;
size = new_size;
}
a[count++] = c;
}
for (i = 0; i < count; i++) {
putchar(a[i]);
}
free(a);
return 0;
}

There are two ways to create space for the string without using dynamic memory allocation(malloc...). You can use a static array or an array with automatic storage duration, you need to specify a maximum amount, you might never reach. But always check against it.
#define BUFFER_SIZE 0x10000
Static
static char buffer[BUFFER_SIZE];
Or automatic (You need to ensure BUFFER_SIZE is smaller than the stack size)
int main() {
char buffer[BUFFER_SIZE];
...
};
There are also optimizations done by the operating system. It might lazily allocate the whole (static/automatic) buffer, so that only the used part is in the physical memory. (This also applies to the dynamic memory allocation functions.) I found out that calloc (for big chunks) just allocates the virtual memory for the program; memory pages are cleared only, when they are accessed (probably through some interrupts raised by the cpu). I compared it to an allocation with malloc and memset. The memset does unnessecary work, if not all bytes/pages of the buffer are accessed by the program.
If you cannot allocate a buffer with malloc..., create a static/automatic array with enough size and let the operating system allocate it for you. It does not occupy the same space in the binary, because it is just stored as a size.

Why doesn't this memory eater really eat memory?

I want to create a program that will simulate an out-of-memory (OOM) situation on a Unix server. I created this super-simple memory eater:
#include <stdio.h>
#include <stdlib.h>
unsigned long long memory_to_eat = 1024 * 50000;
size_t eaten_memory = 0;
void *memory = NULL;
int eat_kilobyte()
{
memory = realloc(memory, (eaten_memory * 1024) + 1024);
if (memory == NULL)
{
// realloc failed here - we probably can't allocate more memory for whatever reason
return 1;
}
else
{
eaten_memory++;
return 0;
}
}
int main(int argc, char **argv)
{
printf("I will try to eat %i kb of ram\n", memory_to_eat);
int megabyte = 0;
while (memory_to_eat > 0)
{
memory_to_eat--;
if (eat_kilobyte())
{
printf("Failed to allocate more memory! Stucked at %i kb :(\n", eaten_memory);
return 200;
}
if (megabyte++ >= 1024)
{
printf("Eaten 1 MB of ram\n");
megabyte = 0;
}
}
printf("Successfully eaten requested memory!\n");
free(memory);
return 0;
}
It eats as much memory as defined in memory_to_eat which now is exactly 50 GB of RAM. It allocates memory by 1 MB and prints exactly the point where it fails to allocate more, so that I know which maximum value it managed to eat.
The problem is that it works. Even on a system with 1 GB of physical memory.
When I check top I see that the process eats 50 GB of virtual memory and only less than 1 MB of resident memory. Is there a way to create a memory eater that really does consume it?
System specifications: Linux kernel 3.16 (Debian) most likely with overcommit enabled (not sure how to check it out) with no swap and virtualized.

When your malloc() implementation requests memory from the system kernel (via an sbrk() or mmap() system call), the kernel only makes a note that you have requested the memory and where it is to be placed within your address space. It does not actually map those pages yet.
When the process subsequently accesses memory within the new region, the hardware recognizes a segmentation fault and alerts the kernel to the condition. The kernel then looks up the page in its own data structures, and finds that you should have a zero page there, so it maps in a zero page (possibly first evicting a page from page-cache) and returns from the interrupt. Your process does not realize that any of this happened, the kernels operation is perfectly transparent (except for the short delay while the kernel does its work).
This optimization allows the system call to return very quickly, and, most importantly, it avoids any resources to be committed to your process when the mapping is made. This allows processes to reserve rather large buffers that they never need under normal circumstances, without fear of gobbling up too much memory.
So, if you want to program a memory eater, you absolutely have to actually do something with the memory you allocate. For this, you only need to add a single line to your code:
int eat_kilobyte()
{
if (memory == NULL)
memory = malloc(1024);
else
memory = realloc(memory, (eaten_memory * 1024) + 1024);
if (memory == NULL)
{
return 1;
}
else
{
//Force the kernel to map the containing memory page.
((char*)memory)[1024*eaten_memory] = 42;
eaten_memory++;
return 0;
}
}
Note that it is perfectly sufficient to write to a single byte within each page (which contains 4096 bytes on X86). That's because all memory allocation from the kernel to a process is done at memory page granularity, which is, in turn, because of the hardware that does not allow paging at smaller granularities.

All the virtual pages start out copy-on-write mapped to the same zeroed physical page. To use up physical pages, you can dirty them by writing something to each virtual page.
If running as root, you can use mlock(2) or mlockall(2) to have the kernel wire up the pages when they're allocated, without having to dirty them. (normal non-root users have a ulimit -l of only 64kiB.)
As many others suggested, it seems that the Linux kernel doesn't really allocate the memory unless you write to it
An improved version of the code, which does what the OP was wanting:
This also fixes the printf format string mismatches with the types of memory_to_eat and eaten_memory, using %zi to print size_t integers. The memory size to eat, in kiB, can optionally be specified as a command line arg.
The messy design using global variables, and growing by 1k instead of 4k pages, is unchanged.
#include <stdio.h>
#include <stdlib.h>
size_t memory_to_eat = 1024 * 50000;
size_t eaten_memory = 0;
char *memory = NULL;
void write_kilobyte(char *pointer, size_t offset)
{
int size = 0;
while (size < 1024)
{ // writing one byte per page is enough, this is overkill
pointer[offset + (size_t) size++] = 1;
}
}
int eat_kilobyte()
{
if (memory == NULL)
{
memory = malloc(1024);
} else
{
memory = realloc(memory, (eaten_memory * 1024) + 1024);
}
if (memory == NULL)
{
return 1;
}
else
{
write_kilobyte(memory, eaten_memory * 1024);
eaten_memory++;
return 0;
}
}
int main(int argc, char **argv)
{
if (argc >= 2)
memory_to_eat = atoll(argv[1]);
printf("I will try to eat %zi kb of ram\n", memory_to_eat);
int megabyte = 0;
int megabytes = 0;
while (memory_to_eat-- > 0)
{
if (eat_kilobyte())
{
printf("Failed to allocate more memory at %zi kb :(\n", eaten_memory);
return 200;
}
if (megabyte++ >= 1024)
{
megabytes++;
printf("Eaten %i MB of ram\n", megabytes);
megabyte = 0;
}
}
printf("Successfully eaten requested memory!\n");
free(memory);
return 0;
}

A sensible optimisation is being made here. The runtime does not actually acquire the memory until you use it.
A simple memcpy will be sufficient to circumvent this optimisation. (You might find that calloc still optimises out the memory allocation until the point of use.)

Not sure about this one but the only explanation that I can things of is that linux is a copy-on-write operating system. When one calls fork the both processes point to the same physically memory. The memory is only copied once one process actually WRITES to the memory.
I think here, the actual physical memory is only allocated when one tries to write something to it. Calling sbrk or mmap may well only update the kernel's memory book-keep. The actual RAM may only be allocated when we actually try to access the memory.

Basic Answer
As mentioned by others, the allocation of memory, until used, does not always commit the necessary RAM. This happens if you allocate a buffer larger than one page (usually 4Kb on Linux).
One simple answer would be for your "eat memory" function to always allocate 1Kb instead of increasingly larger blocks. This is because each allocated blocks start with a header (a size for allocated blocks). So allocating a buffer of a size equal to or less than one page will always commit all of those pages.
Following Your Idea
To optimize your code as much as possible, you want to allocate blocks of memory aligned to 1 page size.
From what I can see in your code, you use 1024. I would suggest that you use:
int size;
size = getpagesize();
block_size = size - sizeof(void *) * 2;
What voodoo magic is this sizeof(void *) * 2?! When using the default memory allocation library (i.e. not SAN, fence, valgrin, ...), there is a small header just before the pointer returned by malloc() which includes a pointer to the next block and a size.
struct mem_header { void * next_block; intptr_t size; };
Now, using block_size, all your malloc() should be aligned to the page size we found earlier.
If you want to properly align everything, the first allocation needs to use an aligned allocation:
char *p = NULL;
int posix_memalign(&p, size, block_size);
Further allocations (assuming your tool only does that) can use malloc(). They will be aligned.
p = malloc(block_size);
Note: please verify that it is indeed aligned on your system... it works on mine.
As a result you can simplify your loop with:
for(;;)
{
p = malloc(block_size);
*p = 1;
}
Until you create a thread, the malloc() does not use mutexes. But it still has to look for a free memory block. In your case, though, it will be one after the other and there will be no holes in the allocated memory so it will be pretty fast.
Can it be faster?
Further note about how memory is generally allocated in a Unix system:
the malloc() function and related functions will allocate a block in your heap; which at the start is pretty small (maybe 2Mb)
when the existing heap is full it gets grown using the sbrk() function; as far as your process is concerned, the memory address always increases, that's what sbrk() does (contrary to MS-Windows which allocates blocks all over the place)
using sbrk() once and then hitting the memory every "page size" bytes would be faster than using malloc()
char * p = malloc(size); // get current "highest address"
p += size;
p = (char*)((intptr_t)p & -size); // clear bits (alignment)
int total_mem(50 * 1024 * 1024 * 1024); // 50Gb
void * start(sbrk(total_mem));
char * end((char *)start + total_mem);
for(; p < end; p += size)
{
*p = 1;
}
note that the malloc() above may give you the "wrong" start address. But your process really doesn't do much, so I think you'll always be safe. That for() loop, however, is going to be as fast as possible. As mentioned by others, you'll get the total_mem of virtual memory allocated "instantly" and then the RSS memory allocated each time you write to *p.
WARNING: Code not tested, use at your own risk.

memory leak (free function not working)

I am facing memory leak problem with the below code
static char **edits1(char *word)
{
int next_idx;
char **array = malloc(edits1_rows(word) * sizeof (char *));
if (!array)
return NULL;
next_idx = deletion(word, array, 0);
next_idx += transposition(word, array, next_idx);
next_idx += alteration(word, array, next_idx);
insertion(word, array, next_idx);
return array;
}
static void array_cleanup(char **array, int rows) {
int i;
for (i = 0; i < rows; i++)
free(array[i]);
}
static char *correct(char *word,int *count) {
char **e1, **e2, *e1_word, *e2_word, *res_word = word;
int e1_rows, e2_rows,max_size;
e1_rows = edits1_rows(word);
if (e1_rows) {
e1 = edits1(word);
*count=(*count)*300;
e1_word = max(e1, e1_rows,*count);
if (e1_word) {
array_cleanup(e1, e1_rows);
free(e1);
return e1_word;
}
}
*count=(*count)/300;
if((*count>5000)||(strlen(word)<=4))
return res_word;
e2 = known_edits2(e1, e1_rows, &e2_rows);
if (e2_rows) {
*count=(*count)*3000;
e2_word = max(e2, e2_rows,*count);
if (e2_word)
res_word = e2_word;
}
array_cleanup(e1, e1_rows);
array_cleanup(e2, e2_rows);
free(e1);
free(e2);
return res_word;
}
I don’t know why free() is not working. I am calling this function "correct" in thread, multiple threads are running simultaneously.I am using Ubuntu OS.

You don't show where you allocate the actual arrays, you just show where you allocate the array of pointers. So it is quite possible that you have leaks elsewhere in the code you are not showing.
Furthermore, array_cleanup leaks since it only deletes those arrays you don't show where you allocate. It doesn't delete the array of pointers itself. The final row of that function should have been free(array);.
Your main problem is that you are using an obscure allocation algorithm. Instead, allocate true dynamic 2D arrays.

Answer based on digging for further information in comments.
Most malloc implementations usually don't return the memory to the operating system, but rather keep it for future calls to malloc. This is done because returning the memory to the operating system can impact performance quite a lot.
Furthermore, if you have certain allocation patterns, the memory that malloc keeps might not be easily reusable by future calls to malloc. This is called memory fragmentation and is a large topic of research for designing memory allocators.
Whatever htop/top/ps reports is not how much memory you have currently allocated inside your program with malloc, but all the various allocations that all libraries did, their reserves and such, which could be much more than you've allocated.
If you want an accurate assessment of how much memory you are leaking, you need to use a tool like valgrind or see if maybe the malloc you're using has diagnostic tools to help you with that.

What happens to memory after free()?

I know that on your hard drive, if you delete a file, the data is not (instantly) gone. The data is still there until it is overwritten. I was wondering if a similar concept existed in memory. Say I allocate 256 bytes for a string, is that string still floating in memory somewhere after I free() it until it is overwritten?

Your analogy is correct. The data in memory doesn't disappear or anything like that; the values may indeed still be there after a free(), though attempting to read from freed memory is undefined behaviour.

Generally, it does stay around, unless you explicitly overwrite the string before freeing it (like people sometimes do with passwords). Some library implementations automatically overwrite deallocated memory to catch accesses to it, but that is not done in release mode.

The answer depends highly on the implementation. On a good implementation, it's likely that at least the beginning (or the end?) of the memory will be overwritten with bookkeeping information for tracking free chunks of memory that could later be reused. However the details will vary. If your program has any level of concurrency/threads (even in the library implementation you might not see), then such memory could be clobbered asynchronously, perhaps even in such a way that even reading it is dangerous. And of course the implementation of free might completely unmap the address range from the program's virtual address space, in which case attempting to do anything with it will crash your program.
From a standpoint of an application author, you should simply treat free according to the specification and never access freed memory. But from the standpoint of a systems implementor or integrator, it might be useful to know (or design) the implementation, in which case your question is then interesting.

If you want to verify the behaviour for your implementation, the simple program below will do that for you.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
/* The number of memory bytes to test */
#define MEM_TEST_SIZE 256
void outputMem(unsigned char *mem, int length)
{
int i;
for (i = 0; i < length; i++) {
printf("[%02d]", mem[i] );
}
}
int bytesChanged(unsigned char *mem, int length)
{
int i;
int count = 0;
for (i = 0; i < MEM_TEST_SIZE; i++) {
if (mem[i] != i % 256)
count++;
}
return count;
}
main(void)
{
int i;
unsigned char *mem = (unsigned char *)malloc(MEM_TEST_SIZE);
/* Fill memory with bytes */
for (i = 0; i < MEM_TEST_SIZE; i++) {
mem[i] = i % 256;
}
printf("After malloc and copy to new mem location\n");
printf("mem = %ld\n", mem );
printf("Contents of mem: ");
outputMem(mem, MEM_TEST_SIZE);
free(mem);
printf("\n\nAfter free()\n");
printf("mem = %ld\n", mem );
printf("Bytes changed in memory = %d\n", bytesChanged(mem, MEM_TEST_SIZE) );
printf("Contents of mem: ");
outputMem(mem, MEM_TEST_SIZE);
}

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Obtain size of array via write permission check - c

Related

Memory Allocation, Recursive Function and Pure C [duplicate]

Is it possible to increase char array while using it, WITHOUT malloc?

Why doesn't this memory eater really eat memory?

memory leak (free function not working)

What happens to memory after free()?

Categories

Resources