I know that on your hard drive, if you delete a file, the data is not (instantly) gone. The data is still there until it is overwritten. I was wondering if a similar concept existed in memory. Say I allocate 256 bytes for a string, is that string still floating in memory somewhere after I free() it until it is overwritten?
Your analogy is correct. The data in memory doesn't disappear or anything like that; the values may indeed still be there after a free(), though attempting to read from freed memory is undefined behaviour.
Generally, it does stay around, unless you explicitly overwrite the string before freeing it (like people sometimes do with passwords). Some library implementations automatically overwrite deallocated memory to catch accesses to it, but that is not done in release mode.
The answer depends highly on the implementation. On a good implementation, it's likely that at least the beginning (or the end?) of the memory will be overwritten with bookkeeping information for tracking free chunks of memory that could later be reused. However the details will vary. If your program has any level of concurrency/threads (even in the library implementation you might not see), then such memory could be clobbered asynchronously, perhaps even in such a way that even reading it is dangerous. And of course the implementation of free might completely unmap the address range from the program's virtual address space, in which case attempting to do anything with it will crash your program.
From a standpoint of an application author, you should simply treat free according to the specification and never access freed memory. But from the standpoint of a systems implementor or integrator, it might be useful to know (or design) the implementation, in which case your question is then interesting.
If you want to verify the behaviour for your implementation, the simple program below will do that for you.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
/* The number of memory bytes to test */
#define MEM_TEST_SIZE 256
void outputMem(unsigned char *mem, int length)
{
int i;
for (i = 0; i < length; i++) {
printf("[%02d]", mem[i] );
}
}
int bytesChanged(unsigned char *mem, int length)
{
int i;
int count = 0;
for (i = 0; i < MEM_TEST_SIZE; i++) {
if (mem[i] != i % 256)
count++;
}
return count;
}
main(void)
{
int i;
unsigned char *mem = (unsigned char *)malloc(MEM_TEST_SIZE);
/* Fill memory with bytes */
for (i = 0; i < MEM_TEST_SIZE; i++) {
mem[i] = i % 256;
}
printf("After malloc and copy to new mem location\n");
printf("mem = %ld\n", mem );
printf("Contents of mem: ");
outputMem(mem, MEM_TEST_SIZE);
free(mem);
printf("\n\nAfter free()\n");
printf("mem = %ld\n", mem );
printf("Bytes changed in memory = %d\n", bytesChanged(mem, MEM_TEST_SIZE) );
printf("Contents of mem: ");
outputMem(mem, MEM_TEST_SIZE);
}
Related
I know that on your hard drive, if you delete a file, the data is not (instantly) gone. The data is still there until it is overwritten. I was wondering if a similar concept existed in memory. Say I allocate 256 bytes for a string, is that string still floating in memory somewhere after I free() it until it is overwritten?
Your analogy is correct. The data in memory doesn't disappear or anything like that; the values may indeed still be there after a free(), though attempting to read from freed memory is undefined behaviour.
Generally, it does stay around, unless you explicitly overwrite the string before freeing it (like people sometimes do with passwords). Some library implementations automatically overwrite deallocated memory to catch accesses to it, but that is not done in release mode.
The answer depends highly on the implementation. On a good implementation, it's likely that at least the beginning (or the end?) of the memory will be overwritten with bookkeeping information for tracking free chunks of memory that could later be reused. However the details will vary. If your program has any level of concurrency/threads (even in the library implementation you might not see), then such memory could be clobbered asynchronously, perhaps even in such a way that even reading it is dangerous. And of course the implementation of free might completely unmap the address range from the program's virtual address space, in which case attempting to do anything with it will crash your program.
From a standpoint of an application author, you should simply treat free according to the specification and never access freed memory. But from the standpoint of a systems implementor or integrator, it might be useful to know (or design) the implementation, in which case your question is then interesting.
If you want to verify the behaviour for your implementation, the simple program below will do that for you.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
/* The number of memory bytes to test */
#define MEM_TEST_SIZE 256
void outputMem(unsigned char *mem, int length)
{
int i;
for (i = 0; i < length; i++) {
printf("[%02d]", mem[i] );
}
}
int bytesChanged(unsigned char *mem, int length)
{
int i;
int count = 0;
for (i = 0; i < MEM_TEST_SIZE; i++) {
if (mem[i] != i % 256)
count++;
}
return count;
}
main(void)
{
int i;
unsigned char *mem = (unsigned char *)malloc(MEM_TEST_SIZE);
/* Fill memory with bytes */
for (i = 0; i < MEM_TEST_SIZE; i++) {
mem[i] = i % 256;
}
printf("After malloc and copy to new mem location\n");
printf("mem = %ld\n", mem );
printf("Contents of mem: ");
outputMem(mem, MEM_TEST_SIZE);
free(mem);
printf("\n\nAfter free()\n");
printf("mem = %ld\n", mem );
printf("Bytes changed in memory = %d\n", bytesChanged(mem, MEM_TEST_SIZE) );
printf("Contents of mem: ");
outputMem(mem, MEM_TEST_SIZE);
}
Follow these two questions:
Kernel zeroes memory?
If the heap is zero-initialized for security then why is the stack merely uninitialized?
#include <stddef.h>
#include <stdlib.h>
#include <stdio.h>
const size_t n = 4;
const size_t m = 0x10;
int main()
{
int *p = malloc(m*sizeof(int));
printf("%p ", p);
for (size_t i = 0; i < m; ++i) {
printf("%d", p[i]);
}
printf("\n");
memset(p,9,m*sizeof(int));
free(p);
int *v = malloc(m*sizeof(int));
printf("%p ", v);
for (size_t j = 0; j < m; ++j) {
printf("%x", v[j]);
}
printf("\n");
return 0;
}
OUTPUT:
0xaaaae7082260 0000000000000000
0xaaaae7082260 0090909099090909909090990909099090909909090990909099090909909090990909099090909909090990909099090909
I have a question: In a process, the assigned memory by malloc is set 0 when first using malloc. But reusing malloc to allocate a new memory after free the first assigned memory, the new memory has the same virtual address and same content with the first memory.
My question: How does the kernel know that the memory is first assigned to a process and is needed to be set zero?
And how does the kernel know that the memory is reassigned to the same process and doesn't need to be cleared?
Getting a chunk of memory from the OS for your memory pool and reusing memory already in your memory pool are two different things.
The OS may zero the memory when you first get it but it is up to the "malloc" implementation whether it zeros memory (either on free or malloc).
The answer to "how does the kernel know that the memory is first assigned to a process" is that the process (via the C library) makes a request to the kernel to allocate it some memory, so the kernel knows that the memory should not reveal its previous contents (and zeroing the allocated memory is one way of ensuring that information does not leak between processes).
The answer to "how does the kernel know that the memory is reassigned …" is "it doesn't" — that information is private to the process and the kernel has no knowledge of what the process does to reuse the memory.
To obtain the length of a null terminated string,we simply write len = strlen(str) however,i often see here on SO posts saying that to get the size of an int array for example,you need to keep track of it on your own and that's what i do normally.But,i have a question,could we obtain the size by using some sort of write permission check,that checks if we have writing permissions to a block of memory? for example :
#include <stdio.h>
int getSize(int *arr);
bool permissionTo(int *ptr);
int main(void)
{
int arr[3] = {1,2,3};
int size = getSize(arr) * sizeof(int);
}
int getSize(int *arr)
{
int *ptr = arr;
int size = 0;
while( permissionTo(ptr) )
{
size++;
ptr++;
}
return size;
}
bool permissionTo(int *ptr)
{
/*............*/
}
No, you can't. Memory permissions don't have this granularity on most, if not all, architectures.
Almost all CPU architectures manage memory in pages. On most things you'll run into today one page is 4kB. There's no practical way to control permissions on anything smaller than that.
Most memory management is done by your libc allocating a large:ish chunk of memory from the kernel and then handing out smaller chunks of it to individual malloc calls. This is done for performance (among other things) because creating, removing or modifying a memory mapping is an expensive operation especially on multiprocessor systems.
For the stack (as in your example), allocations are even simpler. The kernel knows that "this large area of memory will be used by the stack" and memory accesses to it just simply allocates the necessary pages to back it. All tracking your program does of stack allocations is one register.
If you are trying to achive, that an allocation becomes comfortable to use by carrying its own size around then do this:
Wrap malloc and free by prefixing the memory with its size internally (written from memory, not tested yet):
void* myMalloc(long numBytes) {
char* mem = malloc(numBytes+sizeof(long));
((long*)mem)[0] = numBytes;
return mem+sizeof(long);
}
void myFree(void* memory) {
char* mem = (char*)memory-sizeof(long);
free(mem)
}
long memlen(void* memory) {
char* mem = (char*)memory-sizeof(long);
return ((long*)mem)[0];
}
I am relatively new to 'C' and would appreciate some insight on this topic.
Basically, I am trying to create a 16 MB array and check if the memory content is initialized to zero or '\0' or some garbage value for a school project.
Something like this:
char buffer[16*1024*1024];
I know there is a limit on the size of the program stack and obviously I get a segmentation fault. Can this somehow be done using malloc()?
You can initialize the memory with malloc like so:
#define MEM_SIZE_16MB ( 16 * 1024 * 1024 )
char *buffer = malloc(MEM_SIZE_16MB * sizeof(char) );
if (buffer == NULL ) {
// unable to allocate memory. stop here or undefined behavior happens
}
You can then check the values in memory so (note that this will print for a very very long time):
for (int i = 0; i < MEM_SIZE_16MB; i++) {
if( i%16 == 0 ) {
// print a newline and the memory address every 16 bytes so
// it's a little easier to read
printf("\nAddr: %08p: ", &buffer[i]);
}
printf("%02x ", buffer[i]);
}
printf("\n"); // one final newline
Don't forget to free the memory when finished
free(buffer);
Yes, you will probably need to do this using malloc(), and here's why:
When any program (process ... thread ...) is started, it is given a chunk of memory which it uses to store (among other things ...) "local" variables. This area is called "the stack." It most-certainly won't be big enough to store 16 megabytes.
But there's another area of memory which any program can use: its "heap." This area (as the name, "heap," is intended to imply ...) has no inherent structure: it's simply a pool of storage, and it's usually big enough to store many megabytes. You simply malloc() the number of bytes you need, and free() those bytes when you're through.
Simply define a type that corresponds to the structure you need to store, then malloc(sizeof(type)). The storage will come from the heap. (And that, basically, is what the heap is for ...)
Incidentally, there's a library function called calloc() which will reserve an area that is "known zero." Furthermore, it might use clever operating-system tricks to do so very efficiently.
Sure:
int memSize = 16*1024*1024;
char* buffer = malloc( memSize );
if ( buffer != 0 )
{
// check contents
for ( i = 0; i < memSize; i++ )
{
if ( buffer[i] != 0 )
{
// holler
break;
}
}
free( buffer );
}
Strictly speaking, code can not check if buffer is not zeroed without risking undefined behavior. Had the type been unsigned char, then no problem. But char, which may be signed, may have a trap value. Attempting to work with that value leads to UB.
char buffer[16*1024*1024];
// Potential UB
if (buffer[0]) ...
Better to use unsigned char which cannot have trap values.
#define N (16LU*1204*1204)
unsigned char *buffer = malloc(N);
if (buffer) {
for (size_t i = 0; i<N; i++) {
if (buffer[i]) Note_NonZeroValue();
}
}
// Clean-up when done.
free(buffer);
buffer = 0;
The tricky thing about C is that even if char does not have a trap value, some smart compiler could identify code is attempting something that is UB per the spec and then optimize if (buffer[0]) into nothing. Reading uninitialized non-unsigned char data is a no-no.
Is there a way to actually create dynamic arrays in C without having to use the stdlib?
Malloc requires the stdlib.h library, I am not allowed to use this in my project.
If anyone has any ideas, please share? Thanks.
malloc is not just a library, it is the way you interface with the Operating System to ask for more memory for the running process. Well, you could ask more memory and manage free/occupied memory yourself, but it would be wrong on many levels.
But, I am inclined to believe that your project is going to run in some kind of platform which does not have an operating system, is it?1 In that case, the faster solution is to first allocate statically some memory in a big global array, and every time you need memory you would ask for a manager responsible for this big array.
Let me give you an example, for the sake of simplicity it will be tiny and not very functional, but it is a very good quick start.
typedef char bool;
#define BLOCK_SIZE 1024 //Each allocation must have in max 1kb
#define MAX_DATA 1024*1024*10 //Our program statically allocates 10MB
#define BLOCKS (MAX_DATA/BLOCK_SIZE)
typedef char Scott_Block[BLOCK_SIZE];
Scott_Block Scott_memory[BLOCKS];
bool Scott_used_memory[BLOCKS];
void* Scott_malloc(unsigned size) {
if( size > BLOCK_SIZE )
return NULL;
unsigned int i;
for(i=0;i<BLOCKS;++i) {
if( Scott_used_memory[i] == 0 ) {
Scott_used_memory[i] = 1;
return Scott_memory[i];
}
}
return NULL;
}
void Scott_free(void* ptr) {
unsigned int pos = ((char*)(ptr)-Scott_memory[0])/BLOCK_SIZE;
printf("Pos %d\n",pos);
Scott_used_memory[pos] = 0;
}
I wrote this code to show how to emulate a memory manager. Let me point out a few improvements that may be done to it.
First, the Scott_used_memory could be a bitmap instead of a bool array.
Second, it does not allocate memory bigger than BLOCK_SIZE, it should search for consecutives blocks to create a bigger block. But for that you would need more control data to tell how much blocks an allocated void* occupies.
Third, the way free memory is searched (linearly) is very slow, usually the blocks creates a link list of free blocks.
But, like I said, this is a great quick start. And depending on your needs this may fulfill it very well.
1 If not, then you have absolutely no reason to not use malloc.
Well why not this program (C99)
#include <stdio.h>
int main(int argc, char *argv[])
{
int sizet, i;
printf("Enter size:");
scanf("%d",&sizet);
int array[sizet];
for(i = 0; i < sizet; i++){
array[i] = i;
}
for(i = 0; i < sizet; i++){
printf("%d", array[i]);
}
return 0;
}
Like a boss! :-)