I was reading in a book:
The virtual address space of a process on a 32 bit machine is 2^32 i.e. 4Gb of space. And every address seen in the program is a virtual address. The 4GB of space is further goes through user/kernel split 3-1GB.
To better understand this, I did malloc() of 5Gb space and tried to print the all addresses. If I print the addresses, How is the application going to print whole 5Gb address when It has only 3GB of virtual address space? Am I missing something here?
malloc() takes size_t as an argument. On 32 bit system it's an alias to some unsigned 32 bit integer type. This means that you just cannot pass any value bigger than 2^32-1 as an argument for malloc() making it impossible request allocation of more than 4GB of memory using this function.
The same is true for all other functions that can be used to allocate memory. Ultimately they all end up as either brk() or mmap syscall. The length argument of mmap() is also of type ssize_t an in case of brk() you have to provide a pointer for the new end of your allocated space. The pointer is again 32 bit.
So there is absolutely no way to tell kernel you would like to get more than 4GB of memory allocated with one call) And it's not an accident - this just wouldn't make any sense anyway.
Now it's true that you could do several calls to malloc or other function that allocates memory, requesting more than 4GB in total. If you try this, the subsequent call (that would cause extending allocated memory to more than 3GB) will fail as there is just no address space available.
So I guess that you either didn't check the malloc return value or you did try to run code like this (or something similar):
int main() {
assert(malloc(5*1<<30));
}
and assumed that you succeeded in allocating 5GB without verifying that your argument overflowed and instead of requesting 5368709120 bytes, you requested 1073741824. One example to verify this on Linux is to use:
$ ltrace ./a.out
__libc_start_main(0x804844c, 1, 0xbfbcea74, 0x80484a0, 0x8048490 <unfinished ...>
malloc(1073741824) = 0x77746008
$
There's already a good answer. Just in case, the size of your virtual address space is easily verifiable like this:
#include <stdlib.h>
#include <stdio.h>
int main()
{
size_t size = (size_t)-1L;
void *foo;
printf("trying to allocate %zu bytes\n", size);
if (!(foo = malloc(size)))
{
perror("malloc()");
}
else
{
free(foo);
}
}
> gcc -m32 -omalloc malloc.c && ./malloc
trying to allocate 4294967295 bytes
malloc(): Cannot allocate memory
This must fail because parts of your address space are already occupied: by the mapped part of the kernel, by mapped shared libraries and by your program, of course.
You cannot do this because there is no function for you to alloc 5GB memory.
Related
I'm studying dynamic memory allocation in C, and I want to ask a question - let us suppose we have a program that receives a text input from the user. We don't know how long that text will be, it could be short, it could also be extremely long, so we know that we have to allocate memory to store the text in a buffer. In cases in which we receive a very long text, is there a way to find out whether we have enough memory space to allocate more memory to the text? Is there a way to have an indication that there is no memory space left?
You can use malloc() function if it returned NULL that means there no enough mem space but if it returned address of the mem it means there are mem space available example:
void* loc = malloc(sizeof(string));
ANSI C has no standard functions to get the size of available free RAM.
You may use platform-specific solutions.
C - Check currently available free RAM?
In C we typically use malloc, calloc and realloc for allocation of dynamic memory. As it has been pointed out in both answers and comments these functions return a NULL pointer in case of failure. So typical C code would be:
SomeType *p = malloc(size_required);
if (p == NULL)
{
// ups... malloc failed... add error handling
}
else
{
// great... p now points to allocated memory that we can use
}
I like to add that on (at least) Linux systems, the return value from malloc (and friends) is not really an out-of-memory indicator.
If the return value is NULL, we know the call failed and that we didn't get any memory that we can use.
But even if the return value is non-NULL, there is no guarantee that the memory really is available.
From https://man7.org/linux/man-pages/man3/free.3.html :
By default, Linux follows an optimistic memory allocation
strategy. This means that when malloc() returns non-NULL there
is no guarantee that the memory really is available. In case it
turns out that the system is out of memory, one or more processes
will be killed by the OOM killer.
We don't know how long that text will be
Sure we do, we always set a maximum limit. Because all user input needs to be sanitised anyway - so we always require a maximum limit on every single user input. If you don't do this, it likely means that your program is broken since it's vulnerable to buffer overruns.
Typically you'll read each line of user input into a local array allocated on the stack. Then you can check if it is valid (are strings null terminated etc) before allocating dynamic memory and then copy it over there.
By checking the return value of malloc etc you'll see if there was enough memory left or not.
There is no standard library function that tells you how much memory is available for use.
The best you can do within the bounds of the standard library is to attempt the allocation using malloc, calloc, or realloc and check the return value - it it’s NULL, then the allocation operation failed.
There may be system-specific routines that can provide that information, but I don’t know of any off the top of my head.
I made a test on linux with 8GB RAM. The overcommit has three main modes 0, 1 and 2 which are default, unlimited, and never:
Default:
$ echo 0 > /proc/sys/vm/overcommit_memory
$ ./a.out
After loop: Cannot allocate memory
size 17179869184
size 400000000
log2(size)34.000000
This means 8.5 GB were successfuly allocated, just about the amount of physical RAM. I tried to tweak it, but without changing swap, which is only 4 GB.
Unlimited:
$ echo 1 > /proc/sys/vm/overcommit_memory
$ ./a.out
After loop: Cannot allocate memory
size 140737488355328
size 800000000000
log2(size)47.000000
48 bits is virtual address size. 140 TB. Physical is only 39 bits (500 GB).
No overcommmit:
$ echo 2 > /proc/sys/vm/overcommit_memory
$ ./a.out
After loop: Cannot allocate memory
size 2147483648
size 80000000
log2(size)31.000000
2 GB is just what free command declares as free. Available are 4.6 GB.
malloc() fails in the same way if the process's resources are restricted - so this ENOMEM does not really specify much. "Cannot allocate memory" (aka ENOMEM aka 12) just says "malloc failed, guess why" or rather "malloc failed, NO more MEMory for you now.".
Well here is a.out which allocates doubling sizes until error.
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <math.h>
int main() {
size_t sz = 4096;
void *p;
while (!errno) {
p = malloc(sz *= 2);
free(p);
}
perror("After loop");
printf("size %ld\n", sz);
printf("size %lx\n", sz);
printf("log2(size)%f\n", log2((double)sz));
}
But I don't think this kind of probing is very useful/
Buffer
we have to allocate memory to store the text in a buffer
But not the whole text at once. With a real buffer (not just an allocated memory as destination) you could read portions of the input and store them away (out of memory onto disk).
Only disadvantage is: if I cannot use a partial input, then all the buffered copying and saving is wasted.
I really wonder what happens if I type and type fast for a couple of billion years -- without a newline.
We can allocate much more than we have as RAM, but we only need a fraction of that RAM: the buffer. But Lundin's answer shows it is much easier (typical) to rely on newlines and maximum length.
getline(3)
This gnu/posix function has the malloc/realloc built in. The paramters are a bit complicated, because a new pointer and size can be returned by reference. And return value of -1 can also mean ENOMEM, not end-of-file.
fgets() is the line-truncating version.
fread() is newline independant, with fixed size. (But you asked about text input - long lines or long overall text, or both?)
Good Q, good As, good comments about "live input":
getline how to limit amount of input as you can with fgets
So I've got an interesting OS based problem for you. I've spent the last few hours conversing with anyone I know who's experienced with C programming, and nobody seems to be able to come up with a definitive answer as to why this behaviour is occurring.
I have a program that is intentionally designed to cause an extreme memory leak, (as an example of what happens when you don't free memory after allocating it). On 64 bit operating systems, (Windows, Linux, etc), it does what it should do. It fills physical ram, then fills the swap space of the OS. In Linux, the process is then terminated by the OS. In Windows however, it is not, and it continues running. The eventual result is a system crash.
Here's the code:
#include <stdlib.h>
#include <stdio.h>
void main()
{
while(1)
{
int *a;
a = (int*)calloc(65536, 4);
}
}
However, if you compile and run this code on a 32 bit Linux distribution, it has no effect on physical memory usage at all. It uses approximately 1% of my 4 GB of allocated RAM, and it never rises after that. I don't have a legitimate copy of 32 Bit Windows to test on, so I can't be certain this occurs on 32 bit Windows as well.
Can somebody please explain why the use of calloc will fill the physical ram of a 64 bit Linux OS, but not a 32 bit Linux OS?
The malloc and calloc functions do not technically allocate memory, despite their name. They actually allocate portions of your program's address space with OS-level read/write permissions. This is a subtle difference and is not relevant most of the time.
This program, as written, only consumes address space. Eventually, calloc will start returning NULL but the program will continue running.
#include <stdlib.h>
// Note main should be int.
int main() {
while (1) {
// Note calloc should not be cast.
int *a = calloc(65536, sizeof(int));
}
}
If you write to the addresses returned from calloc, it will force the kernel to allocate memory to back those addresses.
#include <stdlib.h>
#include <string.h>
int main() {
size_t size = 65536 * 4;
while (1) {
// Allocates address space.
void *p = calloc(size, 1);
// Forces the address space to have allocated memory behind it.
memset(p, 0, size);
}
}
It's not enough to write to a single location in the block returned from calloc because the granularity for allocating actual memory is 4 KiB (the page size... 4 KiB is the most common). So you can get by with just writing to each page.
What about the 64-bit case?
There is some bookkeeping overhead for allocating address space. On a 64-bit system, you get something like 40 or 48 bits of address space, of which about half can be allocated to the program, which comes to at least 8 TiB. On a 32-bit system this comes to 2 GiB or so (depending on kernel configuration).
So on a 64-bit system, you can allocate ~8 TiB, and a 32-bit system you can allocate ~2 GiB, and the overhead is what causes the problems. There is typically a small amount of overhead for each call to malloc or calloc.
See also Why malloc+memset is slower than calloc?
Is there a limit to the amount of memory that can be allocated from a program? By that I mean, is there any protection from a program, for example, that allocates memory in an infinite loop?
When would the call to malloc() return a NULL pointer?
Yes, there is a limit. What that limit is depends on many factors, including (but not limited to):
The instruction set of the program (32-bit binaries have a smaller address space than 64-bit binaries, for example).
How much memory the system has free. ("Memory" here includes virtual memory.)
Any artificial restrictions set by the system administrator or a privileged process (see, for example, setrlimit() and the (obsolete) ulimit() function).
When memory cannot be allocated, malloc() will return NULL. If the system is completely out of memory, your process may be terminated forcefully.
From Wikipedia,
The largest possible memory block malloc can allocate depends on the
host system, particularly the size of physical memory and the
operating system implementation. Theoretically, the largest number
should be the maximum value that can be held in a size_t type, which
is an implementation-dependent unsigned integer representing the size
of an area of memory. The maximum value is 2CHAR_BIT × sizeof(size_t)
− 1, or the constant SIZE_MAX in the C99 standard.
It depends on the operating system and the standard library.
On Linux,
When you run out of address space, malloc() will return NULL.
When you run out of both physical memory and swap space, the OOM killer will run and kill a process to free memory.
I am tackling this issue in Reverse.
See pointer stores addresses of memory blocks. If we able to find maximum address it can store then we can find memory allocated to our program.
Code
#include <stdio.h>
int main()
{
void *p;
printf("%zu",sizeof(p));
return 0;
}
Output
8
Understanding:
pointer size is 8 bytes.
8 bytes -> 64 bits
Max address it can store/ last memory block address: 2^64-1
Memory block addresses: 0, 1, 2, 3, ... 2^64-1
Memory allocated to program: 2^64 byte
Here's the code snippet:
void main() {
int i,*s;
for(i=1;i<=4;i++) {
s=malloc(sizeof(int));
printf("%lu \n",(unsigned long)s);
}
}
The size of int on my comp is 2 bytes, so shouldn't the printf command print address incremented by 16 bits, instead it prints the address as:
2215224120
2215224128
2215224136...
Why is this so?
How memory managed is entirely up to your operating system. It could allocate memory from all over the place, you can absolutely make no assumptions as to where the memory will be.
Most memory allocators also have some overhead, so even a simple 2-byte allocation might take up 8 bytes or more. Besides, addresses might need to be aligned for several reasons (like performance, and because some CPUs even crash when reading from unaligned addresses).
Bottom line - take the return value from malloc as it is, don't make any guesses or assumptions.
Its called alignment. Most CPUs have to align memory on some boundary, and its commonly 4 or 8. If you mis-align an address you will get a segfault or bus error.
malloc() does not provide any such guarantees. It just allocates some memory according to its own memory management decisions and returns you a pointer to that. In fact, many implementations use extra memory right before the pointer returned for memory management metadata.
malloc() gives you an abstraction on the underlying hardware, OS, drivers, etc. The memory allocation pattern may differ from machine to machine due to various parameters.
But the following are few things that always stays right about malloc()
The malloc() function allocates size bytes and returns a pointer to the allocated memory.
The memory is not initialized.
If size is 0,then malloc() returns either NULL, or a unique pointer value that can later be successfully passed to free().
The malloc() returns a pointer to the allocated memory that is suitably aligned for any kind of variable. On error, it returns NULL.
NULL may also be returned by a successful call to malloc() with a size of zero
On a side note, you can use %p format specifier for printing the pointers
I modified the program as follows
#include <stdlib.h>
int main(void) {
int i,*s;
printf("sizeof(int) = %zu \n", sizeof(int));
for(i=1;i<=4;i++) {
if ((s=malloc(sizeof(int))) == NULL) {
printf("unable to allocate memory \n");
return -1;
}
printf("%p \n",s);
}
return 0;
}
The output is as follows:
$ ./a.out
sizeof(int) = 4
0x9d5a008
0x9d5a018
0x9d5a028
0x9d5a038
$
You have no guarantees whatsoever about the pattern of addresses malloc returns to you.
I thought that I couldn't retrieve the length of an allocated memory block like the simple .length function in Java. However, I now know that when malloc() allocates the block, it allocates extra bytes to hold an integer containing the size of the block. This integer is located at the beginning of the block; the address actually returned to the caller points to the location just past this length value. The problem is, I can't access that address to retrieve the block length.
#include <stdlib.h>
#include <stdio.h>
int main(void)
{
char *str;
str = (char*) malloc(sizeof(char)*1000);
int *length;
length = str-4; /*because on 32 bit system, an int is 4 bytes long*/
printf("Length of str:%d\n", *length);
free(str);
}
**Edit:
I finally did it. The problem is, it keeps giving 0 as the length instead of the size on my system is because my Ubuntu is 64 bit. I changed str-4 to str-8, and it works now.
If I change the size to 2000, it produces 2017 as the length. However, when I change to 3000, it gives 3009. I am using GCC.
You don't have to track it by your self!
size_t malloc_usable_size (void *ptr);
But it returns the real size of the allocated memory block!
Not the size you passed to malloc!
What you're doing is definitely wrong. While it's almost certain that the word just before the allocated block is related to the size, even so it probably contains some additional flags or information in the unused bits. Depending on the implementation, this data might even be in the high bits, which would cause you to read the entirely wrong length. Also it's possible that small allocations (e.g. 1 to 32 bytes) are packed into special small-block pages with no headers, in which case the word before the allocated block is just part of another block and has no meaning whatsoever in relation to the size of the block you're examining.
Just stop this misguided and dangerous pursuit. If you need to know the size of a block obtained by malloc, you're doing something wrong.
I would suggest you create your own malloc wrapper by compiling and linking a file which defines my_malloc() and then overwiting the default as follows:
// my_malloc.c
#define malloc(sz) my_malloc(sz)
typedef struct {
size_t size;
} Metadata;
void *my_malloc(size_t sz) {
size_t size_with_header = sz + sizeof(Metadata);
void* pointer = malloc(size_with_header);
// cast the header into a Metadata struct
Metadata* header = (Metadata*)pointer;
header->size = sz;
// return the address starting after the header
// since this is what the user needs
return pointer + sizeof(Metadata);
}
then you can always retrieve the size allocated by subtracting sizeof(Metadata), casting that pointer to Metadata and doing metadata->size:
Metadata* header = (Metadata*)(ptr - sizeof(Metadata));
printf("Size allocated is:%lu", header->size); // don't quote me on the %lu ;-)
You're not supposed to do that. If you want to know how much memory you've allocated, you need to keep track of it yourself.
Looking outside the block of memory returned to you (before the pointer returned by malloc, or after that pointer + the number of bytes you asked for) will result in undefined behavior. It might work in practice for a given malloc implementation, but it's not a good idea to depend on that.
This is not Standard C. However, it is supposed to work on Windows operatings systems and might to be available on other operating systems such as Linux (msize?) or Mac (alloc_size?), as well.
size_t _msize( void *memblock );
_msize() returns the size of a memory block allocated in the heap.
See this link:
http://msdn.microsoft.com/en-us/library/z2s077bc.aspx
This is implementation dependent
Every block you're allocating is precedeed by a block descriptor. Problem is, it dependends on system architecture.
Try to find the block descriptor size for you own system. Try take a look at you system malloc man page.