why when declare a dynamic variable, reserves more than specified - c

I have the following code
char *str = (char *) malloc(sizeof(char)*5);
printf("Enter a string: ");
scanf("%s", str);
printf("%s\n", str);
This code supposed will reserve 5 places in memory ex: 5 * 8 bit, this mean that will stores five characters.
Now, when enter any number of characters (not up to five only), does not occur any error whether in compile time or in run time.
is this normal? or there is an error I did not understand in my code ?

C will not prevent you from shooting yourself in the foot. scanf will happily overwrite the buffer given to it, invoking undefined behavior. This error is not reliably detectable at runtime and will silently corrupt memory and break the runtime of your application in unpredictable ways.
It is your responsibility as the programmer to prevent this from happening - in this case, for example, by replacing scanf with much safer fgets.

You've allocated 5 bytes, but scanf will happily continue writing into *un*allocated memory. This is a buffer overflow, and the C runtime assumes you know what you are doing; no bounds checking is performed.
Don't use scanf. Use fgets to read a line of at most 5 bytes:
char *str = malloc(5);
fgets(str, 5, stdin);
If you type a line with more than 4 characters, fgets simply discards the extra characters.

You have indeed allocated space only for 5 bytes (i.e., strings up to 4 characters + the terminating NUL), but your scanf does not know that. It will blissfully overflow the allocated buffer. You are responsible for ensuring that this does not happen; you need to change your code accordingly. Overflowing the buffer is undefined behaviour so in theory “anything” may happen or not happen as a consequence, but in practice it tends to overwrite other things in adjacent memory, corrupting the contents of other variables and possibly even program code (leading to an exploit where a malicious user can craft the input string so as to execute arbitrary code).
(As an additional note, sizeof char is always 1 by definition, you do not need to multiply with it.)

You asked for 5 chars, you get memory that can contain at least 5 chars; the allocator is free to give you more memory for its internal reasons, but there's no standard way to know how much more it gave to you.
Besides, normally there's no immediate error even if you actually overflow a buffer like you did - the standard does not mandate bounds checking, it just says that this is "undefined behavior", i.e. anything can happen, from "your program seems to work" to "universe death" passing through nasal demons.
What actually happens in most implementation is that you will happily write over whatever happens to be after your buffer - typically other local variables or the return address for stack variables, other memory blocks and allocator's data structures for heap allocations. The effect usually is "impossible" bugs due to changing unrelated variables, heap corruption (typically discovered when you call free), segmentation faults and the like.
You must be very careful with this kind of errors, since buffer overflows not only undermine the stability of your application, but can also be exploited for security breaches. Thus, never carelessly write in a buffer - always use functions that allow you to specify the total size of the buffer and that stop at its boundaries.

When you allocate memory dynamically, more space will be allocated than the specified. Generally malloc implementations round the size requested to the multiple of 8 or 16 or some other 2^n. That's may be the one of the reason that you are not getting any error.

Related

How come this piece of code is working? It should crash due to buffer over-run but I am getting the output as "stackoverflow"

Thisprogram should crash due to buffer overrun. But I am getting output as "stackoverflow". How?
#include<stdio.h>
#include<string.h>
int main()
{
char *src;
char dest[10];
src = (char*)malloc(5);
strcpy(src, "stackoverflow");
printf("%s\n", src);
return 0;
}
It does crash due to a buffer overrun.
The behaviour of your code is undefined as you are overrunning your buffer. You can't expect the behaviour to be in any way predictable.
It's difficult - and not required by the c standard - to issue an appropriate diagnostic in such cases.
Buffer overflows are not guaranteed to crash you: they cause undefined behavior. While a lot of platforms make the sequence of events that may or may not culminate in a crash rather predictable, one very important thing to consider is that the possible crash almost never happens at the same time that the damage is caused.
In a stack buffer overflow, possible crashes happens when you read the value of a variable that sat on the stack and was overflowed onto, or when you return from the function and the return address has been overwritten.
However, you're not overflowing a stack buffer: you're overflowing a heap buffer that you got from malloc. Typically, possible crashes there happens when you free that buffer or try to use a buffer that happened to be contiguous to it (there is, on purpose, no way to predict this). You allocate only one buffer and never free it, so you're not going to observe any problem from a small overflow.
In addition, I don't know any mainstream malloc implementation on desktops that returns blocks of less than 32 bytes, so even though you said malloc(5), you probably have room for 32 bytes, so your short write is not overflowing on anything (although you must not rely on this).
The only case where an overflow will straight-up crash your program is if you overflow to a memory location that has not been assigned any meaning. For instance, if you do something like memset('c', dest, 100000000), that will probably happen because you'll be busting out of the memory area that is reserved to the stack and there is probably nothing next to it.
Copying to a buffer that is too small is undefined behavior; that doesn't necessarily mean it's guaranteed to crash. For all we know those other bytes occupying the "overflow\0" part of your string aren't being used anyway.
Because unless you are using some overrun-protection library/debugging tool, nothing will notice that you’re writing to memory you shouldn’t be. If you run this under valgrind it will display that you wrote to memory you shouldn’t have. But malloc(5) returns a pointer into a likely larger block of memory, so the chances of the buffer overflow resulting in trying to access an unmapped address is low. But if you had other malloc() calls, etc., you might notice the "overflow" part ending up in one of those other buffers—but it really depends on the implementation of malloc() and what code that overflow breaks won’t be deterministic.
Your buffer is allocated in the heap so your pointer src is pointing to buffer of char basicly of size 5 bytes because the size of char is 1 byte, however if the size of this allocated buffer + the added size by copying the string into this buffer doesn't exceed the size of the heap then it will work ,in the other hand if the total size try to overwrite an allocat memory by other pointer then you get the crash or the size exceed the heap size limitation you get the crash
As conclusion avoid this kind of code because you will get an unexpected behavior.

Char array can hold more than expected

I tried to run this code in C and expected runtime error but actually it ran without errors. Can you tell me the reason of why this happens?
char str[10];
scanf("%s",str);
printf("%s",str);
As I initialized the size of array as 10, how can code prints string of more than 10 letters?
As soon as you read or write from an array outside of its bounds, you invoke undefined behavior.
Whenever this happens, the program may do anything it wants. It is even allowed to play you a birthday song although it's not your birthday, or to transfer money from your bank account. Or it may crash, or delete files, or just pretend nothing bad happened.
In your case, it did the latter, but it is in no way guaranteed.
To learn further details about this phenomenon, read something about exploiting buffer overflows, this is a large topic.
C doesn't perform any bounds checking on the array. This can lead to buffer overflows attack on your executable.
The bound checking should be done at the user end to make it anti-buffer overflow.
Instead of typing in magic numbers when taking input from fgets in an array, always use the sizeof(array) - 1 operator on the array to take in that much, -1 for leaving a space for '\0' character.
This is a good question. And the answer
is that it there is indeed a memory problem
The string is read and stored from the address of str
up until the length of the actual read string demands,
and it exceeds the place you allocated for it.
Now, it may be not crash immediately, or even ever for
short programs, but it's very likely that when you expand
the program, and define other variables, this string will
overrun them, creating weird bugs of all kinds, and it may
eventually also crash.
In short, this is a real error, but it's not uncommon to have
memory bugs like this one which do not affect at first, but
do create bugs or crash the program later.

How can gets() exceed memory allocated by malloc()?

I have a doubt in malloc and realloc function, When I am using the malloc function for
allocating the memory for the character pointer 10 Bytes. But when I am assigning the value
for that character pointer, it takes more than 10 bytes if I try to assign. How it is possible.
For example:
main()
{
char *ptr;
ptr=malloc(10*sizeof(char));
gets("%s",ptr);
printf("The String is :%s",ptr);
}
Sample Output:
$./a.out
hello world this is for testing
The String is :hello world this is for testing
Now look at the output the number of characters are more than 10 bytes.
How this is possible, I need clear explanation.
Thanks in Advance.
That's why using gets is an evil thing. Use fgets instead.
malloc has nothing to do with it.
Don't use gets().
Admittedly, rationale would be useful, yes? For one thing, gets() doesn't allow you to specify the length of the buffer to store the string in. This would allow people to keep entering data past the end of your buffer, and believe me, this would be Bad News.
Detailed explanation:
First, let's look at the prototype for this function:
#include <stdio.h>
char *gets(char *s);
You can see that the one and only parameter is a char pointer. So then, if we make an array like this:
char buf[100];
we could pass it to gets() like so:
gets(buf)
So far, so good. Or so it seems... but really our problem has already begun. gets() has only received the name of the array (a pointer), it does not know how big the array is, and it is impossible to determine this from the pointer alone. When the user enters their text, gets() will read all available data into the array, this will be fine if the user is sensible and enters less than 99 bytes. However, if they enter more than 99, gets() will not stop writing at the end of the array. Instead, it continues writing past the end and into memory it doesn't own.
This problem may manifest itself in a number of ways:
No visible affect what-so-ever
Immediate program termination (a crash)
Termination at a later point in the programs life time (maybe 1 second later, maybe 15 days later)
Termination of another, unrelated program
Incorrect program behavior and/or calculation
... and the list goes on. This is the problem with "buffer overflow" bugs, you just can't tell when and how they'll bite you.
You just got an undefined behavior. (More information here)
Use fgets and not gets !
malloc reserves memory for your use. The rules are that you are permitted to use memory allocated this way and in other ways (as by defining automatic or static objects) and you are not permitted to use memory not allocated for your use.
However, malloc and the operating system do not completely enforce these rules. The obligation to obey them belongs to you, not to malloc or the operating system.
General-purpose operating systems have memory protection that prevents one process from reading or altering the memory of another process without permission. It does not prevent one process from reading or altering its own memory in improper ways. When you access bytes that you are not supposed to, there is no mechanism that always prevents this. The memory is there, and you can access it, but you should not.
gets is a badly designed routine, because it will write any amount of memory if the input line is long enough. This means you have no way to prevent it from writing more memory than you have allocated. You should use fgets instead. It has a parameter that limits the amount of memory it may write.
General-purpose operating systems allocate memory in chunks known as pages. The size of a page might be 4096 bytes. When malloc allocates memory for you, the smallest size it can get from the operating system is one page. When you ask for ten bytes, malloc will get a page, if necessary, select ten bytes in it, keep records that a small portion of the page has been allocated but the rest is available for other use, and return a pointer to the ten bytes to you. When you do further allocations, malloc might allocate additional bytes from the same page.
When you overrun the bytes that have been allocated to you, you are violating the rules. If no other part of your process is using those bytes, you might get away with this violation. But you might alter data that malloc is using to keep track of allocations, you might alter data that is part of a separate allocation of memory, or you might, if you go far enough, alter data that is in a separate page completely and is in use by a completely different part of your program.
A general-purpose operating system does provide some protection against improper use of memory within your process. If you attempt to access pages that are not mapped in your virtual address space or you attempt to modify pages that are marked read-only, a fault will be triggered, and your process will be interrupted. However, this protection only applies at a page level, and it does not protect against you incorrectly using the memory that is allocated to your process.
The malloc will reserve 10 bytes (in your case assuming the char have 1 byte) and will return the start point of the reserved area.
You do a gets, so it get the text you typed and write using your pointer.
Windows/Mac os x/ Unix (Advances OS'S) have protected memory.
That means, when you do a malloc/new the OS reserve that memory area for your program. IF another program tries to write in that area an segmentation fault happens because you wrote on an area that you should not write.
You reserved 10 bytes.
IF the byte 11, 12, 13, 14 are not yet reserved for another program it will not crash, if it is your program will access an protected area and crash.
OP: ... number of characters are more than 10 bytes. How this is possible?
A: Writing outside allocated memory as done by gets() is undefined behavior - UB. UB ranges from working just as you want to crash-and-burn.
The real issue is not the regrettable use of gets(), but the idea that C language should prevent memory access mis-use. C does not prevent it. The code should prevent it. C is not a language with lots of behind-the-scenes protection. If writing to ptr[10] is bad, don't do it. Don't call functions that might do it such as gets(). Like many aspects of life - practice safe computing.
C gives you lots of rope to do all sorts of things including enough rope to hang yourself.

malloc() in C not working as expected

I'm new to C. Sorry if this has already been answered, I could'n find a straight answer, so here we go..
I'm trying to understand how malloc() works in C. I have this code:
#define MAXLINE 100
void readInput(char **s)
{
char temp[MAXLINE];
printf("Please enter a string: ");
scanf("%s", temp);
*s = (char *)malloc((strlen(temp)+1)*sizeof(char)); // works as expected
//*s = (char *)malloc(2*sizeof(char)); // also works even when entering 10 chars, why?
strcpy ((char *)*s, temp);
}
int main()
{
char *str;
readInput(&str);
printf("Your string is %s\n", str);
free(str);
return 0;
}
The question is why doesn't the program crash (or at least strip the remaining characters) when I call malloc() like this:
*s = (char *)malloc(2*sizeof(char)); // also works even when entering 10 chars, why?
Won't this cause a buffer overflow if I enter a string with more than two characters? As I understood malloc(), it allocates a fixed space for data, so surely allocating the space for only two chars would allow the string to be maximum of one usable character ('0\' being the second), but it still is printing out all the 10 chars entered.
P.S. I'm using Xcode if that makes any difference.
Thanks,
Simon
It works out fine because you're lucky! Usually, a block a little larger than just 2 bytes is given to your program by your operating system.
If the OS actually gave you 16 bytes when you asked for 2 bytes, you could write 16 bytes without the OS taking notice of it. However if you had another malloc() in your program which used the other 14 bytes, you would write over that variables content.
The OS doesn't care about you messing about inside your own program. Your program will only crash if you write outside what the OS has given you.
Try to write 200 bytes and see if it crashes.
Edit:
malloc() and free() uses some of the heap space to maintain information about allocated memory. This information is usually stored in between the memory blocks. If you overflow a buffer, this information may get overwritten.
Yes writing more data into an allocated buffer is a buffer overflow. However there is no buffer overflow check in C and if there happens to be valid memory after your buffer than your code will appear to work correctly.
However what you have done is write into memory that you don't own and likely have corrupted the heap. Your next call to free or malloc will likely crash, or if not the next call, some later call could crash, or you could get lucky and malloc handed you a larger buffer than you requested, in which case you'll never see an issue.
Won't this cause a buffer overflow if I enter a string with more than two characters?
Absolutely. However, C does no bounds checking at runtime; it assumes you knew what you were doing when you allocated the memory, and that you know how much is available. If you go over the end of the buffer, you will clobber whatever was there before.
Whether that causes your code to crash or not depends on what was there before and what you clobbered it with. Not all overflows will kill your program, and overflow in the heap may not cause any (obvious) problems at all.
This is because even if you did not allocate the memory, the memory exists.
You are accessing data that is not yours, and probably that with a good debugger, or static analyzer you would have seen the error.
Also if you have a variable that is just behind the block you allocated it will probably be overriden by what you enter.
Simply this is one of the case of undefined behavior. You are unlucky that you are getting the expected result.
It does cause a buffer overflow. But C doesn’t do anything to prevent a buffer overflow. Neither do most implementations of malloc.
In general, a crash from a buffer overflow only occurs when...
It overflows a page—the unit of memory that malloc actually gets from the operating system. Malloc will fulfill many individual allocation requests from the same page of memory.
The overflow corrupts the memory that follows the buffer. This doesn’t cause an immediate crash. It causes a crash later when other code runs that depends upon the contents of that memory.
(...but these things depend upon the specifics of the system involved.)
It is entirely possible, if you are lucky, that a buffer overflow will never cause a crash. Although it may create other, less noticeable problems.
malloc() is the function call which is specified in Stdlib.h header file. If you are using arrays, you have to fix your memory length before utilize it. But in malloc() function, you can allocate the memory when you need and in required size. When you allocate the memory through malloc() it will search the memory modules and find the free block. even the memory blocks are in different places, it will assign a address and connect all the blocks.
when your process finish, you can free it. Free means, assigning a memory is in RAM only. once you process the function and make some data, you will shift the data to hard disk or any other permenant storage. afterwards, you can free the block so you can use for another data.
If you are going through pointer function, with out malloc() you can not make data blocks.
New() is the keyword for c++.
When you don't know when you are programming how big is the space of memory you will need, you can use the function malloc
void *malloc(size_t size);
The malloc() function shall allocate unused space for an object whose size in bytes is specified by size and whose value is unspecified.
how does it work is the question...
so
your system have the free chain list, that lists all the memory spaces available, the malloc search this list until it finds a space big enough as you required. Then it breaks this space in 2, sends you the space you required and put the other one back in the list. It breaks in pieces of size 2^n that way you wont have weird space sizes in your list, what makes it easy just like Lego.
when you call 'free' your block goes back to the free chain list.

Array is larger than allocated?

I have an array that's declared as char buff[8]. That should only be 8 bytes, but looking as the assembly and testing the code, I get a segmentation fault when I input something larger than 32 characters into that buff, whereas I would expect it to be for larger than 8 characters. Why is this?
What you're saying is not a contradiction:
You have space for 8 characters.
You get an error when you input more than 32 characters.
So what?
The point is that nobody told you that you would be guaranteed to get an error if you input more than 8 characters. That's simply undefined behaviour, and anything can (and will) happen.
You absolutely mustn't think that the absence of obvious misbehaviour is proof of the correctness of your code. Code correctness can only be verified by checking the code against the rules of the language (though some automated tools such as valgrind are an immense help).
Writing beyond the end of the array is undefined behavior. Undefined behavior means nothing (including a segmentation fault) is guaranteed.
In other words, it might do anything. More practical, it's likely the write didn't touch anything protected, so from the point of view of the OS everything is still OK until 32.
This raises an interesting point. What is "totally wrong" from the point of view of C might be OK with the OS. The OS only cares about what pages you access:
Is the address mapped for your process ?
Does your process have the rights ?
You shouldn't count on the OS slapping you if anything goes wrong. A useful tool for this (slapping) is valgrind, if you are using Unix. It will warn you if your process is doing nasty things, even if those nasty things are technically OK with the OS.
C arrays have no bound checking.
As other said, you are hitting undefined behavior; until you stay inside the bounds of the array, everything works fine. If you cheat, as far as the standard is concerned, anything can happen, including your program seeming to work right as well as the explosion of the Sun.
What happens in practice is that with stack-allocated variables you are likely to overwrite other variables on the stack, getting "impossible" bugs, or, if you hit a canary value put by the compiler, it may detect the buffer overflow on return from the function. For variables allocated in the so-called heap, the heap allocator may have given some more room than requested, so the mistake may be less easy to spot, although you may easily mess up the internal structures of the heap.
In both cases you can also hit a protected memory page, which will result in your program being terminated forcibly (for the stack this happens less often because usually you have to overwrite the entire stack to get to a protected page).
Your declaration char buff[8] sounds like a stack allocated variable, although it could be heap allocated if part of a struct. Accessing out of bounds of an array is undefined behaviour and is known as a buffer overrun. Buffer overruns on stack allocated memory may corrupt the current stack frame and possibly other stack frames in the call stack. With undefined behaviour, anything could happen, including no apparent error. You would not expect a seg fault immediately because the stack is typically when the thread starts.
For heap allocated memory, memory managers typically allocate large blocks of memory and then sub-allocate from those larger blocks. That is why you often don't get a seg fault when you access beyond the end of a block of memory.
It is undefined behaviour to access beyond the end of a memory block. And it is perfectly valid, according to the standard, for such out of bounds accesses to result in seg faults or indeed an apparently successful read or write. I say apparently successful because if you are writing then you will quite possibly produce a heap corruption by writing out of bounds.
Unless you are not telling us something you answered your owflown question.
declaring
char buff[8] ;
means that the compiler grabs 8 bytes of memory. If you try and stuff 32 char's into it you should get a seg fault, that's called a buffer overflow.
Each char is a byte ( unless you are doing unicode in which it is a word ) so you are trying to put 4x the number of chars that will fit in your buffer.
Is this your first time coding in C ?

Resources