How does creating a dynamically allocated string in C work?

How does creating a dynamically allocated string in C work? - c

I don't understand how dynamically allocated strings in C work. Below, I have an example where I think I have created a pointer to a string and allocated it 0 memory, but I'm still able to give it characters. I'm clearly doing something wrong, but what?
#include <stdlib.h>
#include <stdio.h>
int main(int argc, char *argv[])
{
char *str = malloc(0);
int i;
str[i++] = 'a';
str[i++] = 'b';
str[i++] = '\0';
printf("%s\n", str);
return 0;
}

What you're doing is undefined behavior. It might appear to work now, but is not required to work, and may break if anything changes.
malloc normally returns a block of memory of the given size that you can use. In your case, it just so happens that there's valid memory outside of that block that you're touching. That memory is not supposed to be touched; malloc might use that memory for internal housekeeping, it might give that memory as the result of some malloc call, or something else entirely. Whatever it is, it isn't yours, and touching it produces undefined behavior.

Section 7.20.3 of the current C standard states in part:
"If the size of the space requested is zero, the behavior is
implementation defined: either a null pointer is returned, or the
behavior is as if the size were some nonzero value, except that the
returned pointer shall not be used to access an object."
This will be implementation defined. Either it could send a NULL pointer or as mentioned something that cannot be referenced

Your are overwriting non-allocated memory. This might looks like working. But you are in trouble when you call free where the heap function tries to gives the memory block back.
Each malloc() returned chunk of memory has a header and a trailer. These structures hold at least the size of the allocated memory. Sometimes yout have additional guards. You are overwriting this heap internal structures. That's the reason why free() will complain or crash.
So you have an undefined behavior.

By doing malloc(0) you are creating a NULL pointer or a unique pointer that can be passed to free. Nothing wrong with that line. The problem lies when you perform pointer arithmetic and assign values to memory you have not allocated. Hence:
str[i++] = 'a'; // Invalid (undefined).
str[i++] = 'b'; // Invalid (undefined).
str[i++] = '\0'; // Invalid (undefined).
printf("%s\n", str); // Valid, (undefined).
It's always good to do two things:
Do not malloc 0 bytes.
Check to ensure the block of memory you malloced is valid.
... to check to see if a block of memory requested from malloc is valid, do the following:
if ( str == NULL ) exit( EXIT_FAILURE );
... after your call to malloc.

Your malloc(0) is wrong. As other people have pointed out that may or may not end up allocating a bit of memory, but regardless of what malloc actually does with 0 you should in this trivial example allocate at least 3*sizeof(char) bytes of memory.
So here we have a right nuisance. Say you allocated 20 bytes for your string, and then filled it with 19 characters and a null, thus filling the memory. So far so good. However, consider the case where you then want to add more characters to the string; you can't just out them in place because you had allocated only 20 bytes and you had already used them. All you can do is allocate a whole new buffer (say, 40 bytes), copy the original 19 characters into it, then add the new characters on the end and then free the original 20 bytes. Sounds inefficient doesn't it. And it is inefficient, it's a whole lot of work to allocate memory, and sounds like an specially large amount of work compared to other languages (eg C++) where you just concatenate strings with nothing more than str1 + str2.
Except that underneath the hood those languages are having to do exactly the same thing of allocating more memory and copying existing data. If one cares about high performance C makes it clearer where you are spending time, whereas the likes of C++, Java, C# hide the costly operations from you behind convenient-to-use classes. Those classes can be quite clever (eg allocating more memory than strictly necessary just in case), but you do have to be on the ball if you're interested in extracting the very best performance from your hardware.
This sort of problem is what lies behind the difficulties that operations like Facebook and Twitter had in growing their services. Sooner or later those convenient but inefficient class methods add up to something unsustainable.

Related

Dynamic memory allocation and pointers related concept doubts

On the first note: it is a new concept to me!!
I studied pointers and dynamic memory allocations and executed some program recently and was wondering in statement char*p="Computers" the string is stored in some memory location and the base address,
i.e the starting address of the string is stored in p, now I noticed I can perform any desired operations on the string, now my doubt is why do we use a special statement like malloc and calloc when we can just declare a string like this of the desired length.
If my understanding of the concept Is wrong please explain.
Thanks in advance.

In this declaration
char*p="Computers";
the pointer p is initialized by the address of the first character of the string literal "Computers".
String literals have the static storage duration. You may not change a string literal as for example
p[0] = 'c';
Any attempt to change a string literal results in undefined behavior.
The function malloc is used to allocate memory dynamically. For example if you want to create dynamically a character array that will contain the string "Computers" you should write
char *p = malloc( 10 ); // the same as `malloc( 10 * sizeof( char ) )`
strcpy( p, "Computers" );
You may change the created character array. For example
p[0] = 'c';
After the array is not required any more you should free the allocated memory like
free( p );
Otherwise the program can have a memory leak.

A simple answer to that would be by doing
char *p = "Computers";
you are basically declaring a fixed constant string. With that means you cannot edit anything inside the string. Trying to do so may result in Segmentation Fault. Using malloc and calloc would allow us to edit the string.
Simply do this on p[0] = 'c' and you will see the result

A statement like
char *p = "Computers";
is not an example of dynamic memory allocation. The memory for the string literal is set aside when the program starts up and held until the program terminates. You can’t resize that memory, and you’re not supposed to modify it (the behavior on doing so is undefined - it may work as expected, it may crash outright, it may do anything in between).
We use malloc, calloc, and realloc to allocate memory at runtime that needs to be writable, resizable, and doesn’t go away until we explicitly release it.
We have to use pointers to reference dynamically-allocated memory because that’s just how the language is designed, but pointers play a much larger role in C programming than just tracking dynamic memory.

as a novice, I described below as my own thinking…
Dynamic memory completely depends on the pointer. I mean without pointer knowledge you are able to cope up with dynamic memory allocation. (stdlib) library function where store calloc, malloc, relalloc and free.
malloc initialized no bit mentioned, calloc mainly used for the array.
realloc used to increase or decrease size.
To simply say it is not as hard as what you first think. if you declare an array[500] initial declare but you used 100 and the rest of 400 bits remove to use dynamic memory.

strcpy working no matter the malloc size?

I'm currently learning C programming and since I'm a python programmer, I'm not entirely sure about the inner workings of C. I just stumbled upon a really weird thing.
void test_realloc(){
// So this is the original place allocated for my string
char * curr_token = malloc(2*sizeof(char));
// This is really weird because I only allocated 2x char size in bytes
strcpy(curr_token, "Davi");
curr_token[4] = 'd';
// I guess is somehow overwrote data outside the allocated memory?
// I was hoping this would result in an exception ( I guess not? )
printf("Current token > %s\n", curr_token);
// Looks like it's still printable, wtf???
char *new_token = realloc(curr_token, 6);
curr_token = new_token;
printf("Current token > %s\n", curr_token);
}
int main(){
test_realloc();
return 0;
}
So the question is: how come I'm able to write more chars into a string than is its allocated size? I know I'm supposed to handle mallocated memory myself but does it mean there is no indication that something is wrong when I write outside the designated memory?
What I was trying to accomplish
Allocate a 4 char ( + null char ) string where I would write 4 chars of my name
Reallocate memory to acomodate the last character of my name

know I'm supposed to handle mallocated memory myself but does it mean there is no indication that something is wrong when I write outside the designated memory?
Welcome to C programming :). In general, this is correct: you can do something wrong and receive no immediate feedback that was the case. In some cases, indeed, you can do something wrong and never see a problem at runtime. In other cases, however, you'll see crashes or other behaviour that doesn't make sense to you.
The key term is undefined behavior. This is a concept that you should become familiar with if you continue programming in C. It means just like it sounds: if your program violates certain rules, the behaviour is undefined - it might do what you want, it might crash, it might do something different. Even worse, it might do what you want most of the time, but just occasionally do something different.
It is this mechanism which allows C programs to be fast - since they don't at runtime do a lot of the checks that you may be used to from Python - but it also makes C dangerous. It's easy to write incorrect code and be unaware of it; then later make a subtle change elsewhere, or use a different compiler or operating system, and the code will no longer function as you wanted. In some cases this can lead to security vulnerabilities, since unwanted behavior may be exploitable.

Suppose that you have an array as shown below.
int arr[5] = {6,7,8,9,10};
From the basics of arrays, name of the array is a pointer pointing to the base element of the array. Here, arr is the name of the array, which is a pointer, pointing to the base element, which is 6. Hence,*arr, literally, *(arr+0) gives you 6 as the output and *(arr+1) gives you 7 and so on.
Here, size of the array is 5 integer elements. Now, try accessing the 10th element, though the size of the array is 5 integers. arr[10]. This is not going to give you an error, rather gives you some garbage value. As arr is just a pointer, the dereference is done as arr+0,arr+1,arr+2and so on. In the same manner, you can access arr+10 also using the base array pointer.
Now, try understanding your context with this example. Though you have allocated memory only for 2 bytes for character, you can access memory beyond the two bytes allocated using the pointer. Hence, it is not throwing you an error. On the other hand, you are able to predict the output on your machine. But it is not guaranteed that you can predict the output on another machine (May be the memory you are allocating on your machine is filled with zeros and may be those particular memory locations are being used for the first time ever!). In the statement,
char *new_token = realloc(curr_token, 6); note that you are reallocating the memory for 6 bytes of data pointed by curr_token pointer to the new_tokenpointer. Now, the initial size of new_token will be 6 bytes.

Usually malloc is implemented such a way that it allocates chunks of memory aligned to paragraph (fundamental alignment) that is equal to 16 bytes.
So when you request to allocate for example 2 bytes malloc actually allocates 16 bytes. This allows to use the same chunk of memory when realloc is called.
According to the C Standard (7.22.3 Memory management functions)
...The pointer returned if the allocation succeeds is suitably aligned so
that it may be assigned to a pointer to any type of object
with a fundamental alignment requirement and then used to access such an
object or an array of such objects in the space allocated
(until the space is explicitly deallocated).
Nevertheless you should not rely on such behavior because it is not normative and as result is considered as undefined behavior.

No automatic bounds checking is performed in C.
The program behaviour is unpredictable.
If you go writing in the memory reserved for another process, you will end with a Segmentation fault, otherwise you will only corrupt data, ecc...

using malloc over array

May be similar question found on SO. But, I didn't found that, here is the scenario
Case 1
void main()
{
char g[10];
char a[10];
scanf("%[^\n] %[^\n]",a,g);
swap(a,g);
printf("%s %s",a,g);
}
Case 2
void main()
{
char *g=malloc(sizeof(char)*10);
char *a=malloc(sizeof(char)*10);
scanf("%[^\n] %[^\n]",a,g);
swap(a,g);
printf("%s %s",a,g);
}
I'm getting same output in both case. So, my question is when should I prefer malloc() instead of array or vice-verse and why ?? I found common definition, malloc() provides dynamic allocation. So, it is the only difference between them ?? Please any one explain with example, what is the meaning of dynamic although we are specifying the size in malloc().

The principle difference relates to when and how you decide the array length. Using fixed length arrays forces you to decide your array length at compile time. In contrast using malloc allows you to decide the array length at runtime.
In particular, deciding at runtime allows you to base the decision on user input, on information not known at the time you compile. For example, you may allocate the array to be a size big enough to fit the actual data input by the user. If you use fixed length arrays, you have to decide at compile time an upper bound, and then force that limitation onto the user.
Another more subtle issue is that allocating very large fixed length arrays as local variables can lead to stack overflow runtime errors. And for that reason, you sometimes prefer to allocate such arrays dynamically using malloc.

Please any one explain with example, what is the meaning of dynamic although we are specifying the size.
I suspect this was significant before C99. Before C99, you couldn't have dynamically-sized auto arrays:
void somefunc(size_t sz)
{
char buf[sz];
}
is valid C99 but invalid C89. However, using malloc(), you can specify any value, you don't have to call malloc() with a constant as its argument.
Also, to clear up what other purpose malloc() has: you can't return stack-allocated memory from a function, so if your function needs to return allocated memory, you typically use malloc() (or some other member of the malloc familiy, including realloc() and calloc()) to obtain a block of memory. To understand this, consider the following code:
char *foo()
{
char buf[13] = "Hello world!";
return buf;
}
Since buf is a local variable, it's invalidated at the end of its enclosing function - returning it results in undefined behavior. The function above is erroneous. However, a pointer obtained using malloc() remains valid through function calls (until you don't call free() on it):
char *bar()
{
char *buf = malloc(13);
strcpy(buf, "Hello World!");
return buf;
}
This is absolutely valid.

I would add that in this particular example, malloc() is very wasteful, as there is more memory allocated for the array than what would appear [due to overhead in malloc] as well as the time it takes to call malloc() and later free() - and there's overhead for the programmer to remember to free it - memory leaks can be quite hard to debug.
Edit: Case in point, your code is missing the free() at the end of main() - may not matter here, but it shows my point quite well.
So small structures (less than 100 bytes) should typically be allocated on the stack. If you have large data structures, it's better to allocate them with malloc (or, if it's the right thing to do, use globals - but this is a sensitive subject).
Clearly, if you don't know the size of something beforehand, and it MAY be very large (kilobytes in size), it is definitely a case of "consider using malloc".
On the other hand, stacks are pretty big these days (for "real computers" at least), so allocating a couple of kilobytes of stack is not a big deal.

What happens to memory after '\0' in a C string?

Surprisingly simple/stupid/basic question, but I have no idea: Suppose I want to return the user of my function a C-string, whose length I do not know at the beginning of the function. I can place only an upper bound on the length at the outset, and, depending on processing, the size may shrink.
The question is, is there anything wrong with allocating enough heap space (the upper bound) and then terminating the string well short of that during processing? i.e. If I stick a '\0' into the middle of the allocated memory, does (a.) free() still work properly, and (b.) does the space after the '\0' become inconsequential? Once '\0' is added, does the memory just get returned, or is it sitting there hogging space until free() is called? Is it generally bad programming style to leave this hanging space there, in order to save some upfront programming time computing the necessary space before calling malloc?
To give this some context, let's say I want to remove consecutive duplicates, like this:
input "Hello oOOOo !!" --> output "Helo oOo !"
... and some code below showing how I'm pre-computing the size resulting from my operation, effectively performing processing twice to get the heap size right.
char* RemoveChains(const char* str)
{
if (str == NULL) {
return NULL;
}
if (strlen(str) == 0) {
char* outstr = (char*)malloc(1);
*outstr = '\0';
return outstr;
}
const char* original = str; // for reuse
char prev = *str++; // [prev][str][str+1]...
unsigned int outlen = 1; // first char auto-counted
// Determine length necessary by mimicking processing
while (*str) {
if (*str != prev) { // new char encountered
++outlen;
prev = *str; // restart chain
}
++str; // step pointer along input
}
// Declare new string to be perfect size
char* outstr = (char*)malloc(outlen + 1);
outstr[outlen] = '\0';
outstr[0] = original[0];
outlen = 1;
// Construct output
prev = *original++;
while (*original) {
if (*original != prev) {
outstr[outlen++] = *original;
prev = *original;
}
++original;
}
return outstr;
}

If I stick a '\0' into the middle of the allocated memory, does
(a.) free() still work properly, and
Yes.
(b.) does the space after the '\0' become inconsequential? Once '\0' is added, does the memory just get returned, or is it sitting there hogging space until free() is called?
Depends. Often, when you allocate large amounts of heap space, the system first allocates virtual address space - as you write to the pages some actual physical memory is assigned to back it (and that may later get swapped out to disk when your OS has virtual memory support). Famously, this distinction between wasteful allocation of virtual address space and actual physical/swap memory allows sparse arrays to be reasonably memory efficient on such OSs.
Now, the granularity of this virtual addressing and paging is in memory page sizes - that might be 4k, 8k, 16k...? Most OSs have a function you can call to find out the page size. So, if you're doing a lot of small allocations then rounding up to page sizes is wasteful, and if you have a limited address space relative to the amount of memory you really need to use then depending on virtual addressing in the way described above won't scale (for example, 4GB RAM with 32-bit addressing). On the other hand, if you have a 64-bit process running with say 32GB of RAM, and are doing relatively few such string allocations, you have an enormous amount of virtual address space to play with and the rounding up to page size won't amount to much.
But - note the difference between writing throughout the buffer then terminating it at some earlier point (in which case the once-written-to memory will have backing memory and could end up in swap) versus having a big buffer in which you only ever write to the first bit then terminate (in which case backing memory is only allocated for the used space rounded up to page size).
It's also worth pointing out that on many operating systems heap memory may not be returned to the Operating System until the process terminates: instead, the malloc/free library notifies the OS when it needs to grow the heap (e.g. using sbrk() on UNIX or VirtualAlloc() on Windows). In that sense, free() memory is free for your process to re-use, but not free for other processes to use. Some Operating Systems do optimise this - for example, using a distinct and independently releasble memory region for very large allocations.
Is it generally bad programming style to leave this hanging space there, in order to save some upfront programming time computing the necessary space before calling malloc?
Again, it depends on how many such allocations you're dealing with. If there are a great many relative to your virtual address space / RAM - you want to explicitly let the memory library know not all the originally requested memory is actually needed using realloc(), or you could even use strdup() to allocate a new block more tightly based on actual needs (then free() the original) - depending on your malloc/free library implementation that might work out better or worse, but very few applications would be significantly affected by any difference.
Sometimes your code may be in a library where you can't guess how many string instances the calling application will be managing - in such cases it's better to provide slower behaviour that never gets too bad... so lean towards shrinking the memory blocks to fit the string data (a set number of additional operations so doesn't affect big-O efficiency) rather than having an unknown proportion of the original string buffer wasted (in a pathological case - zero or one character used after arbitrarily large allocations). As a performance optimisation you might only bother returning memory if unusued space is >= the used space - tune to taste, or make it caller-configurable.
You comment on another answer:
So it comes down to judging whether the realloc will take longer, or the preprocessing size determination?
If performance is your top priority, then yes - you'd want to profile. If you're not CPU bound, then as a general rule take the "preprocessing" hit and do a right-sized allocation - there's just less fragmentation and mess. Countering that, if you have to write a special preprocessing mode for some function - that's an extra "surface" for errors and code to maintain. (This trade-off decision is commonly needed when implementing your own asprintf() from snprintf(), but there at least you can trust snprintf() to act as documented and don't personally have to maintain it).

Once '\0' is added, does the memory just get returned, or is it
sitting there hogging space until free() is called?
There's nothing magical about \0. You have to call realloc if you want to "shrink" the allocated memory. Otherwise the memory will just sit there until you call free.
If I stick a '\0' into the middle of the allocated memory, does (a.)
free() still work properly
Whatever you do in that memory free will always work properly if you pass it the exact same pointer returned by malloc. Of course if you write outside it all bets are off.

\0 is just one more character from malloc and free perspective, they don't care what data you put in the memory. So free will still work whether you add \0 in the middle or don't add \0 at all. The extra space allocated will still be there, it won't be returned back to the process as soon as you add \0 to the memory. I personally would prefer to allocate only the required amount of memory instead of allocating at some upper bound as that will just wasting the resource.

As soon as you get memory from heap by calling malloc(), the memory is yours to use. Inserting \0 is like inserting any other character. This memory will remain in your possession until you free it or until OS claims it back.

The \0is a pure convention to interpret character arrays as stings - it is independent of the memory management. I.e., if you want to get your money back, you should call realloc. The string does not care about memory (what is a source of many security problems).

malloc just allocates a chunk of memory .. Its upto you to use however you want and call free from the initial pointer position... Inserting '\0' in the middle has no consequence...
To be specific malloc doesnt know what type of memory you want (It returns onle a void pointer) ..
Let us assume you wish to allocate 10 bytes of memory starting 0x10 to 0x19 ..
char * ptr = (char *)malloc(sizeof(char) * 10);
Inserting a null at 5th position (0x14) does not free the memory 0x15 onwards...
However a free from 0x10 frees the entire chunk of 10 bytes..

free() will still work with a NUL byte in memory
the space will remain wasted until free() is called, or unless you subsequently shrink the allocation

Generally, memory is memory is memory. It doesn't care what you write into it. BUT it has a race, or if you prefer a flavor (malloc, new, VirtualAlloc, HeapAlloc, etc). This means that the party that allocates a piece of memory must also provide the means to deallocate it. If your API comes in a DLL, then it should provide a free function of some sort.
This of course puts a burden on the caller right?
So why not put the WHOLE burden on the caller?
The BEST way to deal with dynamically allocated memory is to NOT allocate it yourself. Have the caller allocate it and pass it on to you. He knows what flavor he allocated, and he is responsible to free it whenever he is done using it.
How does the caller know how much to allocate?
Like many Windows APIs have your function return the required amount of bytes when called e.g. with a NULL pointer, then do the job when provided with a non-NULL pointer (using IsBadWritePtr if it is suitable for your case to double-check accessibility).
This can also be much much more efficient. Memory allocations COST a lot. Too many memory allocations cause heap fragmentation and then the allocations cost even more. That's why in kernel mode we use the so called "look-aside lists". To minimize the number of memory allocations done, we reuse the blocks we have already allocated and "freed", using services that the NT Kernel provides to driver writers.
If you pass on the responsibility for memory allocation to your caller, then he might be passing you cheap memory from the stack (_alloca), or passing you the same memory over and over again without any additional allocations. You don't care of course, but you DO allow your caller to be in charge of optimal memory handling.

To elaborate on the use of the NULL terminator in C:
You cannot allocate a "C string" you can allocate a char array and store a string in it, but malloc and free just see it as an array of the requested length.
A C string is not a data type but a convention for using a char array where the null character '\0' is treated as the string terminator.
This is a way to pass strings around without having to pass a length value as a separate argument. Some other programming languages have explicit string types that store a length along with the character data to allow passing strings in a single parameter.
Functions that document their arguments as "C strings" are passed char arrays but have no way of knowing how big the array is without the null terminator so if it is not there things will go horribly wrong.
You will notice functions that expect char arrays that are not necessarily treated as strings will always require a buffer length parameter to be passed.
For example if you want to process char data where a zero byte is a valid value you can't use '\0' as a terminator character.

You could do what some of the MS Windows APIs do where you (the caller) pass a pointer and the size of the memory you allocated. If the size isn't enough, you're told how many bytes to allocate. If it was enough, the memory is used and the result is the number of bytes used.
Thus the decision about how to efficiently use memory is left to the caller. They can allocate a fixed 255 bytes (common when working with paths in Windows) and use the result from the function call to know whether more bytes are needed (not the case with paths due to MAX_PATH being 255 without bypassing Win32 API) or whether most of the bytes can be ignored...
The caller could also pass zero as the memory size and be told exactly how much needs to be allocated - not as efficient processing-wise, but could be more efficient space-wise.

You can certainly preallocate to an upperbound, and use all or something less.
Just make sure you actually use all or something less.
Making two passes is also fine.
You asked the right questions about the tradeoffs.
How do you decide?
Use two passes, initially, because:
1. you'll know you aren't wasting memory.
2. you're going to profile to find out where
you need to optimize for speed anyway.
3. upperbounds are hard to get right before
you've written and tested and modified and
used and updated the code in response to new
requirements for a while.
4. simplest thing that could possibly work.
You might tighten up the code a little, too.
Shorter is usually better. And the more the
code takes advantage of known truths, the more
comfortable I am that it does what it says.
char* copyWithoutDuplicateChains(const char* str)
{
if (str == NULL) return NULL;
const char* s = str;
char prev = *s; // [prev][s+1]...
unsigned int outlen = 1; // first character counted
// Determine length necessary by mimicking processing
while (*s)
{ while (*++s == prev); // skip duplicates
++outlen; // new character encountered
prev = *s; // restart chain
}
// Construct output
char* outstr = (char*)malloc(outlen);
s = str;
*outstr++ = *s; // first character copied
while (*s)
{ while (*++s == prev); // skip duplicates
*outstr++ = *s; // copy new character
}
// done
return outstr;
}

try to buffer overflow value allocated by malloc()

I'm a bit confused about malloc() function.
if sizeof(char) is 1 byte and the malloc() function accepts N bytes in argument to allocate, then if I do:
char* buffer = malloc(3);
I allocate a buffer that can to store 3 characters, right?
char* s = malloc(3);
int i = 0;
while(i < 1024) { s[i] = 'b'; i++; }
s[i++] = '$';
s[i] = '\0';
printf("%s\n",s);
it works fine. and stores 1024 b's in s.
bbbb[...]$
why doesn't the code above cause a buffer overflow? Can anyone explain?

malloc(size) returns a location in memory where at least size bytes are available for you to use. You are likely to be able to write to the bytes immediately after s[size], but:
Those bytes may belong to other bits of your program, which will cause problems later in the execution.
Or, the bytes might be fine for you to write to - they might belong to a page your program uses, but aren't used for anything.
Or, they might belong to the structures that malloc() has used to keep track of what your program has used. Corrupting this is very bad!
Or, they might NOT belong to your program, which will result in an immediate segmentation fault. This is likely if you access say s[size + large_number]
It's difficult to say which one of these will happen because accessing outside the space you asked malloc() for will result in undefined behaviour.
In your example, you are overflowing the buffer, but not in a way that causes an immediate crash. Keep in mind that C does no bounds checking on array/pointer accesses.
Also, malloc() creates memory on the heap, but buffer overflows are usually about memory on the stack. If you want to create one as an exercise, use
char s[3];
instead. This will create an array of 3 chars on the stack. On most systems, there won't be any free space after the array, and so the space after s[2] will belong to the stack. Writing to that space can overwrite other variables on the stack, and ultimately cause segmentation faults by (say) overwriting the current stack frame's return pointer.
One other thing:
if sizeof(char) is 1 byte
sizeof(char) is actually defined by the standard to always be 1 byte. However, the size of that 1 byte might not be 8 bits on exotic systems. Of course, most of the time you don't have to worry about this.

It is Undefined Behavior(UB) to write beyond the bounds of allocated memory.
Any behavior is possible, no diagnostic is needed for UB & any behavior can be encountered.
An UB does not necessarily warrant a segmentation fault.

In a way, you did overflow your 3 character buffer. However, you did not overflow your program's address space (yet). So you are well out of the bounds of s*, but you are overwriting random other data in your program. Because your program owns this data, the program doesn't crash, but still does very very wrong things, and the future behaviour is undefined.

In practice what this is doing is corrupting the heap. The effects may not appear immediately (in fact, that's part of what makes such errors a PITA to debug). However, you may trash anything else that happens to be in the heap, or in that part of your program's address space for that matter. It's likely that you have also trashed malloc() internal data structures, and so it's likely that subsequent malloc() or free() calls may crash your program, leading many programmers to (falsely) believe they've found a bug in malloc().

You're overflowing the buffer. It depends what memory you're overflowing into to get an error msg.

Did you try executing your code in release mode or did you try to free up the memory you of s? It is an undefined behavior.
It's a bit of a language hack, and a bit dubious about it's use.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight