Do all objects sit in the same address space in C? - c

I am trying to work out if the C standard require that all addresses are in the same address space. If I have two objects of different type
double d;
int i;
I cannot do pointer arithmetic on their addresses, because they are pointers of different types. However, the standard says that I can point character type pointers there and will get the address of the first byte in the objects.
char *dp = (char *)&d;
char *ip = (char *)&i;
and with those I can do pointer arithmetic, and for example figure out how far apart they are in memory, (dp - ip). That is, of course, if doubles and ints sit in the same memory. They always do on the platforms I know, but is it guaranteed by the standard? Or is pointer arithmetic only allowed if my char pointers actually point at something with the same type?

Pointer arithmetic is only defined when the pointers have the same type and they point within the same object. More specifically, the standard says:
3 For subtraction, one of the following shall hold:
both operands have arithmetic type;
both operands are pointers to qualified or unqualified versions of compatible complete object types; or
the left operand is a pointer to a complete object type and the right operand has integer type.
and:
9 When two pointers are subtracted, both shall point to elements of the same array object, or one past the last element of the array object; the result is the difference of the subscripts of the two array elements.
(For the purposes of interpreting the above, a single object is treated as an array with one element.)
Casting the pointer types to char * addresses the constraint in clause 3, but a pointer to d is not pointing to an element of i. So you can't subtract them.

Due to factors like ASLR, and the lack of specificity in the C specification regarding how variables are actually positioned in memory, you really can't trust the difference of two pointers to two different objects to represent anything.
Are things allocated on the stack in a top-down manner? Usually, sure, it's a long-standing convention, but it is not required to be that way. They could be heap allocated, or strewn about randomly. That's unlikely, but allowed.
In any protected mode operating system you are not seeing real memory addresses, they're user-space addresses that might look and feel very real, but they're remapped by the CPU to their actual location in memory, or perhaps not even, as that memory could have been swapped out to disk, compressed, or other more mysterious and confusing things that are all hidden away by the kernel and CPU.
While you can take the difference of two locations within a given allocation, as in through malloc or calloc, the difference between two arbitrary allocations or objects is really not meaningful. Not only does the kernel add an abstraction layer, it will deliberately scramble the allocations it gives you through Address Space Layout Randomization as a measure to make your allocations more unpredictable.
Why? To make it harder to weaponize a buffer overflow bug.
So if you're curious about the position of variables in memory, that's great, have a look, explore, but don't presume that the strategy used by your compiler, operating system, or CPU won't change in the future in some dramatic way.
On any modern 64-bit CPU and operating system there's a huge amount of address space to work with, like 18,446,744,073,709,551,616 possible bytes, and while large chunks of this are walled off and reserved, there's still a nearly inexhaustible amount of space left. That's also multiplied by the fact that each process has its own address space, so there's actually a lot more than that in theory to work with.
Fun fact: Before 64-bit CPUs took hold there were unusual 36-bit memory schemes where a 32-bit operating system and CPU could address more than 4GB of memory, but each individual process could only "see" 4GB since it uses 32-bit pointers.

Memory allocated for malloc may be used for any object with a fundamental alignment requirement, which includes all the “built in” types (e.g., including special types a compiler might provide as an extension), per C 2018 7.22.3 1, and therefore all such objects must share the address space used by malloc.
Further, any types of objects can be put into a structure or union together and therefore must share an address space.

Related

In C, are the characters in an array (i.e. string) stored in individual registers or are there four characters per register?

I am writing a program in C (32 bit) where I output a string (15 to 40 characters long). I have elected to use pointers and calloc instead of a formal array declaration. My program functions totally fine so this isn't a question about logic or function, I am simply curious about what's "going on under the hood" of my C code.
My understanding: When I use calloc I am allocating a section of memory in units of bytes. Variables are stored in memory locations of size 32 bits (or 4 bytes). In my program, I write characters using my pointer (i.e. *ptr = '!';) and then I increment the points (ptr++;) to move to the next memory location.
My question: If memory locations are 32-bits and I am writing only 8-bits to that memory location, are the remaining 24-bits unused? If not, then are the pointers I'm using pointing to some kind of 8-bit sub-memory location, pointing to 8-bit sections of memory locations?
Register usage -- and, technically, even the existence of registers at all -- is a characteristic of the C implementation and the hardware on which it runs. There is therefore no definitive answer to your question at its level of generality. This is for the most part true of any question about "what's going on under the hood".
Speaking in terms of typical implementations for commodity hardware, though,
My understanding: When I use calloc I am allocating a section of memory in units of bytes.
A reasonable characterization.
Variables are stored in registers of size 32 bits (or 4 bytes).
No. Values are stored in registers. Implementations generally provide storage for the values of variables in regular memory, though those values may be copied into registers for computation.
Under some implementation-specific circumstances, certain variables might not have an associated memory location, their values instead being maintained only in registers. Generally speaking, however, this is never the case for variables or allocated space that is, was, or ever could be referenced by a pointer.
In my program, I write characters using my pointer (i.e. *ptr = '!';) and then I increment the points (ptr++;) to move to the next register.
No, absolutely not. Incrementing the pointer causes it to point to the next element of your dynamic storage, measured in units of the size of the pointed-to type. This has nothing to do with registers. Writing to the pointed-to object probably involves register use (because that's how CPUs work), but ultimately the character written ends up in regular memory.
My question: If registers are 32-bits and I am writing only 8-bits to that register, are the remaining 24-bits unused?
As I already explained, this question is based on a misconception. The target of your write is not a register. In any case, there are no gaps in memory between the elements you are writing.
It is conceivable that under some circumstances, a clever compiler might optimize your code to minimize writes to memory by collecting bytes in a register and performing writes in chunks of that size. Whether it can or will do so depends on the implementation and the options in effect.
If not, then are the pointers I'm using pointing to some kind of 8-bit sub-register allocation, pointing to 8-bit sections of registers?
Your pointers are (logically) pointing to main memory, which is (logically) addressable in byte-sized units. They are not pointing to registers.
Nopes, there's no register involved, in general, they are scarce resource.
What happens actually is, you are writing the values in the memory locations pointed to by the returned pointer. The pointers and pointer arithmetic regards data type, so the returned pointer, casted to proper type, takes care of access.
I write characters using my pointer (i.e. *ptr = '!';) and then I increment the points (ptr++;) to move to the next register.
Not exactly, you are talking about memory location pointed to by the pointer ptr. In case, ptr is defined as char *, ptr++ is the same as ptr = ptr + 1, which, increases the ptr by the size of the pointing data type, char. So, after the expression, ptr points to the next element in the memory location.
Those pointers are not certain to be stored in registers, normally they will be just stored on the stack.
This is an outcome of the compiler optimizations.
In some compilers you can use the register statement to ensure usage of register.
Also, there is no "next" registers, registers does not have addresses. Register file is a special hardware unit integrated to the cpu and usually named by a certain set of bits.
I advise you to use your compiler or disassembly tool to see exactly how it looks in assembly.
You can specify in c that a var goes into a register, and most compilers will optimize this, but where the var goes depends on what kind of variable it is. Local variables will go on the stack, memory allocation functions should put it on the heap and give you the address. Constants and string literals will go into the read only data segment.
As Sourav pointed out you are using registers wrong. There is a memory called register and there is a keyword register in C. But this has not much to do with pointers.
The typical size for an aligned memory block is 16/32/64bit depending on your architecture. You are thinking that you increase your pointer by that blocksize. This is not correct.
Depending on what type of pointer you have, your stepsize on incrementation differs. It is always the size of your corresponding data type in bytes.
*char gets increase by 1 byte if you do ++
while *(long long) gets increased by 8.
As arrays can decay to pointers on some occasions, the mechanics are quite similar.
What you think of is what happens if you declare two char (or a char and an int in a struct), their addresses differ by a multiple of the blocksize and the rest of the memory is "wasted".
But as you allocated the memory it is yours to control, you can pack it similar to an array.
There seems to be confusion about what a register is. A register is a storage location within the processor. Registers have different functions. However, programmers are generally concerned with GENERAL REGISTERS and the Process Status Register.
General Registers are scratch locations for performing computations. On some systems all operations are performed in registers. Thus, if you want to add two values, you have to load both into registers, then add them. Most non-RISC systems these days allow operations to take place directly to memory.
My understanding: When I use calloc I am allocating a section of memory in units of bytes. Variables are stored in registers of size 32 bits (or 4 bytes). In my program, I write characters using my pointer (i.e. *ptr = '!';) and then I increment the points (ptr++;) to move to the next register.
Your compiler may assign variables to exist in registers, rather than memory. However, any time you dereference a pointer (e.g. *ptr) you have to access memory.
If you call
char *ptr = calloc (...)
The variable ptr may (or may not) be placed in a register. It's all up to your compiler. The value returned by calloc is the location of memory, not registers.
What you should do to learn this is to generate assembly language code from your compiler. Most compilers have such an option and they typically interleave your C code with the generated assembly code.
If you do:
In my program, I write characters using my pointer (i.e. *ptr = '!';) and then I increment the points (ptr++;) to move to the next register.
Your generated code might look like (assuming ptr is mapped to R1):
MOVB '!', (R0)+
Which on several systems, moves the value '!' to the address pointed to by R0, then increments R0 by one.
My question: If registers are 32-bits and I am writing only 8-bits to that register, are the remaining 24-bits unused? If not, then are the pointers I'm using pointing to some kind of 8-bit sub-register allocation, pointing to 8-bit sections of registers?
In your case, you are not reading and writing bytes to registers. However, many systems do have REGISTER subdividing.

sizeof Pointer differs for data type on same architecture

I have been going through some posts and noticed that pointers can be different sizes according to sizeof depending on the architecture the code is compiled for and running on. Seems reasonable enough to me (ie: 4-byte pointers on 32-bit architectures, 8-byte on 64-bit, makes total sense).
One thing that surprises me is that the size of a pointer can different based on the data type it points to. I would have assumed that, on a 32-bit architecture, all pointers would be 4-bytes in size, but it turns out that function pointers can be a different size (ie: larger than what I would have expected). Why is this, in the C programming language? I found an article that explains this for C++, and how the program may have to cope with virtual functions, but this doesn't seem to apply in pure C. Also, it seems the use of "far" and "near" pointers is no longer necessary, so I don't see those entering the equation.
So, in C, what justification, standard, or documentation describes why not all pointers are the same size on the same architecture?
Thanks!
The C standard lays down the law on what's required:
All data pointers can be converted to void* and back without loss of information.
All struct-pointers have the same representation+alignment and can thus be converted to each other.
All union-pointers have the same representation+alignment and can thus be converted to each other.
All character pointers and void pointers have the same representation+alignment.
All pointers to qualified and unqualified compatible types shall have the same representation+alignment. (For example unsigned / signed versions of the same type are compatible)
All function pointers have the same representation+alignment and can be converted to any other function pointer type and back again.
Nothing more is required.
The committee arrived at these guarantees by examining all current implementations and machines and codifying as many guarantees as they could.
On architectures where pointers are naturally word pointers instead of character pointers, you get data pointers of different sizes.
On architectures with different size code / data spaces (many micro-processors), or where additional info is needed for properly invoking functions (like itanium, though they often hide that behind a data-pointer), you get code pointers of different size from data pointers.
So, in C, what justification, standard, or documentation describes why not all pointers are the same size on the same architecture?
C11 : 6.2.5 p(28):
A pointer to void shall have the same representation and alignment requirements as a pointer to a character type. Similarly, pointers to qualified or unqualified versions of compatible types shall have the same representation and alignment requirements. All pointers to structure types shall have the same representation and alignment requirements as each other. All pointers to union types shall have the same representation and alignment requirements as each other. Pointers to other types need not have the same representation or alignment requirements.
6.3.2.3 Pointers p(8):
A pointer to a function of one type may be converted to a pointer to a function of another type and back again; the result shall compare equal to the original pointer. If a converted pointer is used to call a function whose type is not compatible with the pointed-to type, the behavior is undefined.
This clarifies that pointers to data and pointers to functions are not of the same size.
One additional point:
Q: So, is it safe to say that, while I don't have to explicitly use the far/near keywords when defining a pointer, this is handled automatically "under the hood" by the compiler?
A: http://www.unix.com/programming/45002-far-pointer.html
It's a historical anachronism from segmented architectures such as the
8086.
Back in the days of yore there was the 8080, this was an 8 bit
processor with 16 bit address bus, hence 16 bit pointers.
Along came the 8086, in order to support some level of backward
compatiblity it adopted a segmented architecture which let use use
either 16 bit, 20 bit or 32 bit pointers depending on the day of the
week. Where a pointer was a combination of 16 bit segment register and
16 bit near offset. This lead to the rise of tiny, small, medium,
large and huge memory models with near, far and huge pointers.
Other architectures such as 68000 did not adopt this scheme and had
what is called a flat memory model.
With the 80386 and true 32 bit mode, all pointers are 32 bit, but
ironically are now really near pointers but 32 bit wide, the operating
system hides the segments from you.
I compiled this on three different platforms; the char * pointer was identical to the function pointer in every case:
CODE:
#include <stdio.h>
int main (int argc, char *argv[]) {
char * cptr = NULL;
void (*fnptr)() = NULL;
printf ("sizeof cptr=%ld, sizeof fnptr=%ld\n",
sizeof (cptr), sizeof (fnptr));
return 0;
}
RESULTS:
char ptr fn ptr
-------- ------
Win8/MSVS 2013 4 4
Debian7/i686/GCC 4 4
Centos/amd64/GCC 8 8
Some architecture support multiple kinds of address spaces. While nothing in the Standard would require that implementations provide access to all address spaces supported by the underlying platform, and indeed the Standard offers no guidance as to how such support should be provided, the ability to support multiple address spaces may make it possible for a programmer who is aware of them to write code that works much better than would otherwise be possible.
On some platforms, one address space will contain all the others, but accessing things in that address space will be slower (sometimes by 2x or more) than accessing things which are known to be in a particular part of it. On other platforms, there won't be any "master" address space, so different kinds of pointers will be needed to access things in different spaces.
I disagree with the claim that the existence of multiple address spaces should be viewed as a relic. On a number of ARM processors, it would be possible for a program to have up to 1K-4K (depending upon the exact chip) of globals which could be accessed twice as quickly as--and with less code than--"normal" global variables. I don't know of any ARM compilers that would exploit that, but there's no reason a compiler for the ARM couldn't do so.

Sizeof pointer for 16 bit and 32 bit

I was just curious to know what would the sizeof pointer return for a 16 bit and a 32 bit system
printf("%d", sizeof(int16 *));
printf("%d", sizeof(int32 *));
Thank you.
Short answer: On a 32bit Intel 386 you will likely see these returning 4, while targeting a 16bit 8086 you might most likely see either 2 or 4 depending on the memory model you selected.
The details
First standard C does not mandate anything particular about pointers, only that they need to be able to "point to" the given variable, and pointer arithmetic needs to work within the data area of the given variable. Even a C interpreter which has some exotic representation of pointers is possible, and given this flexibility pointers truly might be of any size depending on what you target.
Usually however compilers indeed represent pointers by memory addresses which makes several operations undefined by the C standard "usually working". The way how the compiler chooses to represent a pointer depends on the targeted architecture: compiler writers obviously chose representations which are either or both useful and efficient.
An example to useful representations is generic pointers on a Harward architecture micro. They allow you to address both code and data ram. On a 8 bit micro they might be encoded as one type byte plus 2 address bytes, this obviously implies that whenever you dereference one such pointer, more complex code has to be emitted to load the contents from the proper place.
That gives a good example to an efficient representation: why not have specific pointers then? One which points to code memory, an other which points to data memory? Just 2 bytes (assuming 16bit address space as usual for 8bit micros such as the 8051), and no need to select by type.
But then you have multiple types of pointers, eh (again the 8051: you will likely have at least one additional type of pointer pointing within it's internal RAM too...). The programmer then needs to think about which particular pointer type he needs to use.
And of course the sizes also differ. On this hypothetical compiler targeting the 8051, you would have a generic pointer type of 3 bytes, an external data memory pointer type of 2 bytes, a code memory pointer of 2 bytes, and an internal RAM pointer type of 1 byte.
Also note that these are types of pointers, and not the types of data they point to (function pointers are a little off here as the fact a pointer is a function pointer implies that it is of a different type than data pointers while not having any specific syntax difference except that the data type it points to is a function type).
Back to your 16bit machine, assuming it is a 8086:
If you use some memory model where the compiler assumes you have a single data segment, you will likely get 2 byte data pointers if you don't specifically declare one near or far. Otherwise you will get 4 byte pointers by default. The representation of 2 byte pointers is usually simply the 16bit offset, while for 4 byte pointers it is a segment:offset pair. You can always apply a near or far specifier to explicitly make your pointers one or another type.
(How near pointers work in an program which also uses far pointers? Simply there is a default data segment generated by the compiler, and all nears are located within that. The compiler may simply permanently, or at least most of the time, have the ds segment register filled with the default data segment, so access of data pointed by nears can be faster)
The size of the a pointer depends on the architecture. Precisely, it depends on the size of the addresses used in that architecture which reflects the size of the bus system to access the memory.
For example, on 32 bits architecture the size of an address is 4 bytes :
sizeof (void *) == 4 Bytes.
On 64bits, addreses have size 8 bytes:
sizeof (void *) == 8 bytes.
Note, that all pointers have the same size interdependently of the type. So if you execute your code, the size of a int16 pointer and the size of int32 pointer will be the same.
However, the size of a pointer on a 16 bit system should be 2 bytes. Usually, 16bit systems have really few memory (some megabytes) and 2 bytes are enough to address all its locations. To be more precise, with a pointer of 16 bit the maximum memory you can have is around 65 KB. (really few compared to the amount of memory of a today computer).

Does the size of pointers vary in C? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicates:
Can the Size of Pointers Vary Depending on what’s Pointed To?
Are there are any platforms where pointers to different types have different sizes?
Is it possible that the size of a pointer to a float in c differs from a pointer to int? Having tried it out, I get the same result for all kinds of pointers.
#include <stdio.h>
#include <stdlib.h>
int main()
{
printf("sizeof(int*): %i\n", sizeof(int*));
printf("sizeof(float*): %i\n", sizeof(float*));
printf("sizeof(void*): %i\n", sizeof(void*));
return 0;
}
Which outputs here (OSX 10.6 64bit)
sizeof(int*): 8
sizeof(float*): 8
sizeof(void*): 8
Can I assume that pointers of different types have the same size (on one arch of course)?
Pointers are not always the same size on the same arch.
You can read more on the concept of "near", "far" and "huge" pointers, just as an example of a case where pointer sizes differ...
http://en.wikipedia.org/wiki/Intel_Memory_Model#Pointer_sizes
In days of old, using e.g. Borland C compilers on the DOS platform, there were a total of (I think) 5 memory models which could even be mixed to some extent. Essentially, you had a choice of small or large pointers to data, and small or large pointers to code, and a "tiny" model where code and data had a common address space of (If I remember correctly) 64K.
It was possible to specify "huge" pointers within a program that was otherwise built in the "tiny" model. So in the worst case it was possible to have different sized pointers to the same data type in the same program!
I think the standard doesn't even forbid this, so theoretically an obscure C compiler could do this even today. But there are doubtless experts who will be able to confirm or correct this.
Pointers to data must always be compatible with void* so generally they would be nowadays realized as types of the same width.
This statement is not true for function pointers, they may have different width. For that reason in C99 casting function pointers to void* is undefined behavior.
As I understand it there is nothing in the C standard which guarantees that pointers to different types must be the same size, so in theory an int * and a float * on the same platform could be different sizes without breaking any rules.
There is a requirement that char * and void * have the same representation and alignment requirements, and there are various other similar requirements for different subsets of pointer types but there's nothing that encompasses everything.
In practise you're unlikely to run into any implementation that uses different sized pointers unless you head into some fairly obscure places.
Yes. It's uncommon, but this would certainly happen on systems that are not byte-addressable. E.g. a 16 bit system with 64 Kword = 128KB of memory. On such systems, you can still have 16 bits int pointers. But a char pointer to an 8 bit char would need an extra bit to indicate highbyte/lowbyte within the word, and thus you'd have 17/32 bits char pointers.
This might sound exotic, but many DSP's spend 99.x% of the time executing specialized numerical code. A sound DSP can be a bit simpler if it all it has to deal with is 16 bits data, leaving the occasional 8 bits math to be emulated by the compiler.
I was going to write a reply saying that C99 has various pointer conversion requirements that more or less ensure that pointers to data have to be all the same size. However, on reading them carefully, I realised that C99 is specifically designed to allow pointers to be of different sizes for different types.
For instance on an architecture where the integers are 4 bytes and must be 4 byte aligned an int pointer could be two bits smaller than a char or void pointer. Provided the cast actually does the shift in both directions, you're fine with C99. It helpfully says that the result of casting a char pointer to an incorrectly aligned int pointer is undefined.
See the C99 standard. Section 6.3.2.3
Yes, the size of a pointer is platform dependent. More specifically, the size of a pointer depends on the target processor architecture and the "bit-ness" you compile for.
As a rule of thumb, on a 64bit machine a pointer is usually 64bits, on a 32bit machine usually 32 bits. There are exceptions however.
Since a pointer is just a memory address its always the same size regardless of what the memory it points to contains. So a pointer to a float, a char or an int are all the same size.
Can I assume that pointers of different types have the same size (on one arch of course)?
For the platforms with flat memory model (== all popular/modern platforms) pointer size would be the same.
For the platforms with segmented memory model, for efficiency, often there are platform-specific pointer types of different sizes. (E.g. far pointers in the DOS, since 8086 CPU used segmented memory model.) But this is platform specific and non-standard.
You probably should keep in mind that in C++ size of normal pointer might differ from size of pointer to virtual method. Pointers to virtual methods has to preserve extra bit of information to not to work properly with polymorphism. This is probably only exception I'm aware of, which is still relevant (since I doubt that segmented memory model would ever make it back).
There are platforms where function pointers are a different size than other pointers.
I've never seen more variation than this. All other pointers must be at most sizeof(void*) since the standard requires that they can be cast to void* without loss of information.
Pointer is a memory address - and hence should be the same on a specific machine. 32 bit machine => 4Bytes, 64 bit => 8 Bytes.
Hence irrespective of the datatype of the thing that the pointer is pointing to, the size of a pointer on a specific machine would be the same (since the space required to store a memory address would be the same.)
Assumption: I'm talking about near pointers to data values, the kind you declared in your question.

Why does C need arrays if it has pointers?

If we can use pointers and malloc to create and use arrays, why does the array type exist in C? Isn't it unnecessary if we can use pointers instead?
Arrays are faster than dynamic memory allocation.
Arrays are "allocated" at "compile time" whereas malloc allocates at run time. Allocating takes time.
Also, C does not mandate that malloc() and friends are available in free-standing implementations.
Edit
Example of array
#define DECK_SIZE 52
int main(void) {
int deck[DECK_SIZE];
play(deck, DECK_SIZE);
return 0;
}
Example of malloc()
int main(void) {
size_t len = 52;
int *deck = malloc(len * sizeof *deck);
if (deck) {
play(deck, len);
}
free(deck);
return 0;
}
In the array version, the space for the deck array was reserved by the compiler when the program was created (but, of course, the memory is only reserved/occupied when the program is being run), in the malloc() version, space for the deck array has to be requested at every run of the program.
Arrays can never change size, malloc'd memory can grow when needed.
If you only need a fixed number of elements, use an array (within the limits of your implementation).
If you need memory that can grow or shrink during the running of the program, use malloc() and friends.
It's not a bad question. In fact, early C had no array types.
Global and static arrays are allocated at compile time (very fast). Other arrays are allocated on the stack at runtime (fast). Allocating memory with malloc (to be used for an array or otherwise) is much slower. A similar thing is seen in deallocation: dynamically allocated memory is slower to deallocate.
Speed is not the only issue. Array types are automatically deallocated when they go out of scope, so they cannot be "leaked" by mistake. You don't need to worry about accidentally freeing something twice, and so on. They also make it easier for static analysis tools to detect bugs.
You may argue that there is the function _alloca() which lets you allocate memory from the stack. Yes, there is no technical reason why arrays are needed over _alloca(). However, I think arrays are more convenient to use. Also, it is easier for the compiler to optimise the use of an array than a pointer with an _alloca() return value in it, since it's obvious what a stack-allocated array's offset from the stack pointer is, whereas if _alloca() is treated like a black-box function call, the compiler can't tell this value in advance.
EDIT, since tsubasa has asked for more details on how this allocation occurs:
On x86 architectures, the ebp register normally refers to the current function's stack frame, and is used to reference stack-allocated variables. For instance, you may have an int located at [ebp - 8] and a char array stretching from [ebp - 24] to [ebp - 9]. And perhaps more variables and arrays on the stack. (The compiler decides how to use the stack frame at compile time. C99 compilers allow variable-size arrays to be stack allocated, this is just a matter of doing a tiny bit of work at runtime.)
In x86 code, pointer offsets (such as [ebp - 16]) can be represented in a single instruction. Pretty efficient.
Now, an important point is that all stack-allocated variables and arrays in the current context are retrieved via offsets from a single register. If you call malloc there is (as I have said) some processing overhead in actually finding some memory for you. But also, malloc gives you a new memory address. Let's say it is stored in the ebx register. You can't use an offset from ebp anymore, because you can't tell what that offset will be at compile time. So you are basically "wasting" an extra register that you would not need if you used a normal array instead. If you malloc more arrays, you have more "unpredictable" pointer values that magnify this problem.
Arrays have their uses, and should be used when you can, as static allocation will help make programs more stable, and are a necessity at times due to the need to ensure memory leaks don't happen.
They exist because some requirements require them.
In a language such as BASIC, you have certain commands that are allowed, and this is known, due to the language construct. So, what is the benefit of using malloc to create the arrays, and then fill them in from strings?
If I have to define the names of the operations anyway, why not put them into an array?
C was written as a general purpose language, which means that it should be useful in any situation, so they had to ensure that it had the constructs to be useful for writing operating systems as well as embedded systems.
An array is a shorthand way to specify pointing to the beginning of a malloc for example.
But, imagine trying to do matrix math by using pointer manipulations rather than vec[x] * vec[y]. It would be very prone to difficult to find errors.
See this question discussing space hardening and C. Sometimes dynamic memory allocation is just a bad idea, I have worked with C libraries that are completely devoid of malloc() and friends.
You don't want a satellite dereferencing a NULL pointer any more than you want air traffic control software forgetting to zero out heap blocks.
Its also important (as others have pointed out) to understand what is part of C and what extends it into various uniform standards (i.e. POSIX).
Arrays are a nice syntax improvement compared to dealing with pointers. You can make all sorts of mistakes unknowingly when dealing with pointers. What if you move too many spaces across the memory because you're using the wrong byte size?
Explanation by Dennis Ritchie about C history:
Embryonic C
NB existed so briefly that no full description of it was written. It supplied the types int and char, arrays of them, and pointers to them, declared in a style typified by
int i, j;
char c, d;
int iarray[10];
int ipointer[];
char carray[10];
char cpointer[];
The semantics of arrays remained exactly as in B and BCPL: the declarations of iarray and carray create cells dynamically initialized with a value pointing to the first of a sequence of 10 integers and characters respectively. The declarations for ipointer and cpointer omit the size, to assert that no storage should be allocated automatically. Within procedures, the language's interpretation of the pointers was identical to that of the array variables: a pointer declaration created a cell differing from an array declaration only in that the programmer was expected to assign a referent, instead of letting the compiler allocate the space and initialize the cell.
Values stored in the cells bound to array and pointer names were the machine addresses, measured in bytes, of the corresponding storage area. Therefore, indirection through a pointer implied no run-time overhead to scale the pointer from word to byte offset. On the other hand, the machine code for array subscripting and pointer arithmetic now depended on the type of the array or the pointer: to compute iarray[i] or ipointer+i implied scaling the addend i by the size of the object referred to.
These semantics represented an easy transition from B, and I experimented with them for some months. Problems became evident when I tried to extend the type notation, especially to add structured (record) types. Structures, it seemed, should map in an intuitive way onto memory in the machine, but in a structure containing an array, there was no good place to stash the pointer containing the base of the array, nor any convenient way to arrange that it be initialized. For example, the directory entries of early Unix systems might be described in C as
struct {
int inumber;
char name[14];
};
I wanted the structure not merely to characterize an abstract object but also to describe a collection of bits that might be read from a directory. Where could the compiler hide the pointer to name that the semantics demanded? Even if structures were thought of more abstractly, and the space for pointers could be hidden somehow, how could I handle the technical problem of properly initializing these pointers when allocating a complicated object, perhaps one that specified structures containing arrays containing structures to arbitrary depth?
The solution constituted the crucial jump in the evolutionary chain between typeless BCPL and typed C. It eliminated the materialization of the pointer in storage, and instead caused the creation of the pointer when the array name is mentioned in an expression. The rule, which survives in today's C, is that values of array type are converted, when they appear in expressions, into pointers to the first of the objects making up the array.
To summarize in my own words - if name above were just a pointer, any of that struct would contain an additional pointer, destroying the perfect mapping of it to an external object (like an directory entry).

Resources