How many machine instructions can single memory address store? - c

I'm new in GDB and currently trying to examine memory. I guess title says everything. Basically I compiled some c code and set break point to main. When I type x/x $eip it give me back some machine instruction 0xd02404c7.
On the second try x/5x $eipit gives back 0x8048426 <main+9>: 0xd02404c7 0xe8080484 0xfffffebe 0x9066c3c9
0x8048436: 0x90669066
So i got little confused here. The space between addreses 0x8048426--0x8048436 is equal to 10. So it turns out that four instructions took "10 addresses". My questions are: Can Memory address store maximum of 4 machine instructions?
Why does it took "10 addresses" to store 4 machine instructions?
Is there any relationship between how much bits does proccesor have and how many machine insturction can single memory address store?
Sorry if the question sounds silly.

The space between addreses 0x8048426--0x8048436 is equal to 10. So it turns out that four instructions took "10 addresses"
Not quite, it's equal to 0x10, which is an hexadecimal number, and equal to 16 in decimal.
So those instructions are taking 16 bytes.
Can Memory address store maximum of 4 machine instructions?
Addresses have a granularity of 1 byte. That is, one address refers to exactly one byte.
Machine instructions can take 1 or more byte. So a single memory address, a single byte, can store a maximum of 1 machine instruction, at least on x86.
Why does it took "10 addresses" to store 4 machine instructions?
Each of the numbers you see is not an instruction. The 4 numbers you see are called words and are what your CPU usually works with.
Is there any relationship between how much bits does proccesor have and how many machine insturction can single memory address store?
Not really. A single memory address can store at most one instruction. Because instructions are at lest 1 byte long (for x86).
But "how much bits does proccesor have" can indicate that your processor has access to an extended or different instruction set.

Related

Why 2 raised to 32 power results in a number in bytes instead of bits?

I just restart the C programming study. Now, I'm studying the memory storage capacity and the difference between bit and byte. I came across to this definition.
There is a calculation to a 32 bits system. I'm very confused, because in this calculation 2^32 = 4294967296 bytes and it means about 4 Gigabyte. My question is: Why 2 raised to 32 power results in a number in bytes instead of bits ?
Thanks for helping me.
Because the memory is byte-addressable (that is, each byte has its own address).
There are two ways to look at this:
A 32-bit integer can hold one of 2^32 different values. Thus, a uint32_t can represent the values from 0 to 4294967295.
A 32-bit address can represent 2^32 different addresses. And as Scott said, on a byte-addressable system, that means 2^32 different bytes can be addressed. Thus, a process with 32-bit pointers can address up to 4 GiB of virtual memory. Or, a microprocessor with a 32-bit address bus can address up to 4 GiB of RAM.
That description is really superficial and misses a lot of important considerations, especially as to how memory is defined and accessed.
Fundamentally an N-bit value has 2N possible states, so a 16-bit value has 65,536 possible states. Additionally, memory is accessed as bytes, or 8-bit values. This was not always the case, older machines had different "word" sizes, anywhere from 4 to 36 bits per word, occasionally more, but over time the 8-bit word, or "byte", became the dominant form.
In every case a memory "address" contains one "word" or, on more modern machines, "byte". Memory is measured in these units, like "kilowords" or "gigabytes", for reasons of simplicity even though the individual memory chips themselves are specified in terms of bits. For example, a 1 gigabyte memory module often has 8 gigabit chips on it. These chips are read at the same time, the resulting data combined to produce a single byte of memory.
By that article's wobbly definition this means a 16-bit CPU can only address 64KB of memory, which is wrong. DOS systems from the 1980s used two pointers to represent memory, a segment and an offset, and could address 16MB using an effective 24-bit pointer. This isn't the only way in which the raw pointer size and total addressable memory can differ.
Some 32-bit systems also had an alternate 36-bit memory model that allowed addressing up to 64GB of memory, though an individual process was limited to a 4GB slice of the available memory.
In other words, for systems with a singular pointer to a memory address and where the smallest memory unit is a byte then the maximum addressable memory is 2N bytes.
Thankfully, since 64-bit systems are now commonplace and a computer with > 64GB of memory is not even exotic or unusual, addressing systems are a lot simpler now then when having to work around pointer-size limitations.
We say that memory is byte-addressable, you can think like byte is the smallest unit of memory so you are not reading by bits but bytes. The reason might be that the smallest data type is 1 byte, even boolean type in c/c++ is 1 byte.

Examining memory with x86 proccesor [duplicate]

This question already has an answer here:
How many machine instructions can single memory address store?
(1 answer)
Closed 7 years ago.
I'm new in GDB and have some problem with it. I have x86 proccesor and it means that register eip in my proccesor should contain 4 byte memory. I compiled some c code and set break point to main(). Typing x/x $eip gives me back "0xd02404c7"(hexadecimal) which as i know is some instruction to machine language. So my questions is: if This machine instruction is the size of 4 byte. This command "x/4x $eip" should display 16 byte and it show me this:
0x8048426 <main+9>: 0xd02404c7 0xe8080484 0xfffffebe 0x9066c3c9
So i'm confused. If this is 16 byte than why does it show me that it is located on the same memory when 1 register in 32 bit proccesor should contain only 4 byte? Thank you.
Typing x/x $eip gives me back "0xd02404c7"(hexadecimal) which as i know is some instruction to machine language.
No, it gives you raw bytes in your code. These raw bytes can "cover" less than one, one, or several machine instructions. A shortest x86 instruction takes up just one byte. The longest instruction takes 15 bytes.
So my questions is: if This machine instruction is the size of 4 byte.
An address is 4 bytes, but the instruction itself may contain 1 to 15 bytes. You can see the relationship between bytes and instructions if you do (gdb) disas/r main
So every memory address can store 4 machine instructions?
Not at all. Every memory address corresponds to 1 byte of memory. That byte may contain an entire (single-byte) instruction, or it can be a start of multi-byte instruction, or it could not contain any instructions at all (if the address points to e.g. .data section).

How does one access individual characters of a string properly aligned in memory, on ARM platform?

Since (from what I have read) ARM9 platform may fail to correctly load data at an unaligned memory address, let's assume unaligned meaning that the address value is not multiple of 2 (i.e. not aligned on 16-bit), then how would one access say, fourth character on a string of characters pointed to by a properly aligned pointer?
char buf[] = "Hello world.";
buf[3]; // (buf + 3) is unaligned here, is it not?
Does compiler generate extra code, as opposed to the case when buf + 3 is properly aligned? Or will the last statement in the example above have undesired results at runtime - yielding something else than the fourth character, the second l in Hello?
Byte accesses don't have to be aligned. The compiler will generate a ldrb instruction, which does not need any sort of alignment.
If you're curious as to why, this is because ARM will load the entire aligned word that contains the target byte, and then simply select that byte out of the four it just loaded.
The concept to remember is that the compiler will try to optimize access based on the type in order to get the most efficiency of your processor. So when accessing ints, it'll want to use things like the ldr instruction which will fault if it's an unaligned access. For something link a char access, the compiler will work some of the details for you. Where you have to be concerned are things like:
Casting pointers. If you cast a char * to an int * and the pointer is not aligned correctly, you'll get an alignment trap. In general, it's okay to cast down (from an int to a char), but not the other way around. You would not want to do this:
char buf[] = "12345678";
int *p = &buf[1];
printf("0x%08X\n", *p); // *p is badness here!
Trying to pull data off the wire with structures. I've seen this done a lot, and it's just plain bad practice. Endianess issues aside, you can cause an alignment trap if the elements aren't aligned correctly for the platform.
FWIW, casting pointers is probably the number one issue I've seen in practice.
There's a great book called Write Portable Code which goes over quite a few details about writing code for multiple platforms. The sample chapter on the linked site actually contains a section talking about alignment.
There's a little more that's going on too. Memory functions, like malloc, also give you back aligned blocks (generally on a double-word boundary) so that you can write in data and not hit an alignment fault.
One last bit, while newer ARMs can cope with unaligned accesses better, that does not mean they're performant. It just means they're tolerant. The same can be said for the X86 processors too. They'll do the unaligned access, but you're forcing extra memory fetches by doing so.
Most systems use byte based addressing. The address 0x1234 is in terms of bytes for example. Assume that I mean 8 bit bytes for this answer.
The definition of unaligned as to do with the size of the transfer. A 32 bit transfer for example is 4 bytes. 4 is 2 to the power 2 so if the lower 2 bits of the address are anything other than zeros then that address is an unaligned 32 bit transfer.
So using a table like this or just understanding powers of 2
8 1 0 []
16 2 1 [0]
32 4 2 [1:0]
64 8 3 [2:0]
128 16 4 [3:0]
the first column is the number of bits in the transfer. the second is the number of bytes that represents, the third is the number of bits at the bottom of the address that have to be zero to make it an aligned transfer, and the last column describes those bits.
It is not possible to have an unaligned 8 bit transfer. Not on arm, not on any system. Please understand that.
16 bit transfers. Once we get into transfers larger than 16 bits then you can START to talk about being unaligned. Then problem with unaligned transfers has to do with the number of bus cycles. Say you are doing 16 bit transfers on a system with a 16 bit wide bus and 16 bit wide memories. That means that we have items at memory at these addresses for example, address on left, data on right:
0x0100 : 0x1234
0x0102 : 0x5678
If you want to do a 16 bit transfer that is aligned the lsbit of your address must be zero, 0x100, 0x102, 0x104, etc. Unaligned transfers would be at addresses with the lsbit set, 0x101, 0x103, 0x105, etc. Why are they a problem? In this hypothetical (there were and are still real systems like this) system in order to get two bytes at address 0x0100 we only need to access the memory one time and take all 16 bits from that one address resulting in 0x1234. But if we want 16 bits starting at address 0x0101. We have to do two memory transactions 0x0100 and 0x0102 and take one byte from each combine those to get the result which little endian is 0x7812. That takes more clock cycles, more logic, etc. Inefficient and costly. Intel x86 and other systems from that era which were 8 or 16 bit processors but used 8 bit memory, everything larger than an 8 bit transfer was multiple clock cycles, instructions themselves took multiple clock cycles to execute before the next one could start, burning clock cycles and complication in the logic was not of interest (they saved themselves from pain in other ways).
The older arms may or may not have been from that era, but post acorn, the armv4 to the present is a 32 bit system from a perspective of the size of the general purpose registers, the data bus is 32 or 64 bits (the newest arms have 64 bit registers and I would assume if not already 128 bit busses) depending on your system. The core that put ARM on the map the ARM7TDMI which is an ARMv4T, I assume is a 32 bit data bus. The ARM7 and ARM9 ARM ARM (ARM Architectural Reference Manual) changed its language on each revision (I have several revisions going back to the paper only ones) with respect to words like UNPREDICTABLE RESULTS. When and where they would list something as bad or broken. Some of this was legal, understand ARM does not make chips, they sell IP, back then it was masks for a particular foundry today you get the source code to their core and you deal with it. So to survive you need a good legal defense, your secrets are exposed to customers, some of these items that were claimed not to be supported actually have deterministic results, if ARM were to find a clone (which is yet another legal discussion) with these unpredictable results being predictable and matching what arms logic does you have to be pretty good at explaining why. The clones have been crushed when they have tried (that or legally become licensed arm cores) so some of this is just interesting history. Another arm manual described quite clearly what happens when you do an unaligned transfer on the older ARM7 systems. And it is a bit of a duh moment when you see it, quite obvious what was going on (just plain keep it simple stupid logic).
The byte lanes rotated. On a 32 bit bus somewhere in the system, likely not on the amba/axi bus but inside the memory controller you would effectively get this:
0x0100 : 0x12345678
0x0101 : 0x78123456
0x0102 : 0x56781234
0x0103 : 0x34567812
address on the left resulting data on the right. Now why is that obvious you ask and what is the size of that transfer? The size of the transfer is irrelevant, doesnt matter, look at that address/data this way:
0x0100 : 0x12345678
0x0101 : 0xxx123456
0x0102 : 0xxxxx1234
0x0103 : 0xxxxxxx12
Using aligned transfers, 0x0100 is legal for 32, 16, and 8 bit and look at the lower 8, 16, or 32 bits you get the right answer with the data as shown. For address 0x0101 only an 8 bit transfer is legal, and the lower 8 bits of that data is in the lower 8 bits, just copy those over to the registers lower 8 bits. for address 0x0102 8 and 16 are legal, unaligned, transfers and 0x1234 is the right answer for 16 bit and 0x34 for 8. lastly 0x0103 8 bit is the only transfer size without alignment issues and 0x12 is the right answer.
This above information is all from publicly available documents, no secrets here or special insider knowledge, just generic programming experience.
ARM put an exception in, data abort or prefetch abort (thumb is a separate topic) to discourage the use of unaligned transfers as do other architectures. Unfortunately x86 has lead people to be very lazy and also not care about the performance hit that they incur when doing such a thing on an x86, which allows the transfer at the price of extra cycles and extra logic. The prefetch abort if I remember was not on by default on the ARM7 platforms I used, but was on by default on the ARM9 platforms I used, my memory could be wrong and since I dont know how the defaults worked that could have been a strap option on the core so it could have varied from chip to chip, vendor to vendor. You could disable it and do unaligned transfers so long as you understood what happened with the data (rotate not spill over into the next word).
More modern ARM processors do support unaligned transfers and they are as one would expect, I wont use 64 bit examples here to save typing and space but go back to that 16 bit example to paint the picture
0x0100: 0x1234
0x0102: 0x5678
With a 16 bit wide system, memory and bus, little endian, if you did a 16 bit unaligned transfer at address 0x0101 you would expect to see 0x7812 and that is what you get now on the modern arm systems. But it is still a software controlled feature, you can enable exceptions on unaligned transfers and you will get a data abort instead of a completed transfer.
As far as your question goes look at the ldrb instruction, that instruction does an 8 bit read from memory, being 8 bit there is no such thing as unaligned all addresses are valid, if buf[] happened to live at address 0x1234 then buf[3] is at address 0x1237 and that is a perfectly valid address for an 8 bit read. No alignment issues of any kind, no exceptions will fire. Where you would get into trouble is if you do one of these very ugly programming hacks:
char buf[]="hello world";
short *sptr;
int *iptr;
sptr=(short *)&buf[3];
iptr=(int *)&buf[3];
...
something=*sptr;
something=*iptr;
...
short_something=*(short *)&buf[3];
int_something=*(int *)&buf[3];
And then yes you would need to worry about unaligned transfers as well as hoping that you dont have any compiler optimization issues making the code not work as you had thought it would. +1 to jszakmeister for already covering this sub topic.
short answer:
char buf[]="hello world";
char is generally assumed to mean an 8 bit byte so this is a quantity of 8 bit items. certainly compiled for ARM that is what you will get (or mips or x86 or power pc, etc). So accessing buf[X] for any X within that string, cannot be unaligned because
something = buf[X];
Is an 8 bit transfer and you cant have unaligned 8 bit transfers. If you were to do this
short buf[]={1,2,1,2,3,2,1};
short is assumed but not always the case, to be 16, bit, for the arm compilers I know it is 16 bit. but that doesnt matter buf[X] here also cannot be unaligned because the compiler computes the offset for you. As follows address of buf[X] is base_address_of_buf + (X<<1). And the compiler and/or linker will insure, on ARM, MIPS, and other systems that buf is placed on a 16 bit aligned address so that math will always result in an aligned address.

Limits on Addressability?

I am reading some C text at the address:
https://cs.senecac.on.ca/~lczegel/BTP100/pages/content/compu.html
In the section: Addressible Memory they say that "The maximum size of addressable primary memory depends upon the size of the address registers."
I do not understand why is that.
Can anyone give me a clear explanation, please?
Thanks a lot.
If you have 32-bit registers, then the highest address you can store in a single register is 2^32-1, so you can address 2^32 units (in modern computers, units are almost always bytes). A larger number simply won't fit.
You can get around this by using memory addresses that are larger than a single register can hold (and some CPUs/operating systems have features for doing so), but using addresses/pointers will be slower because it has to fiddle with multiple registers.
As an example, suppose you have 32-bit registers but 64-bit pointers and want to increment a pointer to find the next item in an array of char (++p). Instead of performing a simple increment instruction, the processor will have to
Increment the lower 32 bits;
check if the result is zero (overflow);
increment the upper half as well if overflow occurred.
Simplifying a bit, this means it has to perform a branch (if-then-else) instruction, which is one of the slowest and most complex instructions a modern CPU performs.
(See, e.g., x86 memory segmentation on the Wikipedia for a multi-register addressing scheme used in Intel processors.)
Keeping it simple: the address registers are used to store and refer to addresses of memory; since their size and number is fixed, there is a maximum address.
Obviously you can't exploit more memory than what is addressable (because the machine wouldn't know how to refer to it), so the usable memory is in fact limited by the maximum address that can be expressed by the address registers.
If you have 1 address register, holding a 16 bit address, you can have a maximum of 2^16 - 1 addresses.
However many registers, the number of addresses they can point to will be limited by their width (number of bits).
Thus, the maximum size of addressable primary memory depends upon the size of the address registers.

What is meant by "memory is 8 bytes aligned"?

While going through one project, I have seen that the memory data is "8 bytes aligned". Can anyone please explain what this means?
An object that is "8 bytes aligned" is stored at a memory address that is a multiple of 8.
Many CPUs will only load some data types from aligned locations; on other CPUs such access is just faster. There's also several other possible reasons for using memory alignment - without seeing the code it's hard to say why.
Aligned access is faster because the external bus to memory is not a single byte wide - it is typically 4 or 8 bytes wide (or even wider). This means that the CPU doesn't fetch a single byte at a time - it fetches 4 or 8 bytes starting at the requested address. As a consequence of this, the 2 or 3 least significant bits of the memory address are not actually sent by the CPU - the external memory can only be read or written at addresses that are a multiple of the bus width. If you requested a byte at address "9", the CPU would actually ask the memory for the block of bytes beginning at address 8, and load the second one into your register (discarding the others).
This implies that a misaligned access can require two reads from memory: If you ask for 8 bytes beginning at address 9, the CPU must fetch the 8 bytes beginning at address 8 as well as the 8 bytes beginning at address 16, then mask out the bytes you wanted. On the other hand, if you ask for the 8 bytes beginning at address 8, then only a single fetch is needed. Some CPUs will not even perform such a misaligned load - they will simply raise an exception (or even silently load the wrong data!).
The memory alignment is important for performance in different ways. It has a hardware related reason. Since the 80s there is a difference in access time between the CPU and the memory. The speed of the processor is growing faster than the speed of the memory. This difference is getting bigger and bigger over time (to give an example: on the Apple II the CPU was at 1.023 MHz, the memory was at twice that frequency, 1 cycle for the CPU, 1 cycle for the video. A modern PC works at about 3GHz on the CPU, with a memory at barely 400MHz). One solution to the problem of ever slowing memory, is to access it on ever wider busses, instead of accessing 1 byte at a time, the CPU will read a 64 bit wide word from the memory. This means that even if you read 1 byte from memory, the bus will deliver a whole 64bit (8 byte word). The memory will have these 8 byte units at address 0, 8, 16, 24, 32, 40 etc. A multiple of 8. If you access, for example an 8 byte word at address 4, the hardware will have to read the word at address 0, mask the high 4 bytes of that word, then read word at address 8, mask the low part of that word, combine it with the first half and give that to the register. As you can see a quite complicated (thus slow) operation. This is the first reason one likes aligned memory access. I will give another reason in 2 hours.
"X bytes aligned" means that the base address of your data must be a multiple of X. It can be used for using some special hardware like a DMA in some special hardware, for a faster access by the cpu, etc...
It is the case of the Cell Processor where data must be 16 bytes aligned in order to be copied to/from the co-processor.
if the memory data is 8 bytes aligned, it means: sizeof(the_data) % 8 == 0. generally in C language, if a structure is proposed to be 8 bytes aligned, its size must be multiplication of 8, and if it is not, padding is required manually or by compiler. some compilers provide directives to make a structure aligned with n bytes, for VC, it is #prgama pack(8), and for gcc, it is __attribute__((aligned(8))).

Resources