Understanding hexadecimal output from C pointers

Understanding hexadecimal output from C pointers - c

I ran some C code in CodeBlocks and printed the memory address. The result was a 16 character hexadecimal. This was an exercise in the youtube tutorial on C from freeCodeCamp at 3:14:05.
https://www.youtube.com/watch?v=KJgsSFOSQv0&list=FLWOrEQtSUgIDNHMCI3yfvnQ&index=4
The youtube presenter came up with an 8 character hexadecimal result. Does this mean I have twice the system ram vs the presenter? My Windows 10 laptop has 32 GB of ram.
Edit, I was instructed to post the relevant C code, here it is.
int main()
{
int age = 30;
int * pAge = &age;
double gpa = 3.4;
double * pGpa = &gpa;
char grade = 'A';
char * pGrade = &grade;
printf("age's memory address: %p\n", &age);
return 0;
}

Older Windows versions had a 32 bit address bus. Newer versions support 64 bit data access and 64 bit address buses (given that the CPU is 64 bit too). It has absolutely nothing to do with how much physical RAM you have. Although it was problematic to expand old PC:s beyond 4GB of addressable memory, which was one of the main reasons for migrating to 64 bit.

Related

What are the memory-related ISA features?

I'm preparing for a Computer Architecture exam, and I can't seem to answer this question:
The following code is useful in checking a memory-related ISA feature. What can you determine using this function?
#define X 0
#define Y 1
int mystery_test(){
int i = 1;
char *p = (char *) &i;
if(p[0] == 1) return X;
else return Y;
}
I was thinking that it would check that pointers and arrays are basically the same, but that isn't a memory-related feature so I'm pretty sure my answer is wrong.
Could someone help, please? And also, what are the memory-related ISA features?
Thank you!

The answer from Retired Ninja is way more than you ever want to know, but the shorter version is that the mystery code is testing the endian-ness of the CPU.
We all know that memory in modern CPUs is byte oriented, but when it's storing a larger item - say, a 4-byte integer - what order does it lay down the components?
Imagine the integer value 0x11223344, which is four separate bytes (0x11 .. 0x44). They can't all fit at a single byte memory address, so the CPU has to put them in some kind of order.
Little-endian means the low part - the 0x44 - is in the lowest memory address, while big-endian puts the most significant byte first; here we're pretending that it's stored at memory location 0x9000 (chosen at random):
Little Big -endian
0x9000: x44 x11
0x9001: x33 x22
0x9002: x22 x33
0x9003: x11 x44
It has to pick something, right?
The code you're considering is storing an integer 1 value into a 4 (or 8) byte chunk of memory, so it's going to be in one of these two organizations:
Little Big
0x9000: x01 x00
0x9001: x00 x00
0x9002: x00 x00
0x9003: x00 x01
But by turning an integer pointer into a char pointer, it's looking at only the low byte, and the 0/1 value tells you if this is big-endian or little-endian.
Return is X/0 for little-endian or Y/1 for big-endian.
Good luck on your exam.

long int high and low bits pointers

I'm trying to implement improved sequential multiplication algorithm in C. where the size of the product register is two times the size of the multiplicand and multiplier. in C an int is of 4 bytes and a long int makes it 8 bytes. I wanted to access the higher and lower 32-bits independently. so I pointed the lower and upper bits like:
long long int product = 0;
int* high = &product;
int* low = &product;
low++;
but this didn't work because I thought that if an int is allotted 4 bytes then a long int would be allotted 8 bytes and the pointer would be pointing to the MSB of the allocated memory. I'm not sure if this is actually how allocation is done. can anyone please help me clear this confusion.
I solved the problem using by doing this:
long long int product=0;
int* low = &product;
int* high = &product;
high++;
but I'm still confused that why is it working correctly;

You are probably using a computer that is Little-Endian. On a little-endian machine, the least significant byte is first.

why does a integer type need to be little-endian?

I am curious about little-endian
and I know that computers almost have little-endian method.
So, I praticed through a program and the source is below.
int main(){
int flag = 31337;
char c[10] = "abcde";
int flag2 = 31337;
return 0;
}
when I saw the stack via gdb,
I noticed that there were 0x00007a69 0x00007a69 .... ... ... .. .... ...
0x62610000 0x00656463 .. ...
So, I have two questions.
For one thing,
how can the value of char c[10] be under the flag?
I expected there were the value of flag2 in the top of stack and the value of char c[10] under the flag2 and the value of flag under the char c[10].
like this
7a69
"abcde"
7a69
Second,
I expected the value were stored in the way of little-endian.
As a result, the value of "abcde" was stored '6564636261'
However, the value of 31337 wasn't stored via little-endian.
It was just '7a69'.
I thought it should be '697a'
why doesn't integer type conform little-endian?

There is some confusion in your understanding of endianness, stack and compilers.
First, the locations of variables in the stack may not have anything to do with the code written. The compiler is free to move them around how it wants, unless it is a part of a struct, for example. Usually they try to make as efficient use of memory as possible, so this is needed. For example having char, int, char, int would require 16 bytes (on a 32bit machine), whereas int, int, char, char would require only 12 bytes.
Second, there is no "endianness" in char arrays. They are just that: arrays of values. If you put "abcde" there, the values have to be in that order. If you would use for example UTF16 then endianness would come into play, since then one part of the codeword (not necessarily one character) would require two bytes (on a normal 8-bit machine). These would be stored depending on endianness.
Decimal value 31337 is 0x007a69 in 32bit hexadecimal. If you ask a debugger to show it, it will show it as such whatever the endianness. The only way to see how it is in memory is to dump it as bytes. Then it would be 0x69 0x7a 0x00 0x00 in little endian.
Also, even though little endian is very popular, it's mainly because x86 hardware is popular. Many processors have used big endian (SPARC, PowerPC, MIPS amongst others) order and some (like older ARM processors) could run in either one, depending on the requirements.
There is also a term "network byte order", which actually is big endian. This relates to times before little endian machines became most popular.

Integer byte order is an arbitrary processor design decision. Why for example do you appear to be uncomfortable with little-endian? What makes big-endian a better choice?
Well probably because you are a human used to reading numbers from left-to-right; but the machine hardly cares.
There is in fact a reasonable argument that it is intuitive for the least-significant-byte to be placed in the lowest order address; but again, only from a human intuition point-of-view.

GDB shows you 0x62610000 0x00656463 because it is interpreting data (...abcde...) as 32bit words on a little endian system.
It could be either way, but the reasonable default is to use native endianness.
Data in memory is just a sequence of bytes. If you tell it to show it as a sequence (array) of short ints, it changes what it displays. Many debuggers have advanced memory view features to show memory content in various interpretations, including string, int (hex), int (decimal), float, and many more.

You got a few excellent answers already.
Here is a little code to help you understand how variables are laid out in memory, either using little-endian or big-endian:
#include <stdio.h>
void show_var(char* varname, unsigned char *ptr, size_t size) {
int i;
printf ("%s:\n", varname);
for (i=0; i<size; i++) {
printf("pos %d = %2.2x\n", i, *ptr++);
}
printf("--------\n");
}
int main() {
int flag = 31337;
char c[10] = "abcde";
show_var("flag", (unsigned char*)&flag, sizeof(flag));
show_var("c", (unsigned char*)c, sizeof(c));
}
On my Intel i5 Linux machine it produces:
flag:
pos 0 = 69
pos 1 = 7a
pos 2 = 00
pos 3 = 00
--------
c:
pos 0 = 61
pos 1 = 62
pos 2 = 63
pos 3 = 64
pos 4 = 65
pos 5 = 00
pos 6 = 00
pos 7 = 00
pos 8 = 00
pos 9 = 00
--------

Separate byte in groups

So Im using this (its from another question I did),
unsigned char *y = resultado->informacion;
int i = 0;
int tam = data->tamanho;
unsigned char resAfter;
for (int i=0; i<tam;i++)
{
unsigned char x = data->informacion[i];
x <<= 3;
if (i>0)
{
resAfter = (resAfter << 5) | x;
}
else
{
resAfter = x;
}
}
printf("resAfter es %s\n", resAfter);
so at the end I have this really long number (Im estimating about 43 bits), how can I get groups of 8 bits, I think im gettin something like (010101010101010.....000) and I want to separate this in groups of 8.
Another question, I know for sure that resAfter is going to have n number of bits where n is a multiply of 8 plus 3, so my question is: is this possible? or c is going to complete the byte? like if I get 43 bits then c is going to fill them with 0 and complete so I have 48 bits; and is there a way to delete these 3 bits?
Im new on c and bitwise so sorry if what Im doing is reallly bad.

Basically in programming you deal with bytes (i think, at least in most cases), in C you deal with types of specific size (depending on system you run it on).
That said char usually has size of 1 byte, and I don't really think you can playing around with single bits. I mean u can do operation on them (<< for instance) in scale of single bits but i don't know of any standard way to preserve less than 8 bits in variable in C (though i may be wrong about it)

how is data stored at bit level according to "Endianness"?

I read about Endianness and understood squat...
so I wrote this
main()
{
int k = 0xA5B9BF9F;
BYTE *b = (BYTE*)&k; //value at *b is 9f
b++; //value at *b is BF
b++; //value at *b is B9
b++; //value at *b is A5
}
k was equal to A5 B9 BF 9F
and (byte)pointer "walk" o/p was 9F BF b9 A5
so I get it bytes are stored backwards...ok.
~
so now I thought how is it stored at BIT level...
I means is "9f"(1001 1111) stored as "f9"(1111 1001)?
so I wrote this
int _tmain(int argc, _TCHAR* argv[])
{
int k = 0xA5B9BF9F;
void *ptr = &k;
bool temp= TRUE;
cout<<"ready or not here I come \n"<<endl;
for(int i=0;i<32;i++)
{
temp = *( (bool*)ptr + i );
if( temp )
cout<<"1 ";
if( !temp)
cout<<"0 ";
if(i==7||i==15||i==23)
cout<<" - ";
}
}
I get some random output
even for nos. like "32" I dont get anything sensible.
why ?

Just for completeness, machines are described in terms of both byte order and bit order.
The intel x86 is called Consistent Little Endian because it stores multi-byte values in LSB to MSB order as memory address increases. Its bit numbering convention is b0 = 2^0 and b31 = 2^31.
The Motorola 68000 is called Inconsistent Big Endian because it stores multi-byte values in MSB to LSB order as memory address increases. Its bit numbering convention is b0 = 2^0 and b31 = 2^31 (same as intel, which is why it is called 'Inconsistent' Big Endian).
The 32-bit IBM/Motorola PowerPC is called Consistent Big Endian because it stores multi-byte values in MSB to LSB order as memory address increases. Its bit numbering convention is b0 = 2^31 and b31 = 2^0.
Under normal high level language use the bit order is generally transparent to the developer. When writing in assembly language or working with the hardware, the bit numbering does come into play.

Endianness, as you discovered by your experiment refers to the order that bytes are stored in an object.
Bits do not get stored differently, they're always 8 bits, and always "human readable" (high->low).
Now that we've discussed that you don't need your code... About your code:
for(int i=0;i<32;i++)
{
temp = *( (bool*)ptr + i );
...
}
This isn't doing what you think it's doing. You're iterating over 0-32, the number of bits in a word - good. But your temp assignment is all wrong :)
It's important to note that a bool* is the same size as an int* is the same size as a BigStruct*. All pointers on the same machine are the same size - 32bits on a 32bit machine, 64bits on a 64bit machine.
ptr + i is adding i bytes to the ptr address. When i>3, you're reading a whole new word... this could possibly cause a segfault.
What you want to use is bit-masks. Something like this should work:
for (int i = 0; i < 32; i++) {
unsigned int mask = 1 << i;
bool bit_is_one = static_cast<unsigned int>(ptr) & mask;
...
}

Your machine almost certainly can't address individual bits of memory, so the layout of bits inside a byte is meaningless. Endianness refers only to the ordering of bytes inside multibyte objects.
To make your second program make sense (though there isn't really any reason to, since it won't give you any meaningful results) you need to learn about the bitwise operators - particularly & for this application.

Byte Endianness
On different machines this code may give different results:
union endian_example {
unsigned long u;
unsigned char a[sizeof(unsigned long)];
} x;
x.u = 0x0a0b0c0d;
int i;
for (i = 0; i< sizeof(unsigned long); i++) {
printf("%u\n", (unsigned)x.a[i]);
}
This is because different machines are free to store values in any byte order they wish. This is fairly arbitrary. There is no backwards or forwards in the grand scheme of things.
Bit Endianness
Usually you don't have to ever worry about bit endianness. The most common way to access individual bits is with shifts ( >>, << ) but those are really tied to values, not bytes or bits. They preform an arithmatic operation on a value. That value is stored in bits (which are in bytes).
Where you may run into a problem in C with bit endianness is if you ever use a bit field. This is a rarely used (for this reason and a few others) "feature" of C that allows you to tell the compiler how many bits a member of a struct will use.
struct thing {
unsigned y:1; // y will be one bit and can have the values 0 and 1
signed z:1; // z can only have the values 0 and -1
unsigned a:2; // a can be 0, 1, 2, or 3
unsigned b:4; // b is just here to take up the rest of the a byte
};
In this the bit endianness is compiler dependant. Should y be the most or least significant bit in a thing? Who knows? If you care about the bit ordering (describing things like the layout of a IPv4 packet header, control registers of device, or just a storage formate in a file) then you probably don't want to worry about some different compiler doing this the wrong way. Also, compilers aren't always as smart about how they work with bit fields as one would hope.

This line here:
temp = *( (bool*)ptr + i );
... when you do pointer arithmetic like this, the compiler moves the pointer on by the number you added times the sizeof the thing you are pointing to. Because you are casting your void* to a bool*, the compiler will be moving the pointer along by the size of one "bool", which is probably just an int under the covers, so you'll be printing out memory from further along than you thought.
You can't address the individual bits in a byte, so it's almost meaningless to ask which way round they are stored. (Your machine can store them whichever way it wants and you won't be able to tell). The only time you might care about it is when you come to actually spit bits out over a physical interface like I2C or RS232 or similar, where you have to actually spit the bits out one-by-one. Even then, though, the protocol would define which order to spit the bits out in, and the device driver code would have to translate between "an int with value 0xAABBCCDD" and "a bit sequence 11100011... [whatever] in protocol order".