Understand the following line - c

I read this code in a library which is used to display a bitmap (.bmp) to an LCD.
I do really hard in understanding what is happening at the following lines, and how it does happen.
Maybe someone can explain this to me.
uint16_t s, w, h;
uint8_t* buffer; // does get malloc'd
s = *((uint16_t*)&buffer[0]);
w = *((uint16_t*)&buffer[18]);
h = *((uint16_t*)&buffer[22]);
I guess it's not that hard for a real C programmer, but I am still learning, so I thought I just ask :)
As far as I understand this, it sticks somehow together two uint8_tvariables to an uint16_t.
Thanks in advance for your help here!

In the code you've provided, buffer (which is an array of bytes) is read, and values are extracted into s, w and h.
The (uint16_t*)&buffer[n] syntax means that you're extracting the address of the nth byte of buffer, and casting it into a uint16_t*. The casting tells the compiler to look at this address as if points at a uint16_t, i.e. a pair of uint8_ts.
The additional * in the code dereferences the pointer, i.e. extracts the value from this address. Since the address now points at a uint16_t, a uint16_t value is extracted.
As a result:
s gets the value of the first uint16_t, i.e. bytes 0 and 1.
w gets the value of the tenth uint16_t, i.e. bytes 18 and 19.
h gets the value of the twelveth uint16_t, i.e. bytes 22 and 23.

The code:
takes two bytes at positions 0 and 1 in the buffer, sticks them together into an unsigned 16-bit value, and stores the result in s;
it does the same with bytes 18/19, storing the result in w;
ditto for bytes 22/23 and h.
It is worth noting that the code uses the native endianness of the target platform to decide which of the two bytes represents the top 8 bits of the result, and which represents the bottom 8 bits.

uint8_t* buffer; // pointer to 8 bit or simply one byte
Buffer points to memory address of bytes -> |byte0|byte1|byte2|....
(uint16_t*)&buffer[0] // &buffer[0] is actually the same as buffer
(uint16_t*)&buffer[0] equals (uint16_t*)buffer; it points to 16 bit or halfword
(uint16_t*)buffer points to memory: |byte0byte1 = halfword0|byte2byte3 = halfword1|....
w = *((uint16_t*)&buffer[18]);
Takes memory address to byte 18 in buffer, then reinterpret this address to address of halfword then gets halfword on this address;
it's simply w = byte18 and byte19 sticked together forming a halfword
h = *((uint16_t*)&buffer[22]);
h = byte22 and byte 23 sticked together
UPD More detailed explanation:
h = *((uint16_t*)&buffer[22]) =>
1) buffer[22] === 22nd uint8_t (a.k.a. byte) of buffer; let's call it byte22
2) &buffer[22] === &byte === address of byte22 in memory; it's of type uint8_t*, as same as buffer; letscall it byte22_address;
3) (uint16_t*)&buffer[22] = (uint16_t*)byte22_address; casts address of byte to address of (two bytes sticked together; address of halfword of the same address; let's call it halfword11_address;
4) h = *((uint16_t*)&buffer[22]) === *halfword11_address; * operator takes value at address, that is 11th halfword or bytes 22 and 23 sticked together;

Related

Why does a reference to an int return only one memory address?

Example Program:
#include <stdio.h>
int main() {
int x = 0;
printf("%p", &x);
return 0;
}
I have read that most machines are byte-accessible, meaning that only one
byte can be stored on a single memory address (e.g. 0xf4829cba stores the value 01101011). Assuming that x is a 32-bit integer, shouldn't the reference to the variable return four memory addresses, instead of one?
Please ELI5, as I am very confused right now.
Thank you so much for your time.
-Matt
The address (it's not a "reference") you're given is to the beginning of the memory where the variable is stored. The variable will then take as many bytes as needed according to its type. So if int is 32 bits in your target architecture, the address you get is of the first of four bytes used to store that int.
+−−−−−−−−+
address−−−>| byte 0 |
| byte 1 |
| byte 2 |
| byte 3 |
+−−−−−−−−+
It may help to think in terms of objects1 rather than bytes. Most useful data types in C take up more than a single byte.
As for an expression like &x evaluating to multiple addresses, think of it like the address to your house - you don't specify a distinct address for every room in the house, do you? No, for the purpose of telling other people where your house is, you only need to specify one address. For the purpose of knowing where an int ordouble or struct humongous object is, we only need to know the address of the first byte.
You can access and manipulate individual bytes in a larger object in several different ways. You can use bit masking operations like
int x = some_value;
unsigned char aByte = (x & 0xFF000000) >> 24; // isolate the MSB
or you can map the object onto an array of unsigned char using a union:
union {
int x;
unsigned char b[sizeof (int)];
} u;
u.x = some_value;
aByte = u.b[0]; // access the initial byte - depending on byte ordering, this
// may be the MSB or the LSB.
or by creating a pointer to the first byte:
int x = some_value;
unsigned char *b = (unsigned char *) &x;
unsigned char aByte = b[0];
Byte ordering is a thing - some architectures store multi-byte values starting at the most significant byte, others starting at the least significant byte:
For any address A
A+0 A+1 A+2 A+3
Big endian +---+---+---+---+
|MSB| | |LSB|
+---+---+---+---+ Little endian
A+3 A+2 A+1 A+0
The M68K chips that powered the original Macintosh were big-endian, while x86 is little-endian.
Bitwise operators like & and | take byte ordering into account - x & 0xFF000000 will always isolate the MSB2. When you map an object onto an array of unsigned char, the first element may map to the MSB, or it may map to the LSB, or it may map to something else (the old VAX architecture used a "middle-endian" ordering for 32-bit floats that either went 2301 or 1032, can't remember which offhand).
In the C sense of a region of storage that may be used to hold a value, not the OOP sense of an instance of a class.
Assuming 32-bit int and 8-bit bytes, anyway.

Pointer in arrays. How does it work "physically" in memory?

I have been wondering about pointers and can't find a source explaining them with details.
For example. Given an array int a[3]
There is a pointer pointing at 4 locations?
It starts as *[a+0] and points at address of a?
Then what does it do next? Int is minimum of 16 bites, so it needs to read 2 bytes, but every byte is given an address.
Does it mean that for a[0] the pointer points at the beginning address, then the program reads sizeof(int) bytes starting at the given address?
What would it do the next? Would it stop reading, give the result and
for a[1] would it point at address of &a+1*sizeof(int).
It would start reading at address of (&a+2(as 2 stands for already read addresses of 2 bytes)), start reading, so it would read another 2 bytes and on and on?
I can't quite understand these concepts.
PS: String consist of unsigned char which are 1 byte elements.
The post you mentioned doesn't explain what happens with elements larger than 1 byte. It also doesn't explain exactly what the program does beside "here is a string the program reads from memory". I assume that I am right, but nonetheless the title you mentioned is far away from what I asked about.
(since somebody wrote this already, one address stands for one byte)
54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69
+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
|----|----|----|----|----|----| | | | | | | | | | |
0----+----01---+----12---+----2----+----+----+----+----+----+----+----+----+----+
I specifically asked if
int a[2] means that the pointer first:
Points at memory address (54), the program reads data from 2 following addresses (54 to 54 as int takes 2 bytes), then the pointer points at address 54+2, the program starts reading from address range <56,57>. Then again, the pointer points at starting range of 58, the program reads at address of <58,59>
Is this logic correct? It isn't a string ended up with NULL.
My guess to strings is that the program would access memory byte's address by byte's address and read the values till it found NULL.
Arrays aren't strings.
Consider
int a[3] = {};
int b[300] = {};
These 2 arrays are "similar" in that they contain values of int and are different in these two major regards:
They are of different "size" - that is the memory they point to is reserved with different amount for each. The first array points to a memory that is reserved to hold at least 3 int values. However that is the minimum allocated memory (in this case - on a stack, so most likely it is also a precise amount of memory allocated for it as well)
They point to different addresses in memory (again - in this case they are both allocated on a stack but it is still a RAM)
You can just as easily take an address of the first element of either array:
int * p = a;
p = &a[0]; // same as above
p = b; // now p points to the first element of the second array
When you perform an indexing operation what the compiler does is: it takes the address of the first element and increments it by a value that is equal to the index times the size of each element (if there's no padding due to alignment, of course). In essence the compiler is doing this:
b[1] = 1;
*(p+1) = 1; // same as above
uint8_t * c = reinterpret_cast<uint8_t*>(p); // WARNING! Explanation follows
The last line will cause the compiler to reinterpret the pointer differently and "the same" address arithmetic "suddenly" works differently:
c[1] = 1; // this is NOT the same as b[1] = 1
In this case the compiler will only "move" the pointer 8-bits (not 16 or 32 bits, depending on your platform's sizeof(int)) and end up in the middle of that first int element of array b. Granted this can be useful (especially when dealing directly with hardware) but is super-duper-puper non-portable and you should avoid doing so at all times!
This is admittedly not a comprehensive answer but I was not aiming to provide one as the topic is very vast and there are plenty of resources on the Web that can provide you with many more details on this subject

C programming: words from byte array

I have some confusion regarding reading a word from a byte array. The background context is that I'm working on a MIPS simulator written in C for an intro computer architecture class, but while debugging my code I ran into a surprising result that I simply don't understand from a C programming standpoint.
I have a byte array called mem defined as follows:
uint8_t *mem;
//...
mem = calloc(MEM_SIZE, sizeof(uint8_t)); // MEM_SIZE is pre defined as 1024x1024
During some of my testing I manually stored a uint32_t value into four of the blocks of memory at an address called mipsaddr, one byte at a time, as follows:
for(int i = 3; i >=0; i--) {
*(mem+mipsaddr+i) = value;
value = value >> 8;
// in my test, value = 0x1084
}
Finally, I tested trying to read a word from the array in one of two ways. In the first way, I basically tried to read the entire word into a variable at once:
uint32_t foo = *(uint32_t*)(mem+mipsaddr);
printf("foo = 0x%08x\n", foo);
In the second way, I read each byte from each cell manually, and then added them together with bit shifts:
uint8_t test0 = mem[mipsaddr];
uint8_t test1 = mem[mipsaddr+1];
uint8_t test2 = mem[mipsaddr+2];
uint8_t test3 = mem[mipsaddr+3];
uint32_t test4 = (mem[mipsaddr]<<24) + (mem[mipsaddr+1]<<16) +
(mem[mipsaddr+2]<<8) + mem[mipsaddr+3];
printf("test4= 0x%08x\n", test4);
The output of the code above came out as this:
foo= 0x84100000
test4= 0x00001084
The value of test4 is exactly as I expect it to be, but foo seems to have reversed the order of the bytes. Why would this be the case? In the case of foo, I expected the uint32_t* pointer to point to mem[mipsaddr], and since it's 32-bits long, it would just read in all 32 bits in the order they exist in the array (which would be 00001084). Clearly, my understanding isn't correct.
I'm new here, and I did search for the answer to this question but couldn't find it. If it's already been posted, I apologize! But if not, I hope someone can enlighten me here.
It is (among others) explained here: http://en.wikipedia.org/wiki/Endianness
When storing data larger than one byte into memory, it depends on the architecture (means, the CPU) in which order the bytes are stored. Either, the most significant byte is stored first and the least significant byte last, or vice versa. When you read back the individual bytes through byte access operations, and then merge them to form the original value again, you need to consider the endianess of your particular system.
In your for-loop, you are storing your value byte-wise, starting with the most significant byte (counting down the index is a bit misleading ;-). Your memory looks like this afterwards: 0x00 0x00 0x10 0x84.
You are then reading the word back with a single 32 bit (four byte) access. Depending on our architecture, this will either become 0x00001084 (big endian) or 0x84100000 (little endian). Since you get the latter, you are working on a little endian system.
In your second approach, you are using the same order in which you stored the individual bytes (most significant first), so you get back the same value which you stored earlier.
It seems to be a problem of endianness, maybe comes from casting (uint8_t *) to (uint32_t *)

Which of the following is the correct output for the program given below?

if the machine is 32bit little-endianess and the sizeof(int) is 4 byte.
Given the following program:
line1: #include<stdio.h>
line2: {
line3: int arr[3]={2,3,4};
line4: char *p;
line5: p=(char*)arr;
line6: printf("%d",*p);
line7: p=p+1;
line8: printf("%d\n",*p);
line9: return 0;
}
What is the expected output?
A: 2 3
B: 2 0
C: 1 0
D: garbage value
one thing that bothering me the casting of the integer pointer to an character pointer.
How important the casting is?
What is the compiler doing at line 5? (p = (char *) arr;)
What is happening at line 7? (p = p + 1)
If the output is 20 then how the 0 is being printed out?
(E) none of the above
However, provided that (a) you are on a little-endian machine (e.g. x86), and (b) sizeof(int) >= 2, this should print "20" (no space is printed between the two).
a) the casting is "necessary" to read the array one byte at a time instead of as a series of ints
b) this is just coercing the address of the first int into a pointer to char
c) increment the address stored in p by sizeof(char) (which is 1)
d) the second byte of the machine representation of the int is printed by line 8
(D), or compiler specific, as sizeof(int) (as well as endianness) is platform-dependent.
How important the casting is?
Casting, as a whole is an integral (pun unintended) part of the C language.
and what the compilar would do in line number5?
It takes the address of the first element of arr and puts it in p.
and after line number 5 whats going on line number7?
It increments the pointer so it points to the next char from that memory address.
and if the output is 2 0 then how the 0 is being printed by the compiler?
This is a combination of endanness and sizeof(int). Without the specs of your machine, there isn't much else I can do to explain.
However, assuming little endian and sizeof(int) == 4, we can see the following:
// lets mark these memory regions: |A|B|C|D|
int i = 2; // represented as 0x02000000
char *ptr = (char *) &i; // now ptr points to 0x02 (A)
printf("%d\n", *ptr); // prints '2', because ptr points to 0x02 (A)
ptr++; // increment ptr, ptr now points to 0x00 (B)
printf("%d\n", *ptr); // prints '0', because ptr points to 0x00 (B)
1.important of casting:-
char *p;
this line declare a pointer to a character.That means its property is it can de-reference
only one byte at a time,and also displacement are one one byte.
p=(char*)arr;
2. type casting to char * is only for avoid warning by compiler nothing else.
If you don't then also same behavior.
as pointer to a character as I already write above p=p+1 point to next byte
printf("%d\n",*p);
%d is formatting the value to decimal integer so decimal format shown
here *p used and as per its property it can de-reference only one byte.So now memory organisation comes into picture.
that is your machine follows little endian/LSB first or big endian/MSB first
as per your ans your machine follow little endian.So first time your ans is 0.
Then next byte must be zero so output is 0.
in binary:
2 represented as 00-00-00-02(byte wise representation)
but in memory it stores like
02-00-00-00 four bytes like this
in first memory byte 02
and in 2nd memory byte 00

Can you explain how this ip_to_string function works?

#define IPTOSBUFFERS 12
char *iptos(u_long in)
{
static char output[IPTOSBUFFERS][3*4+3+1];
static short which;
u_char *p;
p = (u_char *)&in;
which = (which + 1 == IPTOSBUFFERS ? 0 : which + 1);
_snprintf_s(output[which], sizeof(output[which]), sizeof(output[which]),"%d.%d.%d.%d", p[0], p[1], p[2], p[3]);
return output[which];
}
Is there something I'm missing to understand it?
Annotated below for your enjoyment:
// This is the number of IP string buffers.
#define IPTOSBUFFERS 12
char *iptos(u_long in)
{
// 12 buffers, each big enough to hold maximum-sized IP address
// and nul terminator.
static char output[IPTOSBUFFERS][3*4+3+1];
// Last buffer used.
static short which;
// Get uns. char pointer to IP address.
u_char *p;
p = (u_char *)&in;
// Move to next string buffer, wrapping if necessary.
which = (which + 1 == IPTOSBUFFERS ? 0 : which + 1);
// Output IP address by accessing individual unsigned chars in it.
_snprintf_s(output[which], sizeof(output[which]), sizeof(output[which]),
"%d.%d.%d.%d", p[0], p[1], p[2], p[3]);
// Return the current buffer.
return output[which];
}
It works because the representation of an IPv4 address is a 32-bit value in memory and each of the four segments occupies one octet each. So it's a relatively simple matter to cast the address of the 32-bit integer to a four-char array then use that array to extract the individual segments. This is, of course, predicated on the data types having specific bit widths so it's not that portable.
The bizarre thing is the 12-IP-Address circular queue. Maybe that was so you could get up to 12 IP addresses at a time without the strings being overwritten although I don't think I've ever encountered a situation where more than two (maybe three for a proxy or pass-thru server) was required at the same time. I don't think it's for thread safety since the modification to which is inherently dangerous in a threaded environment.
Here's an answer, based on what seems to be confusing from the comments.
An IP address is often represented internally as 32 bits. It's often presented as 4 decimal fields, ranging from 0 to 255. To convert from the decimal representation to the 32-bit representation, simple convert the fields from decimal to binary (or hex) from left to right, and concatenate them.
Thus, 1.2.3.4 becomes the fields 0x01, 0x02, 0x03, and 0x04. Thus, the 32-bit (unsigned long) representation of them is: 0x01020304. Of course, this is subject to the byte ordering as well...
To print an 32-bit address as a string, just look at each of the four sets of 8 bits that compose it, and print them as decimal integers with dots in between.

Resources