C - calculate address of pointer - c

I have program that put pointer p to some point and then calculate p-*p. The address of p is not specified in the program itself. How can I know what is p address?
typedef struct locals { int x; int y; } locals;
int main() {
locals Z; char *p;
Z.x = 0x00444342; Z.y = 0x01020304;
p = ((char *)&Z) + 4;
printf("%s\n", (p - *p));
}

I can explain how to calculate the address with few assumptions:
The size of int is 4 bytes
The system is little-endian
There are no 0 paddings in the struct
Thus the locals memory layout is so that the first 4 bytes (Z.x) are
byte[0] = 0x42
byte[1] = 0x43
byte[2] = 0x44
byte[3] = 0x00
and the next 4 bytes (Z.y) are
byte[0] = 0x04
byte[1] = 0x03
byte[2] = 0x02
byte[3] = 0x01
Now for p = ((cahr *)&Z) + 4, p would point to the beginning of Z plus 4 bytes, which brings us to byte[0] of Z.y.
Now about
printf("%s\n", (p - *p));
*p would be the value of the first byte of Z.y which is 0x04. And printf will get as the second argument address of Z.y - 4 bytes which is byte[0] of Z.x.
Thus the output of printf will be all characters till the first '\0' (which is the 3rd byte of Z.x):
ASCII 0x42 ASCII 0x43 ASCII 0x44:
BCD

Related

How is an integer stored in C program?

is the number 1 stored in memory as 00000001 00000000 00000000 00000000?
#include <stdio.h>
int main()
{
unsigned int a[3] = {1, 1, 0x7f7f0501};
int *p = a;
printf("%d %p\n", *p, p);
p = (long long)p + 1;
printf("%d %p\n", *p, p);
char *p3 = a;
int i;
for (i = 0; i < 12; i++, p3++)
{
printf("%x %p\n", *p3, p3);
}
return 0;
}
Why is 16777216 printed in the output:
An integer is stored in memory in different ways on different architectures. Most commons ways are called little-endian and big-endian byte ordering.
See Endianness
(long long)p+1
|
v
Your memory: [0x01, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, ...]
You increment p not like pointer but as a long long number, so it does not point to next integer but the next byte. So you will get 0x00, 0x00, 0x00, 0x01 which translates to 0x1000000 (decimal 16777216) in a little-endian arch.
Something to play with (assuming int is 32 bits wide):
#include <stdio.h>
#include <stdbool.h>
typedef union byte_rec {
struct bit_rec {
bool b0 : 1;
bool b1 : 1;
bool b2 : 1;
bool b3 : 1;
bool b4 : 1;
bool b5 : 1;
bool b6 : 1;
bool b7 : 1;
} bits;
unsigned char value;
} byte_t;
typedef union int_rec {
struct bytes_rec {
byte_t b0;
byte_t b1;
byte_t b2;
byte_t b3;
} bytes;
int value;
} int_t;
void printByte(byte_t *b)
{
printf(
"%d %d %d %d %d %d %d %d ",
b->bits.b0,
b->bits.b1,
b->bits.b2,
b->bits.b3,
b->bits.b4,
b->bits.b5,
b->bits.b6,
b->bits.b7
);
}
void printInt(int_t *i)
{
printf("%p: ", i);
printByte(&i->bytes.b0);
printByte(&i->bytes.b1);
printByte(&i->bytes.b2);
printByte(&i->bytes.b3);
putchar('\n');
}
int main()
{
int_t i1, i2;
i1.value = 0x00000001;
i2.value = 0x80000000;
printInt(&i1);
printInt(&i2);
return 0;
}
Possible output:
0x7ffea0e30920: 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0x7ffea0e30924: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
Additional (based on the comment of #chqrlie):
I've previously used the unsigned char type, but the C Standard allows only 3 - and since C99 - 4 types. Additional implementation-defined types may be acceptable by the C Standard and it seems that gcc was ok with the unsigned char type for the bit field, but i've changed it nevertheless to the allowed type _Bool (since C99).
Noteworthy: The order of bit fields within an allocation unit (on some platforms, bit fields are packed left-to-right, on others right-to-left) are undefined (see Notes section in the reference).
Reference to bit fields: https://en.cppreference.com/w/c/language/bit_field
p = (long long)p + 1; is bad code (undefined behavior UB (e.g. bus fault and re-booted machine)) as it is not specified to work in C. The attempted assigned of the newly formed address is not certainly aligned to int * needs.
Don`t do that.
To look at the bytes of a[]
#include <stdio.h>
#include <stdlib.h>
void dump(size_t sz, const void *ptr) {
const unsigned char *byte_ptr = (const unsigned char *) ptr;
for (size_t i = 0; i < sz; i++) {
printf("%p %02X\n", (void*) byte_ptr, *byte_ptr);
byte_ptr++;
}
}
int main(void) {
unsigned int a[3] = {1, 1, 0x7f7f0501u};
dump(sizeof a, a);
}
As this is wiki, feel open to edit.
There are multiple instances of undefined behavior in your code:
in printf("%d %p\n", *p, p) you should cast p as (void *)p to ensure printf receives a void * as it expects. This is unlikely to pose a problem on most current targets but some ancien systems had different representations for int * and void *, such as early Cray systems.
in p = (long long)p + 1, you have implementation defined behavior converting a pointer to an integer and implicitly converting the integral result of the addition back to a pointer. More importantly, this may create a pointer with incorrect alignment for accessing int in memory, resulting in undefined behavior when you dereference p. This would cause a bus error on many systems, eg: most RISC architectures, but by chance not on intel processors. It would be safer to compute the pointer as p = (void *)((intptr_t)p + 1); or p = (void *)((char *)p + 1); albeit this would still have undefined behavior because of alignment issues.
is the number 1 stored in memory as 00000001 00000000 00000000 00000000?
Yes, your system seems to use little endian representation for int types. The least significant 8 bits are stored in the byte at the address of a, then the next least significant 8 bits, and so on. As can be seen in the output, 1 is stored as 01 00 00 00 and 0x7f7f0501 stored as 01 05 7f 7f.
Why is 16777216 printed in the output?
The second instance of printf("%d %p\n", *p, p) as undefined behavior. On your system, p points to the second byte of the array a and *p reads 4 bytes from this address, namely 00 00 00 01 (the last 3 bytes of 1 and the first byte of the next array element, also 1), which is the representation of the int value 16777216.
To dump the contents of the array as bytes, you should access it using a char * as you do in the last loop. Be aware that char may be signed on some systems, causing for example printf("%x\n", *p3); to output ffffff80 if p3 points to the byte with hex value 80. Using unsigned char * is recommended for consistent and portable behavior.

Pointers in C-Program

I have a question about pointers in C. For example:
int data[SIZE] = {2,4,5,1,0};
int *p = &data[2];
int **s=&p;
p++;
printf("%p ", *s);
Is here the pointer *s equal to *p, i.e is the adress of *s equal to *p?
*It may be an easy question, but we didnt spend enough time learning C
After the declarations, you have the following:
s == &p // int ** == int **
*s == p == &data[2] // int * == int * == int *
**s == *p == data[2] == 5 // int == int == int == int
After p++:
*s == p == &data[3]
**s == *p == data[3] == 1
If you run
int data[5] = {2,4,5,1,0};
int *p = &data[2];
int **s=&p;
p++;
int data[5] = {2,4,5,1,0};
int *p = &data[2];
int **s=&p;
p++;
printf("*p: %d \n", *p);
printf("&p: %p \n", &p);
printf("s: %p \n", s);
printf("*s: %p \n", *s);
printf("**s: %d \n", **s);
you'll get:
*p: 1
&p: 0x7ffc69a74650
s: 0x7ffc69a74650
*s: 0x7ffc69a7466c
**s: 1
Which shows that the value pointed by both *p and **s is the same (1), also &p == s but &p and *s are not the same, as there's an extra "step".
Here's a possible execution of your program:
int data[SIZE] = {2,4,5,1,0};
// your memory looks like this:
// Address (name) -> Value
// 0x80 (data[0]) -> 2
// 0x84 (data[1]) -> 4
// 0x88 (data[2]) -> 5
// 0x8C (data[3]) -> 1
// 0x90 (data[4]) -> 0
int *p = &data[2];
// 0x80 (data[0]) -> 2
// 0x84 (data[1]) -> 4
// 0x88 (data[2]) -> 5
// 0x8C (data[3]) -> 1
// 0x90 (data[4]) -> 0
// 0xC0 (p) -> 0x88
int **s=&p;
// 0x80 (data[0]) -> 2
// 0x84 (data[1]) -> 4
// 0x88 (data[2]) -> 5
// 0x8C (data[3]) -> 1
// 0x90 (data[4]) -> 0
// 0xC0 (p) -> 0x88
// 0xD8 (s) -> 0xC0
p++;
// 0x80 (data[0]) -> 2
// 0x84 (data[1]) -> 4
// 0x88 (data[2]) -> 5
// 0x8C (data[3]) -> 1
// 0x90 (data[4]) -> 0
// 0xC0 (p) -> 0x8C
// 0xD8 (s) -> 0xC0
printf("%p ", *s); //Will print 0x8C
So no, *s (the value pointed by s) won't be equal to *p (the value pointed by p) but to p (the address of the p pointer)

c: interpreting bytes of a given sequence as int16_t values and summing them

I am trying to figure out how to add sequential bytes in a block of data starting at a given place(sequenceOffset) to a certain length(sequenceLength), by typcasting them to signed 16 bit integers(int16_t). The numbers can be negative and positive.I also cannot use any arrays, only pointer syntax.
*blockAddress points to the first byte of the memory region
*blockLength is number of bytes in the memory region
* sequenceOffset is the offset of the first byte of the sequence that
* is to be summed
* sequenceLength is the number of bytes in the sequence, and
* sequenceLength > 0
*
* Returns: the sum of the int16_t values obtained from the given sequence;
* if the sequence contains an odd number of bytes, the final byte
* is ignored; return zero if there are no bytes to sum
int16_t sumSequence16(const uint8_t* const blockAddress, uint32_t blockLength,
uint32_t sequenceOffset, uint8_t sequenceLength){
uint16_t sum = 0;
const uint8_t* curr = blockAddress; // deref
uint16_t pointer = *(uint16_t*)curr; // typecast to int16
for (uint16_t i = 0; i< sequenceLength; i++){
sum = sequenceOffset + (pointer +i +1);
}// for
an example of a test case:
--Summing sequence of 8 bytes at offset 113:
5D 5C 4E 6E FA B3 5D 4C
23645 28238 -19462 19549
You said the sum is: -7412
Should be: -13566
i'm not sure how to handle the case where I ignore the final byte if the sequence contains an odd number of bytes.
#include <stdint.h>
#include <stdio.h>
int16_t sumSequence16sane(const uint8_t* block, uint32_t length)
{
int16_t ret = 0;
while (length >= 2)
{
ret += block[1] << 8 | block[0];
block += 2;
length -= 2;
}
return ret;
}
int16_t sumSequence16(const uint8_t* const blockAddress, uint32_t blockLength,
uint32_t sequenceOffset, uint8_t sequenceLength)
{
return sumSequence16sane (blockAddress + sequenceOffset, sequenceLength);
}
int main()
{
uint8_t b[8] = { 0x5d, 0x5c, 0x4e, 0x6e, 0xfa, 0xb3, 0x5d, 0x4c };
printf("%d\n", sumSequence16sane(b, 8));
}
Some might prefer this inner loop. It's a bit more compact but potentially a bit more confusing:
for (; length >= 2; block += 2, length -= 2)
ret += block[1] << 8 | block[0];

C integer to binary conversion, splitting the result into two binary values [duplicate]

I know that to get the number of bytes used by a variable type, you use sizeof(int) for instance. How do you get the value of the individual bytes used when you store a number with that variable type? (i.e. int x = 125.)
You have to know the number of bits (often 8) in each "byte". Then you can extract each byte in turn by ANDing the int with the appropriate mask. Imagine that an int is 32 bits, then to get 4 bytes out of the_int:
int a = (the_int >> 24) & 0xff; // high-order (leftmost) byte: bits 24-31
int b = (the_int >> 16) & 0xff; // next byte, counting from left: bits 16-23
int c = (the_int >> 8) & 0xff; // next byte, bits 8-15
int d = the_int & 0xff; // low-order byte: bits 0-7
And there you have it: each byte is in the low-order 8 bits of a, b, c, and d.
You can get the bytes by using some pointer arithmetic:
int x = 12578329; // 0xBFEE19
for (size_t i = 0; i < sizeof(x); ++i) {
// Convert to unsigned char* because a char is 1 byte in size.
// That is guaranteed by the standard.
// Note that is it NOT required to be 8 bits in size.
unsigned char byte = *((unsigned char *)&x + i);
printf("Byte %d = %u\n", i, (unsigned)byte);
}
On my machine (Intel x86-64), the output is:
Byte 0 = 25 // 0x19
Byte 1 = 238 // 0xEE
Byte 2 = 191 // 0xBF
Byte 3 = 0 // 0x00
You could make use of a union but keep in mind that the byte ordering is processor dependent and is called Endianness http://en.wikipedia.org/wiki/Endianness
#include <stdio.h>
#include <stdint.h>
union my_int {
int val;
uint8_t bytes[sizeof(int)];
};
int main(int argc, char** argv) {
union my_int mi;
int idx;
mi.val = 128;
for (idx = 0; idx < sizeof(int); idx++)
printf("byte %d = %hhu\n", idx, mi.bytes[idx]);
return 0;
}
If you want to get that information, say for:
int value = -278;
(I selected that value because it isn't very interesting for 125 - the least significant byte is 125 and the other bytes are all 0!)
You first need a pointer to that value:
int* pointer = &value;
You can now typecast that to a 'char' pointer which is only one byte, and get the individual bytes by indexing.
for (int i = 0; i < sizeof(value); i++) {
char thisbyte = *( ((char*) pointer) + i );
// do whatever processing you want.
}
Note that the order of bytes for ints and other data types depends on your system - look up 'big-endian' vs 'little-endian'.
This should work:
int x = 125;
unsigned char *bytes = (unsigned char *) (&x);
unsigned char byte0 = bytes[0];
unsigned char byte1 = bytes[1];
...
unsigned char byteN = bytes[sizeof(int) - 1];
But be aware that the byte order of integers is platform dependent.

How to get the value of individual bytes of a variable?

I know that to get the number of bytes used by a variable type, you use sizeof(int) for instance. How do you get the value of the individual bytes used when you store a number with that variable type? (i.e. int x = 125.)
You have to know the number of bits (often 8) in each "byte". Then you can extract each byte in turn by ANDing the int with the appropriate mask. Imagine that an int is 32 bits, then to get 4 bytes out of the_int:
int a = (the_int >> 24) & 0xff; // high-order (leftmost) byte: bits 24-31
int b = (the_int >> 16) & 0xff; // next byte, counting from left: bits 16-23
int c = (the_int >> 8) & 0xff; // next byte, bits 8-15
int d = the_int & 0xff; // low-order byte: bits 0-7
And there you have it: each byte is in the low-order 8 bits of a, b, c, and d.
You can get the bytes by using some pointer arithmetic:
int x = 12578329; // 0xBFEE19
for (size_t i = 0; i < sizeof(x); ++i) {
// Convert to unsigned char* because a char is 1 byte in size.
// That is guaranteed by the standard.
// Note that is it NOT required to be 8 bits in size.
unsigned char byte = *((unsigned char *)&x + i);
printf("Byte %d = %u\n", i, (unsigned)byte);
}
On my machine (Intel x86-64), the output is:
Byte 0 = 25 // 0x19
Byte 1 = 238 // 0xEE
Byte 2 = 191 // 0xBF
Byte 3 = 0 // 0x00
You could make use of a union but keep in mind that the byte ordering is processor dependent and is called Endianness http://en.wikipedia.org/wiki/Endianness
#include <stdio.h>
#include <stdint.h>
union my_int {
int val;
uint8_t bytes[sizeof(int)];
};
int main(int argc, char** argv) {
union my_int mi;
int idx;
mi.val = 128;
for (idx = 0; idx < sizeof(int); idx++)
printf("byte %d = %hhu\n", idx, mi.bytes[idx]);
return 0;
}
If you want to get that information, say for:
int value = -278;
(I selected that value because it isn't very interesting for 125 - the least significant byte is 125 and the other bytes are all 0!)
You first need a pointer to that value:
int* pointer = &value;
You can now typecast that to a 'char' pointer which is only one byte, and get the individual bytes by indexing.
for (int i = 0; i < sizeof(value); i++) {
char thisbyte = *( ((char*) pointer) + i );
// do whatever processing you want.
}
Note that the order of bytes for ints and other data types depends on your system - look up 'big-endian' vs 'little-endian'.
This should work:
int x = 125;
unsigned char *bytes = (unsigned char *) (&x);
unsigned char byte0 = bytes[0];
unsigned char byte1 = bytes[1];
...
unsigned char byteN = bytes[sizeof(int) - 1];
But be aware that the byte order of integers is platform dependent.

Resources