#include <stdio.h>
int main()
{
int x = 1023;
char *p = (char *)&x;
printf("%d %d %d %d\n", p[0], p[1], p[2], p[3]);
}
The result is: -1 3 0 0
I have imagined:
int x = 1023 (4 bytes):
m m+1 m+2 m+3
-+----+----+----+----+----+----+-
| | 255| 3 | 0 | 0 | |
-+----+----+----+----+----+----+-
~~~~~~~~~~~~~~~~~~~~~
cast to char pointer:
m
-+----+----+----+----+----+----+-
| | 255| 3 | 0 | 0 | |
-+----+----+----+----+----+----+-
~~~~~~
Why is the result -1 3 0 0? Or where did I go wrong?
char can be either signed char or unsigned char, it's implementation-dependent. Your implementation uses signed char. The values range from -128 to 127, 255 is not a possible value. That byte representation corresponds to -1.
Change your pointer declaration to
unsigned char *p = (unsigned char*)&x;
and you'll get the result you expect.
Related
is the number 1 stored in memory as 00000001 00000000 00000000 00000000?
#include <stdio.h>
int main()
{
unsigned int a[3] = {1, 1, 0x7f7f0501};
int *p = a;
printf("%d %p\n", *p, p);
p = (long long)p + 1;
printf("%d %p\n", *p, p);
char *p3 = a;
int i;
for (i = 0; i < 12; i++, p3++)
{
printf("%x %p\n", *p3, p3);
}
return 0;
}
Why is 16777216 printed in the output:
An integer is stored in memory in different ways on different architectures. Most commons ways are called little-endian and big-endian byte ordering.
See Endianness
(long long)p+1
|
v
Your memory: [0x01, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, ...]
You increment p not like pointer but as a long long number, so it does not point to next integer but the next byte. So you will get 0x00, 0x00, 0x00, 0x01 which translates to 0x1000000 (decimal 16777216) in a little-endian arch.
Something to play with (assuming int is 32 bits wide):
#include <stdio.h>
#include <stdbool.h>
typedef union byte_rec {
struct bit_rec {
bool b0 : 1;
bool b1 : 1;
bool b2 : 1;
bool b3 : 1;
bool b4 : 1;
bool b5 : 1;
bool b6 : 1;
bool b7 : 1;
} bits;
unsigned char value;
} byte_t;
typedef union int_rec {
struct bytes_rec {
byte_t b0;
byte_t b1;
byte_t b2;
byte_t b3;
} bytes;
int value;
} int_t;
void printByte(byte_t *b)
{
printf(
"%d %d %d %d %d %d %d %d ",
b->bits.b0,
b->bits.b1,
b->bits.b2,
b->bits.b3,
b->bits.b4,
b->bits.b5,
b->bits.b6,
b->bits.b7
);
}
void printInt(int_t *i)
{
printf("%p: ", i);
printByte(&i->bytes.b0);
printByte(&i->bytes.b1);
printByte(&i->bytes.b2);
printByte(&i->bytes.b3);
putchar('\n');
}
int main()
{
int_t i1, i2;
i1.value = 0x00000001;
i2.value = 0x80000000;
printInt(&i1);
printInt(&i2);
return 0;
}
Possible output:
0x7ffea0e30920: 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0x7ffea0e30924: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
Additional (based on the comment of #chqrlie):
I've previously used the unsigned char type, but the C Standard allows only 3 - and since C99 - 4 types. Additional implementation-defined types may be acceptable by the C Standard and it seems that gcc was ok with the unsigned char type for the bit field, but i've changed it nevertheless to the allowed type _Bool (since C99).
Noteworthy: The order of bit fields within an allocation unit (on some platforms, bit fields are packed left-to-right, on others right-to-left) are undefined (see Notes section in the reference).
Reference to bit fields: https://en.cppreference.com/w/c/language/bit_field
p = (long long)p + 1; is bad code (undefined behavior UB (e.g. bus fault and re-booted machine)) as it is not specified to work in C. The attempted assigned of the newly formed address is not certainly aligned to int * needs.
Don`t do that.
To look at the bytes of a[]
#include <stdio.h>
#include <stdlib.h>
void dump(size_t sz, const void *ptr) {
const unsigned char *byte_ptr = (const unsigned char *) ptr;
for (size_t i = 0; i < sz; i++) {
printf("%p %02X\n", (void*) byte_ptr, *byte_ptr);
byte_ptr++;
}
}
int main(void) {
unsigned int a[3] = {1, 1, 0x7f7f0501u};
dump(sizeof a, a);
}
As this is wiki, feel open to edit.
There are multiple instances of undefined behavior in your code:
in printf("%d %p\n", *p, p) you should cast p as (void *)p to ensure printf receives a void * as it expects. This is unlikely to pose a problem on most current targets but some ancien systems had different representations for int * and void *, such as early Cray systems.
in p = (long long)p + 1, you have implementation defined behavior converting a pointer to an integer and implicitly converting the integral result of the addition back to a pointer. More importantly, this may create a pointer with incorrect alignment for accessing int in memory, resulting in undefined behavior when you dereference p. This would cause a bus error on many systems, eg: most RISC architectures, but by chance not on intel processors. It would be safer to compute the pointer as p = (void *)((intptr_t)p + 1); or p = (void *)((char *)p + 1); albeit this would still have undefined behavior because of alignment issues.
is the number 1 stored in memory as 00000001 00000000 00000000 00000000?
Yes, your system seems to use little endian representation for int types. The least significant 8 bits are stored in the byte at the address of a, then the next least significant 8 bits, and so on. As can be seen in the output, 1 is stored as 01 00 00 00 and 0x7f7f0501 stored as 01 05 7f 7f.
Why is 16777216 printed in the output?
The second instance of printf("%d %p\n", *p, p) as undefined behavior. On your system, p points to the second byte of the array a and *p reads 4 bytes from this address, namely 00 00 00 01 (the last 3 bytes of 1 and the first byte of the next array element, also 1), which is the representation of the int value 16777216.
To dump the contents of the array as bytes, you should access it using a char * as you do in the last loop. Be aware that char may be signed on some systems, causing for example printf("%x\n", *p3); to output ffffff80 if p3 points to the byte with hex value 80. Using unsigned char * is recommended for consistent and portable behavior.
Can somebody explain to me what happens with br agument in union, after assigning str.a and str.b? We need to set that value before calling the function above? I tried to run the code in simulator https://pythontutor.com/render.html#mode=display which says that the value of br is 516 before calling the function. How is that possible?
#include <stdio.h>
void f(short num, short* res){
if (num){
*res = *res * 10 + num%10;
f(num / 10, res);
}}
typedef union {
short br;
struct {
char a, b;
} str;
} un;
void main() {
short res = 0; un x;
x.str.a = 4; x.str.b = 2;
f(x.br, &res); x.br = res;
printf("%d %d %d\n", x.br, x.str.a, x.str.b);}
Assuming that char is one byte and short is two bytes (the most common), then it's really simple.
Begin by drawing out the members of the union on a piece of paper, one member next to the other. Something like this:
br str
+---+ +---+
| | | | a
+---+ +---+
| | | | b
+---+ +---+
Now we do the assignments:
x.str.a = 4;
x.str.b = 2;
And write the results in the drawing:
br str
+---+ +---+
| 4 | | 4 | a
+---+ +---+
| 2 | | 2 | b
+---+ +---+
Assuming little endianness like on a normal x86 or x86-64 system, then the value of br will be 0x0204 which is 516 in decimal.
So that's where the value 516 is coming from.
The value of the short will depend on the computer's endianess. On a little endian machine, a will correspond to the least significant byte and b to the most significant. Thus when those two bytes are converted to a short, you get the number 0x0204 = 516 decimal.
As a side note, it is a bad idea to use short and char since those may be signed and negative. Use uint16_t and uint8_t instead, whenever dealing with binary arithmetic.
If you put some effort into your debugging you would see what is going on:
void f(short num, short* res)
{
if (num)
{
*res = *res * 10 + num%10;
f(num / 10, res);
}
}
typedef union
{
short br;
struct
{
char a, b;
};
} un;
int main(void)
{
short res = 0; un x;
x.a = 4; x.b = 2;
printf("Before br=0x%04x (%d) a=0x%02x b=0x%02x res = %d 0x%x\n", x.br, x.br, x.a, x.b, res, res);
f(x.br, &res); x.br = res;
printf("After br=0x%04x a=0x%02x b=0x%02x res = %d 0x%x\n", x.br, x.a, x.b, res, res);
}
result:
efore br=0x0204 (516) a=0x04 b=0x02 res = 0 0x0
After br=0x0267 a=0x67 b=0x02 res = 615 0x267
Do your br was 516 and it was reversed by the f function becoming 615 which is 0x0276. It contains of two bytes 0x02 and 0x67.
Your computer is little-endian so the first byte is 0x67 and the second one is 0x02 because this system stores the least significant byte first.
so I'm trying to shift these values left to store all this data into a 64 bit value. Unfortunately the numbers turn negative right at the first shift, what's causing this? Isn't it only suppose to store the first 41 bits of mil_t time into x? Also why would the remainder of serial and userid be zero?
long int gen_id(){
printf("Calling gen_id...\n");
unsigned long int mil_t = 1623638363132;
unsigned long int serial = 10000;
unsigned int rowId = 5;
unsigned int userId = 30000;
printf("Original mil_t: %ld\n", mil_t);
unsigned long int x = mil_t << 41;
printf("GEN_ID | MIL_T shift left 41: %ld\n", x);
unsigned long int tbusMod = userId % serial;
printf("GEN_ID | tbusMod = userId mod serial: %ld\n", tbusMod);
x += tbusMod << (64 - 41 - 13);
printf("GEN_ID | x1 : %ld\n", x);
x += (rowId % 1024);
printf("GEN_ID | x2 : %ld\n", x);
return x;
}
OUTPUT:
Original mil_t: 1623638647191
GEN_ID | MIL_T shift left 41: -4136565053832822784
GEN_ID | tbusMod = userId mod serial: 0
GEN_ID | x1 : -4136565053832822784
GEN_ID | x2 : -4136565053832822779
FINAL: -4136565053832822779
TOTAL BYTES: 68
Your program causes undefined behaviour by using the incorrect format specifier. %ld is only for long int (and not unsigned long int).
Instead use %lu to display x and tbusMod.
30000 divided by 10000 gives quotient 3, remainder 0 .
In the following code
#include <stdio.h>
int main() {
union a {
int i;
char ch[2];
};
union a u;
int b;
u.ch[0] = 3;
u.ch[1] = 2;
printf("%d,%d,%d\n", u.ch[0], u.ch[1], u.i);
return 0;
}
The output I get is
3,2,515
Can anyone explain me why the value of i is 515?
union a {
int i;
char ch[2];
};
union a u; /* initially it contains gargage data */
All members of the union shares the common memory. In above case total of 4 bytes gets allocated for u because in 4 bytes(MAX memory needed) you can store both i and ch.
ch[1] ch[0]
----------------------------------
| G | G | G | G | => G means garbage/junk data, because u didn't initialized
----------------------------------
u
MSB LSB
when statement u.ch[0] = 3; executed only ch[0] initialized.
ch[1] ch[0]
--------------------------------------
| G | G | G | 0000 0011 | => G means garbage/junk data, because u didn't initialized
--------------------------------------
u
MSB LSB
And when u.ch[1] = 2; executed next 1 bytes gets initialized as
ch[1] ch[0]
------------------------------------------
| G | G | 0000 0010 | 0000 0011 | => G means garbage/junk data, because u didn't initialized
------------------------------------------
u
MSB LSB
As you can see above out of 4 bytes only first 2 bytes got initialized, still remaining 2 bytes are uninitialised so when you are printing u.i, its undefined behaviour.
If you want expected result then initialize then union variable first as
union a u = { 0 }; /* all 4 bytes got initialized at first instance itself, no chance of any junk data */
u.ch[0] = 3;
u.ch[1] = 2;
Now when you prints u.i, it prints data in whole 4 bytes which is 512 + 3 = 515 (In case of little enidian processor)
I can only use these symbols:
! ~ & ^ | + << >>
Here is the table I need to achieve:
input | output
--------------
0 | 0
1 | 8
2 | 16
3 | 24
With the output I am going to left shift a 32 bit int over.
Ex.
int main()
{
int myInt = 0xFFFFFFFF;
myInt = (x << (myFunction(2)));
//OUTPUT = 0xFFFF0000
}
int myFunction(int input)
{
// Do some magic conversions here
}
any ideas????
Well, if you want a function with f(0) = 0, f(1) = 8, f(3) = 24 and so on then you'll have to implement f(x) = x * 8. Since 8 is a perfect power of two the multiplication can be replaced by shifting. Thus:
int myFunction(int input)
{
return input << 3;
}
That's all.