How to shift bytes from char array into int - c

I would like to make a int varibale out of a char array in C.
The char array looks like this:
buffer[0] = 0xcf
buffer[1] = 0x04
buffer[2] = 0x00
buffer[3] = 0x00
The shifting looks like this
x = (buffer[1] << 8 )| (buffer[0] << 0) ;
After that x looks like this:
x = 0xffff04cf
Right now everthing would be fine, if the first two bytes wouldn't be ff.
If I try this line
x = (buffer[3] << 24 )| (buffer[2] << 16)| (buffer[1] << 8)| (buffer[0] << 0) ;
it still looks
x = 0xffff04cf
Even when I try to shift in the zeros before or after I shift in 04cf it looks still the same.
Is this the rigth idea to it or what am I doing wrong?

The issue is that you declared buffer by means of a signed type, probably (signed) char. When applying operator <<, integral promotions will be performed, and as the value 0xcf in an 8-bit signed type represents a negative value (i.e. -49), it will remain a negative value (yet represented by more bits, i.e. 0xffffffcf). Note that -1 is represented as 0xFFFFFFFF and vice versa.
To overcome this issue, simply define buffer as
unsigned char buffer[4]
And if you weren't allowed to change the data type of buffer, you could write...
unsigned x = ( (unsigned char)buffer[0] << 8 )| ((unsigned char)buffer[1] << 4) ;

For tasks like this I like using unions, for example:
union tag_int_chars {
char buffer[sizeof(int32_t)];
int32_t value;
} int_chars;
int_chars.value = 0x01234567;
int_chars.buffer[0] = 0xff;
This will automate the memory overlay without the need to shift. Set the value of the int and voila the chars have changed, change a char value and voila the int has changed.
The example will leave the int value = 0x012345ff on a little endian machine.
Another easy way is to use memcpy():
#include <string.h>
char buffer[sizeof(int32_t)];
int32_t value;
memcpy(&value, buffer, sizeof(int32_t)); // chars to int
memcpy(buffer, &value, sizeof(int32_t)); // int to chars

Related

Algorithm to write two's complement integer in memory portably

Say I have the following:
int32 a = ...; // value of variable irrelevant; can be negative
unsigned char *buf = malloc(4); /* assuming octet bytes, this is just big
enough to hold an int32 */
Is there an efficient and portable algorithm to write the two's complement big-endian representation of a to the 4-byte buffer buf in a portable way? That is, regardless of how the machine we're running represents integers internally, how can I efficiently write the two's complement representation of a to the buffer?
This is a C question so you can rely on the C standard to determine if your answer meets the portability requirement.
Yes, you can certainly do it portably:
int32_t a = ...;
uint32_t b = a;
unsigned char *buf = malloc(sizeof a);
uint32_t mask = (1U << CHAR_BIT) - 1; // one-byte mask
for (int i = 0; i < sizeof a; i++)
{
int shift = CHAR_BIT * (sizeof a - i - 1); // downshift amount to put next
// byte in low bits
buf[i] = (b >> shift) & mask; // save current byte to buffer
}
At least, I think that's right. I'll make a quick test.
unsigned long tmp = a; // Converts to "twos complement"
unsigned char *buf = malloc(4);
buf[0] = tmp>>24 & 255;
buf[1] = tmp>>16 & 255;
buf[2] = tmp>>8 & 255;
buf[3] = tmp & 255;
You can drop the & 255 parts if you're assuming CHAR_BIT == 8.
If I understand correctly, you want to store 4 bytes of an int32 inside a char buffer, in a specific order(e.g. lower byte first), regardless of how int32 is represented.
Let's first make clear about those assumptions: sizeof(char)=8, two's compliment, and sizeof(int32)=4.
No, there is NO portable way in your code because you are trying to convert it to char instead of unsigned char. Storing a byte in char is implementation defined.
But if you store it in an unsigned char array, there are portable ways. You can right shift the value each time by 8 bit, to form a byte in the resulting array, or with the bitwise and operator &:
// a is unsigned
1st byte = a & 0xFF
2nd byte = a>>8 & 0xFF
3rd byte = a>>16 & 0xFF
4th byte = a>>24 & 0xFF

How do I convert and break a 2 byte integer into 2 different chars in C?

I want to convert an unsigned int and break it into 2 chars. For example: If the integer is 1, its binary representation would be 0000 0001. I want the 0000 part in one char variable and the 0001 part in another binary variable. How do I achieve this in C?
If you insist that you have a sizeof(int)==2 then:
unsigned int x = (unsigned int)2; //or any other value it happens to be
unsigned char high = (unsigned char)(x>>8);
unsigned char low = x & 0xff;
If you have eight bits total (one byte) and you are breaking it into two 4-bit values:
unsigned char x=2;// or whatever
unsigned char high = (x>>4);
unsigned char low = x & 0xf;
Shift and mask off the part of the number you want. Unsigned ints are probably four bytes, and if you wanted all four bytes, you'd just shift by 16 and 24 for the higher order bytes.
unsigned char low = myuint & 0xff;
unsigned char high = (myuint >> 8) & 0xff;
This is assuming 16 bit ints check with sizeof!! On my platform ints are 32bit so I will use a short in this code example. Mine wins the award for most disgusting in terms of pulling apart the pointer - but it also is the clearest for me to understand.
unsigned short number = 1;
unsigned char a;
a = *((unsigned char*)(&number)); // Grab char from first byte of the pointer to the int
unsigned char b;
b = *((unsigned char*)(&number) + 1); // Offset one byte from the pointer and grab second char
One method that works is as follows:
typedef union
{
unsigned char c[sizeof(int)];
int i;
} intchar__t;
intchar__t x;
x.i = 2;
Now x.c[] (an array) will reference the integer as a series of characters, although you will have byte endian issues. Those can be addressed with appropriate #define values for the platform you are programming on. This is similar to the answer that Justin Meiners provided, but a bit cleaner.
unsigned short s = 0xFFEE;
unsigned char b1 = (s >> 8)&0xFF;
unsigned char b2 = (((s << 8)>> 8) & 0xFF);
Simplest I could think of.
int i = 1 // 2 Byte integer value 0x0001
unsigned char byteLow = (i & 0x00FF);
unsinged char byteHigh = ((i & 0xFF00) >> 8);
value in byteLow is 0x01 and value in byteHigh is 0x00

Convert decimal to char/string

Let's say i have this number
int x = 65535;
Which is the decimal representation of:
ÿÿ
I know how i can do it from single char
#include <stdio.h>
int main() {
int f = 65535;
printf("%c", f);
}
But this will only give me "ÿ"
I would like to do this without using any external library, and preferably using C type strings.
#include <stdio.h>
int main() {
unsigned f = 65535; // initial value
// this will do the printf and ff >>= 8 until f <= 0 ( =0 actually)
do {
printf("%c", f & 0xff); // print once char. The &0xff keeps only the bits for one byte (8 bits)
f >>= 8; // shifts f right side for 8 bits
} while (f > 0);
}
Consider the value 65535, or 0xffff in hexadecimal, meaning its positive value takes 2 bytes that are 0xff and 0xff
print of f & 0xff keeps only the 8 LSb, (0xffff & 0xff = 0xff)
f >> = 8 shifts the value 8 bits to the right, 0xffff becomes 0x00ff (the 'ff' right side are gone
f > 0 is true since f == 0xff now
Next loop is the same, but f >>= 8 shifts 0x00ff to the right => 0x0000, and f is null.
Thus the f > 0 condition is wrong and the loop ends.
What you're looking for is bit masking: (x >> 8) & 0xFF for
the high order byte, and (x & 0xFF) for the lower. (Actually,
if int has 32 bits, it's (x >> 24) & 0xFF for the high order
byte. But given the values, and what you say your expecting,
you probably want the second byte, and not the high order byte.)
What you have is a 16 bit (two bytes) unsigned number. A char is 8 bits (one byte). This means you have to extract the two bytes in the number to get them as separate character.
This is done with the bitwise operators. You can use bitwise and & and bitwise shift >> to accomplish that.
Something like
char buffer[9];
long value = 65535;
char *cur = buffer;
while (value > 0)
{
*cur++ = value % 0x100;
value /= 0x100;
}
*cur = 0;

32 bit hex in one variable

How can I put this hex 0x0a01 into a 32 bit var in C. What I'm trying to do is to parse a protocol. Part of the it has a length value. The problem is that I'm getting the received packet as an array, so the length 0x0a01 would be 0x0a on lets say [1] and 0x01 on [2], and I want them both to be 0a01 in one var so I can run a compare to a constant or use in a for loop.
ah 32 bit is a int in most current platforms (or int32_t defined in stdint.h)
and bit operations are made for this:
int var = buff[1]<<8|buff[2];
<< is the left shift so 0x0a gets transformed into 0x0a00 and | is the or operator so that is gets combined properly
uint32_t var;
char buff[128];
...
var = ((buff[1] & 0xFF) << 8) | (buff[2] & 0xFF);
Notice we use (buff[1] & 0xFF) << 8 and not buff[1] << 8, because if buff is a char array and char is signed, sign extension would occur on the promoted buff[1] when the value is negative.

Converting an int into a 4 byte char array (C)

Hey, I'm looking to convert a int that is inputed by the user into 4 bytes, that I am assigning to a character array. How can this be done?
Example:
Convert a user inputs of 175 to
00000000 00000000 00000000 10101111
Issue with all of the answers so far, converting 255 should result in 0 0 0 ff although it prints out as: 0 0 0 ffffffff
unsigned int value = 255;
buffer[0] = (value >> 24) & 0xFF;
buffer[1] = (value >> 16) & 0xFF;
buffer[2] = (value >> 8) & 0xFF;
buffer[3] = value & 0xFF;
union {
unsigned int integer;
unsigned char byte[4];
} temp32bitint;
temp32bitint.integer = value;
buffer[8] = temp32bitint.byte[3];
buffer[9] = temp32bitint.byte[2];
buffer[10] = temp32bitint.byte[1];
buffer[11] = temp32bitint.byte[0];
both result in 0 0 0 ffffffff instead of 0 0 0 ff
Just another example is 175 as the input prints out as 0, 0, 0, ffffffaf when it should just be 0, 0, 0, af
The portable way to do this (ensuring that you get 0x00 0x00 0x00 0xaf everywhere) is to use shifts:
unsigned char bytes[4];
unsigned long n = 175;
bytes[0] = (n >> 24) & 0xFF;
bytes[1] = (n >> 16) & 0xFF;
bytes[2] = (n >> 8) & 0xFF;
bytes[3] = n & 0xFF;
The methods using unions and memcpy() will get a different result on different machines.
The issue you are having is with the printing rather than the conversion. I presume you are using char rather than unsigned char, and you are using a line like this to print it:
printf("%x %x %x %x\n", bytes[0], bytes[1], bytes[2], bytes[3]);
When any types narrower than int are passed to printf, they are promoted to int (or unsigned int, if int cannot hold all the values of the original type). If char is signed on your platform, then 0xff likely does not fit into the range of that type, and it is being set to -1 instead (which has the representation 0xff on a 2s-complement machine).
-1 is promoted to an int, and has the representation 0xffffffff as an int on your machine, and that is what you see.
Your solution is to either actually use unsigned char, or else cast to unsigned char in the printf statement:
printf("%x %x %x %x\n", (unsigned char)bytes[0],
(unsigned char)bytes[1],
(unsigned char)bytes[2],
(unsigned char)bytes[3]);
Do you want to address the individual bytes of a 32-bit int? One possible method is a union:
union
{
unsigned int integer;
unsigned char byte[4];
} foo;
int main()
{
foo.integer = 123456789;
printf("%u %u %u %u\n", foo.byte[3], foo.byte[2], foo.byte[1], foo.byte[0]);
}
Note: corrected the printf to reflect unsigned values.
In your question, you stated that you want to convert a user input of 175 to
00000000 00000000 00000000 10101111, which is big endian byte ordering, also known as network byte order.
A mostly portable way to convert your unsigned integer to a big endian unsigned char array, as you suggested from that "175" example you gave, would be to use C's htonl() function (defined in the header <arpa/inet.h> on Linux systems) to convert your unsigned int to big endian byte order, then use memcpy() (defined in the header <string.h> for C, <cstring> for C++) to copy the bytes into your char (or unsigned char) array.
The htonl() function takes in an unsigned 32-bit integer as an argument (in contrast to htons(), which takes in an unsigned 16-bit integer) and converts it to network byte order from the host byte order (hence the acronym, Host TO Network Long, versus Host TO Network Short for htons), returning the result as an unsigned 32-bit integer. The purpose of this family of functions is to ensure that all network communications occur in big endian byte order, so that all machines can communicate with each other over a socket without byte order issues. (As an aside, for big-endian machines, the htonl(), htons(), ntohl() and ntohs() functions are generally compiled to just be a 'no op', because the bytes do not need to be flipped around before they are sent over or received from a socket since they're already in the proper byte order)
Here's the code:
#include <stdio.h>
#include <arpa/inet.h>
#include <string.h>
int main() {
unsigned int number = 175;
unsigned int number2 = htonl(number);
char numberStr[4];
memcpy(numberStr, &number2, 4);
printf("%x %x %x %x\n", numberStr[0], numberStr[1], numberStr[2], numberStr[3]);
return 0;
}
Note that, as caf said, you have to print the characters as unsigned characters using printf's %x format specifier.
The above code prints 0 0 0 af on my machine (an x86_64 machine, which uses little endian byte ordering), which is hex for 175.
You can try:
void CopyInt(int value, char* buffer) {
memcpy(buffer, (void*)value, sizeof(int));
}
Why would you need an intermediate cast to void * in C++
Because cpp doesn't allow direct conversion between pointers, you need to use reinterpret_cast or casting to void* does the thing.
int a = 1;
char * c = (char*)(&a); //In C++ should be intermediate cst to void*
The issue with the conversion (the reason it's giving you a ffffff at the end) is because your hex integer (that you are using the & binary operator with) is interpreted as being signed. Cast it to an unsigned integer, and you'll be fine.
An int is equivalent to uint32_t and char to uint8_t.
I'll show how I resolved client-server communication, sending the actual time (4 bytes, formatted in Unix epoch) in a 1-bit array, and then re-built it in the other side. (Note: the protocol was to send 1024 bytes)
Client side
uint8_t message[1024];
uint32_t t = time(NULL);
uint8_t watch[4] = { t & 255, (t >> 8) & 255, (t >> 16) & 255, (t >>
24) & 255 };
message[0] = watch[0];
message[1] = watch[1];
message[2] = watch[2];
message[3] = watch[3];
send(socket, message, 1024, 0);
Server side
uint8_t res[1024];
uint32_t date;
recv(socket, res, 1024, 0);
date = res[0] + (res[1] << 8) + (res[2] << 16) + (res[3] << 24);
printf("Received message from client %d sent at %d\n", socket, date);
Hope it helps.
You can simply use memcpy as follows:
unsigned int value = 255;
char bytes[4] = {0, 0, 0, 0};
memcpy(bytes, &value, 4);
The problem is arising as unsigned char is a 4 byte number not a 1 byte number as many think, so change it to
union {
unsigned int integer;
char byte[4];
} temp32bitint;
and cast while printing, to prevent promoting to 'int' (which C does by default)
printf("%u, %u \n", (unsigned char)Buffer[0], (unsigned char)Buffer[1]);

Resources