I receive a port number as 2 bytes (least significant byte first) and I want to convert it into an integer so that I can work with it. I've made this:
char buf[2]; //Where the received bytes are
char port[2];
port[0]=buf[1];
port[1]=buf[0];
int number=0;
number = (*((int *)port));
However, there's something wrong because I don't get the correct port number. Any ideas?
I receive a port number as 2 bytes (least significant byte first)
You can then do this:
int number = buf[0] | buf[1] << 8;
If you make buf into an unsigned char buf[2];, you can simplify it to:
number = (buf[1] << 8) + buf[0];
I appreciate this has already been answered reasonably. However, another technique is to define a macro in your code eg:
// bytes_to_int_example.cpp
// Output: port = 514
// I am assuming that the bytes the bytes need to be treated as 0-255 and combined MSB -> LSB
// This creates a macro in your code that does the conversion and can be tweaked as necessary
#define bytes_to_u16(MSB,LSB) (((unsigned int) ((unsigned char) MSB)) & 255)<<8 | (((unsigned char) LSB)&255)
// Note: #define statements do not typically have semi-colons
#include <stdio.h>
int main()
{
char buf[2];
// Fill buf with example numbers
buf[0]=2; // (Least significant byte)
buf[1]=2; // (Most significant byte)
// If endian is other way around swap bytes!
unsigned int port=bytes_to_u16(buf[1],buf[0]);
printf("port = %u \n",port);
return 0;
}
Least significant byte:
int number = (uint8_t)buf[0] | (uint8_t)buf[1] << 8;
Most significant byte:
int number = (uint8_t)buf[1] << 8 | (uint8_t)buf[0];
char buf[2]; //Where the received bytes are
int number;
number = *((int*)&buf[0]);
&buf[0] takes address of first byte in buf.
(int*) converts it to integer pointer.
Leftmost * reads integer from that memory address.
If you need to swap endianness:
char buf[2]; //Where the received bytes are
int number;
*((char*)&number) = buf[1];
*((char*)&number+1) = buf[0];
Related
I need to read 32bit instructions from a binary file.
so what i have right now is:
unsigned char buffer[4];
fread(buffer,sizeof(buffer),1,file);
which will put 4 bytes in an array
how should I approach that to connect those 4 bytes together in order to process 32bit instruction later?
Or should I even start in a different way and not use fread?
my weird method right now is to create an array of ints of size 32 and the fill it with bits from buffer array
The answer depends on how the 32-bit integer is stored in the binary file. (I'll assume that the integer is unsigned, because it really is an id, and use the type uint32_t from <stdint.h>.)
Native byte order The data was written out as integer on this machine. Just read the integer with fread:
uint32_t op;
fread(&op, sizeof(op), 1, file);
Rationale: fread read the raw representation of the integer into memory. The matching fwrite does the reverse: It writes the raw representation to thze file. If you don't need to exchange the file between platforms, this is a good method to store and read data.
Little-endian byte order The data is stored as four bytes, least significant byte first:
uint32_t op = 0u;
op |= getc(file); // 0x000000AA
op |= getc(file) << 8; // 0x0000BBaa
op |= getc(file) << 16; // 0x00CCbbaa
op |= getc(file) << 24; // 0xDDccbbaa
Rationale: getc reads a char and returns an integer between 0 and 255. (The case where the stream runs out and getc returns the negative value EOF is not considered here for brevity, viz laziness.) Build your integer by shifting each byte you read by multiples of 8 and or them with the existing value. The comments sketch how it works. The capital letters are being read, the lower-case letters were already there. Zeros have not yet been assigned.
Big-endian byte order The data is stored as four bytes, least significant byte last:
uint32_t op = 0u;
op |= getc(file) << 24; // 0xAA000000
op |= getc(file) << 16; // 0xaaBB0000
op |= getc(file) << 8; // 0xaabbCC00
op |= getc(file); // 0xaabbccDD
Rationale: Pretty much the same as above, only that you shift the bytes in another order.
You can imagine little-endian and big-endian as writing the number one hundred and twenty tree (CXXIII) as either 321 or 123. The bit-shifting is similar to shifting decimal digtis when dividing by or multiplying with powers of 10, only that you shift my 8 bits to multiply with 2^8 = 256 here.
Add
unsigned int instruction;
memcpy(&instruction,buffer,4);
to your code. This will copy the 4 bytes of buffer to a single 32-bit variable. Hence you will get connected 4 bytes :)
If you know that the int in the file is the same endian as the machine the program's running on, then you can read straight into the int. No need for a char buffer.
unsigned int instruction;
fread(&instruction,sizeof(instruction),1,file);
If you know the endianness of the int in the file, but not the machine the program's running on, then you'll need to add and shift the bytes together.
unsigned char buffer[4];
unsigned int instruction;
fread(buffer,sizeof(buffer),1,file);
//big-endian
instruction = (buffer[0]<<24) + (buffer[1]<<16) + (buffer[2]<<8) + buffer[3];
//little-endian
instruction = (buffer[3]<<24) + (buffer[2]<<16) + (buffer[1]<<8) + buffer[0];
Another way to think of this is that it's a positional number system in base-256. So just like you combine digits in a base-10.
257
= 2*100 + 5*10 + 7
= 2*10^2 + 5*10^1 + 7*10^0
So you can also combine them using Horner's rule.
//big-endian
instruction = ((((buffer[0]*256) + buffer[1]*256) + buffer[2]*256) + buffer[3]);
//little-endian
instruction = ((((buffer[3]*256) + buffer[2]*256) + buffer[1]*256) + buffer[0]);
#luser droog
There are two bugs in your code.
The size of the variable "instruction" must not be 4 bytes: for example, Turbo C assumes sizeof(int) to be 2. Obviously, your program fails in this case. But, what is much more important and not so obvious: your program will also fail in case sizeof(int) be more than 4 bytes! To understand this, consider the following example:
int main()
{ const unsigned char a[4] = {0x21,0x43,0x65,0x87};
const unsigned char* p = &a;
unsigned long x = (((((p[3] << 8) + p[2]) << 8) + p[1]) << 8) + p[0];
printf("%08lX\n", x);
return 0;
}
This program prints "FFFFFFFF87654321" under amd64, because an unsigned char variable becomes SIGNED INT when it is used! So, changing the type of the variable "instruction" from "int" to "long" does not solve the problem.
The only way is to write something like:
unsigned long instruction;
instruction = 0;
for (int i = 0, unsigned char* p = buffer + 3; i < 4; i++, p--) {
instruction <<= 8;
instruction += *p;
}
Say I have the following:
int32 a = ...; // value of variable irrelevant; can be negative
unsigned char *buf = malloc(4); /* assuming octet bytes, this is just big
enough to hold an int32 */
Is there an efficient and portable algorithm to write the two's complement big-endian representation of a to the 4-byte buffer buf in a portable way? That is, regardless of how the machine we're running represents integers internally, how can I efficiently write the two's complement representation of a to the buffer?
This is a C question so you can rely on the C standard to determine if your answer meets the portability requirement.
Yes, you can certainly do it portably:
int32_t a = ...;
uint32_t b = a;
unsigned char *buf = malloc(sizeof a);
uint32_t mask = (1U << CHAR_BIT) - 1; // one-byte mask
for (int i = 0; i < sizeof a; i++)
{
int shift = CHAR_BIT * (sizeof a - i - 1); // downshift amount to put next
// byte in low bits
buf[i] = (b >> shift) & mask; // save current byte to buffer
}
At least, I think that's right. I'll make a quick test.
unsigned long tmp = a; // Converts to "twos complement"
unsigned char *buf = malloc(4);
buf[0] = tmp>>24 & 255;
buf[1] = tmp>>16 & 255;
buf[2] = tmp>>8 & 255;
buf[3] = tmp & 255;
You can drop the & 255 parts if you're assuming CHAR_BIT == 8.
If I understand correctly, you want to store 4 bytes of an int32 inside a char buffer, in a specific order(e.g. lower byte first), regardless of how int32 is represented.
Let's first make clear about those assumptions: sizeof(char)=8, two's compliment, and sizeof(int32)=4.
No, there is NO portable way in your code because you are trying to convert it to char instead of unsigned char. Storing a byte in char is implementation defined.
But if you store it in an unsigned char array, there are portable ways. You can right shift the value each time by 8 bit, to form a byte in the resulting array, or with the bitwise and operator &:
// a is unsigned
1st byte = a & 0xFF
2nd byte = a>>8 & 0xFF
3rd byte = a>>16 & 0xFF
4th byte = a>>24 & 0xFF
I want to convert an unsigned int and break it into 2 chars. For example: If the integer is 1, its binary representation would be 0000 0001. I want the 0000 part in one char variable and the 0001 part in another binary variable. How do I achieve this in C?
If you insist that you have a sizeof(int)==2 then:
unsigned int x = (unsigned int)2; //or any other value it happens to be
unsigned char high = (unsigned char)(x>>8);
unsigned char low = x & 0xff;
If you have eight bits total (one byte) and you are breaking it into two 4-bit values:
unsigned char x=2;// or whatever
unsigned char high = (x>>4);
unsigned char low = x & 0xf;
Shift and mask off the part of the number you want. Unsigned ints are probably four bytes, and if you wanted all four bytes, you'd just shift by 16 and 24 for the higher order bytes.
unsigned char low = myuint & 0xff;
unsigned char high = (myuint >> 8) & 0xff;
This is assuming 16 bit ints check with sizeof!! On my platform ints are 32bit so I will use a short in this code example. Mine wins the award for most disgusting in terms of pulling apart the pointer - but it also is the clearest for me to understand.
unsigned short number = 1;
unsigned char a;
a = *((unsigned char*)(&number)); // Grab char from first byte of the pointer to the int
unsigned char b;
b = *((unsigned char*)(&number) + 1); // Offset one byte from the pointer and grab second char
One method that works is as follows:
typedef union
{
unsigned char c[sizeof(int)];
int i;
} intchar__t;
intchar__t x;
x.i = 2;
Now x.c[] (an array) will reference the integer as a series of characters, although you will have byte endian issues. Those can be addressed with appropriate #define values for the platform you are programming on. This is similar to the answer that Justin Meiners provided, but a bit cleaner.
unsigned short s = 0xFFEE;
unsigned char b1 = (s >> 8)&0xFF;
unsigned char b2 = (((s << 8)>> 8) & 0xFF);
Simplest I could think of.
int i = 1 // 2 Byte integer value 0x0001
unsigned char byteLow = (i & 0x00FF);
unsinged char byteHigh = ((i & 0xFF00) >> 8);
value in byteLow is 0x01 and value in byteHigh is 0x00
I have an unsigned int number (2 byte) and I want to convert it to unsigned char type. From my search, I find that most people recommend to do the following:
unsigned int x;
...
unsigned char ch = (unsigned char)x;
Is the right approach? I ask because unsigned char is 1 byte and we casted from 2 byte data to 1 byte.
To prevent any data loss, I want to create an array of unsigned char[] and save the individual bytes into the array. I am stuck at the following:
unsigned char ch[2];
unsigned int num = 272;
for(i=0; i<2; i++){
// how should the individual bytes from num be saved in ch[0] and ch[1] ??
}
Also, how would we convert the unsigned char[2] back to unsigned int.
Thanks a lot.
You can use memcpy in that case:
memcpy(ch, (char*)&num, 2); /* although sizeof(int) would be better */
Also, how would be convert the unsigned char[2] back to unsigned int.
The same way, just reverse the arguments of memcpy.
How about:
ch[0] = num & 0xFF;
ch[1] = (num >> 8) & 0xFF;
The converse operation is left as an exercise.
How about using a union?
union {
unsigned int num;
unsigned char ch[2];
} theValue;
theValue.num = 272;
printf("The two bytes: %d and %d\n", theValue.ch[0], theValue.ch[1]);
It really depends on your goal: why do you want to convert this to an unsigned char? Depending on the answer to that there are a few different ways to do this:
Truncate: This is what was recomended. If you are just trying to squeeze data into a function which requires an unsigned char, simply cast uchar ch = (uchar)x (but, of course, beware of what happens if your int is too big).
Specific endian: Use this when your destination requires a specific format. Usually networking code likes everything converted to big endian arrays of chars:
int n = sizeof x;
for(int y=0; n-->0; y++)
ch[y] = (x>>(n*8))&0xff;
will does that.
Machine endian. Use this when there is no endianness requirement, and the data will only occur on one machine. The order of the array will change across different architectures. People usually take care of this with unions:
union {int x; char ch[sizeof (int)];} u;
u.x = 0xf00
//use u.ch
with memcpy:
uchar ch[sizeof(int)];
memcpy(&ch, &x, sizeof x);
or with the ever-dangerous simple casting (which is undefined behavior, and crashes on numerous systems):
char *ch = (unsigned char *)&x;
Of course, array of chars large enough to contain a larger value has to be exactly as big as this value itself.
So you can simply pretend that this larger value already is an array of chars:
unsigned int x = 12345678;//well, it should be just 1234.
unsigned char* pChars;
pChars = (unsigned char*) &x;
pChars[0];//one byte is here
pChars[1];//another byte here
(Once you understand what's going on, it can be done without any variables, all just casting)
You just need to extract those bytes using bitwise & operator. OxFF is a hexadecimal mask to extract one byte. Please look at various bit operations here - http://www.catonmat.net/blog/low-level-bit-hacks-you-absolutely-must-know/
An example program is as follows:
#include <stdio.h>
int main()
{
unsigned int i = 0x1122;
unsigned char c[2];
c[0] = i & 0xFF;
c[1] = (i>>8) & 0xFF;
printf("c[0] = %x \n", c[0]);
printf("c[1] = %x \n", c[1]);
printf("i = %x \n", i);
return 0;
}
Output:
$ gcc 1.c
$ ./a.out
c[0] = 22
c[1] = 11
i = 1122
$
Endorsing #abelenky suggestion, using an union would be a more fail proof way of doing this.
union unsigned_number {
unsigned int value; // An int is 4 bytes long
unsigned char index[4]; // A char is 1 byte long
};
The characteristics of this type is that the compiler will allocate memory only for the biggest member of our data structure unsigned_number, which in this case is going to be 4 bytes - since both members (value and index) have the same size. Had you defined it as a struct instead, we would have 8 bytes allocated on memory, since the compiler does its allocation for all the members of a struct.
Additionally, and here is where your problem is solved, the members of an union data structure all share the same memory location, which means they all refer to same data - think of that like a hard link on GNU/Linux systems.
So we would have:
union unsigned_number my_number;
// Assigning decimal value 202050300 to my_number
// which is represented as 0xC0B0AFC in hex format
my_number.value = 0xC0B0AFC; // Representation: Binary - Decimal
// Byte 3: 00001100 - 12
// Byte 2: 00001011 - 11
// Byte 1: 00001010 - 10
// Byte 0: 11111100 - 252
// Printing out my_number one byte at time
for (int i = 0; i < (sizeof(my_number.value)); i++)
{
printf("index[%d]: %u, 0x%x\n", \
i, my_number.index[i], my_number.index[i]);
}
// Printing out my_number as an unsigned integer
printf("my_number.value: %u, 0x%x", my_number.value, my_number.value);
And the output is going to be:
index[0]: 252, 0xfc
index[1]: 10, 0xa
index[2]: 11, 0xb
index[3]: 12, 0xc
my_number.value: 202050300, 0xc0b0afc
And as for your final question, we wouldn't have to convert from unsigned char back to unsigned int since the values are already there. You just have to choose by which way you want to access it
Note 1: I am using an integer of 4 bytes in order to ease the understanding of the concept. For the problem you presented you must use:
union unsigned_number {
unsigned short int value; // A short int is 2 bytes long
unsigned char index[2]; // A char is 1 byte long
};
Note 2: I have assigned byte 0 to 252 in order to point out the unsigned characteristic of our index field. Was it declared as a signed char, we would have index[0]: -4, 0xfc as output.
Hey, I'm looking to convert a int that is inputed by the user into 4 bytes, that I am assigning to a character array. How can this be done?
Example:
Convert a user inputs of 175 to
00000000 00000000 00000000 10101111
Issue with all of the answers so far, converting 255 should result in 0 0 0 ff although it prints out as: 0 0 0 ffffffff
unsigned int value = 255;
buffer[0] = (value >> 24) & 0xFF;
buffer[1] = (value >> 16) & 0xFF;
buffer[2] = (value >> 8) & 0xFF;
buffer[3] = value & 0xFF;
union {
unsigned int integer;
unsigned char byte[4];
} temp32bitint;
temp32bitint.integer = value;
buffer[8] = temp32bitint.byte[3];
buffer[9] = temp32bitint.byte[2];
buffer[10] = temp32bitint.byte[1];
buffer[11] = temp32bitint.byte[0];
both result in 0 0 0 ffffffff instead of 0 0 0 ff
Just another example is 175 as the input prints out as 0, 0, 0, ffffffaf when it should just be 0, 0, 0, af
The portable way to do this (ensuring that you get 0x00 0x00 0x00 0xaf everywhere) is to use shifts:
unsigned char bytes[4];
unsigned long n = 175;
bytes[0] = (n >> 24) & 0xFF;
bytes[1] = (n >> 16) & 0xFF;
bytes[2] = (n >> 8) & 0xFF;
bytes[3] = n & 0xFF;
The methods using unions and memcpy() will get a different result on different machines.
The issue you are having is with the printing rather than the conversion. I presume you are using char rather than unsigned char, and you are using a line like this to print it:
printf("%x %x %x %x\n", bytes[0], bytes[1], bytes[2], bytes[3]);
When any types narrower than int are passed to printf, they are promoted to int (or unsigned int, if int cannot hold all the values of the original type). If char is signed on your platform, then 0xff likely does not fit into the range of that type, and it is being set to -1 instead (which has the representation 0xff on a 2s-complement machine).
-1 is promoted to an int, and has the representation 0xffffffff as an int on your machine, and that is what you see.
Your solution is to either actually use unsigned char, or else cast to unsigned char in the printf statement:
printf("%x %x %x %x\n", (unsigned char)bytes[0],
(unsigned char)bytes[1],
(unsigned char)bytes[2],
(unsigned char)bytes[3]);
Do you want to address the individual bytes of a 32-bit int? One possible method is a union:
union
{
unsigned int integer;
unsigned char byte[4];
} foo;
int main()
{
foo.integer = 123456789;
printf("%u %u %u %u\n", foo.byte[3], foo.byte[2], foo.byte[1], foo.byte[0]);
}
Note: corrected the printf to reflect unsigned values.
In your question, you stated that you want to convert a user input of 175 to
00000000 00000000 00000000 10101111, which is big endian byte ordering, also known as network byte order.
A mostly portable way to convert your unsigned integer to a big endian unsigned char array, as you suggested from that "175" example you gave, would be to use C's htonl() function (defined in the header <arpa/inet.h> on Linux systems) to convert your unsigned int to big endian byte order, then use memcpy() (defined in the header <string.h> for C, <cstring> for C++) to copy the bytes into your char (or unsigned char) array.
The htonl() function takes in an unsigned 32-bit integer as an argument (in contrast to htons(), which takes in an unsigned 16-bit integer) and converts it to network byte order from the host byte order (hence the acronym, Host TO Network Long, versus Host TO Network Short for htons), returning the result as an unsigned 32-bit integer. The purpose of this family of functions is to ensure that all network communications occur in big endian byte order, so that all machines can communicate with each other over a socket without byte order issues. (As an aside, for big-endian machines, the htonl(), htons(), ntohl() and ntohs() functions are generally compiled to just be a 'no op', because the bytes do not need to be flipped around before they are sent over or received from a socket since they're already in the proper byte order)
Here's the code:
#include <stdio.h>
#include <arpa/inet.h>
#include <string.h>
int main() {
unsigned int number = 175;
unsigned int number2 = htonl(number);
char numberStr[4];
memcpy(numberStr, &number2, 4);
printf("%x %x %x %x\n", numberStr[0], numberStr[1], numberStr[2], numberStr[3]);
return 0;
}
Note that, as caf said, you have to print the characters as unsigned characters using printf's %x format specifier.
The above code prints 0 0 0 af on my machine (an x86_64 machine, which uses little endian byte ordering), which is hex for 175.
You can try:
void CopyInt(int value, char* buffer) {
memcpy(buffer, (void*)value, sizeof(int));
}
Why would you need an intermediate cast to void * in C++
Because cpp doesn't allow direct conversion between pointers, you need to use reinterpret_cast or casting to void* does the thing.
int a = 1;
char * c = (char*)(&a); //In C++ should be intermediate cst to void*
The issue with the conversion (the reason it's giving you a ffffff at the end) is because your hex integer (that you are using the & binary operator with) is interpreted as being signed. Cast it to an unsigned integer, and you'll be fine.
An int is equivalent to uint32_t and char to uint8_t.
I'll show how I resolved client-server communication, sending the actual time (4 bytes, formatted in Unix epoch) in a 1-bit array, and then re-built it in the other side. (Note: the protocol was to send 1024 bytes)
Client side
uint8_t message[1024];
uint32_t t = time(NULL);
uint8_t watch[4] = { t & 255, (t >> 8) & 255, (t >> 16) & 255, (t >>
24) & 255 };
message[0] = watch[0];
message[1] = watch[1];
message[2] = watch[2];
message[3] = watch[3];
send(socket, message, 1024, 0);
Server side
uint8_t res[1024];
uint32_t date;
recv(socket, res, 1024, 0);
date = res[0] + (res[1] << 8) + (res[2] << 16) + (res[3] << 24);
printf("Received message from client %d sent at %d\n", socket, date);
Hope it helps.
You can simply use memcpy as follows:
unsigned int value = 255;
char bytes[4] = {0, 0, 0, 0};
memcpy(bytes, &value, 4);
The problem is arising as unsigned char is a 4 byte number not a 1 byte number as many think, so change it to
union {
unsigned int integer;
char byte[4];
} temp32bitint;
and cast while printing, to prevent promoting to 'int' (which C does by default)
printf("%u, %u \n", (unsigned char)Buffer[0], (unsigned char)Buffer[1]);