Bit Manipulation on char array in c - c

If I am given a char array of size 8, where I know the the first 3 bytes are the id, the next byte is the message, and the last 3 bytes are the values. How could I use bit manipulation in order to extract the message.
Example: a char array contains 9990111 (one integer per position), where 999 is the id, 0 is the message, and 111 is the value.
Any tips? Thanks!

Given:
the array contains {'9','9','9','0','1','1','1'}
Then you can convert with sscanf():
char buffer[8] = { '9', '9', '9', '0', '1', '1', '1', '\0' };
//char buffer[] = "9990111"; // More conventional but equivalent notation
int id;
int message;
int value;
if (sscanf(buffer, "%3d%1d%3d", &id, &message, &value) != 3)
…conversion failed…inexplicably in this context…
assert(id == 999);
assert(message == 0);
assert(value == 111);
But there's no bit manipulation needed there.

Well, if you want bit manipulation, no matter what, here it goes:
#include <stdio.h>
#include <arpa/inet.h>
int main(void) {
char arr[8] = "9997111";
int msg = 0;
msg = ((ntohl(*(uint32_t *) arr)) & 0xff) - 48;
printf("%d\n", msg);
return 0;
}
Output:
7
Just remember one thing... this does not comply with strict aliasing rules. But you can use some memcpy() stuff to solve it.
Edit #1 (parsing it all, granting compliance with strict aliasing rules, and making you see that this does not make any sense):
#include <stdio.h>
#include <string.h>
#include <stdint.h>
#include <arpa/inet.h>
int main(void) {
char arr[8] = "9997111";
uint32_t a[2];
unsigned int id = 0, msg = 0, val = 0;
memcpy(a, arr, 4);
memcpy(&a[1], arr + 4, 4);
a[0] = ntohl(a[0]);
a[1] = ntohl(a[1]);
id = ((((a[0] & 0xff000000) >> 24) - 48) * 100) + ((((a[0] & 0xff0000) >> 16)- 48) * 10) + (((a[0] & 0xff00) >> 8)- 48);
msg = (a[0] & 0xff) - 48;
val = ((((a[1] & 0xff000000) >> 24) - 48) * 100) + ((((a[1] & 0xff0000) >> 16)- 48) * 10) + (((a[1] & 0xff00) >> 8)- 48);
printf("%d\n", id);
printf("%d\n", msg);
printf("%d\n", val);
return 0;
}
Output:
999
7
111

The usual way would be to define a structure with members which are bit fields and correspond to the segmented information in your array. (oh, re-reading your question: is the array filled with { '9', '9',...}?? Then you'd just sscanf the values with the proper offset into the array.

You can use Memory Copy to extract the values. Here is an example
char *info = malloc(sizeof(int)*3);
char *info2 = malloc(sizeof(int)*1);
char *info3 = malloc(sizeof(int)*3);
memcpy(info,msgTest, 3);
memcpy(info2,msgTest+3, 1);
memcpy(info3,msgTest+4, 3);
printf("%s\n", msgTest);
printf("ID is %s\n", info);
printf("Code is %s\n", info2);
printf("Val is %s\n", info3);
Lets say string msgTest = "0098457
The print statement willl goes as follows..
ID is 009
Code is 8
Val is 457
Hope this helps, Good luck!

here is an example in which i don't use malloc or memory copy for a good implementation on embedded devices, where the stack is limited. Note there is no need to use compact because it is only 1 byte. This is C11 implementation. If you have 4 Bytes for example to be analyzed, create another struct with 4 charbits, and copy the address to the new struct instead. This is coinstance with design patterns concept for embedded.
#include <stdio.h>
// start by creating a struct for the bits
typedef struct {
unsigned int bit0:1; //this is LSB
unsigned int bit1:1; //bit 1
unsigned int bit2:1;
unsigned int bit3:1;
unsigned int bit4:1;
unsigned int bit5:1;
unsigned int bit6:1;
unsigned int bit7:1;
unsigned int bit8:1;
}charbits;
int main()
{
// now assume we have a char to be converted into its bits
char a = 'a'; //asci of a is 97
charbits *x; //this is the character bits to be converted to
// first convert the char a to void pointer
void* p; //this is a void pointer
p=&a; // put the address of a into p
//now convert the void pointer to the struct pointer
x=(charbits *) p;
// now print the contents of the struct
printf("b0 %d b1 %d b2 %d b3 %d b4 %d b5 %d b6 %d b7 %d", x->bit0,x->bit1, x->bit2,x->bit3, x->bit4, x->bit5, x->bit6, x->bit7, x->bit8);
// 97 has bits like this 01100001
//b0 1 b1 0 b2 0 b3 0 b4 0 b5 1 b6 1 b7 0
// now we see that bit 0 is the LSB which is the first one in the struct
return 0;
}
// thank you and i hope this helps

Related

C integer to binary conversion, splitting the result into two binary values [duplicate]

I know that to get the number of bytes used by a variable type, you use sizeof(int) for instance. How do you get the value of the individual bytes used when you store a number with that variable type? (i.e. int x = 125.)
You have to know the number of bits (often 8) in each "byte". Then you can extract each byte in turn by ANDing the int with the appropriate mask. Imagine that an int is 32 bits, then to get 4 bytes out of the_int:
int a = (the_int >> 24) & 0xff; // high-order (leftmost) byte: bits 24-31
int b = (the_int >> 16) & 0xff; // next byte, counting from left: bits 16-23
int c = (the_int >> 8) & 0xff; // next byte, bits 8-15
int d = the_int & 0xff; // low-order byte: bits 0-7
And there you have it: each byte is in the low-order 8 bits of a, b, c, and d.
You can get the bytes by using some pointer arithmetic:
int x = 12578329; // 0xBFEE19
for (size_t i = 0; i < sizeof(x); ++i) {
// Convert to unsigned char* because a char is 1 byte in size.
// That is guaranteed by the standard.
// Note that is it NOT required to be 8 bits in size.
unsigned char byte = *((unsigned char *)&x + i);
printf("Byte %d = %u\n", i, (unsigned)byte);
}
On my machine (Intel x86-64), the output is:
Byte 0 = 25 // 0x19
Byte 1 = 238 // 0xEE
Byte 2 = 191 // 0xBF
Byte 3 = 0 // 0x00
You could make use of a union but keep in mind that the byte ordering is processor dependent and is called Endianness http://en.wikipedia.org/wiki/Endianness
#include <stdio.h>
#include <stdint.h>
union my_int {
int val;
uint8_t bytes[sizeof(int)];
};
int main(int argc, char** argv) {
union my_int mi;
int idx;
mi.val = 128;
for (idx = 0; idx < sizeof(int); idx++)
printf("byte %d = %hhu\n", idx, mi.bytes[idx]);
return 0;
}
If you want to get that information, say for:
int value = -278;
(I selected that value because it isn't very interesting for 125 - the least significant byte is 125 and the other bytes are all 0!)
You first need a pointer to that value:
int* pointer = &value;
You can now typecast that to a 'char' pointer which is only one byte, and get the individual bytes by indexing.
for (int i = 0; i < sizeof(value); i++) {
char thisbyte = *( ((char*) pointer) + i );
// do whatever processing you want.
}
Note that the order of bytes for ints and other data types depends on your system - look up 'big-endian' vs 'little-endian'.
This should work:
int x = 125;
unsigned char *bytes = (unsigned char *) (&x);
unsigned char byte0 = bytes[0];
unsigned char byte1 = bytes[1];
...
unsigned char byteN = bytes[sizeof(int) - 1];
But be aware that the byte order of integers is platform dependent.

combining MSB and LSB in short

I have a function that return 1 Byte
uint8_t fun();
the function should run 9 times , so I get 9 Byte I want to make the last8 one as 4 short values here what I've done but I'm not sure that the value that I get are correct :
char array[9];
.............
for ( i = 0; i< 9 ; i++){
array[i] = fun();
}
printf( " 1. Byte %x a = %d , b=%d c =%d \n" ,
array[0],
*(short*)&(array[1]),
*(short*)&(array[3]),
*(short*)&(array[5]),
*(short*)&(array[7]));
is that right ?
It's better to be explicit and join the 8-bit values into 16-bit values yourself:
uint8_t bytes[9];
uint16_t words[4];
words[0] = bytes[1] | (bytes[2] << 8);
words[1] = bytes[3] | (bytes[4] << 8);
words[2] = bytes[5] | (bytes[6] << 8);
words[3] = bytes[7] | (bytes[8] << 8);
The above assumes little-endian, by the way.
You will get alignement problems. Any pointer to a short can be seen as a pointer to char, but on non 8 bit machines, the inverse is not guaranteed.
IMHO, this would be safer :
struct {
char arr0;
union {
char array[8];
uint16_t sarr[4];
} u;
} s;
s.arr0 = fun();
for ( i = 0; i< 8 ; i++){
s.u.array[i] = fun();
}
printf( " 1. Byte %x a = %d , b=%d c =%d d=%d\n" ,
s.arr0,
s.u.sarr[0],
s.u.sarr[1],
s.u.sarr[2],
s.u.sarr[3]);
But I suppose you deal correctly with endianness on your machine and know how the conversion 2 chars <=> 1 short works ...
Try using struct to arrange the data and shift operations to convert for enianism.
// The existence of this function is assumed from the question.
extern unsigned char fun(void);
typedef struct
{
unsigned char Byte;
short WordA;
short WordB;
short WordC;
short WordD;
} converted_data;
void ConvertByteArray(converted_data* Dest, unsigned char* Source)
{
Dest->Byte = Source[0];
// The following assume that the Source bytes are MSB first.
// If they are LSB first, you will need to swap the indeces.
Dest->WordA = (((short)Source[1]) << 8) + Source[2];
Dest->WordB = (((short)Source[3]) << 8) + Source[4];
Dest->WordC = (((short)Source[5]) << 8) + Source[6];
Dest->WordD = (((hshort)Source[7]) << 8) + Source[8];
}
int main(void)
{
unsigned char array[9];
converted_data convertedData;
// Fill the array as per the question.
int i;
for ( i = 0; i< 9 ; i++)
{
array[i] = fun();
}
// Perform the conversion
ConvertByteArray(&convertedData, array);
// Note the use of %h not %d to specify a short in the printf!
printf( " 1. Byte %x a = %h , b=%h c =%h d =%h\n",
(int)convertedData.Byte, // Cast as int because %x assumes an int.
convertedData.WordA,
convertedData.WordB,
convertedData.WordC,
convertedData.WordD );
return 0;
}

Read an int value using a char pointer and return it

static unsigned int read24(unsigned char *ptr)
{
unsigned int b0;
unsigned int b1;
unsigned int b2;
unsigned int b3;
b0 = *ptr++;
b1 = *ptr++;
b2 = *ptr++;
b3 = *ptr;
return ( ((b0 >> 24) & 0x000000ff) |
((b1 >> 8) & 0x0000ff00) |
((b2 << 8) & 0x00ff0000) |
(b3 << 24) & 0x00000000 // this byte is not important so make it zero
);
}
Here i have written a function and am trying to read 32 bits (4bytes) using a char pointer and return those 32 bits (4bytes).I have a doubt if this will work properly.Also,am i using/wasting too much memory by defining 4 different integer variables?Is there a better way to write this function. Thank you for your time.
First, drop b3, since you're apparently meaning to read 24 bits you shouldn't even try to access that extra byte (what if it's not even allocated?).
Second, I think you have your shifts wrong. b0 will always be in the range [0..255], so if you >> 24, it'll become zero. There's also no need to mask anything out, since you're coming from unsigned char you know you'll only have 8 bits set. You probably want either:
return (b0 << 16) | (b1 << 8) | b2;
or
return (b2 << 16) | (b1 << 8) | b0;
depending on the endianness of your data.
As for using those intermediate ints, if you have a decent compiler it won't matter (the compiler will optimize them out). If however you're writing for an embedded platform or otherwise have a less-than state of the are compiler, it's possible that eliding the intermediate ints may help your performance. In this case, don't put multiple ptr++s in the same statement, use ptr[n] instead to avoid undefined behavior from multiple increments.
Well, I'm not too clear on what you're attempting to do. If I'm not mistaken you want to input a char* (Most likely 4 bytes if you're running a 32 bit system) and get the same organization of bytes as an int* (4 bytes)
If all you want is the int* version of a char* set of bytes you can use type-casting:
unsigned int* result = (unsigned int*)ptr;
If you want the same collection of bytes BUT you want the most significant byte to be equal to 0 then you can do this:
unsigned int* result = (unsigned int*)ptr & 0x0FFF;
Some additional info:
-Type Casting is a method of temporarily "casting" a variable as any type you want via the use of a temporary copy that is of the type your casting the variable to You can make a variable act as any type you want if you typecast it:
Example:
unsigned int varX = 48;
//Prints "Ascii decimal value 48 corresponds with: 0"
printf ("Ascii decimal value 48 corresponds with: %c\n", (char)varX);
-Hexidicamal digits occupy one byte each. So in your code:
0x000000ff -> 8 bytes of data
0x implies that each of the place holders are a hexidecimal value and
I think what you were going for was 0x000F, which would make all the other bytes 0 except the least significant byte
ANSI-C can process hexidecimal(prefix -> 0x), octal(prefix -> 0) and decimal
Hope this helped!
When building your number from the individual pointers, you must shift the numbers to the left as you incrementally Or the values together. (for little endian machines). Think of it this way, after you read b0, that will be the least significant byte in your final number. Where do more significant bytes go? (to the left).
When you read a pointer value into b0, b1, b2, b3, all they hold is one byte each. They have no way of knowing where they came from in the original number, so there is no "relative" shifting required. You just start with the least significant byte, and incrementally shift each successive byte to the left by 1 byte more than the last.
Below, I have used all bytes in the building of the unsigned value from the unsigned char pointers as an example. You can simply omit bytes you do not need to meet your needs.
#include <stdio.h>
#include <stdlib.h>
#if defined(__LP64__) || defined(_LP64)
# define BUILD_64 1
#endif
#ifdef BUILD_64
# define BITS_PER_LONG 64
#else
# define BITS_PER_LONG 32
#endif
char *binstr (unsigned long n);
static unsigned int read24 (unsigned char *ptr);
int main (void) {
unsigned int n = 16975631;
unsigned int o = 0;
o = read24 ((unsigned char *)&n);
printf ("\n number : %u %s\n", n, binstr (n));
printf (" read24 : %u %s\n\n", o, binstr (o));
return 0;
}
static unsigned int read24 (unsigned char *ptr)
{
unsigned char b0;
unsigned char b1;
unsigned char b2;
unsigned char b3;
b0 = *ptr++; /* 00001111000001110000001100000001 */
b1 = *ptr++; /* b0 b1 b2 b3 */
b2 = *ptr++; /* b3 b2 b1 b0 */
b3 = *ptr; /* 00000001000000110000011100001111 */
return ((b0 & 0x000000ffU) |
((b1 << 8 ) & 0x0000ff00U) |
((b2 << 16) & 0x00ff0000U) |
((b3 << 24) & 0xff000000U));
}
/* simple return of binary string */
char *binstr (unsigned long n)
{
static char s[BITS_PER_LONG + 1] = {0};
char *p = s + BITS_PER_LONG;
if (!n) {
*s = '0';
return s;
}
while (n) {
*(--p) = (n & 1) ? '1' : '0';
n >>= 1;
}
return p;
}
Output
$ ./bin/rd_int_as_uc
number : 16975631 1000000110000011100001111
read24 : 16975631 1000000110000011100001111
Consider using the following approach for your task:
#include <string.h>
unsigned int read24b(unsigned char *ptr)
{
unsigned int data = 0;
memcpy(&data, ptr, 3);
return data;
}
This is for case if you want direct order of bits, but I suppose you do not...
Concerning your code - you must apply mask and then make shift, e.g.:
unsigned int read24(unsigned char *ptr)
{
unsigned char b0;
unsigned char b1;
unsigned char b2;
b0 = *ptr++;
b1 = *ptr++;
b2 = *ptr;
return ( (b0 & 0x0ff) >> 16 |
(b1 & 0x0ff) >> 8 |
(b2 & 0x0ff)
);
}

C, how to put number split over an array, into int

Lets say I have char array[10] and [0] = '1', [1] = '2', [2] = '3', etc.
How would i go about creating (int) 123 from these indexes, using C?
I wish to implement this on an arduino board which is limited to just under 2kb of SRAM. so resourcefulness & efficiency are key.
With thanks to Sourav Ghosh, i solved this with a custom function to suit:
long makeInt(char one, char two, char three, char four){
char tmp[5];
tmp[0] = one;
tmp[1] = two;
tmp[2] = three;
tmp[3] = four;
char *ptr;
long ret;
ret = strtol(tmp, &ptr, 10);
return ret;
}
I think what you need to know is strtol(). Read details here.
Just to quote the essential part
long int strtol(const char *nptr, char **endptr, int base);
The strtol() function converts the initial part of the string in nptr to a long integer value according to the given base, which must be between 2 and 36 inclusive, or be the special value 0.
int i = ((array[0] << 24) & 0xff000000) |
((array[1] << 16) & 0x00ff0000) |
((array[2] << 8) & 0x0000ff00) |
((array[3] << 0) & 0x000000ff);
This should work
if you have no library with strtol()or atoi() available use this:
int result = 0;
for(char* p = array; *p; )
{
result += *p++ - '0';
if(*p) result *= 10; // more digits to come
}

How to get the value of individual bytes of a variable?

I know that to get the number of bytes used by a variable type, you use sizeof(int) for instance. How do you get the value of the individual bytes used when you store a number with that variable type? (i.e. int x = 125.)
You have to know the number of bits (often 8) in each "byte". Then you can extract each byte in turn by ANDing the int with the appropriate mask. Imagine that an int is 32 bits, then to get 4 bytes out of the_int:
int a = (the_int >> 24) & 0xff; // high-order (leftmost) byte: bits 24-31
int b = (the_int >> 16) & 0xff; // next byte, counting from left: bits 16-23
int c = (the_int >> 8) & 0xff; // next byte, bits 8-15
int d = the_int & 0xff; // low-order byte: bits 0-7
And there you have it: each byte is in the low-order 8 bits of a, b, c, and d.
You can get the bytes by using some pointer arithmetic:
int x = 12578329; // 0xBFEE19
for (size_t i = 0; i < sizeof(x); ++i) {
// Convert to unsigned char* because a char is 1 byte in size.
// That is guaranteed by the standard.
// Note that is it NOT required to be 8 bits in size.
unsigned char byte = *((unsigned char *)&x + i);
printf("Byte %d = %u\n", i, (unsigned)byte);
}
On my machine (Intel x86-64), the output is:
Byte 0 = 25 // 0x19
Byte 1 = 238 // 0xEE
Byte 2 = 191 // 0xBF
Byte 3 = 0 // 0x00
You could make use of a union but keep in mind that the byte ordering is processor dependent and is called Endianness http://en.wikipedia.org/wiki/Endianness
#include <stdio.h>
#include <stdint.h>
union my_int {
int val;
uint8_t bytes[sizeof(int)];
};
int main(int argc, char** argv) {
union my_int mi;
int idx;
mi.val = 128;
for (idx = 0; idx < sizeof(int); idx++)
printf("byte %d = %hhu\n", idx, mi.bytes[idx]);
return 0;
}
If you want to get that information, say for:
int value = -278;
(I selected that value because it isn't very interesting for 125 - the least significant byte is 125 and the other bytes are all 0!)
You first need a pointer to that value:
int* pointer = &value;
You can now typecast that to a 'char' pointer which is only one byte, and get the individual bytes by indexing.
for (int i = 0; i < sizeof(value); i++) {
char thisbyte = *( ((char*) pointer) + i );
// do whatever processing you want.
}
Note that the order of bytes for ints and other data types depends on your system - look up 'big-endian' vs 'little-endian'.
This should work:
int x = 125;
unsigned char *bytes = (unsigned char *) (&x);
unsigned char byte0 = bytes[0];
unsigned char byte1 = bytes[1];
...
unsigned char byteN = bytes[sizeof(int) - 1];
But be aware that the byte order of integers is platform dependent.

Resources