Separate byte in groups

Separate byte in groups - c

So Im using this (its from another question I did),
unsigned char *y = resultado->informacion;
int i = 0;
int tam = data->tamanho;
unsigned char resAfter;
for (int i=0; i<tam;i++)
{
unsigned char x = data->informacion[i];
x <<= 3;
if (i>0)
{
resAfter = (resAfter << 5) | x;
}
else
{
resAfter = x;
}
}
printf("resAfter es %s\n", resAfter);
so at the end I have this really long number (Im estimating about 43 bits), how can I get groups of 8 bits, I think im gettin something like (010101010101010.....000) and I want to separate this in groups of 8.
Another question, I know for sure that resAfter is going to have n number of bits where n is a multiply of 8 plus 3, so my question is: is this possible? or c is going to complete the byte? like if I get 43 bits then c is going to fill them with 0 and complete so I have 48 bits; and is there a way to delete these 3 bits?
Im new on c and bitwise so sorry if what Im doing is reallly bad.

Basically in programming you deal with bytes (i think, at least in most cases), in C you deal with types of specific size (depending on system you run it on).
That said char usually has size of 1 byte, and I don't really think you can playing around with single bits. I mean u can do operation on them (<< for instance) in scale of single bits but i don't know of any standard way to preserve less than 8 bits in variable in C (though i may be wrong about it)

Related

How to analyze bytes of a variable's value in C

is it possible to divide for example an integer in n bits?
For example, since an int variable has a size of 32 bits (4 bytes) is it possible to divide the number in 4 "pieces" of 8 bits and put them in 4 other variables that have a size of 8 bits?

I solved using unsigned char *pointer pointing to the variable that I want to analyze bytes, something like this:
int x = 10;
unsigned char *p = (unsigned char *) &x;
//Since my cpu is little endian I'll print bytes from the end
for(int i = sizeof(int) - 1; i >= 0; i--)
//print hexadecimal bytes
printf("%.2x ", p[i]);

Yes, of course it is. But generally we just use bit operations directly on the bits (called bitops) using bitwise operators defined for all discrete integer types.
For instance, if you need to test the 5th least significant bit you can use x &= 1 << 4 to have x just to have the 5th bit set, and all others set to zero. Then you can use if (x) to test if it has been set; C doesn't use a boolean type but assumes that zero is false and any other value means true. If you store 1 << 4 into a constant then you have created a "(bit) mask" for that particular bit.
If you need a value 0 or 1 then you can use a shift the other way and use x = (x >> 4) & 1. This is all covered in most C books, so I'd implore you to read about these bit operations there.
There are many Q/A's here how to split integers into bytes, see e.g. here. In principle you can store those in a char, but if you may require integer operations then you can also split the int into multiple values. One problem with that is that an int is just defined to at least store values from -32768 to 32767. That means that the number of bytes in an int can be 2 bytes or more.
In principle it is also possible to use bit fields but I'd be hesitant to use those. With an int you will at least know that the bits will be stored in the least significant bits.

C - Method for setting all even-numbered bits to 1

I was charged with the task of writing a method that "returns the word with all even-numbered bits set to 1." Being completely new to C this seems really confusing and unclear. I don't understand how I can change the bits of a number with C. That seems like a very low level instruction, and I don't even know how I would do that in Java (my first language)! Can someone please help me! This is the method signature.
int evenBits(void){
return 0;
}
Any instruction on how to do this or even guidance on how to begin doing this would be greatly appreciated. Thank you so much!

Break it down into two problems.
(1) Given a variable, how do I set particular bits?
Hint: use a bitwise operator.
(2) How do I find out the representation of "all even-numbered bits" so I can use a bitwise operator to set them?
Hint: Use math. ;-) You could make a table (or find one) such as:
Decimal | Binary
--------+-------
0 | 0
1 | 1
2 | 10
3 | 11
... | ...
Once you know what operation to use to set particular bits, and you know a decimal (or hexadecimal) integer literal to use that with in C, you've solved the problem.

You must give a precise definition of all even numbered bits. Bits are numbered in different ways on different architectures. Hardware people like to number them from 1 to 32 from the least significant to the most significant bit, or sometimes the other way, from the most significant to the least significant bit... while software guys like to number bits by increasing order starting at 0 because bit 0 represents the number 20, ie: 1.
With this latter numbering system, the bit pattern would be 0101...0101, thus a value in hex 0x555...555. If you number bits starting at 1 for the least significant bit, the pattern would be 1010...1010, in hex 0xAAA...AAA. But this representation actually encodes a negative value on current architectures.
I shall assume for the rest of this answer that even numbered bits are those representing even powers of 2: 1 (20), 4 (22), 16 (24)...
The short answer for this problem is:
int evenBits(void) {
return 0x55555555;
}
But what if int has 64 bits?
int evenBits(void) {
return 0x5555555555555555;
}
Would handle 64 bit int but would have implementation defined behavior on systems where int is smaller.
Using macros from <limits.h>, you could mask off the extra bits to handle 16, 32 and 64 bit ints:
#include <limits.h>
int evenBits(void) {
return 0x5555555555555555 & INT_MAX;
}
But this code still makes some assumptions:
int has at most 64 bits.
int has an even number of bits.
INT_MAX is a power of 2 minus 1.
These assumptions are valid for most current systems, but the C Standard allows for implementations where one or more are invalid.

So basically every other bit has to be set to one? This is why we have bitwise operations in C. Imagine a regular bitarray. What you want is the right most even bit and set it to 1(this is the number 2). Then we just use the OR operator (|) to modify our existing number. After doing that. we bitshift the number 2 places to the left (<< 2), this modifies the bit array to 1000 compared to the previous 0010. Then we do the same again and use the or operator. The code below describes it better.
#include <stdio.h>
unsigned char SetAllEvenBitsToOne(unsigned char x);
int IsAllEvenBitsOne(unsigned char x);
int main()
{
unsigned char x = 0; //char is one byte data type ie. 8 bits.
x = SetAllEvenBitsToOne(x);
int check = IsAllEvenBitsOne(x);
if(check==1)
{
printf("shit works");
}
return 0;
}
unsigned char SetAllEvenBitsToOne(unsigned char x)
{
int i=0;
unsigned char y = 2;
for(i=0; i < sizeof(char)*8/2; i++)
{
x = x | y;
y = y << 2;
}
return x;
}
int IsAllEvenBitsOne(unsigned char x)
{
unsigned char y;
for(int i=0; i<(sizeof(char)*8/2); i++)
{
y = x >> 7;
if(y > 0)
{
printf("x before: %d\t", x);
x = x << 2;
printf("x after: %d\n", x);
continue;
}
else
{
printf("Not all even bits are 1\n");
return 0;
}
}
printf("All even bits are 1\n");
return 1;
}
Here is a link to Bitwise Operations in C

Need help understanding bitmaps, bitwise operations, and C

Disclaimer: I am asking these questions in relation to an assignment. The assignment itself calls for implementing a bitmap and doing some operations with that, but that is not what I am asking about. I just want to understand the concepts so I can try the implementation for myself.
I need help understanding bitmaps/bit arrays and bitwise operations. I understand the basics of binary and how left/right shift work, but I don't know exactly how that use is beneficial.
Basically, I need to implement a bitmap to store the results of a prime sieve (of Eratosthenes.) This is a small part of a larger assignment focused on different IPC methods, but to get to that part I need to get the sieve completed first. I've never had to use bitwise operations nor have I ever learned about bitmaps, so I'm kind of on my own to learn this.
From what I can tell, bitmaps are arrays of a bit of a certain size, right? By that I mean you could have an 8-bit array or a 32-bit array (in my case, I need to find the primes for a 32-bit unsigned int, so I'd need the 32-bit array.) So if this is an array of bits, 32 of them to be specific, then we're basically talking about a string of 32 1s and 0s. How does this translate into a list of primes? I figure that one method would evaluate the binary number and save it to a new array as decimal, so all the decimal primes exist in one array, but that seems like you're using too much data.
Do I have the gist of bitmaps? Or is there something I'm missing? I've tried reading about this around the internet but I can't find a source that makes it clear enough for me...

Suppose you have a list of primes: {3, 5, 7}. You can store these numbers as a character array: char c[] = {3, 5, 7} and this requires 3 bytes.
Instead lets use a single byte such that each set bit indicates that the number is in the set. For example, 01010100. If we can set the byte we want and later test it we can use this to store the same information in a single byte. To set it:
char b = 0;
// want to set `3` so shift 1 twice to the left
b = b | (1 << 2);
// also set `5`
b = b | (1 << 4);
// and 7
b = b | (1 << 6);
And to test these numbers:
// is 3 in the map:
if (b & (1 << 2)) {
// it is in...

You are going to need a lot more than 32 bits.
You want a sieve for up to 2^32 numbers, so you will need a bit for each one of those. Each bit will represent one number, and will be 0 if the number is prime and 1 if it is composite. (You can save one bit by noting that the first bit must be 2 as 1 is neither prime nor composite. It is easier to waste that one bit.)
2^32 = 4,294,967,296
Divide by 8
536,870,912 bytes, or 1/2 GB.
So you will want an array of 2^29 bytes, or 2^27 4-byte words, or whatever you decide is best, and also a method for manipulating the individual bits stored in the chars (ints) in the array.
It sounds like eventually, you are going to have several threads or processes operating on this shared memory.You may need to store it all in a file if you can't allocate all that memory to yourself.
Say you want to find the bit for x. Then let a = x / 8 and b = x - 8 * a. Then the bit is at arr[a] & (1 << b). (Avoid the modulus operator % wherever possible.)
//mark composite
a = x / 8;
b = x - 8 * a;
arr[a] |= 1 << b;
This sounds like a fun assignment!

A bitmap allows you to construct a large predicate function over the range of numbers you're interested in. If you just have a single 8-bit char, you can store Boolean values for each of the eight values. If you have 2 chars, it doubles your range.
So, say you have a bitmap that already has this information stored, your test function could look something like this:
bool num_in_bitmap (int num, char *bitmap, size_t sz) {
if (num/8 >= sz) return 0;
return (bitmap[num/8] >> (num%8)) & 1;
}

Combining two integers with bit-shifting

I am writing a program, I have to store the distances between pairs of numbers in a hash table.
I will be given a Range R. Let's say the range is 5.
Now I have to find distances between the following pairs:
1 2
1 3
1 4
1 5
2 3
2 4
2 5
3 4
3 5
4 5
that is, the total number of pairs is (R^2/2 -R). I want to store it in a hash table. All these are unsigned integers. So there are 32 bits. My idea was that, I take an unsigned long (64 bits).
Let's say I need to hash the distance between 1 and 5. Now
long k = 1;
k = k<<31;
k+=5;
Since I have 64 bits, I am storing the first number in the first 31 bits and the second number in the second 31 bits. This guarantees unique keys which can then be used for hashing.
But when I do this:
long k = 2;
k << 31;
k+= 2;
The value of k becomes zero.
I am not able to wrap my head around this shifting concept.
Essentially what I am trying to achieve is that,
An unsigned long has | 32bits | 32 bits |
Store |1st integer|2nd integer|
How can I achieve this to get unique keys for each pair?
I am running the code on a 64 bit AMD Opteron processor. sizeof(ulong) returns 8. So it is 64 bits. Do I need a long long in such a case?
Also I need to know if this will create unique keys? From my understanding , it does seem to create unique keys. But I would like a confirmation.

Assuming you're using C or something that follows vaguely similar rules, your problem is primarily with types.
long k = 2; // This defines `k` a a long
k << 31; // This (sort of) shifts k left, but still as a 32-bit long.
What you almost certainly want/need to do is convert k to a long long before you shift it left, so you're shifting in a 64-bit word.
unsigned long first_number = 2;
unsigned long long both_numbers = (unsigned long long)first_number << 32;
unsigned long second_number = 5;
both_numbers |= second_number;
In this case, if (for example) you print out both_numbers, in hexadecimal, you should get 0x0000000200000005.

The concept makes sense. As Oli has added, you want to shift by 32, not 31 - shifting by 31 will put it in the 31st bit, so if you shifted back to the right to try and get the first number you would end up with a bit missing, and the second number would be potentially huge because you could have put a 1 in the uppermost bit.
But if you want to do bit manipulation, I would do instead:
k = 1 << 32;
k = k|5;
It really should produce the same result though. Are you sure that long is 64 bits on your machine? This is not always the case (although it usually is, I think). If long is actually 32 bits, 2<<31 will result in 0.
How large is R? You can get away with a 32 bit sized variable if R doesn't go past 65535...

how is data stored at bit level according to "Endianness"?

I read about Endianness and understood squat...
so I wrote this
main()
{
int k = 0xA5B9BF9F;
BYTE *b = (BYTE*)&k; //value at *b is 9f
b++; //value at *b is BF
b++; //value at *b is B9
b++; //value at *b is A5
}
k was equal to A5 B9 BF 9F
and (byte)pointer "walk" o/p was 9F BF b9 A5
so I get it bytes are stored backwards...ok.
~
so now I thought how is it stored at BIT level...
I means is "9f"(1001 1111) stored as "f9"(1111 1001)?
so I wrote this
int _tmain(int argc, _TCHAR* argv[])
{
int k = 0xA5B9BF9F;
void *ptr = &k;
bool temp= TRUE;
cout<<"ready or not here I come \n"<<endl;
for(int i=0;i<32;i++)
{
temp = *( (bool*)ptr + i );
if( temp )
cout<<"1 ";
if( !temp)
cout<<"0 ";
if(i==7||i==15||i==23)
cout<<" - ";
}
}
I get some random output
even for nos. like "32" I dont get anything sensible.
why ?

Just for completeness, machines are described in terms of both byte order and bit order.
The intel x86 is called Consistent Little Endian because it stores multi-byte values in LSB to MSB order as memory address increases. Its bit numbering convention is b0 = 2^0 and b31 = 2^31.
The Motorola 68000 is called Inconsistent Big Endian because it stores multi-byte values in MSB to LSB order as memory address increases. Its bit numbering convention is b0 = 2^0 and b31 = 2^31 (same as intel, which is why it is called 'Inconsistent' Big Endian).
The 32-bit IBM/Motorola PowerPC is called Consistent Big Endian because it stores multi-byte values in MSB to LSB order as memory address increases. Its bit numbering convention is b0 = 2^31 and b31 = 2^0.
Under normal high level language use the bit order is generally transparent to the developer. When writing in assembly language or working with the hardware, the bit numbering does come into play.

Endianness, as you discovered by your experiment refers to the order that bytes are stored in an object.
Bits do not get stored differently, they're always 8 bits, and always "human readable" (high->low).
Now that we've discussed that you don't need your code... About your code:
for(int i=0;i<32;i++)
{
temp = *( (bool*)ptr + i );
...
}
This isn't doing what you think it's doing. You're iterating over 0-32, the number of bits in a word - good. But your temp assignment is all wrong :)
It's important to note that a bool* is the same size as an int* is the same size as a BigStruct*. All pointers on the same machine are the same size - 32bits on a 32bit machine, 64bits on a 64bit machine.
ptr + i is adding i bytes to the ptr address. When i>3, you're reading a whole new word... this could possibly cause a segfault.
What you want to use is bit-masks. Something like this should work:
for (int i = 0; i < 32; i++) {
unsigned int mask = 1 << i;
bool bit_is_one = static_cast<unsigned int>(ptr) & mask;
...
}

Your machine almost certainly can't address individual bits of memory, so the layout of bits inside a byte is meaningless. Endianness refers only to the ordering of bytes inside multibyte objects.
To make your second program make sense (though there isn't really any reason to, since it won't give you any meaningful results) you need to learn about the bitwise operators - particularly & for this application.

Byte Endianness
On different machines this code may give different results:
union endian_example {
unsigned long u;
unsigned char a[sizeof(unsigned long)];
} x;
x.u = 0x0a0b0c0d;
int i;
for (i = 0; i< sizeof(unsigned long); i++) {
printf("%u\n", (unsigned)x.a[i]);
}
This is because different machines are free to store values in any byte order they wish. This is fairly arbitrary. There is no backwards or forwards in the grand scheme of things.
Bit Endianness
Usually you don't have to ever worry about bit endianness. The most common way to access individual bits is with shifts ( >>, << ) but those are really tied to values, not bytes or bits. They preform an arithmatic operation on a value. That value is stored in bits (which are in bytes).
Where you may run into a problem in C with bit endianness is if you ever use a bit field. This is a rarely used (for this reason and a few others) "feature" of C that allows you to tell the compiler how many bits a member of a struct will use.
struct thing {
unsigned y:1; // y will be one bit and can have the values 0 and 1
signed z:1; // z can only have the values 0 and -1
unsigned a:2; // a can be 0, 1, 2, or 3
unsigned b:4; // b is just here to take up the rest of the a byte
};
In this the bit endianness is compiler dependant. Should y be the most or least significant bit in a thing? Who knows? If you care about the bit ordering (describing things like the layout of a IPv4 packet header, control registers of device, or just a storage formate in a file) then you probably don't want to worry about some different compiler doing this the wrong way. Also, compilers aren't always as smart about how they work with bit fields as one would hope.

This line here:
temp = *( (bool*)ptr + i );
... when you do pointer arithmetic like this, the compiler moves the pointer on by the number you added times the sizeof the thing you are pointing to. Because you are casting your void* to a bool*, the compiler will be moving the pointer along by the size of one "bool", which is probably just an int under the covers, so you'll be printing out memory from further along than you thought.
You can't address the individual bits in a byte, so it's almost meaningless to ask which way round they are stored. (Your machine can store them whichever way it wants and you won't be able to tell). The only time you might care about it is when you come to actually spit bits out over a physical interface like I2C or RS232 or similar, where you have to actually spit the bits out one-by-one. Even then, though, the protocol would define which order to spit the bits out in, and the device driver code would have to translate between "an int with value 0xAABBCCDD" and "a bit sequence 11100011... [whatever] in protocol order".