Structure for an array of bits in C - c

It has come to my attention that there is no builtin structure for a single bit in C. There is (unsigned) char and int, which are 8 bits (one byte), and long which is 64+ bits, and so on (uint64_t, bool...)
I came across this while coding up a huffman tree, and the encodings for certain characters were not necessarily exactly 8 bits long (like 00101), so there was no efficient way to store the encodings. I had to find makeshift solutions such as strings or boolean arrays, but this takes far more memory.
But anyways, my question is more general: is there a good way to store an array of bits, or some sort of user-defined struct? I scoured the web for one but the smallest structure seems to be 8 bits (one byte). I tried things such as int a : 1 but it didn't work. I read about bit fields but they do not simply achieve exactly what I want to do. I know questions have already been asked about this in C++ and if there is a struct for a single bit, but mostly I want to know specifically what would be the most memory-efficient way to store an encoding such as 00101 in C.

If you're mainly interested in accessing a single bit at a time, you can take an array of unsigned char and treat it as a bit array. For example:
unsigned char array[125];
Assuming 8 bits per byte, this can be treated as an array of 1000 bits. The first 16 logically look like this:
---------------------------------------------------------------------------------
byte | 0 | 1 |
---------------------------------------------------------------------------------
bit | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 |
---------------------------------------------------------------------------------
Let's say you want to work with bit b. You can then do the following:
Read bit b:
value = (array[b/8] & (1 << (b%8)) != 0;
Set bit b:
array[b/8] |= (1 << (b%8));
Clear bit b:
array[b/8] &= ~(1 << (b%8));
Dividing the bit number by 8 gets you the relevant byte. Similarly, mod'ing the bit number by 8 gives you the relevant bit inside of that byte. You then left shift the value 1 by the bit number to give you the necessary bit mask.
While there is integer division and modulus at work here, the dividend is a power of 2 so any decent compiler should replace them with bit shifting/masking.

It has come to my attention that there is no builtin structure for a single bit in C.
That is true, and it makes sense because substantially no machines have bit-addressible memory.
But anyways, my question is more general: is there a good way to store
an array of bits, or some sort of user-defined struct?
One generally uses an unsigned char or another unsigned integer type, or an array of such. Along with that you need some masking and shifting to set or read the values of individual bits.
I scoured the
web for one but the smallest structure seems to be 8 bits (one byte).
Technically, the smallest addressible storage unit ([[un]signed] char) could be larger than 8 bits, though you're unlikely ever to see that.
I tried things such as int a : 1 but it didn't work. I read about bit
fields but they do not simply achieve exactly what I want to do.
Bit fields can appear only as structure members. A structure object containing such a bitfield will still have a size that is a multiple of the size of a char, so that doesn't map very well onto a bit array or any part of one.
I
know questions have already been asked about this in C++ and if there
is a struct for a single bit, but mostly I want to know specifically
what would be the most memory-efficient way to store an encoding such
as 00101 in C.
If you need a bit pattern and a separate bit count -- such as if some of the bits available in the bit-storage object are not actually significant -- then you need a separate datum for the significant-bit count. If you want a data structure for a small but variable number of bits, then you might go with something along these lines:
struct bit_array_small {
unsigned char bits;
unsigned char num_bits;
};
Of course, you can make that larger by choosing a different data type for the bits member and, maybe, the num_bits member. I'm sure you can see how you might extend the concept to handling arbitrary-length bit arrays if you should happen to need that.

If you really want the most memory efficiency, you can encode the Huffman tree itself as a stream of bits. See, for example:
https://www.siggraph.org/education/materials/HyperGraph/video/mpeg/mpegfaq/huffman_tutorial.html
Then just encode those bits as an array of bytes, with a possible waste of 7 bits.
But that would be a horrible idea. For the structure in memory to be useful, it must be easy to access. You can still do that very efficiently. Let's say you want to encode up to 12-bit codes. Use a 16-bit integer and bitfields:
struct huffcode {
uint16_t length: 4,
value: 12;
}
C will store this as a single 16-bit value, and allow you to access the length and value fields separately. The complete Huffman node would also contain the input code value, and tree pointers (which, if you want further compactness, can be integer indices into an array).

You can make you own bit array in no time.
#define ba_set(ptr, bit) { (ptr)[(bit) >> 3] |= (char)(1 << ((bit) & 7)); }
#define ba_clear(ptr, bit) { (ptr)[(bit) >> 3] &= (char)(~(1 << ((bit) & 7))); }
#define ba_get(ptr, bit) ( ((ptr)[(bit) >> 3] & (char)(1 << ((bit) & 7)) ? 1 : 0 )
#define ba_setbit(ptr, bit, value) { if (value) { ba_set((ptr), (bit)) } else { ba_clear((ptr), (bit)); } }
#define BITARRAY_BITS (120)
int main()
{
char mybits[(BITARRAY_BITS + 7) / 8];
memset(mybits, 0, sizeof(mybits));
ba_setbit(mybits, 33, 1);
if (!ba_get(33))
return 1;
return 0;
};

Related

Converting 32 bit number to four 8bit numbers

I am trying to convert the input from a device (always integer between 1 and 600000) to four 8-bit integers.
For example,
If the input is 32700, I want 188 127 00 00.
I achieved this by using:
32700 % 256
32700 / 256
The above works till 32700. From 32800 onward, I start getting incorrect conversions.
I am totally new to this and would like some help to understand how this can be done properly.
Major edit following clarifications:
Given that someone has already mentioned the shift-and-mask approach (which is undeniably the right one), I'll give another approach, which, to be pedantic, is not portable, machine-dependent, and possibly exhibits undefined behavior. It is nevertheless a good learning exercise, IMO.
For various reasons, your computer represents integers as groups of 8-bit values (called bytes); note that, although extremely common, this is not always the case (see CHAR_BIT). For this reason, values that are represented using more than 8 bits use multiple bytes (hence those using a number of bits with is a multiple of 8). For a 32-bit value, you use 4 bytes and, in memory, those bytes always follow each other.
We call a pointer a value containing the address in memory of another value. In that context, a byte is defined as the smallest (in terms of bit count) value that can be referred to by a pointer. For example, your 32-bit value, covering 4 bytes, will have 4 "addressable" cells (one per byte) and its address is defined as the first of those addresses:
|==================|
| MEMORY | ADDRESS |
|========|=========|
| ... | x-1 | <== Pointer to byte before
|--------|---------|
| BYTE 0 | x | <== Pointer to first byte (also pointer to 32-bit value)
|--------|---------|
| BYTE 1 | x+1 | <== Pointer to second byte
|--------|---------|
| BYTE 2 | x+2 | <== Pointer to third byte
|--------|---------|
| BYTE 3 | x+3 | <== Pointer to fourth byte
|--------|---------|
| ... | x+4 | <== Pointer to byte after
|===================
So what you want to do (split the 32-bit word into 8-bits word) has already been done by your computer, as it is imposed onto it by its processor and/or memory architecture. To reap the benefits of this almost-coincidence, we are going to find where your 32-bit value is stored and read its memory byte-by-byte (instead of 32 bits at a time).
As all serious SO answers seem to do so, let me cite the Standard (ISO/IEC 9899:2018, 6.2.5-20) to define the last thing I need (emphasis mine):
Any number of derived types can be constructed from the object and function types, as follows:
An array type describes a contiguously allocated nonempty set of objects with a particular member object type, called the element type. [...] Array types are characterized by their element type and by the number of elements in the array. [...]
[...]
So, as elements in an array are defined to be contiguous, a 32-bit value in memory, on a machine with 8-bit bytes, really is nothing more, in its machine representation, than an array of 4 bytes!
Given a 32-bit signed value:
int32_t value;
its address is given by &value. Meanwhile, an array of 4 8-bit bytes may be represented by:
uint8_t arr[4];
notice that I use the unsigned variant because those bytes don't really represent a number per se so interpreting them as "signed" would not make sense. Now, a pointer-to-array-of-4-uint8_t is defined as:
uint8_t (*ptr)[4];
and if I assign the address of our 32-bit value to such an array, I will be able to index each byte individually, which means that I will be reading the byte directly, avoiding any pesky shifting-and-masking operations!
uint8_t (*bytes)[4] = (void *) &value;
I need to cast the pointer ("(void *)") because I can't bear that whining compiler &value's type is "pointer-to-int32_t" while I'm assigning it to a "pointer-to-array-of-4-uint8_t" and this type-mismatch is caught by the compiler and pedantically warned against by the Standard; this is a first warning that what we're doing is not ideal!
Finally, we can access each byte individually by reading it directly from memory through indexing: (*bytes)[n] reads the n-th byte of value!
To put it all together, given a send_can(uint8_t) function:
for (size_t i = 0; i < sizeof(*bytes); i++)
send_can((*bytes)[i]);
and, for testing purpose, we define:
void send_can(uint8_t b)
{
printf("%hhu\n", b);
}
which prints, on my machine, when value is 32700:
188
127
0
0
Lastly, this shows yet another reason why this method is platform-dependent: the order in which the bytes of the 32-bit word is stored isn't always what you would expect from a theoretical discussion of binary representation i.e:
byte 0 contains bits 31-24
byte 1 contains bits 23-16
byte 2 contains bits 15-8
byte 3 contains bits 7-0
actually, AFAIK, the C Language permits any of the 24 possibilities for ordering those 4 bytes (this is called endianness). Meanwhile, shifting and masking will always get you the n-th "logical" byte.
It really depends on how your architecture stores an int. For example
8 or 16 bit system short=16, int=16, long=32
32 bit system, short=16, int=32, long=32
64 bit system, short=16, int=32, long=64
This is not a hard and fast rule - you need to check your architecture first. There is also a long long but some compilers do not recognize it and the size varies according to architecture.
Some compilers have uint8_t etc defined so you can actually specify how many bits your number is instead of worrying about ints and longs.
Having said that you wish to convert a number into 4 8 bit ints. You could have something like
unsigned long x = 600000UL; // you need UL to indicate it is unsigned long
unsigned int b1 = (unsigned int)(x & 0xff);
unsigned int b2 = (unsigned int)(x >> 8) & 0xff;
unsigned int b3 = (unsigned int)(x >> 16) & 0xff;
unsigned int b4 = (unsigned int)(x >> 24);
Using shifts is a lot faster than multiplication, division or mod. This depends on the endianess you wish to achieve. You could reverse the assignments using b1 with the formula for b4 etc.
You could do some bit masking.
600000 is 0x927C0
600000 / (256 * 256) gets you the 9, no masking yet.
((600000 / 256) & (255 * 256)) >> 8 gets you the 0x27 == 39. Using a 8bit-shifted mask of 8 set bits (256 * 255) and a right shift by 8 bits, the >> 8, which would also be possible as another / 256.
600000 % 256 gets you the 0xC0 == 192 as you did it. Masking would be 600000 & 255.
I ended up doing this:
unsigned char bytes[4];
unsigned long n;
n = (unsigned long) sensore1 * 100;
bytes[0] = n & 0xFF;
bytes[1] = (n >> 8) & 0xFF;
bytes[2] = (n >> 16) & 0xFF;
bytes[3] = (n >> 24) & 0xFF;
CAN_WRITE(0x7FD,8,01,sizeof(n),bytes[0],bytes[1],bytes[2],bytes[3],07,255);
I have been in a similar kind of situation while packing and unpacking huge custom packets of data to be transmitted/received, I suggest you try below approach:
typedef union
{
uint32_t u4_input;
uint8_t u1_byte_arr[4];
}UN_COMMON_32BIT_TO_4X8BIT_CONVERTER;
UN_COMMON_32BIT_TO_4X8BIT_CONVERTER un_t_mode_reg;
un_t_mode_reg.u4_input = input;/*your 32 bit input*/
// 1st byte = un_t_mode_reg.u1_byte_arr[0];
// 2nd byte = un_t_mode_reg.u1_byte_arr[1];
// 3rd byte = un_t_mode_reg.u1_byte_arr[2];
// 4th byte = un_t_mode_reg.u1_byte_arr[3];
The largest positive value you can store in a 16-bit signed int is 32767. If you force a number bigger than that, you'll get a negative number as a result, hence unexpected values returned by % and /.
Use either unsigned 16-bit int for a range up to 65535 or a 32-bit integer type.

Bitshifting vs array indexing, which is more appropriate for usart interfaces on 32bit MCUs

I have an embedded project with a USART HAL. This USART can only transmit or receive 8 or 16 bits at a time (depending on the usart register I chose i.e. single/double in/out). Since it's a 32-bit MCU, I figured I might as well pass around 32-bit fields as (from what I have been lead to understand) this is a more efficient use of bits for the MPU. Same would apply for a 64-bit MPU i.e. pass around 64-bit integers. Perhaps that is misguided advice, or advice taken out of context.
With that in mind, I have packed the 8 bits into a 32-bit field via bit-shifting. I do this for both tx and rx on the usart.
The code for the 8-bit only register is as follows (the 16-bit register just has half the amount of rounds for bit-shifting):
int zg_usartTxdataWrite(USART_data* MPI_buffer,
USART_frameconf* MPI_config,
USART_error* MPI_error)
{
MPI_error = NULL;
if(MPI_config != NULL){
zg_usartFrameConfWrite(MPI_config);
}
HPI_usart_data.txdata = MPI_buffer->txdata;
for (int i = 0; i < USART_TXDATA_LOOP; i++){
if((USART_STATUS_TXC & usart->STATUS) > 0){
usart->TXDATAX = (i == 0 ? (HPI_usart_data.txdata & USART_TXDATA_DATABITS) : (HPI_usart_data.txdata >> SINGLE_BYTE_SHIFT) & USART_TXDATA_DATABITS);
}
usart->IFC |= USART_STATUS_TXC;
}
return 0;
}
EDIT: RE-ENTERTING LOGIC OF ABOVE CODE WITH ADDED DEFINES FOR CLARITY OF TERNARY OPERATOR IMPLICIT PROMOTION PROBLEM DISCUSSED IN COMMENTS SECTION
(the HPI_usart and USART_data structs are the same just different levels, I have since removed the HPI_usart layer, but for the sake of this example I will leave it in)
#define USART_TXDATA_LOOP 4
#define SINGLE_BYTE_SHIFT 8
typedef struct HPI_USART_DATA{
...
uint32_t txdata;
...
}HPI_usart
HPI_usart HPI_usart_data = {'\0'};
const uint8_t USART_TXDATA_DATABITS = 0xFF;
int zg_usartTxdataWrite(USART_data* MPI_buffer,
USART_frameconf* MPI_config,
USART_error* MPI_error)
{
MPI_error = NULL;
if(MPI_config != NULL){
zg_usartFrameConfWrite(MPI_config);
}
HPI_usart_data.txdata = MPI_buffer->txdata;
for (int i = 0; i < USART_TXDATA_LOOP; i++){
if((USART_STATUS_TXC & usart->STATUS) > 0){
usart->TXDATAX = (i == 0 ? (HPI_usart_data.txdata & USART_TXDATA_DATABITS) : (HPI_usart_data.txdata >> SINGLE_BYTE_SHIFT) & USART_TXDATA_DATABITS);
}
usart->IFC |= USART_STATUS_TXC;
}
return 0;
}
However, I now realize that this is potentially causing more issues than it solves because I am essentially internally encoding these bits which then have to be decoded almost immediately when they are passed through to/from different data layers. I feel like it's a clever and sexy solution, but I'm now trying to solve a problem that I shouldn't have created in the first place. Like how to extract variable bit fields when there is an offset i.e. in gps nmea sentences where the first 8 bits might be one relevant field and then the rest are 32bit fields. So it ends up being like this:
32-bit array member 0:
bits 24-31 bits 15-23 bits 8-15 bits 0-7
| 8-bit Value | 32-bit Value A, bits 24-31 | 32-bit Value A, bits 16-23 | 32-bit Value A, bits 8-15 |
32-bit array member 1:
bits 24-31 bits 15-23 bits 8-15 bits 0-7
| 32-bit Value A, bits 0-7 | 32-bit Value B, bits 24-31 | 32-bit Value B, bits 16-23 | 32-bit Value B, bits 8-15 |
32-bit array member 2:
bits 24-31 15-23 8-15 ...
| 32-bit Value B, bits 0-7 | etc... | .... | .... |
The above example requires manual decoding, which is fine I guess, but it's different for every nmea sentence and just feels more manual than programmatic.
My question is this: bitshifting vs array indexing, which is more appropriate?
Should I just have assigned each incoming/outgoing value to a 32-bit array member and then just index that way? I feel like that is the solution since it would not only make it easier to traverse the data on other layers, but I would be able to eliminate all this bit-shifting logic and then the only difference between an rx or tx function would be the direction the data is going.
It does mean a small rewrite of the interface and the resulting gps module layer, but that feels like less work and also a cheap lesson early on in my project.
Also any thoughts and general experience on this would be great.
Since it's a 32-bit MCU, I figured I might as well pass around 32-bit fields
That's not really the programmer's call to make. Put the 8 or 16 bit variable in a struct. Let the compiler add padding if needed. Alternatively you can use uint_fast8_t and uint_fast16_t.
My question is this: bitshifting vs array indexing, which is more appropriate?
Array indexing is for accessing arrays. If you have an array, use it. If not, then don't.
While it is possible to chew through larger chunks of data byte by byte, such code must be written much more carefully, to prevent running into various subtle type conversion and pointer aliasing bugs.
In general, bit shifting is preferred when accessing data up to the CPU's word size, 32 bits in this case. It is fast and also portable, so that you don't have to take endianess in account. It is the preferred method of serialization/de-serialization of integers.

Reading two 8 bit registers into 12 bit value of an ADXL362 in C

I'm querying an ADXL362 Digital Output MEMS Accelerometer for its axis data which it holds as two 8 bit registers which combine to give a 12 bit value and I'm trying to figure out how to combine those values. I've never been good at bitwise manipulation so any help would be greatly appreciated. I would imagine it is something like this:
number = Z_data_H << 8 | Z_data_L;
number = (number & ~(1<<13)) | (0<<13);
number = (number & ~(1<<14)) | (0<<14);
number = (number & ~(1<<15)) | (0<<15);
number = (number & ~(1<<16)) | (0<<16);
ADXL362 data sheet (page 26)
Z axis data register
Your first line should be what you need:
int16_t number;
number = (Z_data_H << 8) | Z_data_L;
The sign-extension bits mean that you can read the value as if it was a 16-bit signed integer. The value will simply never be outside the range of a 12-bit integer. It's important that you leave those bits intact in order to handle negative values correctly.
You just have to do:
signed short number;
number = Z_data_H << 8 | Z_data_L;
The shift left by 8 bit combined with the lower bits you already
had figured out are combining the 2 bytes correctly. Just use the appropriate data size to have the C code recoginize the sign of the 12 bit number correctly.
Note that short not necessarily refers to a 16bit value, depending on your compiler and architecture - so, you might want to attempt to that.

How to initialize the bits in a register using C in a readable manner

I have a 24 bit register that comprises a number of fields. For example, the 3 upper bits are "mode", the bottom 10 bits are "data rate divisor", etc. Now, I can just work out what has to go into this 24 bits and code it as a single hex number 0xNNNNNN. However, that is fairly unreadable to anyone trying to maintain it.
The question is, if I define each subfield separately what's the best way of coding it all together?
The classic way is to use the << left shift operator on constant values and combine all values with either + or |. For example:
*register_address = (SYNC_MODE << 21) | ... | DEFAULT_RATE;
Solution 1
The "standard" approach for this problem is to use a struct with bitfield members. Something like this:
typedef struct {
int divisor: 10;
unsigned int field1: 9;
char field2: 2;
unsigned char mode: 3;
} fields;
The numbers after each field name specify the number of bits used by that member. In the example above, field divisor uses 10 bits and can store values between -512 and 511 (signed integer) while mode can store unsigned values on 3 bits: between 0 and 7.
The range of values for each field use the usual rules regarding signed/unsigned and but the field length (char/int/long) is limited to the specified number of bits. Of course, a char can still hold up to 8 bits, a short up to 16 a.s.o. The coercion rules are the usual rules for the types of the fields, taking into account their size (i.e. storing -5 in mode will convert it to unsigned (and the actual value will probably be 3).
There are several issues you need to pay attention of (some of them are also mentioned in the Notes section of the documentation page about bit fields:
the total amount of bits declared in the structure must be 24 (the size of your register);
because your structure uses 3 bytes, it's possible that some positions in arrays of such structures to behave strange because they span the allocation unit size (which is usually 4 or 8 bytes, depending on the hardware);
the order of the bit fields in the allocation unit is not guaranteed by the standard; depending on the architecture, it's possible that in the final 3-bytes pack, the field mode contains either the most significant 3 bits or the least significant 3 bites; you can sort this thing out easily, though.
You probably need to handle the values you store in a fields structure all at once. For that you can embed the structure in an union:
typedef union {
fields f;
unsigned int a;
} reg;
reg x;
/* Access individual fields */
x.f.mode = 2;
x.f.divisor = 42;
/* Get the entire register */
printf("%06X\n", x.a);
Solution 2
An alternative way to do (kind of) the same thing is to use macros to extract the fields and to compose the entire register:
#define MAKE_REG(mode, field2, field1, divisor) \
((((mode) & 0x07) << 21) | \
(((field2) & 0x03) << 19) | \
(((field1) & 0x01FF) << 10 )| \
((divisor) & 0x03FF))
#define GET_MODE(reg) (((reg) & 0xE00000) >> 21)
#define GET_FIELD2(reg) (((reg) & 0x180000) >> 19)
#define GET_FIELD1(reg) (((reg) & 0x07FC00) >> 10)
#define GET_DIVISOR(reg) ((reg) & 0x0003FF)
The first macro assembles the mode, field2, field1, divisor values into a 3-bytes integer. The other set of macros extract the values of individual fields. All of them assume the processed numbers are unsigned.
Pros and cons
The struct (embedded in an union) solution:
[+] it allows the compiler to do some checks of the values you want to put into the fields (and issue warnings); also, it does the correct conversions between signed and unsigned;
The macro solution:
[+] it is not sensible to memory alignment issues, you put the bits exactly where you want;
(-) it doesn't check the range of the values you put in fields;
(-) the handling of signed values is a little bit trickier using macros; the macros suggested here work only for unsigned values; more shifting is required in order to use signed values.

Need help understanding bitmaps, bitwise operations, and C

Disclaimer: I am asking these questions in relation to an assignment. The assignment itself calls for implementing a bitmap and doing some operations with that, but that is not what I am asking about. I just want to understand the concepts so I can try the implementation for myself.
I need help understanding bitmaps/bit arrays and bitwise operations. I understand the basics of binary and how left/right shift work, but I don't know exactly how that use is beneficial.
Basically, I need to implement a bitmap to store the results of a prime sieve (of Eratosthenes.) This is a small part of a larger assignment focused on different IPC methods, but to get to that part I need to get the sieve completed first. I've never had to use bitwise operations nor have I ever learned about bitmaps, so I'm kind of on my own to learn this.
From what I can tell, bitmaps are arrays of a bit of a certain size, right? By that I mean you could have an 8-bit array or a 32-bit array (in my case, I need to find the primes for a 32-bit unsigned int, so I'd need the 32-bit array.) So if this is an array of bits, 32 of them to be specific, then we're basically talking about a string of 32 1s and 0s. How does this translate into a list of primes? I figure that one method would evaluate the binary number and save it to a new array as decimal, so all the decimal primes exist in one array, but that seems like you're using too much data.
Do I have the gist of bitmaps? Or is there something I'm missing? I've tried reading about this around the internet but I can't find a source that makes it clear enough for me...
Suppose you have a list of primes: {3, 5, 7}. You can store these numbers as a character array: char c[] = {3, 5, 7} and this requires 3 bytes.
Instead lets use a single byte such that each set bit indicates that the number is in the set. For example, 01010100. If we can set the byte we want and later test it we can use this to store the same information in a single byte. To set it:
char b = 0;
// want to set `3` so shift 1 twice to the left
b = b | (1 << 2);
// also set `5`
b = b | (1 << 4);
// and 7
b = b | (1 << 6);
And to test these numbers:
// is 3 in the map:
if (b & (1 << 2)) {
// it is in...
You are going to need a lot more than 32 bits.
You want a sieve for up to 2^32 numbers, so you will need a bit for each one of those. Each bit will represent one number, and will be 0 if the number is prime and 1 if it is composite. (You can save one bit by noting that the first bit must be 2 as 1 is neither prime nor composite. It is easier to waste that one bit.)
2^32 = 4,294,967,296
Divide by 8
536,870,912 bytes, or 1/2 GB.
So you will want an array of 2^29 bytes, or 2^27 4-byte words, or whatever you decide is best, and also a method for manipulating the individual bits stored in the chars (ints) in the array.
It sounds like eventually, you are going to have several threads or processes operating on this shared memory.You may need to store it all in a file if you can't allocate all that memory to yourself.
Say you want to find the bit for x. Then let a = x / 8 and b = x - 8 * a. Then the bit is at arr[a] & (1 << b). (Avoid the modulus operator % wherever possible.)
//mark composite
a = x / 8;
b = x - 8 * a;
arr[a] |= 1 << b;
This sounds like a fun assignment!
A bitmap allows you to construct a large predicate function over the range of numbers you're interested in. If you just have a single 8-bit char, you can store Boolean values for each of the eight values. If you have 2 chars, it doubles your range.
So, say you have a bitmap that already has this information stored, your test function could look something like this:
bool num_in_bitmap (int num, char *bitmap, size_t sz) {
if (num/8 >= sz) return 0;
return (bitmap[num/8] >> (num%8)) & 1;
}

Resources