Enum and strings in C - c

I have a char* string coming in. I need to store it accordingly.
The string can be any of those values { UK, GD, BD, ER, WR, FL}
If I want to keep them as enumerated type, which data type is the best to use. Like for 6 values three bits is enough, but how to store three bits in C?

What you want is a Bit Field:
typedef struct {
unsigned char val : 2; //use 2 bits
unsigned char : 6; // remaining 6 bits
} valContainer;
...
valContainer x;
x.val = GD;
Do note that there isn't really a way to store less than one byte, as the definition of a byte is the smallest amount of memory the computer can address. This is just a method of having names associated with different bits in a byte.
Also, of course, 2 bits is not enough for 6 values (2 bits hold 4 distinct values). So you really want at least 3 bits (8 distinct values).

Just store them as an unsigned short. Unless you're storing other things in your struct to fill out a whole word, you're WAY prematurely optimizing. The compiler will have to pad out your data anyway.

As the answer by Eric Finn suggests, you can use bit fields to store a data element of 3 bits. However, this is only good if you have something else to store in the same byte.
struct {
unsigned char value: 3;
unsigned char another: 4;
unsigned char yet_another: 5;
// 12 bits declared so far; 4 more "padding" bits are unusable
} whatever;
If you want to store an array of many such small elements, you have to do it in a different way, for example, clumping 10 elements in each 32-bit word.
int n = ...; // number of elements to store
uint32_t *data = calloc(n / 10, sizeof(*data));
for (int i = 0; i < n; i++)
{
int value = read_string_and_convert_to_int();
data[i / 10] &= ~(7 << (i % 10 * 3));
data[i / 10] |= value << (i % 10 * 3);
}
If you want to have only one element (or a few), just use enum or int.

Related

What are bit vectors and how do I use them to convert chars to ints?

Here's the explanation for our task when implementing a set data structure in C "The set is constructed as a Bit vector, which in turn is implemented as an array of the data type char."
My confusion arises from the fact that almost all the functions we're given take in a set and an int as shown in the function below yet our array is made up of chars. How would I call functions if they can only take in ints when I have an array of chars? Here's my attempt att calling the function in my main function as well as the structs and example of function used.
int main(){
set *setA = set_empty();
set_insert("green",setA );
}
struct set {
int capacity;
int size;
char *array;
};
void set_insert(const int value, set *s)
{
if (!set_member_of(value, s)) {
int bit_in_array = value; // To make the code easier to read
// Increase the capacity if necessary
if (bit_in_array >= s->capacity) {
int no_of_bytes = bit_in_array / 8 + 1;
s->array = realloc(s->array, no_of_bytes);
for (int i = s->capacity / 8 ; i < no_of_bytes ; i++) {
s->array[i] = 0;
}
s->capacity = no_of_bytes * 8;
}
// Set the bit
int byte_no = bit_in_array / 8;
int bit = 7 - bit_in_array % 8;
s->array[byte_no] = s->array[byte_no] | 1 << bit;
s->size++;
}
}
TL;DR: The types of the index (value in your case) and the indexed element of an array (array in your case) are independent from each other. There is no conversion.
Most digital systems these days store their value in bits, each of them can hold only 0 or 1.
An integer value can therefore be viewed as a binary number, a value to the base of 2. It is a sequence of bits, each of which assigned a power of two. See the Wikipedia page on two's complement for details. But this aspect is not relevant for your issue.
Relevant is the view that an integer value is a sequence of bits. The simplest integer type of C is the char. It holds commonly 8 bits. We can assign indexes to these bits, and therefore think of them as a "vector", mathematically. Some people start to count "from the left", others start to count "from the right". Other common terms in this area are "MSB" and "LSB", see this Wikipedia page for more.
To access an element of a vector, you use its index. On a common char this is commonly a value between 0 and 7, inclusively. Remember, in CS we start to count from zero. The type of this index can be any integer wide enough to hold the value, for example an int. This is why you use a int in your case. This data type is independent from the type of elements in the vector.
How to solve the problem, if you need more than 8 bits? Well, then you can use more chars. This is the reason why your structure holds (a pointer to) an array of chars. All n chars of the array represent a vector of n * 8 bits, and you call this amount the "capacity".
Another option is to use a wider type, like a long or even a long long. And you can build an array of elements of these types, too. However, the widths of such types are commonly not equal in all systems.
BTW, the mathematical "vector" is the same thing as an "array" in CS. Different science areas, different terms.
Now, what is a "set"? I hope your script explains that a bit better than I can... It is a collection that contains an element only once or not at all. All elements are distinct. In your case the elements are represented by (small) integers.
Given a vector of bits of arbitrary capacity, we can "map" an element of a set on a bit of this vector by its index. This is done by storing a 1 in the mapped bit, if the element is present in the set, or 0, if it is not.
To access the correct bit, we need the index of the single char in the array, and the index of the bit in this char. You calculate these values in the lines:
int byte_no = bit_in_array / 8;
int bit = 7 - bit_in_array % 8;
All variables are of type int, most probably because this is the common type. It can be any other integer type, like a size_t for example, as long as it can hold the necessary values, even different types for the different variables.
With these two values at hand, you can "insert" the element into the set. For this action, you set the respective bit to 1:
s->array[byte_no] = s->array[byte_no] | 1 << bit;
Please note that the shift operator << has a higher precedence than the bit-wise OR operator |. Some coding style rules request to use parentheses to make this clear, but you can also use this even clearer assignment:
s->array[byte_no] |= 1 << bit;

using bit-fields as representation for integers in c [duplicate]

This question already has answers here:
bit vector implementation of sets
(2 answers)
Closed 6 years ago.
In my C class we were given an assignment:
Write an interactive program (standard input/output). Define the new type set using typedef which can hold a set of integers in the range 0-127. The data structure has to be as efficient as possible in terms of storage (hint: working with bits). Also you need to define 6 global variables A,B,C,D,E,F of type set. All operations on sets in the program will be on these 6 variables.
This command read_set A,5,6,7,4,5,4,-1 will read user's input of integers while -1 means end of user's input. Other commands a user can use: print_set A - prints the set in increasing order, union_set A,B,C does union on 2 sets and saves the output in a third set, intersect_set A,B,C - determines the intersection of 2 sets and saves the output to a third set.
As far as I understand I need to use bit-fields. I could create a table of integers from 0-127. Then I could create the 6 variables A,B,C,D,E,F using set type definition and giving 128 bit-fields to each variable. Then if a user inputs 15 I would turn on the the bit which represents 15 in the data type. I'm really not sure if this is the way, because it's not clear to me how I would arrange bit-fields such that I can turn on exactly 15-th bit if I need to, I would need to convert somehow an integer to bit-field name... Also print_set prints the set in increasing order so how could I re-arrange bit-fields for this?
Really hope you have some ideas.
Yes, each of the sets called A, B, C, D, E and F is represented by a couple of unsigned long long integers like this:
typedef struct {
unsigned long long high;
unsigned long long low;
} Set;
See https://en.wikipedia.org/wiki/C_data_types
This gives you 128 bits of data in a Set (64 bits for the high numbers 64 to 127, and 64 bits for the low numbers 0 to 63).
Then you just need to do some bit manipulation like this: http://www.tutorialspoint.com/ansi_c/c_bits_manipulation.htm
For a number between 0 and 63, you'd shift 1 to the left x times and then set that bit on the "low" field.
For a number between 64 and 127, you'd shift 1 to the left x-64 times and then set that bit on the "high" field.
Hope this helps!
Using bitfields for this assignment will prove very cumbersome because of alignment issues, and you cannot define arrays of bitfields anyway. I would suggest using an array of bytes (unsigned char) and packing values into this array. A 7-bit value spanning at most 2 bytes.
The array for count values should be allocated with a size of (count + 7) / 8 bytes. In order to conserve space, you can store small sets in an integer and larger sets using an allocated array.
The datatype would look like:
#include <stdint.h>
#include <stdlib.h>
typedef struct set {
size_t count;
union {
uintptr_t v;
unsigned char *a;
};
} set;
Here is how to extract the n-th value:
int get_7bits(const set *s, size_t n) {
if (s == NULL || n >= s->count) {
return -1;
} else
if (n < sizeof(uintptr_t) * CHAR_BIT / 7) {
return (s->v >> (n * 7)) & 127;
} else {
size_t i = n / 7;
int shift = n % 7;
if (shift <= CHAR_BIT - 7) {
/* value fits in one byte */
return (s->a[i] >> shift) & 127;
} else {
/* value spans 2 bytes */
return ((s->a[i] | (s->a[i + 1] << CHAR_BIT)) >> shift) & 127;
}
}
}
You can write the other access functions and complete your assignment.

C variable smaller then 8-bit

I'm writing C implementation of Conway's Game of Life and pretty much done with the code, but I'm wondering what is the most efficient way to storage the net in the program.
The net is two dimensional and stores whether cell (x, y) is alive (1) or dead (0). Currently I'm doing it with unsigned char like that:
struct:
typedef struct {
int rows;
int cols;
unsigned char *vec;
} net_t;
allocation:
n->vec = calloc( n->rows * n->cols, sizeof(unsigned char) );
filling:
i = ( n->cols * (x - 1) ) + (y - 1);
n->vec[i] = 1;
searching:
if( n->vec[i] == 1 )
but I don't really need 0-255 values - I only need 0 - 1, so I'm feeling that doing it like that is a waste of space, but as far as I know 8-bit char is the smallest type in C.
Is there any way to do it better?
Thanks!
The smallest declarable / addressable unit of memory you can address/use is a single byte, implemented as unsigned char in your case.
If you want to really save on space, you could make use of masking off individual bits in a character, or using bit fields via a union. The trade-off will be that your code will execute a bit slower, and will certainly be more complicated.
#include <stdio.h>
union both {
struct {
unsigned char b0: 1;
unsigned char b1: 1;
unsigned char b2: 1;
unsigned char b3: 1;
unsigned char b4: 1;
unsigned char b5: 1;
unsigned char b6: 1;
unsigned char b7: 1;
} bits;
unsigned char byte;
};
int main ( ) {
union both var;
var.byte = 0xAA;
if ( var.bits.b0 ) {
printf("Yes\n");
} else {
printf("No\n");
}
return 0;
}
References
Union and Bit Fields, Accessed 2014-04-07, <http://www.rightcorner.com/code/CPP/Basic/union/sample.php>
Access Bits in a Char in C, Accessed 2014-04-07, <https://stackoverflow.com/questions/8584577/access-bits-in-a-char-in-c>
Struct - Bit Field, Accessed 2014-04-07, <http://cboard.cprogramming.com/c-programming/10029-struct-bit-fields.html>
Unless you're working on an embedded platform, I wouldn't be too concerned about the size your net takes up by using an unsigned char to store only a 1 or 0.
To address your specific question: char is the smallest of the C data types. char, signed char, and unsigned char are all only going to take up 1 byte each.
If you want to make your code smaller you can use bitfields to decrees the amount of space you take up, but that will increase the complexity of your code.
For a simple exercise like this, I'd be more concerned about readability than size. One way you can make it more obvious what you're doing is switch to a bool instead of a char.
#include <stdbool.h>
typedef struct {
int rows;
int cols;
bool *vec;
} net_t;
You can then use true and false which, IMO, will make your code much easier to read and understand when all you need is 1 and 0.
It will take up at least as much space as the way you're doing it now, but like I said, consider what's really important in the program you're writing for the platform you're writing it for... it's probably not the size.
The smallest type on C as i know are the char (-128, 127), signed char (-128, 127), unsigned char (0, 255) types, all of them takes a whole byte, so if you are storing multiple bits values on different variables, you can instead use an unsigned char as a group of bits.
unsigned char lives = 128;
At this moment, lives have a 128 decimal value, which it's 10000000 in binary, so now you can use a bitwise operator to get a single value from this variable (like an array of bits)
if((lives >> 7) == 1) {
//This code will run if the 8 bit from right to left (decimal 128) it's true
}
It's a little complex, but finally you'll end up with a bit array, so instead of using multiple variables to store single TRUE / FALSE values, you can use a single unsigned char variable to store 8 TRUE / FALSE values.
Note: As i have some time out of the C/C++ world, i'm not 100% sure that it's "lives >> 7", but it's with the '>' symbol, a little research on it and you'll be ready to go.
You're correct that a char is the smallest type - and it is typically (8) bits, though this is a minimum requirement. And sizeof(char) or (unsigned char) is (1). So, consider using an (unsigned) char to represent (8) columns.
How many char's are required per row? It's (cols / 8), but we have to round up for an integer value:
int byte_cols = (cols + 7) / 8;
or:
int byte_cols = (cols + 7) >> 3;
which you may wish to store with in the net_t data structure. Then:
calloc(n->rows * n->byte_cols, 1) is sufficient for a contiguous bit vector.
Address columns and rows by x and y respectively. Setting (x, y) (relative to 0) :
n->vec[y * byte_cols + (x >> 3)] |= (1 << (x & 0x7));
Clearing:
n->vec[y * byte_cols + (x >> 3)] &= ~(1 << (x & 0x7));
Searching:
if (n->vec[y * byte_cols + (x >> 3)] & (1 << (x & 0x7)))
/* ... (x, y) is set... */
else
/* ... (x, y) is clear... */
These are bit manipulation operations. And it's fundamentally important to learn how (and why) this works. Google the term for more resources. This uses an eighth of the memory of a char per cell, so I certainly wouldn't consider it premature optimization.

Converting a byte array to an int array in C

I have some code below that is supposed to be converting a C (Arduino) 8-bit byte array to a 16-bit int array, but it only seems to partially work. I'm not sure what I'm doing wrong.
The byte array is in little endian byte order. How do I convert it to an int (two bytes per enty) array?
In layman's terms, I want to merge every two bytes.
Currently it is outputting for an input BYTE ARRAY of: {0x10, 0x00, 0x00, 0x00, 0x30, 0x00}. The output INT ARRAY is: {1,0,0}. The output should be an INT ARRAY is: {1,0,3}.
The code below is what I currently have:
I wrote this function based on a solution in Stack Overflow question Convert bytes in a C array as longs.
I also have this solution based off the same code which works fine for byte array to long (32-bits) array http://pastebin.com/TQzyTU2j.
/**
* Convert the retrieved bytes into a set of 16 bit ints
**/
int * byteA2IntA(byte * byte_slice, int sizeOfB, int * ret_array){
//Variable that stores the addressed int to be stored in SRAM
int currentInt;
int sizeOfI = sizeOfB / 2;
if(sizeOfB % 2 != 0) ++sizeOfI;
for(int i = 0; i < sizeOfB; i+=2){
currentInt = 0;
if(byte_slice[i]=='\0') {
break;
}
if(i + 1 < sizeOfB)
currentInt = (currentInt << 8) + byte_slice[i+1];
currentInt = (currentInt << 8) + byte_slice[i+0];
*ret_array = currentInt;
ret_array++;
}
//Pointer to the return array in the parent scope.
return ret_array;
}
What is the meaning of this line of code?
if(i + 1 < sizeOfB) currentInt = (currentInt << 8) + byte_slice[i+1];
Here currentInt is always 0 and 0 << 8 = 0.
Also what you do is, for each couple of bytes (let me call them uint8_t from now on), you pack an int (let me call it uint16_t from now on) by doing the following:
You take the rightmost uint8_t
You shift it 8 positions to the left
You add the leftmost uint8_t
Is this really what you want?
Supposing you have byte_slice[] = {1, 2}, you pack a 16 bit integer with the value 513 (2<<8 + 1)!
Also, you don't need to return the pointer to the array of uint16_t as the caller has already provided it to the function.
If you use the return of your function, as Joachim said, you get a pointer starting from a position of the uint16_t array which is not position [0].
Vincenzo has a point (or two), you need to be clear what you're trying to do;
Combine two bytes to one 16-bit int, one byte being the MSB and one byte being the LSB
int16 result = (byteMSB << 8) | byteLSB;
Convert an array of bytes into 16-bit
for(i = 0; i < num_of_bytes; i++)
{
myint16array[i] = mybytearray[i];
}
Copy an array of data into another one
memcpy(dest, src, num_bytes);
That will (probably, platform/compiler dependent) have the same effect as my 1st example.
Also, beware of using ints as that suggests signed values, use uints, safer and probably faster.
The problem is most likely that you increase ret_array and then return it. When you return it, it will point to one place beyond the destination array.
Save the pointer at the start of the function, and use that pointer instead.
Consider using a struct. This is kind of a hack, though.
Off the top of my head it would look like this.
struct customINT16 {
byte ByteHigh;
byte ByteLow;
}
So in your case you would write:
struct customINT16 myINT16;
myINT16.ByteHigh = BYTEARRAY[0];
myINT16.ByteLow = BYTEARRAY[1];
You'll have to go through a pointer to cast it, though:
intpointer = (int*)(&myINT16);
INTARRAY[0] = *intpointer;

How to convert from integer to unsigned char in C, given integers larger than 256?

As part of my CS course I've been given some functions to use. One of these functions takes a pointer to unsigned chars to write some data to a file (I have to use this function, so I can't just make my own purpose built function that works differently BTW). I need to write an array of integers whose values can be up to 4095 using this function (that only takes unsigned chars).
However am I right in thinking that an unsigned char can only have a max value of 256 because it is 1 byte long? I therefore need to use 4 unsigned chars for every integer? But casting doesn't seem to work with larger values for the integer. Does anyone have any idea how best to convert an array of integers to unsigned chars?
Usually an unsigned char holds 8 bits, with a max value of 255. If you want to know this for your particular compiler, print out CHAR_BIT and UCHAR_MAX from <limits.h> You could extract the individual bytes of a 32 bit int,
#include <stdint.h>
void
pack32(uint32_t val,uint8_t *dest)
{
dest[0] = (val & 0xff000000) >> 24;
dest[1] = (val & 0x00ff0000) >> 16;
dest[2] = (val & 0x0000ff00) >> 8;
dest[3] = (val & 0x000000ff) ;
}
uint32_t
unpack32(uint8_t *src)
{
uint32_t val;
val = src[0] << 24;
val |= src[1] << 16;
val |= src[2] << 8;
val |= src[3] ;
return val;
}
Unsigned char generally has a value of 1 byte, therefore you can decompose any other type to an array of unsigned chars (eg. for a 4 byte int you can use an array of 4 unsigned chars). Your exercise is probably about generics. You should write the file as a binary file using the fwrite() function, and just write byte after byte in the file.
The following example should write a number (of any data type) to the file. I am not sure if it works since you are forcing the cast to unsigned char * instead of void *.
int homework(unsigned char *foo, size_t size)
{
int i;
// open file for binary writing
FILE *f = fopen("work.txt", "wb");
if(f == NULL)
return 1;
// should write byte by byte the data to the file
fwrite(foo+i, sizeof(char), size, f);
fclose(f);
return 0;
}
I hope the given example at least gives you a starting point.
Yes, you're right; a char/byte only allows up to 8 distinct bits, so that is 2^8 distinct numbers, which is zero to 2^8 - 1, or zero to 255. Do something like this to get the bytes:
int x = 0;
char* p = (char*)&x;
for (int i = 0; i < sizeof(x); i++)
{
//Do something with p[i]
}
(This isn't officially C because of the order of declaration but whatever... it's more readable. :) )
Do note that this code may not be portable, since it depends on the processor's internal storage of an int.
If you have to write an array of integers then just convert the array into a pointer to char then run through the array.
int main()
{
int data[] = { 1, 2, 3, 4 ,5 };
size_t size = sizeof(data)/sizeof(data[0]); // Number of integers.
unsigned char* out = (unsigned char*)data;
for(size_t loop =0; loop < (size * sizeof(int)); ++loop)
{
MyProfSuperWrite(out + loop); // Write 1 unsigned char
}
}
Now people have mentioned that 4096 will fit in less bits than a normal integer. Probably true. Thus you can save space and not write out the top bits of each integer. Personally I think this is not worth the effort. The extra code to write the value and processes the incoming data is not worth the savings you would get (Maybe if the data was the size of the library of congress). Rule one do as little work as possible (its easier to maintain). Rule two optimize if asked (but ask why first). You may save space but it will cost in processing time and maintenance costs.
The part of the assignment of: integers whose values can be up to 4095 using this function (that only takes unsigned chars should be giving you a huge hint. 4095 unsigned is 12 bits.
You can store the 12 bits in a 16 bit short, but that is somewhat wasteful of space -- you are only using 12 of 16 bits of the short. Since you are dealing with more than 1 byte in the conversion of characters, you may need to deal with endianess of the result. Easiest.
You could also do a bit field or some packed binary structure if you are concerned about space. More work.
It sounds like what you really want to do is call sprintf to get a string representation of your integers. This is a standard way to convert from a numeric type to its string representation. Something like the following might get you started:
char num[5]; // Room for 4095
// Array is the array of integers, and arrayLen is its length
for (i = 0; i < arrayLen; i++)
{
sprintf (num, "%d", array[i]);
// Call your function that expects a pointer to chars
printfunc (num);
}
Without information on the function you are directed to use regarding its arguments, return value and semantics (i.e. the definition of its behaviour) it is hard to answer. One possibility is:
Given:
void theFunction(unsigned char* data, int size);
then
int array[SIZE_OF_ARRAY];
theFunction((insigned char*)array, sizeof(array));
or
theFunction((insigned char*)array, SIZE_OF_ARRAY * sizeof(*array));
or
theFunction((insigned char*)array, SIZE_OF_ARRAY * sizeof(int));
All of which will pass all of the data to theFunction(), but whether than makes any sense will depend on what theFunction() does.

Resources