Operating Rightmost/Leftmost n-Bits, Not All the Bits of A Integer Type Data Variable - c

In a programming-task, I have to add a smaller integer in variable B (data type int)
to a larger integer (20 decimal integer) in variable A (data type long long int),
then compare A with variable C which is also as large integer (data type long long int) as A.
What I realized, since I add a smaller B to A,
I don't need to check all the digits of A when I compare that with C, in other words, we don't need to check all the bits of A and C.
Given that I know, how many bits from the right I need to check, say n-bits,
is there a way/technique to check only those specific n-bits from the right (not all the bits of A, C) to make the program faster in c programming language?
Because for comparing all the bits take more time, and since I am working with large number, the program becomes slower.
Every time I search in the google, bit-masking appears which uses all the bits of A, C, that doesn't do what I am asking for, so probably I am not using correct terminology, please help.
Addition:
Initial comments of this post made me think there is no way but i found the following -
Bit Manipulation by University of Colorado Boulder
(#cuboulder, after 7:45)
...the bit band region is accessed via a bit band alías, each bit in a
supported bit band region has its own unique address and we can access
that bit using a pointer to its bit band alias location, the least
significant bit in an alias location can be sent or cleared and that
will be mapped to the bit in the corresponding data or peripheral
memory, unfortunately this will not help you if you need to write to
multiple bit locations in memory dependent operations only allow a
single bit to be cleared or set...
Is above what I a asking for? if yes then
where I can find the detail as beginner?
Updated question:
Is there a way/technique to check only those specific n-bits from the right (not all the bits of A, C) to make the program faster in c programming language (or any other language) that makes the program faster?

Your assumption that comparing fewer bits is faster might be true in some cases but is probably not true in most cases.
I'm only familiar with x86 CPUs. A x86-64 Processor has 64 bit wide registers. These can be accessed as 64 bit registers but the lower bits also as 32, 16 and 8 bit registers. There are processor instructions which work with the 64, 32, 16 or 8 bit part of the registers. Comparing 8 bits is one instruction but so is comparing 64 bits.
If using the 32 bit comparison would be faster than the 64 bit comparison you could gain some speed. But it seems like there is no speed difference for current processor generations. (Check out the "cmp" instruction with the link to uops.info from #harold.)
If your long long data type is actually bigger then the word size of your processor, then it's a different story. E.g. if your long long is 64 bit but your are on a 32 bit processor then these instructions cannot be handled by one register and you would need multiple instructions. So if you know that comparing only the lower 32 bits would be enough this could save some time.
Also note that comparing only e.g. 20 bits would actually take more time then comparing 32 bits. You would have to compare 32 bits and then mask the 12 highest bits. So you would need a comparison and a bitwise and instruction.
As you see this is very processor specific. And you are on the processors opcode level. As #RawkFist wrote in his comment you could try to get the C compiler to create such instructions but that does not automatically mean that this is even faster.
All of this is only relevant if these operations are executed a lot. I'm not sure what you are doing. If e.g. you add many values B to A and compare them to C each time it might be faster to start with C, subtract the B values from it and compare with 0. Because the compare-operation works internally like a subtraction. So instead of an add and a compare instruction a single subtraction would be enough within the loop. But modern CPUs and compilers are very smart and optimize a lot. So maybe the compiler automatically performs such or similar optimizations.

Try this question.
Is there a way/technique to check only those specific n-bits from the right (not all the bits of A, C) to make the program faster in c programming language (or any other language) that makes the program faster?
Yes - when A + B != C. We can short-cut the comparison once a difference is found: from least to most significant.
No - when A + B == C. All bits need comparison.
Now back to OP's original question
Is there a way/technique to check only those specific n-bits from the right (not all the bits of A, C) to make the program faster in c programming language (or any other language) that makes the program faster?
No. In order to do so, we need to out-think the compiler. A well enabled compiler itself will notice any "tricks" available for long long + (signed char)int == long long and emit efficient code.
Yet what about really long compares? How about a custom uint1000000 for A and C?
For long compares of a custom type, a quick compare can be had.
First, select a fast working type. unsigned is a prime candidate.
typedef unsigned ufast;
Now define the wide integer.
#include <limits.h>
#include <stdbool.h>
#define UINT1000000_N (1000000/(sizeof(ufast) * CHAR_BIT))
typedef struct {
// Least significant first
ufast digit[UINT1000000_N];
} uint1000000;
Perform the addition and compare one "digit" at a time.
bool uint1000000_fast_offset_compare(const uint1000000 *A, unsigned B,
const uint1000000 *C) {
ufast carry = B;
for (unsigned i = 0; i < UINT1000000_N; i++) {
ufast sum = A->digit[i] + carry;
if (sum != C->digit[i]) {
return false;
}
carry = sum < A->digit[i];
}
return true;
}

Related

Explain how specific C #define works

I have been looking at some of the codes at http://www.netlib.org/fdlibm/ to see how some functions work and I was looking at the code for e_log.c and in some parts of the code it says:
hx = __HI(x); /* high word of x */
lx = __LO(x); /* low word of x */
The code for __HI(x) and __LO(x) is:
#define __HI(x) *(1+(int*)&x)
#define __LO(x) *(int*)&x
which I really don't understand because I am not familiar with this type of C. Can someone please explain to me what __HI(x) and __LO(x) are doing?
Also later in the code for the function there is a statement:
__HI(x) = hx|(i^0x3ff00000);
Can someone please explain to me:
how is it possible to make a function equal to something (I generally work with python so I don't really know what is going on)?
what are __HI(x) and __LO(x) doing?
what does the program mean by "high word" and "low word" of x?
The final purpose of my analysis is understanding this code in order to port it into a Python implementation
These macros use compiler-dependent properties to access the representations of double types.
In C, all objects other than bit-fields are represented as sequences of bytes. The fdlibm code you are looking at is designed for implementations where int is four bytes and the double type is represented using eight bytes in a format defined by the IEEE-754 floating-point specification. That format is called binary64 or IEEE-754 basic 64-bit binary floating-point. It is also designed for an implementation where the C compiler guarantees that aliasing via pointer conversions is supported. (This is not guaranteed by the C standard, but C implementations may support it.)
Consider a double object named x. Given these macros:
#define __HI(x) *(1+(int*)&x)
#define __LO(x) *(int*)&x
When __LO(x) is used in source code, it is replaced by *(int*)&x. The &x takes the address of x. The address of x has type double *. The cast (int *) converts this to int *, a pointer to an int. Then * dereferences this pointer, resulting in a reference to the int that is at the low-address part of x.
When __HI(x) is used in the source code, (int*)&x again points to the low-address part of x. Adding 1 changes it to point to the high-address part. Then * dereferences this, resulting in a reference to the int that is at the high-address part.
The routines in fdlibm are special mathematical routines. To operate, they need to examine and modify the bytes that represent double values. The __LO and __HI macros give them this access.
These definitions of __HI and __LO work for implementations that store the double values in little-endian order (with the “least significant” part of the double in the lower-addressed memory location). The fdlibm code may contain alternate definitions for big-endian systems, likely selected by some #if statement.
In the code __HI(x) = hx|(i^0x3ff00000);, the value 0x3ff00000 is a bit mask for the bits that encode the exponent (and part of the significand) of a double value. Without context, we cannot say precisely what is happening here, but the code appears to be merging hx with some value from i. It is likely completing some computation of the bytes representing a new double value it is creating and storing those bytes in the “high” part of a double object.
I add a reply to integrate the one already present (not substitute).
hx = __HI(x); /* high word of x */
lx = __LO(x); /* low word of x */
Comments are useful... even if in this case the macro name could be clear enough. "high" and "low" refer to the two halves of an integer representation, typically a 16 or 32 bit because for an 8-bit int the used term is "nibble".
If we take a 16-bit unsigned integer which can range from 0 to 65535, or in hex 0x0000 to 0xFFFF, for example 0x1234, the two halves are:
0x1234
^^-------------------- lower half, or "low"
^^---------------------- upper half, or "high"
Note that "lower" means the less significant part. The correct way to get the two halves, assuming 16 bits, is to make a logical (bitwise) AND with 0xFF to get lo(), and to shift 8 bit right (divide by 256) to get high.
Now, inside a CPU the number 0x1234 is written in two consecutive locations, either as 0x12 then 0x34 if big-endian, or 0x34 then 0x12 if little-endian. Given this, other ways are possible to read single halves, reading the correct one directly from memory without calculation. To get the lo() of 0x1234 in a little endian machine, it is possible to read the single byte in the first location.
From the question:
#define __HI(x) *(1+(int*)&x)
#define __LO(x) *(int*)&x
__LO is defined to make a bitwise AND (sure way), while __HI peeks directly in the memory (less sure). It is strange because it seems that the integer to be splitted in two has double dimension of the size of the word of the machine. If the machine is 32 bit, the integer to be split is 64 bits long. And there is another caveat: those macro can read the halves, but can also be used to write separately the two halves. In fact, from the question:
__HI(x) = hx|(i^0x3ff00000);
the result is to set only the HI part (upper, most significant) of x. Note also the value used, 0x3FFF0000, which seems to indicate that x is 128 bits because the mask used to generate a half of it is 64 bits long.
Hope this is clear enough to translate C to python. You should use integers 128 bit long. When in need to get the LO() part, use a bitwise AND with 0xFFFFFFFF; to get HI(), shift right 64 times or do the equivalent division.
When HI and LO are to the left of an assignment, only that half of the value is written, and you should construct separately the two halves and sum them up (or bitwise or them together).
Hope it helps...
#define A B
is a preprocessor directive that substitutes literal A with literal B all over the source code before the compilation.
#define A(x) B
is a function-like preprocessor macro which uses a parameter x in order to do a parameterized preprocessor substitution. In this case, B can be a function of x as well.
Your macros
#define __HI(x) *(1+(int*)&x)
#define __LO(x) *(int*)&x
// called as
__HI(x) = hx|(i^0x3ff00000);
Since it is just a matter of code substitution, the assignment is perfectly legit. Why? Because in this case the macro is substituted by an R-value in both cases.
That rvalue is in both cases a variable of type int:
take x's address
cast it to a pointer to int
deference it (in case of __LO())
Add 1 and then deference it in case of __HI ().
What it will actually point depends on architecture because pointer arithmetics are architecture dependant. Also endianness has to be taken into account.
What we can say is that they are designed in order to access the lower and the higher halves of a data type whose size is 2*sizeof (int) big (so, if for example integer data is 32-bit wide, they will allow the access to lower 32 bytes and to upper 32 bytes). Furthermore, from the macro names we understand that it is a little-endian architecture (LSB comes first).
In order to port to Python code containing this macros you will need to do it at higher level, since Python does not support pointers.
These tips don't solve your specific task, but provide to you a working method for this task and similar:
A way to understand what a macro does is checking how it is actually translated by the preprocessor. This can be done on most compilers through the -E compiler option.
Use a debugger to understand the functionality: set a breakpoint just before the call to the macro, and analyze its effects on addresses and variables.

Declaring the array size in C

Its quite embarrassing but I really want to know... So I needed to make a conversion program that converts decimal(base 10) to binary and hex. I used arrays to store values and everything worked out fine, but i declared the array as int arr[1000]; because i thought 1000 was just an ok number, not too big, not to small...someone in class said " why would you declare an array of 1000? Integers are 32 bits". I was too embarrased to ask what that meant so i didnt say anything. But does this mean that i can just declare the array as int arr[32]; instead? Im using C btw
No, the int type has tipically a 32 bit size, but when you declare
int arr[1000];
you are reserving space for 1000 integers, i.e. 32'000 bits, while with
int arr[32];
you can store up to 32 integers.
You are practically asking yourself a question like this: if an apple weighs 32 grams, I want to my bag to
contain 1000 apples or 32 apples?
Don't be embarrassed. Fear is your enemy and in the end you will be perceived based on contexts that you have no hope of significantly influencing. Anyway, to answer your question, your approach is incorrect. You should declare the array with a size completely determined by the number of positions used.
Concretely, if you access the array at 87 distinct positions (from 0 to 86) then you need a size of 87.
0 to 4,294,967,295 is the maximum possible range of numbers you can store in 32 bits.If your number is outside this range you cannot store your number in 32 bits.Since each bit will occupy one index location of your array if you number falls in that range array size of 32 will do fine.for example consider number 9 it will be stored in array as a[]={1,0,0,1}.
In order to know the know range of numbers, your formula is 0 to (2^n -1) where n is the number of bits in binary. means in the array size of 4 or 4 bits you can just store number from range 0 to 15.
In C , integer datatype can store typically up to 2,147,483,647 and 4,294,967,295 if you are using unsigned integer. Since the maximum value, an integer data type can store in C is within the range of maximum possible number which can be expressed using 32 bits. It is safe to say that array size of 32 is the best size for defining an array.Sice you will never require more than 32 bits to express any number using an int.
I will use
int a = 42;
char bin[sizeof a * CHAR_BIT + 1];
char hex[sizeof a * CHAR_BIT / 4 + 1]
I think this include all possibility.
Consider that also the 'int' type is ambiguous. Generally it depends on the machine you're working on and at minimum its ranges are: -32767,+32767:
https://en.wikipedia.org/wiki/C_data_types
Can I suggest to use the stdint types?
int32_t/uint32_t
What you did is okay. If that is precisely what you want to do. C is a language that lets you do whatever you want. Whenever you want. The reason you were berated on the declaration is because of 'hogging' memory. The thought being, how DARE YOU take up space that is possibly never used... it is inefficient.
And it is. But who cares if you just want to run a program that has a simple purpose? A 1000 16 or 32 bit block of memory is weeeeeensy teeeeny tiny compared to computers from the way back times when it was necessary to watch over how much RAM you were taking up. So - go ahead.
But what they should have said next is how to avoid that. More on that at the end - but first a thing about built in data types in C.
An int can be 16 or 32 bits depending on how you declare it. And your compiler's settings...
A LONG int is 32.
consider:
short int x = 10; // declares an integer that is 16 bits
signed int x = 10; // 32 bit integer with negative and positive range
unsigned int x = 10 // same 32 bit integer - but only 0 to positive values
To specifically code a 32 bit integer you declare it 'long'
long int = 10; // 32 bit
unsigned long int = 10; // 32 bit 0 to positive values
Typical nomenclature is to call a 16 bit value a WORD and a 32 bit value a DWORD - (double word). But why would you want to type in:
long int x = 10;
instead of:
int x = 10;
?? For a few reasons. Some compilers may handle the int as a 16 bit WORD if keeping up with older standards. But the only real reason is to maintain a convention of strongly typecasted code. Make it read directly what you intend it to do. This also helps in readability. You will KNOW when you see it = what size it is for sure, and be reminded whilst coding. Many many code mishaps happen for lack of attention to code practices and naming things well. Save yourself hours of headache later on by learning good habits now. Create YOUR OWN style of coding. Take a look at other styles just to get an idea on what the industry may expect. But in the end you will find you own way in it.
On to the array issue ---> So, I expect you know that the array takes up memory right when the program runs. Right then, wham - the RAM for that array is set aside just for your program. It is locked out from use by any other resource, service, etc the operating system is handling.
But wouldn't it be neat if you could just use the memory you needed when you wanted, and then let it go when done? Inside the program - as it runs. So when your program first started, the array (so to speak) would be zero. And when you needed a 'slot' in the array, you could just add one.... use it, and then let it go - or add another - or ten more... etc.
That is called dynamic memory allocation. And it requires the use of a data type that you may not have encountered yet. Look up "Pointers in C" to get an intro.
If you are coding in regular C there are a few functions that assist in performing dynamic allocation of memory:
malloc and free ~ in the alloc.h library routines
in C++ they are implemented differently. Look for:
new and delete
A common construct for handling dynamic 'arrays' is called a "linked-list." Look that up too...
Don't let someone get your flustered with code concepts. Next time just say your program is designed to handle exactly what you have intended. That usually stops the discussion.
Atomkey

When to use bit-fields in C

On the question 'why do we need to use bit-fields?', searching on Google I found that bit fields are used for flags.
Now I am curious,
Is it the only way bit-fields are used practically?
Do we need to use bit fields to save space?
A way of defining bit field from the book:
struct {
unsigned int is_keyword : 1;
unsigned int is_extern : 1;
unsigned int is_static : 1;
} flags;
Why do we use int?
How much space is occupied?
I am confused why we are using int, but not short or something smaller than an int.
As I understand only 1 bit is occupied in memory, but not the whole unsigned int value. Is it correct?
A quite good resource is Bit Fields in C.
The basic reason is to reduce the used size. For example, if you write:
struct {
unsigned int is_keyword;
unsigned int is_extern;
unsigned int is_static;
} flags;
You will use at least 3 * sizeof(unsigned int) or 12 bytes to represent three small flags, that should only need three bits.
So if you write:
struct {
unsigned int is_keyword : 1;
unsigned int is_extern : 1;
unsigned int is_static : 1;
} flags;
This uses up the same space as one unsigned int, so 4 bytes. You can throw 32 one-bit fields into the struct before it needs more space.
This is sort of equivalent to the classical home brew bit field:
#define IS_KEYWORD 0x01
#define IS_EXTERN 0x02
#define IS_STATIC 0x04
unsigned int flags;
But the bit field syntax is cleaner. Compare:
if (flags.is_keyword)
against:
if (flags & IS_KEYWORD)
And it is obviously less error-prone.
Now I am curious, [are flags] the only way bitfields are used practically?
No, flags are not the only way bitfields are used. They can also be used to store values larger than one bit, although flags are more common. For instance:
typedef enum {
NORTH = 0,
EAST = 1,
SOUTH = 2,
WEST = 3
} directionValues;
struct {
unsigned int alice_dir : 2;
unsigned int bob_dir : 2;
} directions;
Do we need to use bitfields to save space?
Bitfields do save space. They also allow an easier way to set values that aren't byte-aligned. Rather than bit-shifting and using bitwise operations, we can use the same syntax as setting fields in a struct. This improves readability. With a bitfield, you could write
directions.alice_dir = WEST;
directions.bob_dir = SOUTH;
However, to store multiple independent values in the space of one int (or other type) without bitfields, you would need to write something like:
#define ALICE_OFFSET 0
#define BOB_OFFSET 2
directions &= ~(3<<ALICE_OFFSET); // clear Alice's bits
directions |= WEST<<ALICE_OFFSET; // set Alice's bits to WEST
directions &= ~(3<<BOB_OFFSET); // clear Bob's bits
directions |= SOUTH<<BOB_OFFSET; // set Bob's bits to SOUTH
The improved readability of bitfields is arguably more important than saving a few bytes here and there.
Why do we use int? How much space is occupied?
The space of an entire int is occupied. We use int because in many cases, it doesn't really matter. If, for a single value, you use 4 bytes instead of 1 or 2, your user probably won't notice. For some platforms, size does matter more, and you can use other data types which take up less space (char, short, uint8_t, etc.).
As I understand only 1 bit is occupied in memory, but not the whole unsigned int value. Is it correct?
No, that is not correct. The entire unsigned int will exist, even if you're only using 8 of its bits.
Another place where bitfields are common are hardware registers. If you have a 32 bit register where each bit has a certain meaning, you can elegantly describe it with a bitfield.
Such a bitfield is inherently platform-specific. Portability does not matter in this case.
We use bit fields mostly (though not exclusively) for flag structures - bytes or words (or possibly larger things) in which we try to pack tiny (often 2-state) pieces of (often related) information.
In these scenarios, bit fields are used because they correctly model the problem we're solving: what we're dealing with is not really an 8-bit (or 16-bit or 24-bit or 32-bit) number, but rather a collection of 8 (or 16 or 24 or 32) related, but distinct pieces of information.
The problems we solve using bit fields are problems where "packing" the information tightly has measurable benefits and/or "unpacking" the information doesn't have a penalty. For example, if you're exposing 1 byte through 8 pins and the bits from each pin go through their own bus that's already printed on the board so that it leads exactly where it's supposed to, then a bit field is ideal. The benefit in "packing" the data is that it can be sent in one go (which is useful if the frequency of the bus is limited and our operation relies on frequency of its execution), and the penalty of "unpacking" the data is non-existent (or existent but worth it).
On the other hand, we don't use bit fields for booleans in other cases like normal program flow control, because of the way computer architectures usually work. Most common CPUs don't like fetching one bit from memory - they like to fetch bytes or integers. They also don't like to process bits - their instructions often operate on larger things like integers, words, memory addresses, etc.
So, when you try to operate on bits, it's up to you or the compiler (depending on what language you're writing in) to write out additional operations that perform bit masking and strip the structure of everything but the information you actually want to operate on. If there are no benefits in "packing" the information (and in most cases, there aren't), then using bit fields for booleans would only introduce overhead and noise in your code.
To answer the original question »When to use bit-fields in C?« … according to the book "Write Portable Code" by Brian Hook (ISBN 1-59327-056-9, I read the German edition ISBN 3-937514-19-8) and to personal experience:
Never use the bitfield idiom of the C language, but do it by yourself.
A lot of implementation details are compiler-specific, especially in combination with unions and things are not guaranteed over different compilers and different endianness. If there's only a tiny chance your code has to be portable and will be compiled for different architectures and/or with different compilers, don't use it.
We had this case when porting code from a little-endian microcontroller with some proprietary compiler to another big-endian microcontroller with GCC, and it was not fun. :-/
This is how I have used flags (host byte order ;-) ) since then:
# define SOME_FLAG (1 << 0)
# define SOME_OTHER_FLAG (1 << 1)
# define AND_ANOTHER_FLAG (1 << 2)
/* test flag */
if ( someint & SOME_FLAG ) {
/* do this */
}
/* set flag */
someint |= SOME_FLAG;
/* clear flag */
someint &= ~SOME_FLAG;
No need for a union with the int type and some bitfield struct then. If you read lots of embedded code those test, set, and clear patterns will become common, and you spot them easily in your code.
Why do we need to use bit-fields?
When you want to store some data which can be stored in less than one byte, those kind of data can be coupled in a structure using bit fields.
In the embedded word, when one 32 bit world of any register has different meaning for different word then you can also use bit fields to make them more readable.
I found that bit fields are used for flags. Now I am curious, is it the only way bit-fields are used practically?
No, this not the only way. You can use it in other ways too.
Do we need to use bit fields to save space?
Yes.
As I understand only 1 bit is occupied in memory, but not the whole unsigned int value. Is it correct?
No. Memory only can be occupied in multiple of bytes.
Bit fields can be used for saving memory space (but using bit fields for this purpose is rare). It is used where there is a memory constraint, e.g., while programming in embedded systems.
But this should be used only if extremely required because we cannot have the address of a bit field, so address operator & cannot be used with them.
A good usage would be to implement a chunk to translate to—and from—Base64 or any unaligned data structure.
struct {
unsigned int e1:6;
unsigned int e2:6;
unsigned int e3:6;
unsigned int e4:6;
} base64enc; // I don't know if declaring a 4-byte array will have the same effect.
struct {
unsigned char d1;
unsigned char d2;
unsigned char d3;
} base64dec;
union base64chunk {
struct base64enc enc;
struct base64dec dec;
};
base64chunk b64c;
// You can assign three characters to b64c.enc, and get four 0-63 codes from b64dec instantly.
This example is a bit naive, since Base64 must also consider null-termination (i.e. a string which has not a length l so that l % 3 is 0). But works as a sample of accessing unaligned data structures.
Another example: Using this feature to break a TCP packet header into its components (or other network protocol packet header you want to discuss), although it is a more advanced and less end-user example. In general: this is useful regarding PC internals, SO, drivers, an encoding systems.
Another example: analyzing a float number.
struct _FP32 {
unsigned int sign:1;
unsigned int exponent:8;
unsigned int mantissa:23;
}
union FP32_t {
_FP32 parts;
float number;
}
(Disclaimer: Don't know the file name / type name where this is applied, but in C this is declared in a header; Don't know how can this be done for 64-bit floating-point numbers since the mantissa must have 52 bits and—in a 32 bit target—ints have 32 bits).
Conclusion: As the concept and these examples show, this is a rarely used feature because it's mostly for internal purposes, and not for day-by-day software.
To answer the parts of the question no one else answered:
Ints, not Shorts
The reason to use ints rather than shorts, etc. is that in most cases no space will be saved by doing so.
Modern computers have a 32 or 64 bit architecture and that 32 or 64 bits will be needed even if you use a smaller storage type such as a short.
The smaller types are only useful for saving memory if you can pack them together (for example a short array may use less memory than an int array as the shorts can be packed together tighter in the array). For most cases, when using bitfields, this is not the case.
Other uses
Bitfields are most commonly used for flags, but there are other things they are used for. For example, one way to represent a chess board used in a lot of chess algorithms is to use a 64 bit integer to represent the board (8*8 pixels) and set flags in that integer to give the position of all the white pawns. Another integer shows all the black pawns, etc.
You can use them to expand the number of unsigned types that wrap. Ordinary you would have only powers of 8,16,32,64... , but you can have every power with bit-fields.
struct a
{
unsigned int b : 3 ;
} ;
struct a w = { 0 } ;
while( 1 )
{
printf("%u\n" , w.b++ ) ;
getchar() ;
}
To utilize the memory space, we can use bit fields.
As far as I know, in real-world programming, if we require, we can use Booleans instead of declaring it as integers and then making bit field.
If they are also values we use often, not only do we save space, we can also gain performance since we do not need to pollute the caches.
However, caching is also the danger in using bit fields since concurrent reads and writes to different bits will cause a data race and updates to completely separate bits might overwrite new values with old values...
Bitfields are much more compact and that is an advantage.
But don't forget packed structures are slower than normal structures. They are also more difficult to construct since the programmer must define the number of bits to use for each field. This is a disadvantage.
Why do we use int? How much space is occupied?
One answer to this question that I haven't seen mentioned in any of the other answers, is that the C standard guarantees support for int. Specifically:
A bit-field shall have a type that is a qualified or unqualified version of _Bool, signed int, unsigned int, or some other implementation defined type.
It is common for compilers to allow additional bit-field types, but not required. If you're really concerned about portability, int is the best choice.
Nowadays, microcontrollers (MCUs) have peripherals, such as I/O ports, ADCs, DACs, onboard the chip along with the processor.
Before MCUs became available with the needed peripherals, we would access some of our hardware by connecting to the buffered address and data buses of the microprocessor. A pointer would be set to the memory address of the device and if the device saw its address along with the R/W signal and maybe a chip select, it would be accessed.
Oftentimes we would want to access individual or small groups of bits on the device.
In our project, we used this to extract a page table entry and page directory entry from a given memory address:
union VADDRESS {
struct {
ULONG64 BlockOffset : 16;
ULONG64 PteIndex : 14;
ULONG64 PdeIndex : 14;
ULONG64 ReservedMBZ : (64 - (16 + 14 + 14));
};
ULONG64 AsULONG64;
};
Now suppose, we have an address:
union VADDRESS tempAddress;
tempAddress.AsULONG64 = 0x1234567887654321;
Now we can access PTE and PDE from this address:
cout << tempAddress.PteIndex;

On embedded platforms, is it more efficient to use unsigned int instead of (implicity signed) int?

I've got into this habit of always using unsigned integers where possible in my code, because the processor can do divides by powers of two on unsigned types, which it can't with signed types. Speed is critical for this project. The processor operates at up to 40 MIPS.
My processor has an 18 cycle divide, but it takes longer than the single cycle barrel shifter. So is it worth using unsigned integers here to speed things up or do they bring other disadvantages? I'm using a dsPIC33FJ128GP802 - a member of the dsPIC33F series by Microchip. It has single cycle multiply for both signed and unsigned ints. It also has sign and zero extend instructions.
For example, it produces this code when mixing signed and unsigned integers.
026E4 97E80F mov.b [w15-24],w0
026E6 FB0000 se w0,w0
026E8 97E11F mov.b [w15-31],w2
026EA FB8102 ze w2,w2
026EC B98002 mul.ss w0,w2,w0
026EE 400600 add.w w0,w0,w12
026F0 FB8003 ze w3,w0
026F2 100770 subr.w w0,#16,w14
I'm using C (GCC for dsPIC.)
I think we all need to know a lot more about the peculiarities of your processor to answer this question. Why can't it do divides by powers of two on signed integers? As far as I remember the operation is the same for both. I.e.
10/2 = 00001010 goes to 00000101
-10/2 = 11110110 goes to 11111011
Maybe you should write some simple code doing an unsigned divide and a signed divide and compare the compiled output.
Also benchmarking is a good idea. It doesn't need to be precise. Just have a an array of a few thousand numbers, start a timer and start dividing them a few million times and time how long it takes. Maybe do a few billion times if your processor is fast. E.g.
int s_numbers[] = { etc. etc. };
int s_array_size = sizeof(s_numbers);
unsigned int u_numbers[] = { etc. etc.};
unsigned int u_array_size = sizeof(u_numbers);
int i;
int s_result;
unsigned int u_result;
/* Start timer. */
for(i = 0; i < 100000000; i++)
{
i_result = s_numbers[i % s_array_size] / s_numbers[(i + 1) % s_array_size];
}
/* Stop timer and print difference. */
/* Repeat for unsigned integers. */
Written in a hurry to show the principle, please forgive any errors.
It won't give precise benchmarking but should give a general idea of which is faster.
I don't know much about the instruction set available on your processor but a quick look makes me think that it has instructions that may be used for both arithmetic and logical shifts, which should mean that shifting a signed value costs about the same as shifting an unsigned value, and dividing by powers of 2 for each using the shifts should also cost the same. (my knowledge about this is from a quick glance at some intrinsic functions for a C compiler that targets your processor family).
That being said, if you are working with values which are to be interpreted as unsigned then you might as well declare them as unsigned. For the last few years I've been using the types from stdint.h more and more, and usually I end up using the unsigned versions because my values are either inherently unsigned or I'm just using them as bit arrays.
Generate assembly both ways and count cycles.
I'm going to guess the unsigned divide of powers of two are faster because it can simply do a right shift as needed without needing to worry about sign extension.
As for disadvantages: detecting arithmetic overflows, overflowing a signed type because you didn't realize it while using unsigned, etc. Nothing blocking, just different things to watch out for.

Fixed-point multiplication in a known range

I'm trying to multiply A*B in 16-bit fixed point, while keeping as much accuracy as possible. A is 16-bit in unsigned integer range, B is divided by 1000 and always between 0.001 and 9.999. It's been a while since I dealt with problems like that, so:
I know I can just do A*B/1000 after moving to 32-bit variables, then strip back to 16-bit
I'd like to make it faster than that
I'd like to do all the operations without moving to 32-bit (since I've got 16-bit multiplication only)
Is there any easy way to do that?
Edit: A will be between 0 and 4000, so all possible results are in the 16-bit range too.
Edit: B comes from user, set digit-by-digit in the X.XXX mask, that's why the operation is /1000.
No, you have to go to 32 bit. In general the product of two 16 bit numbers will always give you a 32 bit wide result.
You should check the CPU instruction set of the CPU you're working on because most multiply instructions on 16 bit machines have an option to return the result as a 32 bit integer directly.
This would help you a lot because:
short testfunction (short a, short b)
{
int A32 = a;
int B32 = b;
return A32*B32/1000
}
Would force the compiler to do a 32bit * 32bit multiply. On your machine this could be very slow or even done in multiple steps using 16bit multiplies only.
A little bit of inline assembly or even better a compiler intrinsic could speed things up a lot.
Here is an example for the Texas Instruments C64x+ DSP which has such intrinsics:
short test (short a, short b)
{
int product = _mpy (a,b); // calculates product, returns 32 bit integer
return product / 1000;
}
Another thought: You're dividing by 1000. Was that constant your choice? It would be much faster to use a power of two as the base for your fixed-point numbers. 1024 is close. Why don't you:
return (a*b)/1024
instead? The compiler could optimize this by using a shift right by 10 bits. That ought to be much faster than doing reciprocal multiplication tricks.

Resources