rlcf instruction with pic 18F4550 in C compiler - c

I'm new at hardware programming with c compiler for the PIC 18F4550 from Microchip.
My question is, can someone give me an example 'how to rotate bits and get the carry that is added, with this instruction 'rlcf' (c compiler)
This instruction shifts the bits to the left and places the leftmost bit in a Carry and you should read back this value from the carry.
I know how it works. But can not find any example code to run it on my way to code.
That's the data input i receive. It must be converted into binary values, and than rotate it.
unsigned int red = 1206420333240;
Thanks in advance!

You don't have access to carry bits in a C compiler, you'd have to use assembly to get to them.
Also your value is too big for an unsigned int on a PIC18, which is a 16 bit number with a maximum of 65535 decimal, 0xFFFF hex.
How you write assembly inside a C file varies depending on the compiler. In Hitech C, the following syntax is valid
asm("RLCF REG,0,0");//replace REG with your register and consider the d and a flags.
asm("BC 5"); //branch if carry
But note that is is rotating one byte, not a two byte number. You need to chain together two rotates of two registers to rotate a 16 bit number.

Related

Operating Rightmost/Leftmost n-Bits, Not All the Bits of A Integer Type Data Variable

In a programming-task, I have to add a smaller integer in variable B (data type int)
to a larger integer (20 decimal integer) in variable A (data type long long int),
then compare A with variable C which is also as large integer (data type long long int) as A.
What I realized, since I add a smaller B to A,
I don't need to check all the digits of A when I compare that with C, in other words, we don't need to check all the bits of A and C.
Given that I know, how many bits from the right I need to check, say n-bits,
is there a way/technique to check only those specific n-bits from the right (not all the bits of A, C) to make the program faster in c programming language?
Because for comparing all the bits take more time, and since I am working with large number, the program becomes slower.
Every time I search in the google, bit-masking appears which uses all the bits of A, C, that doesn't do what I am asking for, so probably I am not using correct terminology, please help.
Addition:
Initial comments of this post made me think there is no way but i found the following -
Bit Manipulation by University of Colorado Boulder
(#cuboulder, after 7:45)
...the bit band region is accessed via a bit band alías, each bit in a
supported bit band region has its own unique address and we can access
that bit using a pointer to its bit band alias location, the least
significant bit in an alias location can be sent or cleared and that
will be mapped to the bit in the corresponding data or peripheral
memory, unfortunately this will not help you if you need to write to
multiple bit locations in memory dependent operations only allow a
single bit to be cleared or set...
Is above what I a asking for? if yes then
where I can find the detail as beginner?
Updated question:
Is there a way/technique to check only those specific n-bits from the right (not all the bits of A, C) to make the program faster in c programming language (or any other language) that makes the program faster?
Your assumption that comparing fewer bits is faster might be true in some cases but is probably not true in most cases.
I'm only familiar with x86 CPUs. A x86-64 Processor has 64 bit wide registers. These can be accessed as 64 bit registers but the lower bits also as 32, 16 and 8 bit registers. There are processor instructions which work with the 64, 32, 16 or 8 bit part of the registers. Comparing 8 bits is one instruction but so is comparing 64 bits.
If using the 32 bit comparison would be faster than the 64 bit comparison you could gain some speed. But it seems like there is no speed difference for current processor generations. (Check out the "cmp" instruction with the link to uops.info from #harold.)
If your long long data type is actually bigger then the word size of your processor, then it's a different story. E.g. if your long long is 64 bit but your are on a 32 bit processor then these instructions cannot be handled by one register and you would need multiple instructions. So if you know that comparing only the lower 32 bits would be enough this could save some time.
Also note that comparing only e.g. 20 bits would actually take more time then comparing 32 bits. You would have to compare 32 bits and then mask the 12 highest bits. So you would need a comparison and a bitwise and instruction.
As you see this is very processor specific. And you are on the processors opcode level. As #RawkFist wrote in his comment you could try to get the C compiler to create such instructions but that does not automatically mean that this is even faster.
All of this is only relevant if these operations are executed a lot. I'm not sure what you are doing. If e.g. you add many values B to A and compare them to C each time it might be faster to start with C, subtract the B values from it and compare with 0. Because the compare-operation works internally like a subtraction. So instead of an add and a compare instruction a single subtraction would be enough within the loop. But modern CPUs and compilers are very smart and optimize a lot. So maybe the compiler automatically performs such or similar optimizations.
Try this question.
Is there a way/technique to check only those specific n-bits from the right (not all the bits of A, C) to make the program faster in c programming language (or any other language) that makes the program faster?
Yes - when A + B != C. We can short-cut the comparison once a difference is found: from least to most significant.
No - when A + B == C. All bits need comparison.
Now back to OP's original question
Is there a way/technique to check only those specific n-bits from the right (not all the bits of A, C) to make the program faster in c programming language (or any other language) that makes the program faster?
No. In order to do so, we need to out-think the compiler. A well enabled compiler itself will notice any "tricks" available for long long + (signed char)int == long long and emit efficient code.
Yet what about really long compares? How about a custom uint1000000 for A and C?
For long compares of a custom type, a quick compare can be had.
First, select a fast working type. unsigned is a prime candidate.
typedef unsigned ufast;
Now define the wide integer.
#include <limits.h>
#include <stdbool.h>
#define UINT1000000_N (1000000/(sizeof(ufast) * CHAR_BIT))
typedef struct {
// Least significant first
ufast digit[UINT1000000_N];
} uint1000000;
Perform the addition and compare one "digit" at a time.
bool uint1000000_fast_offset_compare(const uint1000000 *A, unsigned B,
const uint1000000 *C) {
ufast carry = B;
for (unsigned i = 0; i < UINT1000000_N; i++) {
ufast sum = A->digit[i] + carry;
if (sum != C->digit[i]) {
return false;
}
carry = sum < A->digit[i];
}
return true;
}

How do you convert the remainder of a division operation to a fixed point in C?

I understand the concept of fixed point pretty well at this point, but I'm having trouble making a logical jump.
I'm working with M68000 CPUs using gcc with no standard libraries of any sort. Using DIVU/DIVS opcodes, I can obtain the quotient and the remainder. Given a Q16.16 fixed point value stored in an unsigned 32bit memory space, I know I can put the quotient in the upper 16 bits. However, how does one convert the integer remainder into the fractional portion of the fixed point value?
I'm sure this is something simple and I'm just missing it. Any help would be greatly appreciated.
The way to think about it is that fixed point numbers are actually integers hold the value of your number times some fixed multiplier. You want to build you fixed point operations out of the integer operations you have available in your hardware.
So for a 16.16 fixed-point format, your multiplier is 65536 (216), so if you want to do a divide c = a/b, the numbers (integers) you have to work with are actually a' = a * 65536 and b' = b * 65536 and you want to find c' = c * 65536. So substituting into the desired c = a/b, you have
c'/65536 = (a'/65536) / (b'/65536) = a'/b'
c' = 65536 * a' / b'
So you actually want to first (integer) mulitply the fixed-point value of a by 65536 (left shift by 16), then do an integer divide by the fixed point value of b, and that will give you the fixed point value of c. The issue is that the first multiply will almost certainly overflow 32 bits, so you need a 64 bit (actually only 48 bit) intermediate. So if you're using a 68020+ with a 64/32 DIVS.L instruction (divides a 64 bit value in a pair of registers by a 32 bit value), you're fine. You don't need the remainder at all.
If you're using a pure 68000 that doesn't have the wide divide, you'll need to do 16-bit long division on the values (where you use 16 bit numbers as "digits", so you're dividing a 3-"digit" number by a 2-"digit" one)

"Bit-fields are assigned left to right on some machines and right to left on others"- unable to get the concept from "The C Programming Language" book

I was going through the text "The C Programming Language" by Kernighan and Ritchie. While discussing about bit-fields at the end of that section, the authors say:
"Fields are assigned left to right on some machines and right to left on others. This means that although fields are useful for maintaining internally-defined data structures, the question of which end comes first has to be carefully considered when picking apart externally-defined data; programs that depend on such things are not portable."
- The C Programming Language [2e] by Kernighan & Ritchie [Section 6.9, p.150]
Strictly I do not get the meaning of these lines. Can anyone please explain me with a possible diagram?
PS: Well I have taken a computer organization and architecture course. I know how computers deal with bits and bytes. In a computer system, the smallest unit of information is a single bit which can be either 0 or 1. 8 such bits form a byte. Memories are byte-addressable, which means that each byte in the memory has an address associated with it. But usually, the processors have word lengths as 2 bytes (very old systems),4 bytes, 8 bytes... This means in one memory cycle, the CPU can take up a word length number of bytes from the main memory and put it inside its registers. Now how these bytes are placed in registers depends on the endianness of the system.
But I do not get what the authors mean by "left to right" or "right to left". The words seem like they are related to the endianness but endianness depends on the CPU and C compilers have nothing to do with it... The question which comes to my mind is "left to right" of "what"? What object are the authors referring to?
When a structure contains bit-fields, the C implementation uses some storage unit to hold them (or multiple storage units if needed). The storage unit might be one eight-bit byte or it might be four bytes, or it might be other sizes—this is a determination made by each C implementation. The C standard only requires that it be addressable, which effectively means it has to be a whole number of bytes.
Once we have a storage unit, it is some number of bits. Say it is 32 bits, and number the bits from 31 to 0, where, if we consider the bits to represent a binary numeral, bit 0 represents 20, and bit 31 represents 231. Note that Kernighan and Ritchie are imprecise to use “left” and “right” here. There is no inherent left or right. We usually write numerals with the most significant digits on the left, so we might consider bit 31 to be the leftmost and bit 0 to be the rightmost.
Now we have a storage unit with some number of bits and some labeling for those bits (31 to 0 or left to right). Say you want to put two bit-fields in them, say fields of width 7 and 5.
Which 7 of the bits from bit 31 to bit 0 are used for the first field? Which 5 of the bits are used for the second field?
We could use bits 31-25 for the first field and bits 24-20 for the second field. Or we could use bits 6-0 for the first field and bits 11-7 for the second field.
In theory, we could also use bits 27-21 for the first field and bits 15-11 for the second field. However, the C standard does say that “If enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit” (C 2018 6.7.2.1 11). “Adjacent” is not formally defined, but we can assume it means consecutively numbered bits. So, if the C implementation puts the first field in bits 31-25, it is required to put the second field in bits 24-20. Conversely, it it puts the first field in bits 6-0, it must put the second field in 11-7.
Thus, the C standard requires an implementation to arrange successive bit-fields in a storage unit from left-to-right or from right-to-left, but it does not say which.
(I do not see anything in the standard that says the first field must start at one end of the storage unit or the other, rather than somewhere in the middle. That would lead to wasting some bits.)
When you write:
struct {
unsigned int version: 4;
unsigned int length: 4;
unsigned char dcsn;
you end up with a big headache you weren't expecting because your code is non-portable.
When you set version to 4 and length to 5, some systems may set the first byte of the structure to 0x45 and other systems may set the first byte of the structure to 0x54.
When I went to college this thing was #ifdef'd as follows (incorrect):
struct {
#if BIG_ENDIAN
unsigned int version: 4;
unsigned int length: 4;
#else
unsigned int length: 4;
unsigned int version: 4;
#endif
unsigned char dcsn;
but this is still rolling the dice as there's no rule that the order of the bits in the bytes in a bitfield corresponds to the order of bytes in the word in the machine. I would not be surprised that when you cross-compile the bit order in the struct comes from the host machine's rules while the bit order of integers comes from the target machine's rules (as it must). In theory the code could be corrected by having a separate #ifdef for BIG_ENDIAN_BITFIELD but I've never seen it done.
Here is some demonstration code. The only goal is to demonstrate what you are asking about. Clean coding etc. is neglected.
#include <stdio.h>
#include <stdint.h>
union
{
uint32_t Everything;
struct
{
uint32_t FirstMentionedBit : 1;
uint32_t FewOTherBits :30;
uint32_t LastMentionedBit : 1;
} bitfield;
} Demonstration;
int main()
{
Demonstration.Everything =0;
Demonstration.bitfield.LastMentionedBit=1;
printf("%x\n", Demonstration.Everything);
Demonstration.Everything =0;
Demonstration.bitfield.FirstMentionedBit=1;
printf("%x\n", Demonstration.Everything);
return 0;
}
If you use this here https://www.tutorialspoint.com/compile_c_online.php
the output is
80000000
1
But in other environments it might easily be
1
80000000
This is because compilers are free to consider the first mentioned bit the MSB or the LSB and correspondingly the last mentioned bit to be the LSB or MSB.
And that is what your quote describes.

Explain how specific C #define works

I have been looking at some of the codes at http://www.netlib.org/fdlibm/ to see how some functions work and I was looking at the code for e_log.c and in some parts of the code it says:
hx = __HI(x); /* high word of x */
lx = __LO(x); /* low word of x */
The code for __HI(x) and __LO(x) is:
#define __HI(x) *(1+(int*)&x)
#define __LO(x) *(int*)&x
which I really don't understand because I am not familiar with this type of C. Can someone please explain to me what __HI(x) and __LO(x) are doing?
Also later in the code for the function there is a statement:
__HI(x) = hx|(i^0x3ff00000);
Can someone please explain to me:
how is it possible to make a function equal to something (I generally work with python so I don't really know what is going on)?
what are __HI(x) and __LO(x) doing?
what does the program mean by "high word" and "low word" of x?
The final purpose of my analysis is understanding this code in order to port it into a Python implementation
These macros use compiler-dependent properties to access the representations of double types.
In C, all objects other than bit-fields are represented as sequences of bytes. The fdlibm code you are looking at is designed for implementations where int is four bytes and the double type is represented using eight bytes in a format defined by the IEEE-754 floating-point specification. That format is called binary64 or IEEE-754 basic 64-bit binary floating-point. It is also designed for an implementation where the C compiler guarantees that aliasing via pointer conversions is supported. (This is not guaranteed by the C standard, but C implementations may support it.)
Consider a double object named x. Given these macros:
#define __HI(x) *(1+(int*)&x)
#define __LO(x) *(int*)&x
When __LO(x) is used in source code, it is replaced by *(int*)&x. The &x takes the address of x. The address of x has type double *. The cast (int *) converts this to int *, a pointer to an int. Then * dereferences this pointer, resulting in a reference to the int that is at the low-address part of x.
When __HI(x) is used in the source code, (int*)&x again points to the low-address part of x. Adding 1 changes it to point to the high-address part. Then * dereferences this, resulting in a reference to the int that is at the high-address part.
The routines in fdlibm are special mathematical routines. To operate, they need to examine and modify the bytes that represent double values. The __LO and __HI macros give them this access.
These definitions of __HI and __LO work for implementations that store the double values in little-endian order (with the “least significant” part of the double in the lower-addressed memory location). The fdlibm code may contain alternate definitions for big-endian systems, likely selected by some #if statement.
In the code __HI(x) = hx|(i^0x3ff00000);, the value 0x3ff00000 is a bit mask for the bits that encode the exponent (and part of the significand) of a double value. Without context, we cannot say precisely what is happening here, but the code appears to be merging hx with some value from i. It is likely completing some computation of the bytes representing a new double value it is creating and storing those bytes in the “high” part of a double object.
I add a reply to integrate the one already present (not substitute).
hx = __HI(x); /* high word of x */
lx = __LO(x); /* low word of x */
Comments are useful... even if in this case the macro name could be clear enough. "high" and "low" refer to the two halves of an integer representation, typically a 16 or 32 bit because for an 8-bit int the used term is "nibble".
If we take a 16-bit unsigned integer which can range from 0 to 65535, or in hex 0x0000 to 0xFFFF, for example 0x1234, the two halves are:
0x1234
^^-------------------- lower half, or "low"
^^---------------------- upper half, or "high"
Note that "lower" means the less significant part. The correct way to get the two halves, assuming 16 bits, is to make a logical (bitwise) AND with 0xFF to get lo(), and to shift 8 bit right (divide by 256) to get high.
Now, inside a CPU the number 0x1234 is written in two consecutive locations, either as 0x12 then 0x34 if big-endian, or 0x34 then 0x12 if little-endian. Given this, other ways are possible to read single halves, reading the correct one directly from memory without calculation. To get the lo() of 0x1234 in a little endian machine, it is possible to read the single byte in the first location.
From the question:
#define __HI(x) *(1+(int*)&x)
#define __LO(x) *(int*)&x
__LO is defined to make a bitwise AND (sure way), while __HI peeks directly in the memory (less sure). It is strange because it seems that the integer to be splitted in two has double dimension of the size of the word of the machine. If the machine is 32 bit, the integer to be split is 64 bits long. And there is another caveat: those macro can read the halves, but can also be used to write separately the two halves. In fact, from the question:
__HI(x) = hx|(i^0x3ff00000);
the result is to set only the HI part (upper, most significant) of x. Note also the value used, 0x3FFF0000, which seems to indicate that x is 128 bits because the mask used to generate a half of it is 64 bits long.
Hope this is clear enough to translate C to python. You should use integers 128 bit long. When in need to get the LO() part, use a bitwise AND with 0xFFFFFFFF; to get HI(), shift right 64 times or do the equivalent division.
When HI and LO are to the left of an assignment, only that half of the value is written, and you should construct separately the two halves and sum them up (or bitwise or them together).
Hope it helps...
#define A B
is a preprocessor directive that substitutes literal A with literal B all over the source code before the compilation.
#define A(x) B
is a function-like preprocessor macro which uses a parameter x in order to do a parameterized preprocessor substitution. In this case, B can be a function of x as well.
Your macros
#define __HI(x) *(1+(int*)&x)
#define __LO(x) *(int*)&x
// called as
__HI(x) = hx|(i^0x3ff00000);
Since it is just a matter of code substitution, the assignment is perfectly legit. Why? Because in this case the macro is substituted by an R-value in both cases.
That rvalue is in both cases a variable of type int:
take x's address
cast it to a pointer to int
deference it (in case of __LO())
Add 1 and then deference it in case of __HI ().
What it will actually point depends on architecture because pointer arithmetics are architecture dependant. Also endianness has to be taken into account.
What we can say is that they are designed in order to access the lower and the higher halves of a data type whose size is 2*sizeof (int) big (so, if for example integer data is 32-bit wide, they will allow the access to lower 32 bytes and to upper 32 bytes). Furthermore, from the macro names we understand that it is a little-endian architecture (LSB comes first).
In order to port to Python code containing this macros you will need to do it at higher level, since Python does not support pointers.
These tips don't solve your specific task, but provide to you a working method for this task and similar:
A way to understand what a macro does is checking how it is actually translated by the preprocessor. This can be done on most compilers through the -E compiler option.
Use a debugger to understand the functionality: set a breakpoint just before the call to the macro, and analyze its effects on addresses and variables.

AVR uint8_t doesn't get correct value

I have a uint8_t that should contain the result of a bitwise calculation. The debugger says the variable is set correctly, but when i check the memory, the var is always at 0. The code proceeds like the var is 0, no matter what the debugger tells me. Here's the code:
temp = (path_table & (1 << current_bit)) >> current_bit;
//temp is always 0, debugger shows correct value
if (temp > 0) {
DS18B20_send_bit(pin, 0x01);
} else {
DS18B20_send_bit(pin, 0x00);
}
Temp's a uint8_t, path_table's a uint64_t and current_bit's a uint8_t. I've tried to make them all uint64_t but nothing changed. I've also tried using unsigned long long int instead. Nothing again.
The code always enters the else clause.
Chip's Atmega4809, and uses uint64_t in other parts of the code with no issues.
Note - If anyone knows a more efficient/compact way to extract a single bit from a variable i would really appreciate if you could share ^^
1 is an integer constant, of type int. The expression 1 << current_bit also has type int, but for 16-bit int, the result of that expression is undefined when current_bit is larger than 14. The behavior being undefined in your case, then, it is plausible that your debugger presents results for the overall expression that seem inconsistent with the observed behavior. If you used an unsigned int constant instead, i.e. 1u, then the resulting value of temp would be well defined as 0 whenever current_bit was greater than 15, because the result of the left shift would be zero.
Solve this problem by performing the computation in a type wide enough to hold the result. Here's a compact, correct, and pretty clear way to correct your code to do that:
DS18B20_send_bit(pin, (path_table & (((uint64_t) 1) << current_bit)) != 0);
Or if path_table has an unsigned type then I prefer this, though it's more of a departure from your original:
DS18B20_send_bit(pin, (path_table >> current_bit) & 1);
Realization #1 here is that AVR is 1980-1990s technology core. It is not a x64 PC that chews 64 bit numbers for breakfast, but an extremely inefficient 8-bit MCU. As such:
It likes 8 bit arithmetic.
It will struggle with 16 bit arithmetic, by doing tricks with 16 bit index registers, double accumulators or whatever 8 bit core tricks it prefers to do.
It will literally take ages to execute 32 bit arithmetic, by invoking software libraries inline.
It will probably melt through the floor if attempting 64 bit arithmetic.
Before you do anything else, you need to get rid of all 64 bit arithmetic and radically minimize the use of 32 bit arithmetic. Period. There should be no single variable of uint64_t in your code or you are doing it very very wrong.
With this revelation also comes that all 8 bit MCUs always have an int type which is 16 bits.
In the code 1<<current_bit, the integer constant 1 is of type int. Meaning that if current_bit is 15 or larger, you will shift bits into the sign bit of this temporary int. This is always a bug. Strictly speaking this is undefined behavior. In practice, you might end up with random change of sign of your numbers.
To avoid this, never use any form of bitwise operators on signed numbers. When mixing integer constants such as 1 with bitwise operators, change them to 1u to avoid bugs like the one mentioned.
If anyone knows a more efficient/compact way to extract a single bit from a variable i would really appreciate if you could share
The most efficient way in C is: uint8_t variable; ... if(variable & (1u << bits)). This should translate to the relevant "branch if bit set" instruction.
My general advise would be find your tool chain's disassembler and see what machine code that the C code actually generated. You don't have to be an assembler guru to read it, peeking at the instruction set should be enough.

Resources