I found this code while I was learning how to make a Virtual Machine. But I haven't got a clue what this function does. Do any of you know what this function is doing?
void decode( int instr )
{
instrNum = (instr & 0xF000) >> 12;
reg1 = (instr & 0xF00 ) >> 8;
reg2 = (instr & 0xF0 ) >> 4;
reg3 = (instr & 0xF );
imm = (instr & 0xFF );
}
The variable instr = 1.
The function is saving specific sets of 4 bits (called nibbles) from the variable instr into other variables instrNum, reg1, etc (these other variables must have a global scope as they're not defined here).
Consider for example if instr was 0x1234
instrNum = (0x1234 & 0xF000) >> 12;
= (0x1000) >> 12;
= 1
reg1 = (0x1234 & 0xF00) >> 8;
= (0x0200) >> 8;
= 2
reg2 = (0x1234 & 0xF0) >> 4;
= (0x0030) >> 4;
= 3
reg3 = (0x1234 & 0xF);
= (0x0004);
= 4
imm = (0x1234 & 0xFF);
= (0x0034);
= 52
So it's taking each nibble of the variable instr and saving it into a separate variable. The last variable imm gets the last byte. & and >> are bit operators, AND operator for seperating out bits and the right shift operator.
Why it's saving these is anyone's guess, we would need to know what type those variables are and what they're used for, but that's what is happening anyway
Those are bit operations, which are often used to compactly store some flags within a single integer. This function "reads" bits from the argument instr and writes the results to other fields.
This function seems to decode an instruction instr into a 4-bit instruction code (instNum), and up to three registers 4-bit codes (reg1 to reg3). In your virtual machine, there seems also an encoding for immediate 8 bit operands (imm). Here an illustration of my guess of the 16-bit instruction set of the VM:
Related
I'm trying to write a VM (LC-3), and on this ADD instruction I encountered this statement. Basically the "register0" is the DR register, but I don't really understand what is actually shifting and why 9. Also the AND operator with the 0x7 value.
|15|14|13|12|11|10|9|8|7|6|5|4|3|2|1|0|
| 0001 | DR | SR1 |0| 00| SR2 |
Could anyone please explain it to me in detail?
ADD {
/* destination register (DR) */
uint16_t r0 = (instr >> 9) & 0x7;
/* first operand (SR1) */
uint16_t r1 = (instr >> 6) & 0x7;
/* whether we are in immediate mode */
uint16_t imm_flag = (instr >> 5) & 0x1;
if (imm_flag) {
uint16_t imm5 = sign_extend(instr & 0x1F, 5);
reg[r0] = reg[r1] + imm5;
} else {
uint16_t r2 = instr & 0x7;
reg[r0] = reg[r1] + reg[r2];
}
update_flags(r0);
}
What it's doing is isolating the 3 bits that represent the DR register so they become a standalone number.
Let's say the entire sequence looks like this:
1110101101101011
^^^
DR
Shifting 9 bits right gives this:
1110101
and & 0x7 (bitwise AND) isolates the 3 lowest bits:
101
Similar operations are performed to isolate the values of SR1 and the immediate mode flag. Depending on that flag, SR2 may also be required, but as it's already in the lowest 3 bits, no shifting is needed.
I am using GCC struct bit fields in an attempt interpret 8 byte CAN message data. I wrote a small program as an example of one possible message layout. The code and the comments should describe my problem. I assigned the 8 bytes so that all 5 signals should equal 1. As the output shows on an Intel PC, that is hardly the case. All CAN data that I deal with is big endian, and the fact that they are almost never packed 8 bit aligned makes htonl() and friends useless in this case. Does anyone know of a solution?
#include <stdio.h>
#include <netinet/in.h>
typedef union
{
unsigned char data[8];
struct {
unsigned int signal1 : 32;
unsigned int signal2 : 6;
unsigned int signal3 : 16;
unsigned int signal4 : 8;
unsigned int signal5 : 2;
} __attribute__((__packed__));
} _message1;
int main()
{
_message1 message1;
unsigned char incoming_data[8]; //This is how this message would come in from a CAN bus for all signals == 1
incoming_data[0] = 0x00;
incoming_data[1] = 0x00;
incoming_data[2] = 0x00;
incoming_data[3] = 0x01; //bit 1 of signal 1
incoming_data[4] = 0x04; //bit 1 of signal 2
incoming_data[5] = 0x00;
incoming_data[6] = 0x04; //bit 1 of signal 3
incoming_data[7] = 0x05; //bit 1 of signal 4 and signal 5
for(int i = 0; i < 8; ++i){
message1.data[i] = incoming_data[i];
}
printf("signal1 = %x\n", message1.signal1);
printf("signal2 = %x\n", message1.signal2);
printf("signal3 = %x\n", message1.signal3);
printf("signal4 = %x\n", message1.signal4);
printf("signal5 = %x\n", message1.signal5);
}
Because struct packing order varies between compilers and architectures, the best option is to use a helper function to pack/unpack the binary data instead.
For example:
static inline void message1_unpack(uint32_t *fields,
const unsigned char *buffer)
{
const uint64_t data = (((uint64_t)buffer[0]) << 56)
| (((uint64_t)buffer[1]) << 48)
| (((uint64_t)buffer[2]) << 40)
| (((uint64_t)buffer[3]) << 32)
| (((uint64_t)buffer[4]) << 24)
| (((uint64_t)buffer[5]) << 16)
| (((uint64_t)buffer[6]) << 8)
| ((uint64_t)buffer[7]);
fields[0] = data >> 32; /* Bits 32..63 */
fields[1] = (data >> 26) & 0x3F; /* Bits 26..31 */
fields[2] = (data >> 10) & 0xFFFF; /* Bits 10..25 */
fields[3] = (data >> 2) & 0xFF; /* Bits 2..9 */
fields[4] = data & 0x03; /* Bits 0..1 */
}
Note that because the consecutive bytes are interpreted as a single unsigned integer (in big-endian byte order), the above will be perfectly portable.
Instead of an array of fields, you could use a structure, of course; but it does not need to have any resemblance to the on-the-wire structure at all. However, if you have several different structures to unpack, an array of (maximum-width) fields usually turns out to be easier and more robust.
All sane compilers will optimize the above code just fine. In particular, GCC with -O2 does a very good job.
The inverse, packing those same fields to a buffer, is very similar:
static inline void message1_pack(unsigned char *buffer,
const uint32_t *fields)
{
const uint64_t data = (((uint64_t)(fields[0] )) << 32)
| (((uint64_t)(fields[1] & 0x3F )) << 26)
| (((uint64_t)(fields[2] & 0xFFFF )) << 10)
| (((uint64_t)(fields[3] & 0xFF )) << 2)
| ( (uint64_t)(fields[4] & 0x03 ) );
buffer[0] = data >> 56;
buffer[1] = data >> 48;
buffer[2] = data >> 40;
buffer[3] = data >> 32;
buffer[4] = data >> 24;
buffer[5] = data >> 16;
buffer[6] = data >> 8;
buffer[7] = data;
}
Note that the masks define the field length (0x03 = 0b11 (2 bits), 0x3F = 0b111111 (16 bits), 0xFF = 0b11111111 (8 bits), 0xFFFF = 0b1111111111111111 (16 bits)); and the shift amount depends on the bit position of the least significant bit in each field.
To verify such functions work, pack, unpack, repack, and re-unpack a buffer that should contain all zeros except one of the fields all ones, and verify the data stays correct over two roundtrips. It usually suffices to detect the typical bugs (wrong bit shift amounts, typos in masks).
Note that documentation will be key to ensure the code remains maintainable. I'd personally add comment blocks before each of the above functions, similar to
/* message1_unpack(): Unpack 8-byte message to 5 fields:
field[0]: Foobar. Bits 32..63.
field[1]: Buzz. Bits 26..31.
field[2]: Wahwah. Bits 10..25.
field[3]: Cheez. Bits 2..9.
field[4]: Blop. Bits 0..1.
*/
with the field "names" reflecting their names in documentation.
I want to read and write from/to an unsigned char according to the table below:
for example I have following variables:
unsigned char hsi_div = 0x01; /* HSI/2 */
unsigned char cpu_div = 0x05; /* Fmaster/32 */
I want to write hsi_div to bits 4,3 and cpu_div to bits 2,1,0 (imagine the whole char is named CLK_DIVR):
CLK_DIVR |= hsi_div << 4; //not correct!
CLK_DIVR |= cpu_div << 2; //not correct!
And lets say I want to read the register back to make sure I did it correct:
if( ((CLK_DIVR << 4) - 1) & hsi_div) ) { /* SET OK */ }
if( ((CLK_DIVR << 2) - 1) & cpu_div) ) { /* SET OK */ }
Is there something wrong with my bitwise operations!? I do not get correct behaviour.
I assume CLK_DIVR is a hardware peripheral register which should be qualified volatile. Such registers should be set up with as few writes as possible. You change all write-able bits, so just
CLK_DIVR = (uint8_t)((hsi_div << 3) | (cpu_div << 0));
Note using fixed width type. That makes mentioniong it is an 8 bit register unnecessary. According to the excerpt, the upper bits are read-only, so they are not changed when writing. The cast keeps the compiler from issuing a truncation warning which is one of the recommended warnings to always enable (included in -Wconversion for gcc).
The shift count is actually the bit the field starts (the LSbit). A shift count of 0 means "no shifting", so the shift-operator is not required. I still use it to clarify I meant the field starts at bit 0. Just let the compiler optimize, concentrate on writing maintainable code.
Note: Your code bit-or's whatever already is in the register. Bit-or can only set bits, but not clear them. Addiionally the shift counts were wrong.
Not sure, but if the excerpt is for an ARM Cortex-M CPU (STM32Fxxxx?), reducing external bus-cycles becomes more relevant, as the ARM can take quite some cycles for an access.
For the HSIDIV bit fields you want:
hw_register = (hw_register & 0x18) | (hsi_value & 0x03) << 0x03;
This will mask the value to 2 bits wide then shift to bit position 3 and 4.
The CPUDIV fields are:
hw_register = (hw_register & 0x7) | (cpu_value & 7);
Reading the register:
hsi_value = (hw_register & 0x18) >> 3;
cpu_value = hw_register & 0x07;
Just
CLK_DIVR |= hsi_div << 3;
CLK_DIVR |= cpu_div << 0;
Since hsi_div is a 2-digit binary, you have to move it three positions to skip the CPUDIV field. And the cpu_div is already at the end of the field.
I built a virtual machine in C. And for this I have the Instruction
pushc <const>
I saved the command and the value in 32 Bit. The First 8 Bit are for the command and the rest for the value.
8 Bit -> Opcode
24 Bit -> Immediate value
For this I make a macro
#define PUSHC 1 //1 is for the command value in the Opcode
#define IMMEDIATE(x) ((x) & 0x00FFFFFF)
UPDATE:
**#define SIGN_EXTEND(i) ((i) & 0x00800000 ? (i) | 0xFF000000 : (i))**
Then I load for testing this in a unsigned int array:
Update:
unsigned int code[] = { (PUSHC << 24 | IMMEDIATE(2)),
(PUSHC << 24 | SIGN_EXTEND(-2)),
...};
later in my code I want to get the Immediate value of the pushc command and push this value to a stack...
I get every Instruction (IR) from the array and built my stack.
UPDATE:
void exec(unsigned int IR){
unsigned int opcode = (IR >> 24) & 0xff;
unsigned int imm = (IR & 0xffffff);
switch(opcode){
case PUSHC: {
stack[sp] = imm;
sp = sp + 1;
break;
}
}
...
}
}
Just use a bitwise AND to mask out the lower 24 bits, then use it in the case:
const uint8_t opcode = (IR >> 24) & 0xff;
const uint32_t imm = (IR & 0xffffff);
switch(opcode)
{
case PUSHC:
stack[sp] = imm;
break;
}
I shifted around the extraction of the opcode to make the case easier to read.
I need to put 8 bytes, that I received on an arbitrary machine that my code runs on, in to big endian order. I believe that I can use the htobe64 function for that, but I'm not sure about the portability of that - i.e. the availability of endian.h - across different machines architectures and operating systems when compiling my code. Is this a safe i.e portable method to use or is it better to use a different approach?
Please use the following, portable approach:
#include <stdint.h>
void write64be(unsigned char out[8], uint64_t in)
{
out[0] = in >> 56 & 0xff;
out[1] = in >> 48 & 0xff;
out[2] = in >> 40 & 0xff;
out[3] = in >> 32 & 0xff;
out[4] = in >> 24 & 0xff;
out[5] = in >> 16 & 0xff;
out[6] = in >> 8 & 0xff;
out[7] = in >> 0 & 0xff;
}