C - Stripping least significant byte [duplicate] - c

This question already has answers here:
What does least significant byte mean?
(4 answers)
Closed 7 years ago.
I am working on an assignment and it is asking me to calculate a checksum by stripping off the least significant byte of a ones complement version of an integer...
This is the part of the assignment outline I am confused by:
"The CHECKSUM field (MM) value is calculated by taking the least significant byte of the 1’s Complement value of the sum of the COUNT, ADDRESS
and DATA fields of the record"
Im a little bit unclear on what this means, as I haven't really worked with ones complements or LSB's in C.
What I have so far is:
int checkSum(int count, int address, char* data)
{
int i = 0;
int dataTotal = 0;
for(i = 0; i < strlen(data); i += 2)
{
dataTotal += (getIntFromHex(data[i]) * 16) + getIntFromHex(data[i + 1]);
}
int checksum = ~(count + address + dataTotal) & 1;
printf("Checksum: %.2X\n", checksum);
return checksum;
}
I didn't really expect this to work but I've done some research and this is what I came up with.
I need some clarification on what is meant by the least significant byte.
P.S. The reason for the for loop is simply just to get the total of the data. Not important for this but the code uses the variable so I figured I would just copy the whole thing to avoid confusion.

I need some clarification on what is meant by the least significant byte.
The last significant byte means the number mod 256, a result from zero to 255.
unsigned leastSignificantByte(unsigned j)
{
return j & 0xff;
}

Related

How to store a binary string into uint8_t array bits? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 months ago.
Improve this question
I have binary string( only 0s or 1s) "0101011111011111000001001001110110", for Huffman encoding I want to store each char in the string as bit representation in a uint8_t array.
If I write the binary string as-is into a file it occupies 35 bytes. If we can store each binary char in the string as bit representation in uint8_t array, it can be stored in ~5 bytes.
static uint8_t out_buffer[1024];
static uint32_t bit_pos = 0;
void printbuffer()
{
printf("Just printing bits\n");
int i;
for (i = 0; i < bit_pos; i++) {
printf("%c", (out_buffer[i / 8] & 1 << (i % 8)) ? '1' : '0');
}
}
void append_to_bit_array(char* in, int len, uint8_t* buf)
{
int i;
printbuffer();
for (i = 0; i < len; i++) {
if (in[i])
{
buf[bit_pos / 8] |= 1 << (bit_pos % 8);
}
bit_pos++;
}
}
You need to first decide on what order you want to put the bits in the bytes — i.e. put the first bit in the most significant bit of the first byte, or the least? You also need to have a strategy to deal with the extra 0 to 7 bits in the last byte. Those could look like another Huffman code, and give you extraneous symbols when decoding. Either you will need a count of symbols to decode, or you will need an end symbol that you add to your set before Huffman coding, and send that symbol at the end.
Learn the bitwise operators in C noted in your tag, and use those to place each bit, one by one, into the sequence of bytes. Those are at least the shifts << and >>, and &, and or |.
For example, 1 << n gives you a one bit in position n. a |= 1 << n would set that bit in a, given that a is initialized to zero. On the decoding end, you can use & to see if a bit is set. E.g. a & (1 << n) would be non-zero if bit n in a is set.

memcpy long long int (casting to char*) into char array

I was trying to split a long long into 8 character. Which the first 8 bits represent the first character while the next represent the second...etc.
However, I was using two methods. First , I shift and cast the type and it went well.
But I've failed when using memcpy. The result would be reversed...(which the first 8 bits become the last character). Shouldn't the memory be consecutive and in the same order? Or was I messing something up...
void num_to_str(){
char str[100005] = {0};
unsigned long long int ans = 0;
scanf("%llu" , &ans);
for(int j = 0; j < 8; j++){
str[8 * i + j] = (unsigned char)(ans >> (56 - 8 * j));
}
printf("%s\n", str);
return;
}
This work great:
input : 8102661169684245760
output : program
However, the following doesn't act as I expected.
void num_to_str(){
char str[100005] = {0};
unsigned long long int ans = 0;
scanf("%llu" , &ans);
memcpy(str , (char *)&ans , 8);
for(int i = 0; i < 8; i++)
printf("%c", str[i]);
return;
}
This work unexpectedly:
input : 8102661169684245760
output : margorp
PS:I couldn't even use printf("%s" , str) or puts(str)
I assume that the first character was stored as '\0'
I am a beginner, so I'll be grateful if someone can help me out
The order of bytes within a binary representation of a number within an encoding scheme is called endianness.
In a big-endian system bytes are ordered from the most significant byte to the least significant.
In a little-endian system bytes are ordered from the least significant byte to the most significant one.
There are other endianness, but they are considered esoteric nowadays so you won't find them in practice.
If you run your program on a little endian system, (e.g. x86) you get your results.
You can read more:
https://en.wikipedia.org/wiki/Endianness
You may think why would anyone sane design and use a little endian system where bytes are reversed from how we humans are used (we use big endian for ordering digits when we write). But there are advantages. You can read some here: The reason behind endianness?

writing an 8 bit checksum in C [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
I am having trouble writing an algorithm for a 1byte / 8 bit checksum.
Obviously with 8bits over a decimal value of 255 the Most significant bits have to wrap around. I think I am doing it correctly.
Here is the code...
#include <stdio.h>
int main(void)
{
int check_sum = 0; //checksum
int lcheck_sum = 0; //left checksum bits
int rcheck_sum = 0; //right checksum bits
short int mask = 0x00FF; // 16 bit mask
//Create the frame - sequence number (S) and checksum 1 byte
int c;
//calculate the checksum
for (c = 0; c < length; c++)
{
check_sum = (int)buf[c] + check_sum;
printf("\n Check Sum %d ", check_sum); //debug
}
printf("\nfinal Check Sum %d", check_sum); //debug
//Take checksum and make it a 8 bit checksum
if (check_sum > 255) //if greater than 8 bits then encode bits
{
lcheck_sum = check_sum;
lcheck_sum >> 8; //shift 8 bits to the right
rcheck_sum = check_sum & mask;
check_sum = lcheck_sum + rcheck_sum;
}
//Take the complement
check_sum = ~check_sum;
//Truncate - to get rid of the 8 bits to the right and keep the 8 LSB's
check_sum = check_sum & mask;
printf("\nTruncated and complemented final Check Sum %d\n",check_sum);
return 0;
}
Short answer: you are not doing it correctly, even if the algorithm would be as your code implies (which is unlikely).
Standard warning: Do not use int if your variable might wrap (undefined behaviour) or you want to right-shift potentially negative values (implementation defined). OTOH, for unsigned types, wrapping and shifting behaviour is well defined by the standard.
Further note: Use stdint.h types if you need a specific bit-size! The built-in standard types are not guaranteed (including char) to provide such.
Normally an 8 bit checksum of an 8 bit buffer is calculated as follows:
#include <stdint.h>
uint8_t chksum8(const unsigned char *buff, size_t len)
{
unsigned int sum; // nothing gained in using smaller types!
for ( sum = 0 ; len != 0 ; len-- )
sum += *(buff++); // parenthesis not required!
return (uint8_t)sum;
}
It is not clear what you are doing with all the typecasts or shifts; uint8_t as being guaranteed the smallest (unsigned) type, the upper bits are guaranteed to be "cut off".
Just compare this and your code and you should be able to see if your code will work.
Also note that there is not the single checksum algorithm. I did not invert the result in my code, nor did I fold upper and lower bytes as you did (the latter is pretty uncommon, as it does not add much more protection).
So, you have to verify the algorithm to use. If that really requires to fold the two bytes of a 16 bit result, change sum to uint16_t` and fold the bytes as follows:
uint16_t sum;
...
// replace return with:
while ( sum > 0xFFU )
sum = (sum & 0xFFU) + ((sum >> 8) & 0xFFU);
return sum;
This cares about any overflow from adding the two bytes of sum (the loop could also be unrolled, as the overflow can only occur once).
Sometimes, CRC algorithms are called "checksum", but these are actually a very different beast (mathematically, they are the remainder of a binary polynomial division) and require much more processing (either at run-time, or to generate a lookup-table). OTOH, CRCs provide a much better detection of data corruption - but not to manipulation.

how to typecast byte array to 8 byte-size integer [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I was asked an interview question: given a 6 byte input, which got from a big endian machine, please implement a function to convert/typecast it to 8 bytes, assume we do not know the endian of the machine running this function.
The point of the question seems to test my understanding of endianess because I was asked whether I know endianess before this question.
I do not know how to answer the question. e.g. do I need to pad 6 byte to 8 byte first? and how? Here is my code. is it correct?
bool isBigEndian(){
int num = 1;
char* b = (char*)(&num);
return b ? false:true;
}
long long* convert(char* arr[]){ //size is 6
long long* res = (long long*)malloc(long long);//...check res is NULL...
if (isBigEnian()){
for(int i = 0; i< 6; i++)
memset(res, i+2, arr[i]);
}
else {
for(int i = 0; i< 6; i++)
memset(res, i+2, arr[6-1-i]);
}
return res; //assume caller will free res.
}
update: to answer that my question is not clear, I just found a link: Convert Bytes to Int / uint in C with the similar question. based on my understanding of that, endianess of the host does matters. suppose if input is: char array[] = {01,02,03,04,05,06}, then if host is little endian, output is stored as 00,00,06,05,04,03,02,01, if big endian, output will be stored as 00,00,01,02,03,04,05,06, in both case, the 0000 are padded at beginning.
I am a kind of understand now: in the other machine, suppose there is a number xyz = 010203040506 because it is bigendian and 01 is MSB. so it is stored as char array = {01,02,03,04,05,06} where 01 has lowest address. then in this machine, if the machine is also big endian. it should be stored as {00,00,01,02,03,04,05,06 } where 01 is still MSB, so that it is cast to the same number int_64 xyz2 = 0000010203040506. but if the machine is little endian, it should be stored as {00,00,06,05,04,03,02,01 } where 01 is MSB has highest address in order for int_32 xyz2 = 0000010203040506.
please let me know if my undestanding is incorrect. and Can anybody tell me why 0000 is always padded at beginning no matter what endianess? shouldn't it be padded at the end if this machine is little endian since 00 is Most sign byte?
Before moving on, you should have asked for clarification.
What exactly means converting here? Padding each char with 0's? Prefixing each char with 0's?
I will assume that each char should be prefixed with 0's. This is a possible solution:
#include <stdint.h>
#include <limits.h>
#define DATA_WIDTH 6
uint64_t convert(unsigned char data[]) {
uint64_t res;
int i;
res = 0;
for (i = 0; i < DATA_WIDTH; i++) {
res = (res << CHAR_BIT) | data[i];
}
return res;
}
To append 0's to each char, we could, instead, use this inside the for:
res = (res << CHAR_BIT) | (data[i] << 2);
In an interview, you should always note the limitations for your solution. This solution assumes that the implementation provides uint64_t type (it is not required by the C standard).
The fact that the input is big endian is important because it lets you know that data[0] corresponds to the most significant byte, and it must remain so in your result. This solution works not matter what the target machine's endianness.
I don't understand why you think malloc is necessary. Why not just something like this?
long long convert(unsigned char data[]);
{
long long res;
res = 0;
for( int i=0;i < 6; ++i)
res = (res << 8) + data[i];
return res;
}

Converting little endian to big endian using Bitshift Operators

I am working on endianess. My little endian program works, and gives the correct output. But I am not able to get my way around big endian. Below is the what I have so far.
I know i have to use bit shift and i dont think i am doing a good job at it. I tried asking my TA's and prof but they are not much help.
I have been following this link (convert big endian to little endian in C [without using provided func]) to understand more but cannot still make it work. Thank you for the help.
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
FILE* input;
FILE* output;
input = fopen(argv[1],"r");
output = fopen(argv[2],"w");
int value,value2;
int i;
int zipcode, population;
while(fscanf(input,"%d %d\n",&zipcode, &population)!= EOF)
{
for(i = 0; i<4; i++)
{
population = ((population >> 4)|(population << 4));
}
fwrite(&population, sizeof(int), 1, output);
}
fclose(input);
fclose(output);
return 0;
}
I'm answering not to give you the answer but to help you solve it yourself.
First ask yourself this: how many bits are in a byte? (hint: 8) Next, how many bytes are in an int? (hint: probably 4) Picture this 32-bit integer in memory:
+--------+
0x|12345678|
+--------+
Now picture it on a little-endian machine, byte-wise. It would look like this:
+--+--+--+--+
0x|78|56|34|12|
+--+--+--+--+
What shift operations are required to get the bytes into the correct spot?
Remember, when you use a bitwise operator like >>, you are operating on bits. So 1 << 24 would be the integer value 1 converted into the processor's opposite endianness.
"little-endian" and "big-endian" refer to the order of bytes (we can assume 8 bits here) in a binary representation. When referring to machines, it's about the order of the bytes in memory: on big-endian machines, the address of an int will point to its highest-order byte, while on a little-endian machine the address of an int will refer to its lowest-order byte.
When referring to binary files (or pipes or transmission protocols etc.), however, it refers to the order of the bytes in the file: a "little-endian representation" will have the lowest-order byte first and the highest-order byte last.
How does one obtain the lowest-order byte of an int? That's the low 8 bits, so it's (n & 0xFF) (or ((n >> 0) & 0xFF), the usefulness of which you will see below).
The next lowest-order byte is ((n >> 8) & 0xFF).
The next lowest-order byte is ((n >> 16) & 0xFF) ... or (((n >> 8) >> 8) & 0xFF).
And so on.
So you can peal off bytes from n in a loop and output them one byte at a time ... you can use fwrite for that but it's simpler just to use putchar or putc.
You say that your teacher requires you to use fwrite. There are two ways to do that: 1) use fwrite(&n, 1, 1, filePtr) in a loop as described above. 2) Use the loop to reorder your int value by storing the bytes in the desired order in a char array rather than outputting them, then use fwrite to write it out. The latter is probably what your teacher has in mind.
Note that, if you just use fwrite to output your int it will work ... if you're running on a little-endian machine, where the bytes of the int are already stored in the right order. But the bytes will be backwards if running on a big-endian machine.
The problem with most answers to this question is portability. I've provided a portable answer here, but this recieved relatively little positive feedback. Note that C defines undefined behavior as: behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements.
The answer I'll give here won't assume that int is 16 bits in width; It'll give you an idea of how to represent "larger int" values. It's the same concept, but uses a dynamic loop rather than two fputcs.
Declare an array of sizeof int unsigned chars: unsigned char big_endian[sizeof int];
Separate the sign and the absolute value.
int sign = value < 0;
value = sign ? -value : value;
Loop from sizeof int to 0, writing the least significant bytes:
size_t foo = sizeof int;
do {
big_endian[--foo] = value % (UCHAR_MAX + 1);
value /= (UCHAR_MAX + 1);
} while (foo > 0);
Now insert the sign: foo[0] |= sign << (CHAR_BIT - 1);
Simple, yeh? Little endian is equally simple. Just reverse the order of the loop to go from 0 to sizeof int, instead of from sizeof int to 0:
size_t foo = 0;
do {
big_endian[foo++] = value % (UCHAR_MAX + 1);
value /= (UCHAR_MAX + 1);
} while (foo < sizeof int);
The portable methods make more sense, because they're well defined.

Resources