8 Byte Number as Hex in C - c

I have given a number, for example n = 10, and I want to calculate its length in hex with big endian and save it in a 8 byte char pointer. In this example I would like to get the following string:
"\x00\x00\x00\x00\x00\x00\x00\x50".
How do I do that automatically in C with for example sprintf?
I am not even able to get "\x50" in a char pointer:
char tmp[1];
sprintf(tmp, "\x%x", 50); // version 1
sprintf(tmp, "\\x%x", 50); // version 2
Version 1 and 2 don't work.

I have given a number, for example n = 10, and I want to calculate its length in hex
Repeatedly divide by 16 to find the number of hexadecimal digits. A do ... while insures the result is 1 when n==0.
int hex_length = 0;
do {
hex_length++;
} while (number /= 16);
save it in a 8 byte char pointer.
C cannot force your system to use 8-byte pointer. So if you system uses 4 byte char pointer, we are out of luck. Let us assume OP's system uses 8-byte pointer. Yet integers may be assigned to pointers. This may or may not result in valid pointer.
assert(sizeof (char*) == 8);
char *char_pointer = n;
printf("%p\n", (void *) char_pointer);
In this example I would like to get the following string: "\x00\x00\x00\x00\x00\x00\x00\x50".
In C, a string includes the various characters up to an including a null character. "\x00\x00\x00\x00\x00\x00\x00\x50" is not a valid C string, yet is a valid string literal. Code cannot construct string literals at run time, that is a part of source code. Further the relationship between n==10 and "\x00...\x00\x50" is unclear. Instead perhaps the goal is to store n into a 8-byte array (big endian).
char buf[8];
for (int i=8; i>=0; i--) {
buf[i] = (char) n;
n /= 256;
}
OP's code certainly will fail as it attempts to store a string which is too small. Further "\x%x" is not valid code as \x begins an invalid escape sequence.
char tmp[1];
sprintf(tmp, "\x%x", 50); // version 1

Just do:
int i;
...
int length = round(ceil(log(i) / log(16)));
This will give you (in length) the number of hexadecimal digits needed to represent i (without 0x of course).
log(i) / log(base) is the log-base of i. The log16 of i gives you the exponent.
To make clear what we're doing here: When rising 16 to the power of the found exponent, we get back i: 16^log16(i) = i.
By rounding up this exponent using ceil(), you get the number of digits.

Related

memcpy long long int (casting to char*) into char array

I was trying to split a long long into 8 character. Which the first 8 bits represent the first character while the next represent the second...etc.
However, I was using two methods. First , I shift and cast the type and it went well.
But I've failed when using memcpy. The result would be reversed...(which the first 8 bits become the last character). Shouldn't the memory be consecutive and in the same order? Or was I messing something up...
void num_to_str(){
char str[100005] = {0};
unsigned long long int ans = 0;
scanf("%llu" , &ans);
for(int j = 0; j < 8; j++){
str[8 * i + j] = (unsigned char)(ans >> (56 - 8 * j));
}
printf("%s\n", str);
return;
}
This work great:
input : 8102661169684245760
output : program
However, the following doesn't act as I expected.
void num_to_str(){
char str[100005] = {0};
unsigned long long int ans = 0;
scanf("%llu" , &ans);
memcpy(str , (char *)&ans , 8);
for(int i = 0; i < 8; i++)
printf("%c", str[i]);
return;
}
This work unexpectedly:
input : 8102661169684245760
output : margorp
PS:I couldn't even use printf("%s" , str) or puts(str)
I assume that the first character was stored as '\0'
I am a beginner, so I'll be grateful if someone can help me out
The order of bytes within a binary representation of a number within an encoding scheme is called endianness.
In a big-endian system bytes are ordered from the most significant byte to the least significant.
In a little-endian system bytes are ordered from the least significant byte to the most significant one.
There are other endianness, but they are considered esoteric nowadays so you won't find them in practice.
If you run your program on a little endian system, (e.g. x86) you get your results.
You can read more:
https://en.wikipedia.org/wiki/Endianness
You may think why would anyone sane design and use a little endian system where bytes are reversed from how we humans are used (we use big endian for ordering digits when we write). But there are advantages. You can read some here: The reason behind endianness?

C programming why does the address of char array increment from 0012FF74 to 0012FF75?

Heres the code:
char chararray[] = {68, 97, 114, 105, 110};
/* 1 byte each*/
int i;
printf("chararray intarray\n");
printf("-------------------\n");
for(i = 0; i < 5; i++)
printf("%p\n", (chararray + i));
Output:
chararray
---------
0012FF74
0012FF75
0012FF76
0012FF77
Now im trying to understand this in terms of hexadecimal, bits and bytes.
I understand that a char is 1 byte and its supposed to increment by 1 byte which is 8 bits.
But I dont understand how its only increasing by 1 in hex? 1 hexadecimal only represents 4 bits correct? so Im kind of confused, it seems like its only incrementing by 4 bits.
Any help on clearing this up is greatly appreciated thanks!
It's true that if you represent a byte in hexa then it is made out of 2 hexa digits where each one stands for 4 bits.
However, the addresses you are seeing are addresses of bytes, and not the content of them. Each byte receives its own address, and the addresses are sequential, just like if we gave each byte a number: byte 0, byte 1, byte 2, byte 3,....
The address in a pointer points to a byte, not to a bit. Your pointer is of type char *, so when it is incremented, the address increases by sizeof(char). If, however, you used a different type, such as int, your pointer would increase by sizeof(int) on each increment, even if it is pointing to a char [] array.
On my machine, sizeof(int)==4, for example.
I wrote this code:
#include <stdio.h>
int main()
{
char str[] = "ACBDEFGHIJKLMNOPQRSTUVWXYZ";
int *a = str;
printf("Char\tAddr\n");
while(a <= &str[25])
{
printf("%c\t%p\n", *a, (void *)a);
a++;
}
return 0;
}
Output:
Char Addr
A 00D5F9BC
E 00D5F9C0
I 00D5F9C4
M 00D5F9C8
Q 00D5F9CC
U 00D5F9D0
Y 00D5F9D4
Every fourth character in the string is outputted.
First, pointer arithmetics like (chararray + i), where chararray points to a char (i.e. is of type char*) increases the value of pointer chararray by i * sizeof(char). Note that sizeof(char) is 1 by definition.
Second, a pointer represents a memory address, which is represented by an integral value that indicates a position in an (absolutely or relatively) addressed memory block, e.g. on the heap, on the stack, on some other data segment, ... . Confer, for example, the following statement in this online C standard draft:
6.3.2.3 Pointers
(5) An integer may be converted to any pointer type. ...
(6) Any pointer type may be converted to an integer type. ...
So when viewing the value of a pointer, we can think of an integral value, just like 256 or 1024 (when "viewed" in decimal format), or 0x100 or 0x400 (when viewed in hexadecimal format). Note that 256 in decmial is equivalent to 100 in hexadecimal, and this has nothing to do with bits and bytes.
Adding 1 to an integral value of 256 (or 0x100) gives 257 (or 0x101), regardless of whether this value stands for a position in a memory block or for oranges sold in the department store. So it's all about "outputting" integral values in hex format.
See the following code illustrating this:
int main()
{
char chararray[] = {68, 97, 114, 105, 110};
for(int i = 0; i < 5; i++) {
char *ptr = (chararray + i);
unsigned long ptrAsIntegralVal = (unsigned long)ptr;
printf("ptr: %p; in decmial format: %lu\n", ptr, ptrAsIntegralVal);
}
}
Output:
ptr: 0x7fff5fbff767; in decmial format: 140734799804263
ptr: 0x7fff5fbff768; in decmial format: 140734799804264
ptr: 0x7fff5fbff769; in decmial format: 140734799804265
ptr: 0x7fff5fbff76a; in decmial format: 140734799804266
ptr: 0x7fff5fbff76b; in decmial format: 140734799804267
Using hexadecimal numbers is just another way of representing any number. It has nothing to do with bits and bytes. One byte is 8 bits, no matter if you represent it as hexadecimal number or decimal number. So it just increases by one = 1 Byte = 8 Bits.

Get bits from number string

If I have a number string (char array), one digit is one char, resulting in that the space for a four digit number is 5 bytes, including the null termination.
unsigned char num[] ="1024";
printf("%d", sizeof(num)); // 5
However, 1024 can be written as
unsigned char binaryNum[2];
binaryNum[0] = 0b00000100;
binaryNum[1] = 0b00000000;
How can the conversion from string to binary be made effectively?
In my program i would work with ≈30 digit numbers, so the space gain would be big.
My goal is to create datapackets to be sent over UDP/TCP.
I would prefer not to use libraries for this task, since the available space the code can take up is small.
EDIT:
Thanks for quick response.
char num = 0b0000 0100 // "4"
--------------------------
char num = 0b0001 1000 // "24"
-----------------------------
char num[2];
num[0] = 0b00000100;
num[1] = 0b00000000;
// num now contains 1024
I would need ≈ 10 bytes to contain my number in binary form. So, if I as suggested parse the digits one by one, starting from the back, how would that build up to the final big binary number?
In general, converting a number in string representation to decimal is easy because each character can be parsed separately. E.g. to convert "1024" to 1024 you can just look at the '4', convert it to 4, multiply by 10, then convert the 2 and add it, multiply by 10, and so on until you have parsed the whole string.
For binary it is not so easy, e.g. you can convert 4 to 100 and 2 to 010 but 42 is not 100 010 or 110 or something like that. So, your best bet is to convert the whole thing to a number and then convert that number to binary using mathematical operations (bit shifts and such). This will work fine for numbers that fit in one of the C++ number types, but if you want to handle arbitrarily large numbers you will need a BigInteger class which seems to be a problem for you since the code has to be small.
From your question I gather that you want to compress the string representation in order to transmit the number over a network, so I am offering a solution that does not strictly convert to binary but will still use fewer bytes than the string representation and is easy to use. It is based on the fact that you can store a number 0..9 in 4 bits, and so you can fit two of those numbers in a byte. Hence you can store an n-digit number in n/2 bytes. The algorithm could be as follows:
Take the last character, '4'
Subtract '0' to get 4 (i.e. an int with value 4).
Strip the last character.
Repeat to get 0
Concatenate into a single byte: digits[0] = (4 << 4) + 0.
Do the same for the next two numbers: digits[1] = (2 << 4) + 1.
Your representation in memory will now look like
4 0 2 1
0100 0000 0010 0001
digits[0] digits[1]
i.e.
digits = { 64, 33 }
This is not quite the binary representation of 1024, but it is shorter and it allows you to easily recover the original number by reversing the algorithm.
You even have 5 values left that you don't use for storing digits (i.e. everything larger than 1010) which you can use for other things like storing the sign, decimal point, byte order or end-of-number delimiter).
I trust that you will be able to implement this, should you choose to use it.
If I understand your question correctly, you would want to do this:
Convert your string representation into an integer.
Convert the integer into binary representation.
For step 1:
You could loop through the string
Subtract '0' from the char
Multiply by 10^n (depending on the position) and add to a sum.
For step 2 (for int x), in general:
x%2 gives you the least-significant-bit (LSB).
x /= 2 "removes" the LSB.
For example, take x = 6.
x%2 = 0 (LSB), x /= 2 -> x becomes 3
x%2 = 1, x /= 2 -> x becomes 1
x%2 = 1 (MSB), x /= 2 -> x becomes 0.
So we we see that (6)decimal == (110)bin.
On to the implementation (for N=2, where N is maximum number of bytes):
int x = 1024;
int n=-1, p=0, p_=0, i=0, ex=1; //you can use smaller types of int for this if you are strict on memory usage
unsigned char num[N] = {0};
for (p=0; p<(N*8); p++,p_++) {
if (p%8 == 0) { n++; p_=0; } //for every 8bits, 1) store the new result in the next element in the array. 2) reset the placing (start at 2^0 again).
for (i=0; i<p_; i++) ex *= 2; //ex = pow(2,p_); without using math.h library
num[n] += ex * (x%2); //add (2^p_ x LSB) to num[n]
x /= 2; // "remove" the last bit to check for the next.
ex = 1; // reset the exponent
}
We can check the result for x = 1024:
for (i=0; i<N; i++)
printf("num[%d] = %d\n", i, num[i]); //num[0] = 0 (0b00000000), num[1] = 4 (0b00000100)
To convert a up-to 30 digit decimal number, represented as a string, into a serious of bytes, effectively a base-256 representation, takes up to 13 bytes. (ceiling of 30/log10(256))
Simple algorithm
dest = 0
for each digit of the string (starting with most significant)
dest *= 10
dest += digit
As C code
#define STR_DEC_TO_BIN_N 13
unsigned char *str_dec_to_bin(unsigned char dest[STR_DEC_TO_BIN_N], const char *src) {
// dest[] = 0
memset(dest, 0, STR_DEC_TO_BIN_N);
// for each digit ...
while (isdigit((unsigned char) *src)) {
// dest[] = 10*dest[] + *src
// with dest[0] as the most significant digit
int sum = *src - '0';
for (int i = STR_DEC_TO_BIN_N - 1; i >= 0; i--) {
sum += dest[i]*10;
dest[i] = sum % 256;
sum /= 256;
}
// If sum is non-zero, it means dest[] overflowed
if (sum) {
return NULL;
}
}
// If stopped on something other than the null character ....
if (*src) {
return NULL;
}
return dest;
}

How to convert from integer to unsigned char in C, given integers larger than 256?

As part of my CS course I've been given some functions to use. One of these functions takes a pointer to unsigned chars to write some data to a file (I have to use this function, so I can't just make my own purpose built function that works differently BTW). I need to write an array of integers whose values can be up to 4095 using this function (that only takes unsigned chars).
However am I right in thinking that an unsigned char can only have a max value of 256 because it is 1 byte long? I therefore need to use 4 unsigned chars for every integer? But casting doesn't seem to work with larger values for the integer. Does anyone have any idea how best to convert an array of integers to unsigned chars?
Usually an unsigned char holds 8 bits, with a max value of 255. If you want to know this for your particular compiler, print out CHAR_BIT and UCHAR_MAX from <limits.h> You could extract the individual bytes of a 32 bit int,
#include <stdint.h>
void
pack32(uint32_t val,uint8_t *dest)
{
dest[0] = (val & 0xff000000) >> 24;
dest[1] = (val & 0x00ff0000) >> 16;
dest[2] = (val & 0x0000ff00) >> 8;
dest[3] = (val & 0x000000ff) ;
}
uint32_t
unpack32(uint8_t *src)
{
uint32_t val;
val = src[0] << 24;
val |= src[1] << 16;
val |= src[2] << 8;
val |= src[3] ;
return val;
}
Unsigned char generally has a value of 1 byte, therefore you can decompose any other type to an array of unsigned chars (eg. for a 4 byte int you can use an array of 4 unsigned chars). Your exercise is probably about generics. You should write the file as a binary file using the fwrite() function, and just write byte after byte in the file.
The following example should write a number (of any data type) to the file. I am not sure if it works since you are forcing the cast to unsigned char * instead of void *.
int homework(unsigned char *foo, size_t size)
{
int i;
// open file for binary writing
FILE *f = fopen("work.txt", "wb");
if(f == NULL)
return 1;
// should write byte by byte the data to the file
fwrite(foo+i, sizeof(char), size, f);
fclose(f);
return 0;
}
I hope the given example at least gives you a starting point.
Yes, you're right; a char/byte only allows up to 8 distinct bits, so that is 2^8 distinct numbers, which is zero to 2^8 - 1, or zero to 255. Do something like this to get the bytes:
int x = 0;
char* p = (char*)&x;
for (int i = 0; i < sizeof(x); i++)
{
//Do something with p[i]
}
(This isn't officially C because of the order of declaration but whatever... it's more readable. :) )
Do note that this code may not be portable, since it depends on the processor's internal storage of an int.
If you have to write an array of integers then just convert the array into a pointer to char then run through the array.
int main()
{
int data[] = { 1, 2, 3, 4 ,5 };
size_t size = sizeof(data)/sizeof(data[0]); // Number of integers.
unsigned char* out = (unsigned char*)data;
for(size_t loop =0; loop < (size * sizeof(int)); ++loop)
{
MyProfSuperWrite(out + loop); // Write 1 unsigned char
}
}
Now people have mentioned that 4096 will fit in less bits than a normal integer. Probably true. Thus you can save space and not write out the top bits of each integer. Personally I think this is not worth the effort. The extra code to write the value and processes the incoming data is not worth the savings you would get (Maybe if the data was the size of the library of congress). Rule one do as little work as possible (its easier to maintain). Rule two optimize if asked (but ask why first). You may save space but it will cost in processing time and maintenance costs.
The part of the assignment of: integers whose values can be up to 4095 using this function (that only takes unsigned chars should be giving you a huge hint. 4095 unsigned is 12 bits.
You can store the 12 bits in a 16 bit short, but that is somewhat wasteful of space -- you are only using 12 of 16 bits of the short. Since you are dealing with more than 1 byte in the conversion of characters, you may need to deal with endianess of the result. Easiest.
You could also do a bit field or some packed binary structure if you are concerned about space. More work.
It sounds like what you really want to do is call sprintf to get a string representation of your integers. This is a standard way to convert from a numeric type to its string representation. Something like the following might get you started:
char num[5]; // Room for 4095
// Array is the array of integers, and arrayLen is its length
for (i = 0; i < arrayLen; i++)
{
sprintf (num, "%d", array[i]);
// Call your function that expects a pointer to chars
printfunc (num);
}
Without information on the function you are directed to use regarding its arguments, return value and semantics (i.e. the definition of its behaviour) it is hard to answer. One possibility is:
Given:
void theFunction(unsigned char* data, int size);
then
int array[SIZE_OF_ARRAY];
theFunction((insigned char*)array, sizeof(array));
or
theFunction((insigned char*)array, SIZE_OF_ARRAY * sizeof(*array));
or
theFunction((insigned char*)array, SIZE_OF_ARRAY * sizeof(int));
All of which will pass all of the data to theFunction(), but whether than makes any sense will depend on what theFunction() does.

ASCII and printf

I have a little (big, dumb?) question about int and chars in C. I rememeber from my studies that "chars are little integers and viceversa," and that's okay to me. If I need to use small numbers, the best way is to use a char type.
But in a code like this:
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
int i= atoi(argv[1]);
printf("%d -> %c\n",i,i);
return 0;
}
I can use as argument every number I want. So with 0-127 I obtain the expected results (the standard ASCII table) but even with bigger or negative numbers it seems to work...
Here is some example:
-181 -> K
-182 -> J
300 -> ,
301 -> -
Why? It seems to me that it's cycling around the ascii table, but I don't understand how.
When you pass an int corresponding to the "%c" conversion specifier, the int is converted to an unsigned char and then written.
The values you pass are being converted to different values when they are outside the range of an unsigned (0 to UCHAR_MAX). The system you are working on probably has UCHAR_MAX == 255.
When converting an int to an unsigned char:
If the value is larger than
UCHAR_MAX, (UCHAR_MAX+1) is
subtracted from the value as many
times as needed to bring it into the
range 0 to UCHAR_MAX.
Likewise, if the
value is less than zero, (UCHAR_MAX+1)
is added to the value as many times
as needed to bring it into the range
0 to UCHAR_MAX.
Therefore:
(unsigned char)-181 == (-181 + (255+1)) == 75 == 'K'
(unsigned char)-182 == (-182 + (255+1)) == 74 == 'J'
(unsigned char)300 == (300 - (255+1)) == 44 == ','
(unsigned char)301 == (301 - (255+1)) == 45 == '-'
The %c format parameter interprets the corresponding value as a character, not as an integer. However, when you lie to printf and pass an int in what you tell it is a char, its internal manipulation of the value (to get a char back, as a char is normally passed as an int anyway, with varargs) happens to yield the values you see.
My guess is that %c takes the first byte of the value provided and formats that as a character. On a little-endian system such as a PC running Windows, that byte would represent the least-significant byte of any value passed in, so consecutive numbers would always be shown as different characters.
You told it the number is a char, so it's going to try every way it can to treat it as one, despite being far too big.
Looking at what you got, since J and K are in that order, I'd say it's using the integer % 128 to make sure it fits in the legal range.
Edit: Please disregard this "answer".
Because you are on a little-endian machine :)
Serously, this is an undefined behavior. Try changing the code to printf("%d -> %c, %c\n",i,i,'4'); and see what happens then...
When we use the %c in printf statement, it can access only the first byte of the integer.
Hence anything greater than 256 is treated as n % 256.
For example
i/p = 321 yields op=A
What atoi does is converting the string to numerical values, so that "1234" gets 1234 and not just a sequence of the ordinal numbers of the string.
Example:
char *x = "1234"; // x[0] = 49, x[1] = 50, x[2] = 51, x[3] = 52 (see the ASCII table)
int y = atoi(x); // y = 1234
int z = (int)x[0]; // z = 49 which is not what one would want

Resources