How to ANSI-C cast from unsigned int * to char *? - c

I want these two print functions to do the same thing:
unsigned int Arraye[] = {0xffff,0xefef,65,66,67,68,69,0};
char Arrage[] = {0xffff,0xefef,65,66,67,68,69,0};
printf("%s", (char*)(2+ Arraye));
printf("%s", (char*)(2+ Arrage));
where Array is an unsigned int. Normally, I would change the type but, the problem is that most of the array is numbers, although the particular section should be printed as ASCII. Currently, the unsigned array prints as "A" and the char array prints as the desired "ABCDE".

This is how the unsigned int version will be arranged in memory, assuming 32-bit big endian integers.
00 00 ff ff 00 00 ef ef 00 00 00 41 00 00 00 42
00 00 00 43 00 00 00 44 00 00 00 45 00 00 00 00
This is how the char version will be arranged in memory, assuming 8-bit characters. Note that 0xffff does not fit in a char.
ff ef 41 42 43 44 45 00
So you can see, casting is not enough. You'll need to actually convert the data.
If you know that your system uses 32-bit wchar_t, you can use the l length modifier for printf.
printf("%ls", 2 + Arraye);
This is NOT portable. The alternative is to copy the unsigned int array into a char array by hand, something like this:
void print_istr(unsigned int const *s)
{
unsigned int const *p;
char *s2, *p2;
for (p = s; *p; p++);
s2 = xmalloc(p - s + 1);
for (p = s, p2 = s2; *p2 = *p; p2++, p++);
fputs(s2, stdout);
free(s2);
}

As Dietrich said, a simple cast will not do, but you don't need a complicated conversion either. Simply loop over your array.
uint_t Arraye[] = {0xffff,0xefef,65,66,67,68,69,0};
char Arrage[] = {0xffff,0xefef,65,66,67,68,69,0};
uint_t *p;
for(p = Arraye+2; p; p++)
printf("%c", p);
printf("%s", (char*)(2+ Arrage));

Related

Conversion from float to char[32] (or vice-versa) in C

I have two variables: a float named diff with a value like 894077435904.000000 (not always only with zero in the decimal part) and a char[32] which is the result of a double-sha256 calculation. I need to do a comparison between them (if(hash < diff) { //do someting } ), but for this I need to convert one to the type of the other.
Is there a way to accomplish this? For example, converting the float to a char* (and using strcmp to do the comparison) or the char* to float (and using the above approach - if it's even possible, considering the char* is 256 bits, or 32 bytes long)?
I have tried converting float to char* like this:
char hex_str[2*sizeof(diff)+1];
snprintf(hex_str, sizeof(hex_str), "%0*lx", (int)(2*sizeof diff), (long unsigned int)diff);
printf("%s\n", hex_str);
When I have diff=894077435904.000000 I get hex_str=d02b2b00. How can I verify if this value is correct? Using this converter I obtain different results.
It is explained in great detail here.
Create an array of 32 unsigned bytes, set all its values to zero.
Extract the top byte from the difficulty and subtract that from 32.
Copy the bottom three bytes from the difficulty into the array, starting the number of bytes into the array that you computed in step 2.
This array now contains the difficulty in raw binary. Use memcmp to compare it to the hash in raw binary.
Example code:
#include <stdio.h>
#include <string.h>
char* tohex="0123456789ABCDEF";
void computeDifficulty(unsigned char* buf, unsigned j)
{
memset(buf, 0, 32);
int offset = 32 - (j >> 24);
buf[offset] = (j >> 16) & 0xffu;
buf[offset + 1] = (j >> 8) & 0xffu;
buf[offset + 2] = j & 0xffu;
}
void showDifficulty(unsigned j)
{
unsigned char buf[32];
computeDifficulty(buf, j);
printf("%x -> ", j);
for (int i = 0; i < 32; ++i)
printf("%c%c ", tohex[buf[i] >> 4], tohex[buf[i] & 0xf]);
printf("\n");
}
int main()
{
showDifficulty(0x1b0404cbu);
}
Output:
1b0404cb -> 00 00 00 00 00 04 04 CB 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

Why padding works based on datatype

I wonder the behavior of this below program, as the padding works based on the adjacent datatype in C programming.
#include <stdio.h>
struct abc{
char a1;
int a2;
}X;
struct efg
{
char b1;
double b2;
}Y;
int main()
{
printf("Size of X = %d\n",sizeof(X));
printf("Size of Y = %d\n",sizeof(Y));
return 0;
}
Output of the program
root#root:~$./mem
Size of X = 8
Size of Y = 16
In Structure abc 3 bytes are padded whereas in structure efg 7 bytes are padded.
Is this how padding designed?
Padding is being added to avoid the members crossing a word boundary when they don't need to; alignment, as some have said in comments. There is a nice explanation about it here:
http://www.geeksforgeeks.org/structure-member-alignment-padding-and-data-packing/
The size of the largest member does have an effect on the padding of the other members. Generally, all members are aligned to the size of the largest member. I believe this is because it is just the simplest/most effective way for the compiler to ensure that all struct members are properly aligned.
Because of this, an interesting detail is that you can often save space if you order your struct members by size, with the largest members declared first. Here's some code to illustrate that (I always find looking at a dump of the actual memory helps with things like this, rather than just the size)
#include <stdio.h>
// Inefficient ordering-- to avoid members unnecessarily crossing word
// boundaries, extra padding is inserted.
struct X {
unsigned long a; // 8 bytes
unsigned char b; // 4 bytes
unsigned int c; // 4 bytes
unsigned char d; // 4 bytes
};
// By ordering the struct this way, we use the space more
// efficiently. The last two bytes can get packed into a single word.
struct Y {
unsigned long a; // 8 bytes
unsigned int c; // 4 bytes
unsigned char b; // 1 byte
unsigned char d; // 3 bytes
};
struct X x = {.a = 1, .b = 2, .c = 3, .d = 4};
struct Y y = {.a = 1, .b = 2, .c = 3, .d = 4};
// Print out the data at some memory location, in hex
void print_mem (void *ptr, unsigned int num)
{
int i;
unsigned char *bptr = (unsigned char *)ptr;
for (i = 0; i < num; ++i) {
printf("%.2X ", bptr[i]);
}
printf("\n");
}
int main (void)
{
print_mem(&x, sizeof(struct X)); // This one will be larger
print_mem(&y, sizeof(struct Y)); // This one will be smaller
return 0;
}
And the output from running the above code:
01 00 00 00 00 00 00 00 02 00 00 00 03 00 00 00 04 00 00 00 00 00 00 00
01 00 00 00 00 00 00 00 03 00 00 00 02 04 00 00
There are various subtleties to this, I'm sure it works a bit differently on various implementations. See http://www.catb.org/esr/structure-packing for more in-depth details about struct ordering/packing...

call a vararg function with an array?

In this example below, I would like to pass to a function that receive variable number of arguments the content of an array.
In other terms, I would like to pass to printf the content of foo by value and thus, pass these arguments on the stack.
#include <stdarg.h>
#include <stdio.h>
void main()
{
int foo[] = {1,2,3,4};
printf("%d, %d, %d, %d\n", foo);
}
I know this example looks stupid because I can use printf("%d, %d, %d, %d\n", 1,2,3,4);. Just imagine I'm calling void bar(char** a, ...) instead and the array is something I receive from RS232...
EDIT
In other words, I would like to avoid this:
#include <stdarg.h>
#include <stdio.h>
void main()
{
int foo[] = {1,2,3,4};
switch(sizeof(foo))
{
case 1: printf("%d, %d, %d, %d\n", foo[0]); break;
case 2: printf("%d, %d, %d, %d\n", foo[0], foo[1]); break;
case 3: printf("%d, %d, %d, %d\n", foo[0], foo[1], foo[2]); break;
case 4: printf("%d, %d, %d, %d\n", foo[0], foo[1], foo[2], foo[3]); break;
...
}
}
I would like to pass to printf the content of foo by value and thus, pass these arguments on the stack.
You cannot pass an array by value. Not by "normal" function call, and not by varargs either (which is, basically, just a different way of reading the stack).
Whenever you use an array as argument to a function, what the called function receives is a pointer.
The easiest example for this is the char array, a.k.a. "string".
int main()
{
char buffer1[100];
char buffer2[] = "Hello";
strcpy( buffer2, buffer1 );
}
What strcpy() "sees" is not two arrays, but two pointers:
char * strcpy( char * restrict s1, const char * restrict s2 )
{
// Yes I know this is a naive implementation in more than one way.
char * rc = s1;
while ( ( *s1++ = *s2++ ) );
return rc;
}
(This is why the size of the array is only known in the scope the array was declared in. Once you pass it around, it's just a pointer, with no place to put the size information.)
The same holds true for passing an array to a varargs function: What ends up on the stack is a pointer to the (first element of) the array, not the whole array.
You can pass an array by reference and do useful things with it in the called function if:
you pass the (pointer to the) array and a count of elements (think argc / argv), or
caller and callee agree on a fixed size, or
caller and callee agree on the array being "terminated" in some way.
Standard printf() does the last one for "%s" and strings (which are terminated by '\0'), but is not equipped to do so with, as in your example, an int[] array. So you would have to write your own custom printme().
In no case are you passing the array "by value". If you think about it, it wouldn't make much sense to copy all elements to the stack for larger arrays anyway.
As already said, you cannot pass an array by value in a va_arg directly. It is possible though if it is packed inside a struct. It is not portable but one can do some things when the implementation is known.
Here an example, that might help.
void call(size_t siz, ...);
struct xx1 { int arr[1]; };
struct xx10 { int arr[10]; };
struct xx20 { int arr[20]; };
void call(size_t siz, ...)
{
va_list va;
va_start(va, siz);
struct xx20 x = va_arg(va, struct xx20);
printf("HEXDUMP:%s\n", HEXDUMP(&x, siz));
va_end(va);
}
int main(void)
{
struct xx10 aa = { {1,2,3,4,5,[9]=-1}};
struct xx20 bb = { {[10]=1,2,3,4,5,[19]=-1}};
struct xx1 cc = { {-1}};
call(sizeof aa, aa);
call(sizeof bb, bb);
call(sizeof cc, cc);
}
Will print following (HEXDUMP() is one of my debug functions, it's obvious what it does).
HEXDUMP:
0x7fff1f154160:01 00 00 00 02 00 00 00 03 00 00 00 04 00 00 00 ................
0x7fff1f154170:05 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x7fff1f154180:00 00 00 00 ff ff ff ff ........
HEXDUMP:
0x7fff1f154160:00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x7fff1f154170:00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x7fff1f154180:00 00 00 00 00 00 00 00 01 00 00 00 02 00 00 00 ................
0x7fff1f154190:03 00 00 00 04 00 00 00 05 00 00 00 00 00 00 00 ................
0x7fff1f1541a0:00 00 00 00 00 00 00 00 00 00 00 00 ff ff ff ff ................
Tested on Linux x86_64 compiled with gcc 5.1 and Solaris SPARC9 compiled with gcc 3.4
I don't know if it is helpful, but it's maybe a start. As can be seen, using the biggest struct array in the functions va_arg allows to handle smaller arrays if the size is known.
But be careful, it probably is full of undefined behaviours (example, if you call the function with a struct array size smaller than 4 int, it doesn't work on Linux x86_64 because the struct is passed by registers, not as an array on stack, but on your embedded processor it might work).
Short answer: No, you can't do it, it's impossible.
Slightly longer answer: Well, maybe you can do it, but it's super tricky. You are basically trying to call a function with an argument list that is not known until run time. There are libraries that can help you dynamically construct argument lists and call functions with them; one library is libffi: https://sourceware.org/libffi/.
See also question 15.13 in the C FAQ list: How can I call a function with an argument list built up at run time?
See also these previous Stackoverflow questions:
C late binding with unknown arguments
How to call functions by their pointers passing multiple arguments in C?
Calling a variadic function with an unknown number of parameters
Ok look at this example, from my code. This is simple one way.
void my_printf(char const * frmt, ...)
{
va_list argl;
unsigned char const * tmp;
unsigned char chr;
va_start(argl,frmt);
while ((chr = (unsigned char)*frmt) != (char)0x0) {
frmt += 1;
if (chr != '%') {
dbg_chr(chr);
continue;
}
chr = (unsigned char)*frmt;
frmt += 1;
switch (chr) {
...
case 'S':
tmp = va_arg(argl,unsigned char const *);
dbg_buf_str(tmp,(uint16_t)va_arg(argl,int));
break;
case 'H':
tmp = va_arg(argl,unsigned char const *);
dbg_buf_hex(tmp,(uint16_t)va_arg(argl,int));
break;
case '%': dbg_chr('%'); break;
}
}
va_end(argl);
}
There dbg_chr(uint8_t byte) drop byte to USART and enable transmitter.
Use example:
#define TEST_LEN 0x4
uint8_t test_buf[TEST_LEN] = {'A','B','C','D'};
my_printf("This is hex buf: \"%H\"",test_buf,TEST_LEN);
As mentioned above, variadic argument might be passed as a struct-packed array:
void logger(char * bufr, uint32_t * args, uint32_t argNum) {
memset(buf, 0, sizeof buf);
struct {
uint32_t ar[16];
} argStr;
for(uint8_t a = 0; a < argNum; a += 1)
argStr.ar[a] = args[a];
snprintf(buf, sizeof buf, bufr, argStr);
strcat(buf, '\0');
pushStr(buf, strlen(buf));
}
tested and works with gnu C compiler

Incorrect hex representations of characters with char but correct with unsigned char

I was writing a function that prints the "hexdump" of a given file. The function is as stated below:
bool printhexdump (FILE *fp) {
long unsigned int filesize = 0;
char c;
if (fp == NULL) {
return false;
}
while (! feof (fp)) {
c = fgetc (fp);
if (filesize % 16 == 0) {
if (filesize >= 16) {
printf ("\n");
}
printf ("%08lx ", filesize);
}
printf ("%02hx ", c);
filesize++;
}
printf ("\n");
return true;
}
However, on certain files, certain invalid integer representations seem to be get printed, for example:
00000000 4d 5a ff90 00 03 00 00 00 04 00 00 00 ffff ffff 00 00
00000010 ffb8 00 00 00 00 00 00 00 40 00 00 00 00 00 00 00
00000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000030 00 00 00 00 00 00 00 00 00 00 00 00 ff80 00 00 00
00000040 ffff
Except for the last ffff caused due to the EOF character, the ff90, ffff, ffb8 etc. are wrong. However, if I change char to unsigned char, I get the correct representation:
00000000 4d 5a 90 00 03 00 00 00 04 00 00 00 ff ff 00 00
00000010 b8 00 00 00 00 00 00 00 40 00 00 00 00 00 00 00
00000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00000030 00 00 00 00 00 00 00 00 00 00 00 00 80 00 00 00
00000040 ff
Why would the above behaviour happen?
Edit: the treatment of c by printf() should be the same since the format specifiers don't change. So I'm not sure how char would get sign extended while unsigned char won't?
Q: the treatment of c by printf() should be the same since the format specifiers don't change.
A: OP is correct, the treatment of c by printf() did not change. What changed was what was passed to printf(). As char or unsigned char, c goes through the usual integer promotions typically to int. char, if signed, gets a sign extension. A char value like 0xFF is -1. An unsigned char value like 0xFF remains 255.
Q: So I'm not sure how char would get sign extended while unsigned char won't?
A: They both got a sign extension. char may be negative, so its sign extension may be 0 or 1 bits. unsigned char is always positive, so its sign extension is 0 bits.
Solution
char c;
printf ("%02x ", (unsigned char) c);
// or
printf ("%02hhx ", c);
// or
unsigned char c;
printf ("%02x ", c);
// or
printf ("%02hhx ", c);
char can be a signed type, and in that case values 0x80 to 0xff get sign-extended before being passed to printf.
(char)0x80 is sign-extended to -128, which in unsigned short is 0xff80.
[edit] To be clearer about promotion; the value stored in a char is eight bits, and in that eight-bit representation a value like 0x90 will represent either -112 or 114, depending on whether the char is signed or unsigned. This is because the most significant bit is taken as the sign bit for signed types, and a magnitude bit for unsigned types. If that bit is set, it either makes the value negative (by subtracting 128) or it makes it larger (by adding 128) depending on the whether or not it's a signed type.
The promotion from char to int will always happen, but if char is signed then converting it to int requires that the sign bit be unrolled up to the sign bit of the int so that the int represents the same value as the char did.
Then printf gets ahold of it, but that doesn't know whether the original type was signed or unsigned, and it doesn't know that it used to be a char. What it does know is that the format specifier is for an unsigned hexadecimal short, so it prints that number as if it were unsigned short. The bit pattern for -112 in a 16-bit int is 1111111110010000, formatted as hex, that's ff90.
If your char is unsigned then 0x90 does not represent a negative value, and when you convert it to an int nothing needs to be changed in the int to make it represent the same value. The rest of the bit pattern is all zeroes and printf doesn't need those to display the number correctly.
Because in unsigned char the most significant bit has a different meaning than that of signed char.
For example, 0x90 in binary is 10010000 which is 144 decimal, unsigned, but signed it is -16 decimal.
Whether or not char is signed is platform-dependant. This means that the sign bit may or may not be extended depending on your machine, and thus you can get different results.
However, using unsigned char ensures that there is no sign extension (because there is no sign bit anymore).
The problem is simply caused by the format. %h02x takes an int. When you take a character below 128, all is fine it is positive and will not change when converted to an int.
Now, let's take a char above 128, say 0x90. As an unsigned char, its value is 144, it will be converted to an int value of 144, and be printed at 90. But as a signed char, its value is -112 (still 0x90) it will be converted to an int of value -112 (0xff90 for a 16 bits int) and be printed as ff90.

What is the exact usage of shift operator in C

I thought shift operator shifts the memory of the integer or the char on which it is applied but the output of the following code came a surprise to me.
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
int main(void) {
uint64_t number = 33550336;
unsigned char *p = (unsigned char *)&number;
size_t i;
for (i=0; i < sizeof number; ++i)
printf("%02x ", p[i]);
printf("\n");
//shift operation
number = number<<4;
p = (unsigned char *)&number;
for (i=0; i < sizeof number; ++i)
printf("%02x ", p[i]);
printf("\n");
return 0;
}
The system on which it ran is little endian and produced the following output:
00 f0 ff 01 00 00 00 00
00 00 ff 1f 00 00 00 00
Can somebody provide some reference to the detailed working of the shift operators?
I think you've answered your own question. The machine is little endian, which means the bytes are stored in memory with the least significant byte to the left. So your memory represents:
00 f0 ff 01 00 00 00 00 => 0x0000000001fff000
00 00 ff 1f 00 00 00 00 => 0x000000001fff0000
As you can see, the second is the same as the first value, shifted left by 4 bits.
Everything is right:
(1 * (256^3)) + (0xff * (256^2)) + (0xf0 * 256) = 33 550 336
(0x1f * (256^3)) + (0xff * (256^2)) = 536 805 376
33 550 336 * (2^4) = 536 805 376
Shifting left by 4 bits is the same as multiplying by 2^4.
I think you printf confuses you. Here are the values:
33550336 = 0x01FFF000
33550336 << 4 = 0x1FFF0000
Can you read you output now?
It doesn't shift the memory, but the bits. So you have the number:
00 00 00 00 01 FF F0 00
After shifting this number 4 bits (one hexadecimal digit) to the left you have:
00 00 00 00 1F FF 00 00
Which is exactly the output you get, when transformed to little endian.
Your loop is printing bytes in the order they are stored in memory, and the output would be different on a big-endian machine. If you want to print the value in hex just use %016llx. Then you'll see what you expect:
0000000001fff000
000000001fff0000
The second value is left-shifted by 4.

Resources