Get last two characters from char array elements - arrays

Trying to print the array elements, like below
typedef struct Info
{
char MacAdd[2];
} Info;
char MAC[100];
sprintf(MAC, "%x:%x%c", Info->MacAdd[0], Info->MacAdd[1],'\0');
printf("MAC %s",MAC);
Got output is --> ffffff98:ffffffa4
How can get output like ---> 98:a4

The MacAdd array in your Info structure is declared as an array of char. But char is usually a signed type.
When you call printf, certain "default argument conversions" take place. Among other things, type char is promoted to int. And since 0x98 as a signed char is actually negative (it's -104), it is automatically sign extended when that happens -- that's where the extra ff's come from. The value ffffff98 is the 32-bit signed version of -104, so the value has been preserved.
There are at least three ways to fix this, in what I'd consider the order of most to least attractive:
Redeclare the MacAdd array as an array of unsigned char.
Change the printf format to "%hhx:%hhx". The hh tells printf that your value was originally a char, and this asks printf to undo the promotion and, in effect, strip the unwanted ff's back off.
Change the call to sprintf(MAC, "%x:%x", Info->MacAdd[0] & 0xff, Info->MacAdd[1] & 0xff);
As #Aconcagua points out in a comment, once you get rid of the ff's, you will probably also want to tweak your printf format a little bit more, to take care of single-digit addresses.
Footnote: Solution 3 above is reasonably terrible, and I wouldn't seriously recommend it. It's what we used to use back in the days of Ritchie's original C compiler, when neither unsigned char nor %hhx had been invented yet.

Related

read fucntion in C uint_8 and char array buffer differences

in some code seen online, i saw that in read function in C, someone uses a uint8_t array for buffer insted of a char array buffer.
what are the differences?
thanks
The C standard allows char to be signed or unsigned. It also allows it to be more than eight bits.
uint8_t, if it is defined, is unsigned and eight bits. This allows programmers to be completely sure of the type that will be used. In particular, signed char types sometimes cause problems with bitwise and shift operations, due to how these operations are defined (or are not defined) when negative values are involved.
So every char corresponds to a number(see ascii table here). I think people use this to avoid some problems(sorry I don't use c I come from c++)

Why can I printf with the wrong specifier and still get output?

My question involves the memory layout and mechanics behind the C printf() function. Say I have the following code:
#include <stdio.h>
int main()
{
short m_short;
int m_int;
m_int = -5339876;
m_short = m_int;
printf("%x\n", m_int);
printf("%x\n", m_short);
return 0;
}
On GCC 7.5.0 this program outputs:
ffae851c
ffff851c
My question is, where is the ffff actually coming from in the second hex number? If I'm correct, those fs should be outside the bounds of the short, but printf is getting them from somewhere.
When I properly format with specifier %hx, the output is rightly:
ffae851c
851c
As far as I have studied, the compiler simply truncates the top half of the number, as shown in the second output. So in the first output, are the first four fs from the program actually reading into memory that it shouldn't? Or does the C compiler behind-the-scenes still reserve a full integer even for a short, sign-extended, but the high half shall be undefined behavior, if used?
Note: I am performing research, in a real-world application, I would never try to abuse the language.
When a char or short (including signed and unsigned versions) is used as a function argument where there is no specific type (as with the ... arguments to printf(format,...))1, it is automatically promoted to an int (assuming it is not already as wide as an int2).
So printf("%x\n", m_short); has an int argument. What is the value of that argument? In the assignment m_short = m_int;, you attempted to assign it the value −5339876 (represented with bytes 0xffae851c). However, −5339876 will not fit in this 16-bit short. In assignments, a conversion is automatically performed, and, when a conversion of an integer to a signed integer type does not fit, the result is implementation-defined. It appears your implementation, as many do, uses two’s complement and simply takes the low bits of the integer. Thus, it puts the bytes 0x851c in m_short, representing the value −31460.
Recall that this is being promoted back to int for use as the argument to printf. In this case, it fits in an int, so the result is still −31460. In a two’s complement int, that is represented with the bytes 0xffff851c.
Now we know what is being passed to printf: An int with bytes 0xffff851c representing the value −31460. However, you are printing it with %x, which is supposed to receive an unsigned int. With this mismatch, the behavior is not defined by the C standard. However, it is a relatively minor mismatch, and many C implementations let it slide. (GCC and Clang do not warn even with -Wall.)
Let’s suppose your C implementation does not treat printf as a special known function and simply generates code for the call as you have written it, and that you later link this program with a C library. In this case, the compiler must pass the argument according to the specification of the Application Binary Interface (ABI) for your platform. (The ABI specifies, among other things, how arguments are passed to functions.) To conform to the ABI, the C compiler will put the address of the format string in one place and the bits of the int in another, and then it will call printf.
The printf routine will read the format string, see %x, and look for the corresponding argument, which should be an unsigned int. In every C implementation and ABI I know of, an int and an unsigned int are passed in the same place. It may be a processor register or a place on the stack. Let’s say it is in register r13. So the compiler designed your calling routine to put the int with bytes 0xffff851c in r13, and the printf routine looked for an unsigned int in r13 and found bytes 0xffff851c.
So the result is that printf interprets the bytes 0xffff851c as if they were an unsigned int, formats them with %x, and prints “ffff851c”.
Essentially, you got away with this because (a) a short is promoted to an int, which is the same size as the unsigned int that printf was expecting, and (b) most C implementations are not strict about mismatching integer types of the same width with printf. If you had instead tried printing an int using %ld, you might have gotten different results, such as “garbage” bits in the high bits of the printed value. Or you might have a case where the argument you passed is supposed to be in a completely different place from the argument printf expected, so none of the bits are correct. In some architectures, passing arguments incorrectly could corrupt the stack and break the program in a variety of ways.
Footnotes
1 This automatic promotion happens in many other expressions too.
2 There are some technical details regarding these automatic integer promotions that need not concern us at the moment.

what happens when we type cast from lower datatype to higher datatype

Will the accessibility of memory space get changed or just informing the compiler take the variable of mentioned type?
Example:
int main()
{
char a;
a = 123456789;
printf("ans is %d\n",(int)a);
}
Output:
overflow in implicit constant conversion a= 123456789.
ans is 21.
Here I know why it's causing overflow. But I want to know how memory is accessed when an overflow occurs.
This is kind of simple: Since char typically only holds one byte, only a single byte of 123456789 will be copied to a. Exactly how depends on if char is signed or unsigned (it's implementation-specific which one it is). For the exact details see e.g. this integer conversion reference.
What typically happens (I haven't seen any compiler do any different) is that the last byte of the value is copied, unmodified, into a.
For 123456789, if you view the hexadecimal representation of the value it will be 0x75bcd15. Here you can easily see that the last byte is 0x15 which is 21 in decimal.
What happens with the casting to int when you print the value is actually nothing that wouldn't happen anyway... When using variable-argument functions like printf values of a smaller type than int will be promoted to an int. Your printf call is exactly equal to
printf("ans is %d\n",a);

Question regarding C argument promotions [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 11 years ago.
Alright actually I've study about how to use looping to make my code more efficient so that I could use a particular block of code that should be repeated without typing it over and over again, and after attempted to use what I've learn so far to program something, I feel it's time for me to proceed to the next chapter to learn on how to use control statement to learn how to instructs the program to make decision.
But the thing is that, before I advance myself to it, I still have a few question that need any expert's help on previous stuff. Actually it's about datatype.
A. Character Type
I extract the following from the book C primer Plus 5th ed:
Somewhat oddly , C treats character
constans as type int rather than
char. For example, on an ASCII system
with a 32-bit int and an 8-bit char
, the code:
char grade = 'B';
represents 'B' as the numerical value
66 stored in a 32-bit unit, grade
winds up with 66 stored ub ab 8-bit
unit. This characteristic of character
constants makes it possible to define
a character constant such as 'FATE',
with four separate 8-bit ASCII codes
stored in a 32-bit unit. However ,
attempting to assign such a character
constant to a char variable results
in only the last 8 bits being used,
so the variable gets the value 'E'.
So the next thing I did after reading this was of course, follow what it mentions, that is I try store the word FATE on a variable with char grade and try to compile and see what it'll be stored using printf(), but instead of getting the character 'E' printed out, what I get is 'F'.
Does this mean there's some mistake in the book? OR is there something I misunderstood?
From the above sentences, there's a line says C treats character constants as type int. So to try it out, I assign a number bigger than 255, (e.x. 356) to the char type.
Since 356 is within the range of 32-bit int (I'm running Windows 7), therefore I expect it would print out 356 when I use the %d specifier.
But instead of printing 356, it gives me 100, which is the last 8-bits value.
Why does this happen? I thought char == int == 32-bits? (Although it does mention before char is only a byte).
B. Int and Floating Type
I understand when a number stores in variable in short type is pass to variadic function or any implicit prototype function, it'll be automatically promoted to int type.
This also happen to floating point type, when a floating-point number with float type is passed, it'll be converted to double type, that is why there's no specifier for the float type but instead there's only %f for double and %Lf for long double.
But why there's a specifier for short type although it is also promoted but not float type? Why don't they just give a specifier for float type with a modifier like %hf or something? Is there anything logical or technical behind this?
A lot of questions in one question... Here are answers to a couple:
This characteristic of character constants makes it possible to define a character constant such as 'FATE' , with four separate 8-bit ASCII codes stored in a 32-bit unit.However , attempting to assign such a character constant to a char variable results in only the last 8 bits being used , so the variable gets the value 'E'.
This is actually implementation defined behavior. So yes, there's a mistake in the book. Many books on C are written with the assumption that the only C compiler in the world is the one the author used when testing the examples.
The compiler the author use treated the characters in 'FATE' as the bytes of an integer with the 'F' being the most significant byte and 'E' being the least significant. Your compiler treats the characters in the literal as bytes of an inteder with 'F' being the least significant byte and 'E' the most significant. For example, The first method is how MSVC treats the value, while MinGW (a GCC compiler targeting Windows) treats the literal in the second way.
As far as there being no format specifier to printf() that expects float, on specifiers that expect double - this is because the values passed to printf() for formatting are part of the variable argument list (the ... in printf()'s prototype). There is not type information about these arguments, so as you mentioned, the compiler must always promote them (from C99 6.5.2.2/6 "Function calls"):
If the expression that denotes the called function has a type that does not include a prototype, the integer promotions are performed on each argument, and arguments that have type float are promoted to double. These are called the default argument promotions.
And C99 6.5.2.2/7 "Function calls"
The ellipsis notation in a function prototype declarator causes argument type conversion to stop after the last declared parameter. The default argument promotions are performed on trailing arguments.
So in effect, it's impossible to pass a float to printf() - it will always be promoted to a double. That's why the format specifiers for floating point values expect a double.
Also due to the automatic promotion that would be applied to short, I'm honestly not sure if the h specifier for formatting a short is strictly necessary (though it is necessary for use with the n specifier if you want to get the count of characters written to the stream placed in a short). It might be in C because it needs to be there to support the n specifier, historical reasons, or something that I'm just not thinking of.
First, a char is by definition exactly 1 byte wide. Then the standard more or less says that the sizes should be:
sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long)
The exact sizes vary (except for char) by system and compiler but on a 32 bit Windows the sizes with GCC and VC are (AFAIK):
sizeof(short) == 2 (byte)
sizeof(int) == sizeof(long) == 4 (byte)
Your observation of 'F' versus 'E' in this case is a typical endianness issue (little vs. big endian, how a "word" is stored in memory).
Now what happens to your value ? You have a variable that is 8 bit wide. You assign a bigger value ('FATE' or 356), but the compiler knows it only is allowed to store 8 bits so it cuts off all the other bits.
To A:
3.) This is due to the different byte orderings of big and little endian CPU achitectures. You get the first byte on a little endian (i.e. x86) and the last byte on a big endian CPU (i.e. PPC). Actually you get always the lowest 8 bit when the conversion fom int to char is done but the characters in the int are stored in reversed order.
7.) a char can only hold 8 bits, so everything else gets truncated in the moment you assign the int to a char variable and can never be restored from the char variable later.
To B:
3.) You might sometimes want to print only the lowest 16 bits of a int variable regardless of what is in the higher half. It is not uncommon to pack multiple integer values in a single variable for certain optimizations. This works well for integer types but makes not much sense for floating point types which don't support bitwise operations directly, which might be the reason why there is no separate type specifier for float in printf.
char is 1 byte long. The bit length of a byte can be 8, 16, 32 bits long. In general purpose computers generally the bitlength of character is 8 bits long. So the maximum number which the character can represent depends on the bitlength of the character. To check the bitlength of character check limits.h header file it is defined as CHAR_BIT in this file.
char x = 'FATE' will depend probably on the byte ordering which the machine/compiler will interpret the 'FATE' . So this depends on the system/compiler. Someone please confirm/correct this.
If your system has 8 bits byte, then, when you do c = 360 only the lower 8 bits of the binary representation of 360 will be stored in the variable, because char data is always allocated 1 byte of storage. So %d will print 100 because the upper bits were lost when you assigned the value in the variable, and what is left is only the lower 8 bits.

C compiler flag to ignore sign

I am currently dealing with code purchased from a third party contractor. One struct has an unsigned char field while the function that they are passing that field to requires a signed char. The compiler does not like this, as it considers them to be mismatched types. However, it apparently compiles for that contractor. Some Googling has told me that "[i]t is implementation-defined whether a char object can hold negative values". Could the contractor's compiler basically ignore the signed/unsigned type and treat them the same? Or is there a compiler flag that will treat them the same?
C is not my strongest language--just look at my tags on my user page--so any help would be much appreciated.
Actually char, signed char and unsigned char are three different types. From the standard (ISO/IEC 9899:1990):
6.1.2.5 Types
...
The three types char, signed char and
unsigned char are collectively called
the character types.
(and in C++ for instance you have to (or at least should) write override functions with three variants of them if you have a char argument)
Plain char might be treated signed or unsigned by the compiler, but the standard says (also in 6.1.2.5):
An object declared as type char is
large enough to store any member of
the basic execution character set. If
a member of the required source
character set in 5.2.1 is stored in a
char object, its value is guarantied
to be positive. If other quantities
are stored in a char object, the
behavior is implementation-defined:
the values are treated as either
signed or nonnegative integers.
and
An object declared as type signed char occupies the same amount of storage as a ''plain'' char object.
The characters referred to in 5.2.1 are A-Z, a-z, 0-9, space, tab, newline and the following 29 graphic characters:
! " # % & ' ( ) * + , - . / :
; < = > ? [ \ ] ^ _ { | } ~
Answer
All of that I interpret to basically mean that ascii characters with value less than 128 are guarantied to be positive. So if the values stored always are less than 128 it should be safe (from a value preserving perspective), although not so good practice.
This is compiler-dependent. For example, in VC++ there's a compiler option and a corresponding _CHAR_UNSIGNED macro defined if that option instructs to use unsigned char by default.
I take it that you're talking about fields of type signed char and unsigned char, so they're explicitly wrong. If one of them was simply char, it might match in whatever compiler the contractor is using (IIRC, it's implementation-defined whether char is signed or unsigned), but not in yours. In that case, you might be able to get by with a command-line option or something to change yours.
Alternatively, the contractor might be using a compiler, or compiler options, that allow him to compile while ignoring errors or warnings. Do you know what sort of compilation environment he has?
In any case, this is not good C. If one of the types is just char, it relies on implementation-defined behavior, and therefore isn't portable. If not, it's flat wrong. I'd take this up with the contractor.

Resources