Why cast is needed in printf? - c

To print a number of type off_t it was recommended to use the following piece of code:
off_t a;
printf("%llu\n", (unsigned long long)a);
Why the format string is not enough?
What will be the problem if it were not casted?

The format string doesn't tell the compiler to perform a cast to unsigned long long, it just tells printf that it's going to receive an unsigned long long. If you pass in something that's not an unsigned long long (which off_t might not be), then printf will simply misinterpret it, with surprising results.
The reason for this is that the compiler doesn't have to know anything about format strings. A good compiler will give you a warning message if you write printf("%d", 3.0), but what can a compiler do if you write printf(s, 3.0), with s being a string determined dynamically at run-time?
Edited to add: As Keith Thompson points out in the comments below, there are many places where the compiler can perform this sort of implicit conversion. printf is rather exceptional, in being one case where it can't. But if you declare a function to accept an unsigned long long, then the compiler will perform the conversion:
#include <stdio.h>
#include <sys/types.h>
int print_llu(unsigned long long ull)
{
return printf("%llu\n", ull); // O.K.; already converted
}
int main()
{
off_t a;
printf("%llu\n", a); // WRONG! Undefined behavior!
printf("%llu\n", (unsigned long long) a); // O.K.; explicit conversion
print_llu((unsigned long long) a); // O.K.; explicit conversion
print_llu(a); // O.K.; implicit conversion
return 0;
}
The reason for this is that printf is declared as int printf(const char *format, ...), where the ... is a "variadic" or "variable-arguments" notation, telling the compiler that it can accept any number and types of arguments after the format. (Obviously printf can't really accept any number and types of arguments: it can only accept the number and types that you tell it to, using format. But the compiler doesn't know anything about that; it's left to the programmer to handle it.)
Even with ..., the compiler does do some implicit conversions, such as promoting char to int and float to double. But these conversions are not specific to printf, and they do not, and cannot, depend on the format string.

The problem is you don't know how big an off_t is. It could be a 64 bit type or a 32 bit type (or perhaps something else). If you use %llu, and do not pass an (unsigned) long long type, you'll get undefined behavior, in practice it might just print garbage.
Not knowing how big it is, the easy way out is to cast it to the biggest reasonable type your system supports, e.g. a unsigned long long. That way using %llu is safe, as printf will receive an unsigned long long type because of the cast.
(e.g. on linux, the size of an off_t is 32 bit by default on a 32 bit machine, and 64 bit if you enable large file support via #define _FILE_OFFSET_BITS=64 before including the relevant system headers)

The signature of printf looks like this:
int printf(const char *format, ...);
The vararg... indicates that anything can follow, and by the rules of C, you can pass anything to printf as long as you include a format string. C simply does not have any constructs to describe any restrictions for the types of objects passed. This is why you must use casts so that the objects passed have exactly the needed type.
This is typical for C, it walks a line between rigidity and trusting the programmer. An unrelated example is that you may use char * (without const) to refer to string literals, but if you modify them, your program may crash.

Related

long int argument != long int parameter

When passing 1113355579999 as an argument, the value changes inside the function to 959050335.
Call(main.c):
printf("%d\n", FindCommonDigit(1113355579999, 123457));
Function(ex4.c):
int FindCommonDigit(long int n1, long int n2) { printf("%d\n", n1); }
What's the problem?
worth mentioning that the value changes before getting to the printf.
The decimal number 1113355579999 is too large to be accommodated by a 32-bit integer, which is a common size for type long int, and in fact is the size of long long int in your MSVC environment. On a C implementation that provides 32-bit long ints, that constant has type long long int.
You can pass a long long int to a parameter of type long int, but if the value is too large for long int then the resulting behavior is implementation-defined. Possibly the least-significant 32 bits are retained, which, in the case of your particular number, would result in the number 959050335 (look familiar?). To pass the argument into the function without loss of fidelity, the function parameter must have a type that can accommodate the argument. On a conforming C implementation, long long int will suffice.
Having received the argument correctly, the function must also present it correctly to printf(), else the behavior is undefined. The formatting directive for a long long int expressed in decimal is %lld.
Putting that together, you appear to want this:
int FindCommonDigit(long long int n1, long long int n2) {
printf("%lld\n", n1);
return /* ... something ... */;
}
You do need the function to return an int, else the behavior is again undefined.
Additionally, as #pmg observed in comments, a prototype for that function must be in scope at the point where it is called. That would be this ...
int FindCommonDigit(long long int n1, long long int n2);
... near the top of the source file in which the function is used (i.e. main.c). You can put that directly into the file if you like, but you should consider instead putting the prototype into a header file and #includeing that. The latter is particularly useful if the function will be used in multiple source files.
Note that only long long int is guaranteed to be large enough to store the result of that calculation (or, indeed, the input values you're using).
You will also need to ensure that you use your compiler in a C99-compatible mode (for example, using the -std=gnu99 option to gcc). This is because the long long int type was not introduced until C99
1113355579999 is too large to fit in your platform's long ints.

Explicit cast of pointer to long long

I need to cast a pointer to a long long, and would prefer to do it in a way that gcc doesn't complain on either 32 or 64-bit architectures about converting pointer to ints of different size. And before anyone asks, yes, I know what I'm doing, and I know what I'm casting to -- my specific use case is wanting to send a stack trace (the pointers themselves being the subject here) over the network when an application error occurs, so there is no guarantee the sender and receiver will have the same word size. I've therefore built a struct holding the message data with, among other entries, an array of "unsigned long long" values (guaranteed minimum 64-bits) to hold the pointers. And yes, I know "long long" is not guaranteed to be only 64-bits, but all compilers I'm using for both source and destination implement it as 64-bits. Because the header (and source) with the struct will be used on both architectures, "uintptr_t" doesn't seem like a workable solution (because, according to the definition in stdint.h, its size is architecture-dependent).
I thought about getting tricky with anonymous unions, but this feels a little too hackish to me...I'm hoping there's a way with some double-cast magic or something to do this in C99 (since anonymous unions weren't standard until C11).
EDIT:
typedef struct error_msg_t {
int msgid;
int len;
pid_t pid;
int si_code;
int signum;
int errno;
unsigned long long stack[20];
char err_msg[];
} error_msg_t;
...
void **stack;
...
msg.msgid = ERROR_MSG;
msg.len = sizeof(error_msg_t) + strlen(err_msg) + 1);
msg.pid = getpid();
...
for (i=0; i<stack_depth; i++)
msg.stack[i] = (unsigned long long)stack[i];
Warning (on a 32-bit compile) about casting to integer of different size occurs on the last line.
Probably your best bet is to double cast to spell it out to the compiler what you want to do (as suggested by Max).
I would recommend wrapping it up into a macro so that the code intention is clear from the macro name.
#define PTR_TO_UINT64(x) (uint64_t)(uintptr_t)(x)

Pass char* to method expecting unsigned char*

I am working on some embedded device which has SDK. It has a method like:
MessageBox(u8*, u8*); // u8 is typedefed unsigned char when I checked
But I have seen in their examples calling code like:
MessageBox("hi","hello");
passing char pointer without cast. Can this be well defined? I am asking because I ran some tool over the code, and it was complaining about above mismatch:
messageBox("Status", "Error calculating \rhash");
diy.c 89 Error 64: Type mismatch (arg. no. 1) (ptrs to signed/unsigned)
diy.c 89 Error 64: Type mismatch (arg. no. 2) (ptrs to signed/unsigned)
Sometimes I get different opinions on this answer and this confuses me even more. So to sum up, by using their API the way described above, is this problem? Will it crash the program?
And also it would be nice to hear what is the correct way then to pass string to SDK methods expecting unsigned char* without causing constraint violation?
It is a constraint violation, so technically it is not well defined, but in practice, it is not a problem. Yet you should cast these arguments to silence these warnings. An alternative to littering your code with ugly casts is to define an inline function:
static inline unsigned char *ucstr(const char *str) { return (unsigned char *)str; }
And use that function wherever you need to pass strings to the APIs that (mistakenly) take unsigned char * arguments:
messageBox(ucstr("hi"), ucstr("hello"));
This way you will not get warnings while keeping some type safety.
Also note that messageBox should take const char * arguments. This SDK uses questionable conventions.
The problem comes down to it being implementation-defined whether char is unsigned or signed.
Compilers for which there is no error will be those for which char is actually unsigned. Some of those (notably the ones that are actually C++ compilers, where char and unsigned char are distinct types) will issue a warning. With these compilers, converting the pointer to unsigned char * will be safe.
Compilers which report an error will be those for which char is actually signed. If the compiler (or host) uses an ASCII or similar character set, and the characters in the string are printable, then converting the string to unsigned char * (or, better, to const unsigned char * which avoids dropping constness from string literals) is technically safe. However, those conversions are potentially unsafe for implementations that use different character sets OR for strings that contain non-printable characters (e.g. values of type signed char that are negative, and values of unsigned char greater than 127). I say potentially unsafe, because what happens depends on what the called function does - for example does it check the values of individual characters? does it check the individual bits of individual characters in the string? The latter is, if the called function is well designed, one reason it will accept a pointer to unsigned char *.
What you need to do therefore comes down to what you can assume about the target machine, and its char and unsigned char types - and what the function is doing with its argument. The most general approach (in the sense that it works for all character sets, and regardless of whether char is signed or unsigned) is to create a helper function which copies the array of char to a different array of unsigned char. The working of that helper function will depend on how (and if) you need to handle the conversion of signed char values with values that are negative.

Using size_t for specifying the precision of a string in C's printf

I have a structure to represent strings in memory looking like this:
typedef struct {
size_t l;
char *s;
} str_t;
I believe using size_t makes sense for specifying the length of a char string. I'd also like to print this string using printf("%.*s\n", str.l, str.s). However, the * precision expects an int argument, not size_t. I haven't been able to find anything relevant about this. Is there someway to use this structure correctly, without a cast to int in the printf() call?
printf("%.*s\n", (int)str.l, str.s)
// ^^^^^ use a type cast
Edit
OK, I didn't read the question properly. You don't want to use a type cast, but I think, in this case: tough.
Either that or simply use fwrite
fwrite(str.s, str.l, 1, stdout);
printf("\n");
You could do a macro
#define STR2(STR) (int const){ (STR).l }, (char const*const){ (STR).s }
and then use this as printf("%.*s\n", STR2(str)).
Beware that this evaluates STR twice, so be carefull with side effects, but you probably knew that already.
Edit:
I am using compound initializers such that these are implicit conversions. If things go wrong there are more chances that the compiler will warn you than with an explicit cast.
E.g if STR has a field .l that is a pointer and you'd only put a cast to int, all compilers would happily convert that pointer to int. Similar for the .s field this really has to correspond to a char* or something compatible, otherwise you'd see a warning or error.
There is no guarantee that the size_t is an int, or that it can be represented within an int. It's just part of C's legacy in not defining the exact size of an int, coupled with concerns that size_t's implementation might need to be leveraged to address large memory areas (ones that have more than MAX_INT values in them).
The most common error concerning size_t is to assume that it is equivalent to unsigned int. Such old bugs were common, and from personal experience it makes porting from a 32 bit to a 64 bit architecture a pain, as you need to undo this assumption.
At best, you can use a cast. If you really want to get rid of the cast, you could alternatively discard the use of size_t.

sizeof (int) == sizeof (void*)?

Is there an integer type with the same size as pointer? Guaranteed on all microarchitectures?
According to this Wikipedia page, in C99 your stdint.h header might declare intptr_t and uintptr_t, but then that of course requires
C99
A compiler implementor which has chosen to implement this optional part of the standard
So in general I think this one is tough.
Simply put, no. Not guaranteed on all architectures.
My question is: why? If you want to allocate a type big enough to store a void*, the best thing to allocate is (surprisingly enough :-) a void*. Why is there a need to fit it within an int?
EDIT: Based on your comments to your duplicate question, you want to store special values of the pointer (1,2,3) to indicate extra information.
NO!! Don't do this!!. There is no guarantee that 1, 2 and 3 aren't perfectly valid pointers. That may be the case in systems where you're required to align pointers on 4-byte boundaries but, since you asked about all architectures, I'm assuming you have portability as a high value.
Find another way to do it that's correct. For example, use the union (syntax from memory, may be wrong):
typedef struct {
int isPointer;
union {
int intVal;
void *ptrVal;
}
} myType;
Then you can use the isPointer 'boolean' to decide if you should treat the union as an integer or pointer.
EDIT:
If execution speed is of prime importance, then the typedef solution is the way to go. Basically, you'll have to define the integer you want for each platform you want to run on. You can do this with conditional compilation. I would also add in a runtime check to ensure you've compiled for each platform correctly thus (I'm defining it in the source but you would pass that as a compiler flag, like "cc -DPTRINT_INT"):
#include <stdio.h>
#define PTRINT_SHORT
#ifdef PTRINT_SHORT
typedef short ptrint;
#endif
#ifdef PTRINT_INT
typedef int ptrint;
#endif
#ifdef PTRINT_LONG
typedef long ptrint;
#endif
#ifdef PTRINT_LONGLONG
typedef long long ptrint;
#endif
int main(void) {
if (sizeof(ptrint) != sizeof(void*)) {
printf ("ERROR: ptrint doesn't match void* for this platform.\n");
printf (" sizeof(void* ) = %d\n", sizeof(void*));
printf (" sizeof(ptrint ) = %d\n", sizeof(ptrint));
printf (" =================\n");
printf (" sizeof(void* ) = %d\n", sizeof(void*));
printf (" sizeof(short ) = %d\n", sizeof(short));
printf (" sizeof(int ) = %d\n", sizeof(int));
printf (" sizeof(long ) = %d\n", sizeof(long));
printf (" sizeof(long long) = %d\n", sizeof(long long));
return 1;
}
/* rest of your code here */
return 0;
}
On my system (Ubuntu 8.04, 32-bit), I get:
ERROR: ptrint typedef doesn't match void* for this platform.
sizeof(void* ) = 4
sizeof(ptrint ) = 2
=================
sizeof(short ) = 2
sizeof(int ) = 4
sizeof(long ) = 4
sizeof(long long) = 8
In that case, I'd know I needed to compile with PTRINT_INT (or long). There may be a way of catching this at compile time with #if, but I couldn't be bothered researching it at the moment. If you strike a platform where there's no integer type sufficient for holding a pointer, you're out of luck.
Keep in mind that using special pointer values (1,2,3) to represent integers may also not work on all platforms - this may actually be valid memory addresses for pointers.
Still ,if you're going to ignore my advice, there's not much I can do to stop you. It's your code after all :-). One possibility is to check all your return values from malloc and, if you get 1, 2 or 3, just malloc again (i.e., have a mymalloc() which does this automatically). This'll be a minor memory leak but it'll guarantee no clashes between your special pointers and real pointers.
The C99 standard defines standard int types:
7.18.1.4 Integer types capable of holding object pointers
The following type designates a signed integer type with the property that any valid pointer to void can be converted to this type, then converted back to pointer to void, and the result will compare equal to the original pointer:
intptr_t
The following type designates an unsigned integer type with the property that any valid pointer to void can be converted to this type, then converted back to pointer to void, and the result will compare equal to the original pointer:
uintptr_t
These types are optional.
C99 also defines size_t and ptrdiff_t:
The types are
ptrdiff_t
which is the signed integer type of the result of subtracting two pointers;
size_t
which is the unsigned integer type of the result of the sizeof operator; and
The architectures I've seen have the maximum size of an object equal to the whole memory, so sizeof(size_t) == sizeof(void*), but I'm not aware of anything that is both portable to C89 ( which size_t is ) and guaranteed to be large enough ( which uintptr_t is ).
This would be true on a standard 32 bit system, but there certainly are no guarantees, and you could find lots of architectures where it isn't true. For example, a common misconception is that sizeof(int) on x86_64 would be 8 (since it's a 64 bit system, I guess), which it isn't. On x86_64, sizeof(int) is still 4, but sizeof(void*) is 8.
The standard solution to this problem is to write a small program which checks the sizes of all int types (short int, int, long int) and compares them to void*. If there is a match, it emits a piece of code which defines the intptr type. You can put this in a header file and use the new type.
It's simple to include this code in the build process (using make, for example)
No, the closest you will come to a portable pointer-capable integer type would be intptr_t and ptrdiff_t.
No.
I do not believe the C standard even specifies standard int sizes. Combine that with all the architectures out there (8/16/32/64bit etc) and there is no way to guarantee anything.
int data type would be the answer on most architectures.
But thre is NO guarantee to this for ANY (micro)architecture.
The answer seems to be "no", but if all you need is a type that can act as both, you can use a union:
union int_ptr_t {
int i;
void* p;
};
Usually sizeof(*void) depends on memory bus width (although not necessarily - pre-RISC AS/400 had 48-bit address bus but 64-bit pointers), and int usually is as big as CPU's general-purpose register (there are also exceptions - SGI C used 32-bit ints on 64-bit MIPS).
So there is no guarantee.

Resources