How can a char variable accept Pointer(NULL) as its value? - c

I understand that a char variable can accept a null character(1 byte) i.e; \0 as its value but, I don't understand how a char variable in my application below accepts a pointer(4 bytes) as its value and still works properly?
#include<stdio.h>
int main()
{
char p[10]="Its C";
printf("%s\n",p);
p[3]='\0'; // assigning null character
printf("%s\n",p);
p[2]=NULL; // assigning null pointer to a char variable
printf("%s\n",p);
p[1]=(void *)0; // assigning null pointer to a char variable
printf("%s\n",p);
return 0;
}
Note: GCC Compiler (32 Bit Linux Platform).

The NULL macro is required to expand to "an implementation-defined null pointer constant".
A null pointer constant is defined as "An integer constant expression with the value 0, or such an expression cast to type void *". Counterintuitively, this definition does not require the expansion of NULL to be an expression of pointer type. A common implementation is:
#define NULL 0
A null pointer constant, when used in a context that requires a pointer, may be implicitly converted to a pointer value; the result is a null pointer. It may also be explicitly converted using a cast, such as (int*)NULL.
But there's no requirement that an expression that qualifies as a null pointer constant may only be used in such a context. Which means that if the implementation chooses to define NULL as above, then this:
char c = NULL; // legal but ugly
is legal and initializes c to the null character.
Such an initialization is non-portable (since NULL may also expand to ((void*)0) and misleading, so it should be avoided, but a compiler is likely to let it through without warning; NULL is expanded to 0 by the preprocessing phase of the compiler, and later phases see it as char c = 0;, which is legal and innocuous -- though personally I'd prefer char c = '\0';.
I just tried your example on my own 32-bit Ubuntu system, with gcc 4.7. With no options specified, the compiler warned about both p[2]=NULL; and p[1]=(void *)0;:
c.c:8:9: warning: assignment makes integer from pointer without a cast [enabled by default]
c.c:10:9: warning: assignment makes integer from pointer without a cast [enabled by default]
The second warning is to be expected from any C compiler; the first indicates that NULL is actually defined as ((void*)0) (running the code through gcc -E confirms this).
The compiler didn't simply "accept" these assignments; it warned you about them. The C language standard merely requires a "diagnostic" for any violation of the language rules, even a syntax error; that diagnostic may legally be a non-fatal warning message. You can make gcc behave more strictly with -std=c89 -pedantic-errors; replace c89 by c99 or c11 to enforce rules from later versions of the standard. (EDIT: I see from comments that you're using a web interface to the compiler that hides warnings; see my comment on your question for a workaround. Warnings are important.)
If you post C code that produces compiler warnings please show us the warnings and pay close attention to them yourself. They often indicate serious problems, even illegalities, in your program.
A language-lawyer quibble: it's not even clear that this:
char c = (void*)0;
specifies a conversion from void* to char. My own view is that, since it violates a constraint, it has no defined semantics. Most compilers that don't reject it will treat it as if it were a void*-to-char conversion, and it's also been argued that this is the required behavior. But you can avoid such questions if you simply pay attention to compiler warnings and/or don't write code like that in the first place.
(The rules are a bit different for C++, but you're asking about C so I won't get into that.)

NULL is a macro and for almost platform is defined in this way
#ifndef __cplusplus
#define NULL ((void *)0)
#else /* C++ */
#define NULL 0
#endif /* C++ */
(from stddef.h from my Ubuntu)
and when you write
p[2]=NULL;
It's the same
p[2]=(void *)0; //for c
p[2]=0; //for c++
It's the same
p[2] = 0; // the 0 is casted to char 0 for C --> '\0'

Because, in compilers, NULL is substituted for 0 in some compilers and ((void*)0) in others.
The value 0 in itself is a valid value for char but with the conversion to (void*), you're technically casting the 0 into a pointer type, hence why the compiler would give a warning.
Note that if the compiler substitutes NULL with 0, an integer constant, it'll be simply and silently converted into a char.

On your platform, a pointer is generally a numerical value treated as a memory address. Since the char type is numeric, a null pointer (memory address 0x00) is being stored in p[1]
The 32-bit value of the pointer (in this case, 0x00000000) is truncated to 8-bit char length: 0x00.

Try to compile this with -Wall option and you will see that there are impilicit convertions taking place.

Related

Why are there two ways of expressing NULL in C?

According to §6.3.2.3 ¶3 of the C11 standard, a null pointer constant in C can be defined by an implementation as either the integer constant expression 0 or such an expression cast to void *. In C the null pointer constant is defined by the NULL macro.
My implementation (GCC 9.4.0) defines NULL in stddef.h in the following ways:
#define NULL ((void *)0)
#define NULL 0
Why are both of the above expressions considered semantically equivalent in the context of NULL? More specifically, why do there exist two ways of expressing the same concept rather than one?
Let's consider this example code:
#include <stddef.h>
int *f(void) { return NULL; }
int g(int x) { return x == NULL ? 3 : 4; }
We want f to compile without warnings, and we want g to cause an error or a warning (because an int variable x was compared to a pointer).
In C, #define NULL ((void*)0) gives us both (GCC warning for g, clean compile for f).
However, in C++, #define NULL ((void*)0) causes a compile error for f. Thus, to make it compile in C++, <stddef.h> has #define NULL 0 for C++ only (not for C). Unfortunately, this also prevents the warning from being reported for g. To fix that, C++11 uses built-in nullptr instead of NULL, and with that, C++ compilers report an error for g, and they compile f cleanly.
((void *)0) has stronger typing and could lead to better compiler or static analyser diagnostics. For example since implicit conversions between pointers and plain integers aren't allowed in standard C.
0 is likely allowed for historical reasons, from a pre-standard time when everything in C was pretty much just integers and wild implicit conversions between pointers and integers were allowed, though possibly resulting in undefined behavior.
Ancient K&R 1st edition provides some insight (7.14 the assignment operator):
The compilers currently allow a pointer to be assigned to an integer, an integer to a pointer, and a pointer to a pointer of another type. The assignment is a pure copy operation, with no conversion. This usage is nonportable, and may produce pointers which cause addressing exceptions when used. However, it is guaranteed that assignment of the constant 0 to a pointer will produce a null pointer distinguishable from a pointer to any object.
Few things in C are more confusing than null pointers. The C FAQ list devotes an entire section to the topic, and to the myriad misunderstandings that eternally arise. And we can see that those misunderstandings never go away, as some of them are being recycled even in this thread, in 2022.
The basic facts are these:
C has the concept of a null pointer, a distinguished pointer value which points definitively nowhere.
The source code construct by which a null pointer is requested — a null pointer constant — fundamentally involves the token 0.
Because the token 0 has other uses, ambiguity (not to mention confusion) is possible.
To help reduce the confusion and ambiguity, for many years the token 0 as a null pointer constant has been hidden behind the preprocessor macro NULL.
To provide some type safety and further reduce errors, it's attractive to have the macro definition of NULL include a pointer cast.
However, and most unfortunately, enough confusion crept in along the way that properly mitigating it all has become almost impossible. In particular, there is so very much extant code that says things like strbuf[len] = NULL; (in an obvious but basically wrong attempt to null-terminate a string) that it is believed in some circles to be impossible to actually define NULL with an expansion including either the explicit cast or the hypothetical future (or extant in C++) new keyword nullptr.
See also Why not call nullptr NULL?
Footnote (call this point 3½): It's also possible for a null pointer — despite being represented in C source code as an integer constant 0 — to have an internal value that is not all-bits-0. This fact adds massively to the confusion whenever this topic is discussed, but it doesn't fundamentally change the definition.
There is just one way to express NULL in C, it's a single 4-character token.
But hold on, when going into its definition it gets more interesting.
NULL has to be defined as a null pointer constant, meaning an integer constant with value 0 or such cast to void*.
As an integer constant is just an expression of integer type with a few restrictions to guarantee static evaluation, there are infinite possibilities for any wanted value.
Of all those possibilities, only an integer literal with value 0 is also a null pointer constant in C++, for what it's worth.
The reason for such variation is history and precedent (everyone did it differently, void* was late to the party, and existing code/implementations trumps all), reinforced with backwards-compatibility which preserves it.
6.3.2.3 Pointers
[...]
An integer constant expression with the value 0, or such an expression cast to type void *, is called a null pointer constant.
67) If a null pointer constant is converted to a pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function.
[...]
6.6 Constant expressions
[...]
Description
2 A constant expression can be evaluated during translation rather than runtime, and accordingly may be used in any place that a constant may be.
Constraints
3 Constant expressions shall not contain assignment, increment, decrement, function-call, or comma operators, except when they are contained within a subexpression that is not evaluated.117)
4 Each constant expression shall evaluate to a constant that is in the range of representable values for its type.
Semantics
5 An expression that evaluates to a constant is required in several contexts. If a floating expression is evaluated in the translation environment, the arithmetic range and precision shall be at least as
great as if the expression were being evaluated in the execution environment.118)
6 An integer constant expression119) shall have integer type and shall only have operands that are integer constants, enumeration constants, character constants, sizeof expressions whose results are integer constants, _Alignof expressions, and floating constants that are the immediate operands of casts.
Cast operators in an integer constant expression shall only convert arithmetic types to integer types, except as part of an operand to the sizeof or _Alignof operator.
C was originally developed on machines where a null pointer constant and the integer constant 0 had the same representation. Later, some vendors ported the language to mainframes where a different special value triggered a hardware trap when used as a pointer, and wanted to use that value for NULL. These companies discovered that so much existing code type-punned between integers and pointers, they had to recognize 0 as a special constant that could implicitly convert to a null pointer constant. ANSI C incorporated this behavior, at the same time as they introduced the void* as a pointer that implicitly converts to any type of object pointer. This allowed NULL to be used as a safer alternative to 0.
I’ve seen some code that (possibly tongue-in-cheek) detected one of these machines by testing if ((char*)1 == 0).
why do there exist two ways of expressing the same concept rather than one?
History.
NULL started as 0 and later better programming practices encouraged ((void *)0).
First, there are more than 2 ways:
#define NULL ((void *)0)
#define NULL 0
#define NULL 0L
#define NULL 0LL
#define NULL 0u
...
Before void * (Pre C89)
Before void * and void existed, #define NULL some_integer_type_of_zero was used.
It was useful to have the size of that integer type to match the size of object pointers. Consider the below. With 16-bit int and 32-bit long, it is useful for the type of zero used to match the width of an object pointer.
Consider printing pointers.
double x;
printf("%ld\n", &x); // On systems where an object pointer was same size as long
printf("%ld\n", NULL);// Would like to use the same specifier for NULL
With 32-bit object pointers, #define NULL 0L is better.
double x;
printf("%d\n", &x); // On systems where an object pointer was same size as int
printf("%d\n", NULL);// Would like to use the same specifier for NULL
With 16-bit object pointers, #define NULL 0 is better.
C89
After the birth of void, void *, it is natural to have the null pointer constant to be a pointer type. This allowed the bit pattern of (void*)0) to be non-zero. This was useful in some architectures.
printf("%p\n", NULL);
With 16-bit object pointers, #define NULL ((void*)0) works above.
With 32-bit object pointers, #define NULL ((void*)0) works.
With 64-bit object pointers, #define NULL ((void*)0) works.
With 16-bit int, #define NULL ((void*)0) works.
With 32-bit int, #define NULL ((void*)0) works.
We now have independence of the int/long/object pointer size. ((void*)0) works in all cases.
Using #define NULL 0 creates issues when passing NULL as a ... argument, hence the irksome need to do printf("%p\n", (void*)NULL); for highly portable code.
With #define NULL ((void*)0), code like char n = NULL; will more likely raise a warning, unlike ``#define NULL 0`
C99
With the advent of _Generic, we can distinguish, for better or worse, NULL as a void *, int, long, ...
According to §6.3.2.3 ¶3 of the C11 standard, a null pointer constant in C can be defined by an implementation as either the integer constant expression 0 or such an expression cast to void *.
No, that a misleading paraphrase of the language spec. The actual language of the cited paragraph is
An integer constant expression with the value 0, or such an expression cast to type void *, is called a null pointer constant. [...]
Implementations don't get to choose between those alternatives. Both are forms of a null pointer constant in the C language. They can be used interchangeably for the purpose.
Moreover, not only the specific integer constant expression 0 can serve in this role, but any integer constant expression with value 0 can do. For example, 1 + 2 + 3 + 4 - 10 is such an expression.
Additionally, do not confuse null pointer constants generally with the macro NULL. The latter is defined by conforming implementations to expand to a null pointer constant, but that doesn't mean that the replacement text of NULL is the only null pointer constant.
My implementation (GCC 9.4.0) defines NULL in stddef.h in the
following ways:
#define NULL ((void *)0)
#define NULL 0
Not both at the same time, of course.
Why are both of the above expressions considered semantically
equivalent in the context of NULL?
Again with the reversal. It's not "the context of NULL". It's pointer context. There is nothing particularly special about the macro NULL itself to distinguish contexts in which it appears from contexts where its replacement text appears directly.
And I guess you're asking for rationale for paragraph 6.3.2.3/3, as opposed to "because 6.3.2.3/3". There is no published rationale for C11. There is one for C99, which largely serves for C90 as well, but it does not address this issue.
It should be noted, however, that void (and therefore void *) was an invention of the committee that developed the original C language specification ("ANSI C" / C89 / C90). There was no possibility of an "integer constant expression cast to type void *" before then.
More specifically, why do there
exist two ways of expressing the same concept rather than one?
Are there, really?
If we accept an integer constant expression with value 0 as a null pointer constant (a source-code entity), and we want to convert it to a runtime null pointer value, then which pointer type do we choose? Pointers to different object types do not necessarily have the same representation, so this actually matters. Type void * seems the natural choice to me, and that's consistent with the fact that, alone of all pointer types, void * can be converted to other object pointer types without a cast.
But then, in a context where 0 is being interpreted as a null pointer constant, casting it to void * is a no-op, so (void *) 0 expresses exactly the same thing as 0 in such a context.
What's really going on here
At the time the ANSI committee was working, many existing C implementations accepted integer-to-pointer conversions without a cast, and although the meaning of most such conversions was implementation and / or context specific, there was wide acceptance that converting constant 0 to a pointer yielded a null pointer. That use was by far the most common one of converting an integer constant to a pointer. The committee wanted to impose stricter rules on type conversions, but it did not want to break all the existing code that used 0 as a constant representing a null pointer.
So they hacked the spec.
They invented a special kind of constant, the null pointer constant, and provided rules around it that made it compatible with existing use. A null pointer constant, regardless of lexical form, can be implicitly converted to any pointer type, yielding a null pointer (value) of that type. Otherwise, no implicit integer-to-pointer conversions are defined.
But the committee preferred that null pointer constants should actually have pointer type without conversion (which 0 does not, pointer context or no), so they provided for the "cast to type void *" option as part of the definition of a null pointer constant. At the time, that was a forward-looking move, but the general consensus now appears to be that it was the right direction to aim.
And why do we still have the "integer constant expression with value 0"? Backwards compatibility. Consistency with conventional idioms such as {0} as a universal initializer for objects of any type. Resistance to change. Perhaps other reasons as well.
The "why" - it is for historical reasons. NULL was used in various implementations before it was added to a standard. And at the time it was added to a C standard, implementations defined NULL usually as 0, or as 0 cast to some pointer. At that point you wouldn't want to make one of them illegal, because whichever you made illegal, you'd break half the existing code.
The C11 standard allows for a null pointer constant to be defined either as the integer constant expression 0 or as an expression that is cast to void *. The use of the NULL macro makes it easier for programmers to use the null pointer constant in their code, as they don't have to remember which of these definitions the implementation uses.
Using a macro also makes it easier to change the underlying definition of the null pointer constant in the future, if necessary. For example, if the implementation decided to change the definition of NULL to be a different integer constant expression, they could do so by simply modifying the definition of the NULL macro. This would not require any changes to the code that uses the NULL macro, as long as the code uses the NULL macro consistently.
There are two definitions of the NULL macro provided in the example you gave because some systems may define NULL as an expression that is cast to void *, while others may define it as the integer constant expression 0. By providing both definitions, the stddef.h header can be used on a wide range of systems without requiring any modifications.

Casting an address of subroutine into void pointer

Is it okay to cast function location with void pointer though function pointers size is not always the same as opaque pointer size?
I already did search about opaque pointers , and casting function pointers . I found out function pointers and normal pointers are not the same on some systems.
void (*fptr)(void) = (void *) 0x00000009; // is that legal???
I feel I should do this
void (*fptr)(void)= void(*)(void) 0x00000009;
It did work fine , though I expected some errors or at least warnings
I'm using keil arm compiler
No, the problem is that you cannot go between void* and function pointers, they are not compatible types. void* is the generic pointer type for object pointers only.
In the second case you have a minor syntax error, should be (void(*)(void)). Fixing that, we have:
void (*fptr)(void) = (void *) 0x00000009; // NOT OK
void (*fptr)(void) = (void(*)(void)) 0x00000009; // PERHAPS OK (but likely not on ARM)
Regarding the former, it is simply not valid C (but might be supported as a non-standard compiler extension). Specifically, it violates the rules of simple assignment, C17 6.5.16.1. Both operands of = must be compatible pointer types.
Regarding the latter, the relevant part is C17 6.3.2.3/5
An integer may be converted to any pointer type. Except as previously specified, the
result is implementation-defined, might not be correctly aligned, might not point to an
entity of the referenced type, and might be a trap representation.
Some special cases exist for null pointers, but that does not apply in your case.
So you can probably cast the value 9 to a function pointer - you have to check the Keil compiler manual regarding implementation-defined behavior. What meaningful purpose the resulting function pointer will have from there on, I have no idea.
Because the address 9 is certainly not an aligned address on ARM. Rather, it is 1 byte past where I'd expect to find the address of the non-maskable interrupt ISR or some such. So what are you actually trying to do here? Grab an address of a ISR from the vector table?
void (*fptr)(void) = (void *) 0x00000009;
is not legal according to standard C as such.
If the pointer on the left is integer constant expression with value 0, or such expression cast to (void *) is a null pointer constant, that can be assigned to a function pointer:
void (*fptr)(void) = 0;
This only applies to the null pointer constant. It does not even apply to a variable of type void * that contains a null pointer. The constraints (C11 6.5.16.1) for simple assignments include
the left operand is an atomic, qualified, or unqualified pointer, and the right is a null pointer constant; or
In strictest sense the standard does not provide a mechanism to convert a pointer-to-void to a pointer to function at all! Not even with a cast. However it is available on most common platforms as a documented common extension C11 J.5.7 Function pointer casts:
A pointer to an object or to void may be cast to a pointer to a function, allowing data to be invoked as a function (6.5.4).
A pointer to a function may be cast to a pointer to an object or to void, allowing a function to be inspected or modified (for example, by a debugger) (6.5.4).
but it is not required at all by the C standard - indeed it is possible to use C in a platform where the code can be executed only from memory that cannot be accessed as data at all.
The first expression is perfectly legal in C, as the void * pointer type is assignment and parameter passing compatible with any other pointer type, and you can run into trouble, if pointers are different size than integers. It's not good programming style, and there's apparently no reason to assign to a function pointer the integer literal 9. I cannot guess what are you doing so for.
Despite of that, there are some few (well, very few) cases in history that that thing has been done (e.g. to give the special values SIG_DFL and SIG_IGN to the signal(2) system call, one can assume nobody will ever use those values to call the function dereferenced by the pointer, indeed, you can use some integers, different than zero, in the page zero virtual addresses of a process, to avoid dereferencing the pointers (so you cannot call the functions, or you'll get a segmentation violation immediately), while using different than zero values to assume several values apart of the NULL pointer itself)
But the second expression is not legal. It's not valid for an expression to start with a type identifier, so the subexpression to the right of the = sign is invalid. To do a correct assignment, with a valid cast, you had to write:
void (*ptr)(void) = (void (*)(void)) 0x9; /* wth to write so many zeros? */
(enclosing the whole type mark in parenthesis) then, you can call the function as:
(*ptr)();
or simply as:
ptr();
Just writing
void(*ptr)(void) = 9;
is also legal, while the integer to pointer conversion is signalled by almost every compiler with a warning. You'll get an executable from there.
If the integer is 0, then the compiler will shut up, as 0 is converted automatically to the NULL pointer.
EDIT
To illustrate the simple use I mentioned above in the first paragraph, from the file <sys/signal.h> of FreeBSD 12.0:
File /usr/include/sys/signal.h
139 #define SIG_DFL ((__sighandler_t *)0)
140 #define SIG_IGN ((__sighandler_t *)1)
141 #define SIG_ERR ((__sighandler_t *)-1)
142 /* #define SIG_CATCH ((__sighandler_t *)2) See signalvar.h */
143 #define SIG_HOLD ((__sighandler_t *)3)
all those definitions are precisely of the type mentioned in the question, an integer value cast to a pointer to function, in order to permit special values to represent non executable/non callback values. The type __sighandler_t is defined as:
161 typedef void __sighandler_t(int);
below.
From CLANG:
$ cc -std=c17 -c pru.c
$ cat pru.c
void (*ptr)(void) = (void *)0x9;
you get even no warning at all.
Without the cast:
$ cc -std=c11 pru.c
pru.c:1:8: warning: incompatible integer to pointer conversion
initializing 'void (*)(void)' with an expression of type 'int'
[-Wint-conversion]
void (*ptr)(void) = 0x9;
^ ~~~
1 warning generated.
(Only a warning, not an error)
With a zero literal:
$ cc -std=c11 -c pru.c
$ cat pru.c
void (*ptr)(void) = 0x0;
even no warning at all.

Can int store the base address of string in C?

Why do the code run without error ?
#include <stdio.h>
int main() {
int i="string"; //the base of string can be stored in a character pointer
printf("%s\n",i);
printf("%d",i);
return 0;
}
//compiling on ideone.com language c
OUTPUT:
string
134513984 //some garbage(address of "string")
Please explain if there is some flexibility in the pointer in c. I tried it for c++ which gives error: cannot convert ‘const char*’ to ‘int*’ in initialization
No, you cannot assume this in general. In part, this is because int may not be the same size as char * (in fact, on many 64-bit compilers it will not be the same size).
If you want to store a pointer as an integer, the appropriate type to use is actually intptr_t, from <stdint.h>. This is an integer which is guaranteed to be able to hold a pointer's value.
However, the circumstances when you'd actually want to do this are somewhat rare, and when you do do this you should also include an explicit cast:
intptr_t i=(intptr_t)"string"; //the base of string can be stored in a character pointer
This also complicates printing its value, you'll need to use a macro to be portable:
printf("%"PRIiPTR,i);
To print the original string, you should also cast:
printf("%s", (char *)i);
In general, no: the C standard states that conversions from pointers to integers are implementation defined. Further, this can be problematic on systems where sizeof(char *) and sizeof(int) are different (i.e. x86-64), for two reasons:
int i = "string"; can lose information, if the e.g. 64-bit pointer cannot fit in a 32-bit integer.
printf expects a pointer to be passed in, but gets a smaller integer. It winds up reading some garbage into the full pointer, and can crash your code (or worse).
Often times, however, compilers are "smart" enough to "fix" arguments to printf. Further, you seem to be running on a platform where pointers and integers are the same size, so you got lucky.
If you compiled this program with warnings (which you should) you'd get the following complaints:
main.c:3:9: warning: incompatible pointer to integer conversion initializing 'int' with an expression of type 'char [7]' [-Wint-conversion]
int i="string"; //the base of string can be stored in a character pointer
^ ~~~~~~~~
main.c:4:19: warning: format specifies type 'char *' but the argument has type 'int' [-Wformat]
printf("%s\n",i);
~~ ^
%d
2 warnings generated.
Warnings generally mean you're doing something that could cause unexpected results.
Most C compilers will let you do this, but that doesn't make it a good idea. Here, the address of the character array "string" gets stored in i. The printf options are determining how the integer is interpreted (as an address or an integer). This can be problematic when char* is not the same size as an int (e.g. on most 64 bit machines).
The C++ compiler is more picky and won't let you compile code like this. C compilers are much more willing, although they will usually generate warnings letting the programmer know it is a bad idea.
Your code is ill-formed in both C and C++. It is illegal to do
int i = "string";
in both languages. In both languages conversion from a pointer to an integer requires an explicit cast.
The only reason your C compiler accepted it is that it was configured by default for rather loose error checking. (A rather typical situation with C compilers.) Tighten up your C compiler settings and it should issue an error for the above initialization. I.e. you can use an explicit conversion
int i = (int) "string";
with implementation-dependent results, but you can't legally do it implicitly.
In any case, the warning your compiler emitted for the above initialization is already a sufficient form of a diagnostic message for this violation.

Sizeof a function that returns void in C [duplicate]

What would this statement yield?
void *p = malloc(sizeof(void));
Edit: An extension to the question.
If sizeof(void) yields 1 in GCC compiler, then 1 byte of memory is allocated and the pointer p points to that byte and would p++ be incremented to 0x2346? Suppose p was 0x2345. I am talking about p and not *p.
The type void has no size; that would be a compilation error. For the same reason you can't do something like:
void n;
EDIT.
To my surprise, doing sizeof(void) actually does compile in GNU C:
$ echo 'int main() { printf("%d", sizeof(void)); }' | gcc -xc -w - && ./a.out
1
However, in C++ it does not:
$ echo 'int main() { printf("%d", sizeof(void)); }' | gcc -xc++ -w - && ./a.out
<stdin>: In function 'int main()':
<stdin>:1: error: invalid application of 'sizeof' to a void type
<stdin>:1: error: 'printf' was not declared in this scope
If you are using GCC and you are not using compilation flags that remove compiler specific extensions, then sizeof(void) is 1. GCC has a nonstandard extension that does that.
In general, void is a incomplete type, and you cannot use sizeof for incomplete types.
Although void may stand in place for a type, it cannot actually hold a value. Therefore, it has no size in memory. Getting the size of a void isn’t defined.
A void pointer is simply a language construct meaning a pointer to untyped memory.
void has no size. In both C and C++, the expression sizeof (void) is invalid.
In C, quoting N1570 6.5.3.4 paragraph 1:
The sizeof operator shall not be applied to an expression that
has function type or an incomplete type, to the parenthesized name of
such a type, or to an expression that designates a bit-field member.
(N1570 is a draft of the 2011 ISO C standard.)
void is an incomplete type. This paragraph is a constraint, meaning that any conforming C compiler must diagnose any violation of it. (The diagnostic message may be a non-fatal warning.)
The C++ 11 standard has very similar wording. Both editions were published after this question was asked, but the rules go back to the 1989 ANSI C standard and the earliest C++ standards. In fact, the rule that void is an incomplete type to which sizeof may not be applied goes back exactly as far as the introduction of void into the language.
gcc has an extension that treats sizeof (void) as 1. gcc is not a conforming C compiler by default, so in its default mode it doesn't warn about sizeof (void). Extensions like this are permitted even for fully conforming C compilers, but the diagnostic is still required.
Taking the size of void is a GCC extension.
sizeof() cannot be applied to incomplete types. And void is incomplete type that cannot be completed.
In C, sizeof(void) == 1 in GCC, but this appears to depend on your compiler.
In C++, I get:
In function 'int main()':
Line 2: error: invalid application of 'sizeof' to a void type
compilation terminated due to -Wfatal-errors.
To the 2nd part of the question: Note that sizeof(void *)!= sizeof(void).
On a 32-bit arch, sizeof(void *) is 4 bytes, so p++, would be set accordingly.The amount by which a pointer is incremented is dependent on the data it is pointing to. So, it will be increased by 1 byte.
while sizeof(void) perhaps makes no sense in itself, it is important when you're doing any pointer math.
eg.
void *p;
while(...)
p++;
If sizeof(void) is considered 1 then this will work.
If sizeof(void) is considered 0 then you hit an infinite loop.
Most C++ compilers choosed to raise a compile error when trying to get sizeof(void).
When compiling C, gcc is not conforming and chose to define sizeof(void) as 1. It may look strange, but has a rationale. When you do pointer arithmetic adding or removing one unit means adding or removing the object pointed to size. Thus defining sizeof(void) as 1 helps defining void* as a pointer to byte (untyped memory address). Otherwise you would have surprising behaviors using pointer arithmetic like p+1 == p when p is void*. Such pointer arithmetic on void pointers is not allowed in c++ but works fine with when compiling C with gcc.
The standard recommended way would be to use char* for that kind of purpose (pointer to byte).
Another similar difference between C and C++ when using sizeof occurs when you defined an empty struct like:
struct Empty {
} empty;
Using gcc as my C compiler sizeof(empty) returns 0.
Using g++ the same code will return 1.
I'm not sure what states both C and C++ standards on this point, but I believe defining the size of some empty structs/objects helps with reference management to avoid that two references to differing consecutive objects, the first one being empty, get the same address. If reference are implemented using hidden pointers as it is often done, ensuring different address will help comparing them.
But this is merely avoiding a surprising behavior (corner case comparison of references) by introduction another one (empty objects, even PODs consume at least 1 byte memory).

Assigning a string to a variable of type int

Why is it that I can assign a string to a variable of type int? Eg. the following code compiles correctly:
int main(int argv, char** argc){
int a="Hello World";
printf(a);
}
Also, the program doesn't compile when I assign a string to a variable of a different type, namely double and char.
I suppose that what is actually going on is that the compiler executes int* a = "Hello World"; and when I write double a="Hello World";, it executes that line of code as it is.
Is this correct?
In fact, that assignment is a constraint violation, requiring a diagnostic (possibly just a warning) from any conforming C implementation. The C language standard does not define the behavior of the program.
EDIT : The constraint is in section 6.5.16.1 of the C99 standard, which describes the allowed operands for a simple assignment. The older C90 standard has essentially the same rules. Pre-ANSI C (as described in K&R1, published in 1978) did allow this particular kind of implicit conversion, but it's been invalid since the 1989 version of the language.
What probably happens if it does compile is that
int a="Hello World";
is treated as it if were
int a = (int)"Hello World";
The cast takes a pointer value and converts to int. The meaning of such a conversion is implementation-defined; if char* and int are of different sizes, it can lose information.
Some conversions may be done explicitly, for example between different arithmetic types. This one may not.
Your compiler should complain about this. Crank up the warning levels until it does. (Tell us what compiler you're using, and we can tell you how to do that.)
EDIT :
The printf call:
printf(a);
has undefined behavior in C90, and is a constraint violation, requiring a diagnostic, in C99, because you're calling a variadic function with no visible prototype. If you want to call printf, you must have a
#include <stdio.h>
(In some circumstances, the compiler won't tell you abut this, but it's still incorrect.) And given a visible declaration, since printf's first parameter is of type char* and you're passing it an int.
And if your compiler doesn't complain about
double a="Hello World";
you should get a better compiler. That (probably) tries to convert a pointer value to type double, which doesn't make any sense at all.
"Hello World" is an array of characters ending with char '\0'
When you assign its value to an int a, you assign the address of the first character in the array to a. GCC is trying to be kind with you.
When you print it, then it goes to where a points and prints all the characters until it reaches char '\0'.
It will compile because (on a 32-bit system) int, int *, and char * all correspond to 32-bit registers -- but double is 64-bits and char is 8-bits.
When compiled, I get the following warnings:
[11:40pm][wlynch#wlynch /tmp] gcc -Wall foo.c -o foo
foo.c: In function ‘main’:
foo.c:4: warning: initialization makes integer from pointer without a cast
foo.c:5: warning: passing argument 1 of ‘printf’ makes pointer from integer without a cast
foo.c:5: warning: format not a string literal and no format arguments
foo.c:5: warning: format not a string literal and no format arguments
foo.c:6: warning: control reaches end of non-void function
As you can see from the warnings, "Hello World" is a pointer, and it is being converted to an integer automatically.
The code you've given will not always work correctly though. A pointer is sometimes larger than an int. If it is, you could get a truncated pointer, and then a very odd fault when you attempt to use that value.
It produces a warning on compilers like GCC and clang.
warning: initialization makes integer from pointer without a cast [enabled by default]
The string literal "Hello World" is a const char *, so you are assigning a pointer to an int (i.e. casting the address of the first character in the string as an int value). On my compilers, gcc 4.6.2 and clang-mac-lion, assigning the string to int, unsigned long long, or char all produce warnings, not errors.
This is not behavior to rely on, quite frankly. Not to mention, your printf(a); is also a dangerous use of printf.

Resources