C pointer addition and substraction in sect. 6.5.6 - c

I am trying to understand paragraph 8 and 9 of C99 sect 6.5.6 (Additive operators)
Does para 8 mean:
int a [4];
int *p = a;
p --; /* undefined behaviour */
p = a + 4; /* okay */
p --; /* okay */
p += 2; /* undefined behaviour */
p = a;
p += 5 - 5; /* okay */
p = p + 5 - 5; /* undefined behaviour */
For paragraph 9, my understanding had been that ptrdiff_t is always large enough to hold the difference of 2 pointers. But the wording:
'provided the value fits in an object of type ptrdiff_t' seems to suggest this understanding is wrong. Is my understanding wrong or C99 meant something else.
You can find a link to the draft standards here:
http://cboard.cprogramming.com/c-programming/84349-c-draft-standards.html

I don't think your interpretation is correct. In the version I have (n1256) paragraph 9 states:
If the result is not representable in an object of that type, the
behavior is undefined
that is it. If the difference is larger than PRTDIFF_MAX or smaller than PTRDIFF_MIN the behavior is undefined.
Notice that this places the burden on the programmer to check if the difference fits in ptrdiff_t. A "lazy" platform implementation could just choose a narrow type for ptrdiff_t and leave you dealing with that.
Checking for that would not be straight forward since you can't do the substraction without provoking UB. You'd have to use the information that the two pointers point inside (or just beyond) of the same object and where the boundaries of that surrounding object are.

I agree to your understanding of paragraph 8. The standard says
If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.
It seems that C assumes that there is no pointer overflow inside an array, so you can increment/decrement pointers while you stay inside the array. If the result pointer is leaving the array, an overflow might occur and behaviour is undefined.
Regarding paragraph 9 I guess the standard takes into account that you might for example have an architecture that gives you 32 bit pointers and 32 bit data types, but since the difference of two 32 bit pointers in fact is a sign plus 32 bit (so 33 bits), not every pointer difference might match into a 32 bit ptrdiff_t. With 2 complement architecture this is not a problem, but it might be a problem on other architectures.

Related

C Pointer Arithmetic for Unusual Architectures

I'm trying to get a better understanding of the C standard. In particular I am interested in how pointer arithmetic might work in an implementation for an unusual machine architecture.
Suppose I have a processor with 64 bit wide registers that is connected to RAM where each address corresponds to a cell 8 bits wide. An implementation for C for this machine defines CHAR_BIT to be equal to 8. Suppose I compile and execute the following lines of code:
char *pointer = 0;
pointer = pointer + 1;
After execution, pointer is equal to 1. This gives one the impression that in general data of type char corresponds to the smallest addressable unit of memory on the machine.
Now suppose I have a processor with 12 bit wide registers that is connected to RAM where each address corresponds to a cell 4 bits wide. An implementation of C for this machine defines CHAR_BIT to be equal to 12. Suppose the same lines of code are compiled and executed for this machine. Would pointer be equal to 3?
More generally, when you increment a pointer to a char, is the address equal to CHAR_BIT divided by the width of a memory cell on the machine?
Would pointer be equal to 3?
Well, the standard doesn't say how pointers are implemented. The standard tells what is to happen when you use a pointer in a specific way but not what the value of a pointer shall be.
All we know is that adding 1 to a char pointer, will make the pointer point at the next char object - where ever that is. But nothing about pointers value.
So when you say that
pointer = pointer + 1;
will make the pointer equal 1, it's wrong. The standard doesn't say anything about that.
On most systems a char is 8 bit and pointers are (virtual) memory addresses referencing a 8 bit addressable memory loacation. On such systems incrementing a char pointer will increase the pointer value (aka memory address) by 1. However, on - unusual architectures - there is no way to tell.
But if you have a system where each memory address references 4 bits and a char is 12 bits, it seems a good guess that ++pointer will increase the pointer by three.
Pointers are incremented by the minimum of they width of the datatype they "point to", but are not guaranteed to increment to that size exactly.
For memory alignment purposes, there are many times where a pointer might increment to the next memory word alignment past the minimum width.
So, in general, you cannot assume this pointer to be equal to 3. It very well may be 3, 4, or some larger number.
Here is an example.
struct char_three {
char a;
char b;
char c;
};
struct char_three* my_pointer = 0;
my_pointer++;
/* I'd be shocked if my_pointer was now 3 */
Memory alignment is machine specific. One cannot generalize about it, except that most machines define a WORD as the first address that can be aligned to a memory fetch on the bus. Some machines can specify addresses that don't align with the bus fetches. In such a case, selecting two bytes that span the alignment may result in loading two WORDS.
Most systems don't accept WORD loads on non-aligned boundaries without complaining. This means that a bit of boiler plate assembly is applied to translate the fetch to the proceeding WORD boundary, if maximum density is desired.
Most compilers prefer speed to maximum density of data, so they align their structured data to take advantage of WORD boundaries, avoiding the extra calculations. This means that in many cases, data that is not carefully aligned might contain "holes" of bytes that are not used.
If you are interested in details of the above summary, you can read up on Data Structure Alignment which will discuss alignment (and as a consequence) padding.
char *pointer = 0;
After execution, pointer is equal to 1
Not necessarily.
This special case gives you a null pointer, since 0 is a null pointer constant. Strictly speaking, such a pointer is not supposed to point at a valid object. If you look at the actual address stored in the pointer, it could be anything.
Null pointers aside, the C language expects you to do pointer arithmetic by first pointing at an array. Or in case of char, you can also point at a chunk of generic data such as a struct. Everything else, like your example, is undefined behavior.
An implementation of C for this machine defines CHAR_BIT to be equal to 12
The C standard defines char to be equal to a byte, so your example is a bit weird and contradicting. Pointer arithmetic will always increase the pointer to point at the next object in the array. The standard doesn't really speak of representation of addresses at all, but your fictional example that would sensibly increase the address by 12 bits, because that's the size of a char.
Fictional computers are quite meaningless to discuss even from a learning point-of-view. I'd advise to focus on real-world computers instead.
When you increment a pointer to a char, is the address equal to CHAR_BIT divided by the width of a memory cell on the machine?
On a "conventional" machine -- indeed on the vast majority of machines where C runs -- CHAR_BIT simply is the width of a memory cell on the machine, so the answer to the question is vacuously "yes" (since CHAR_BIT / CHAR_BIT is 1.).
A machine with memory cells smaller than CHAR_BIT would be very, very strange -- arguably incompatible with C's definition.
C's definition says that:
sizeof(char) is exactly 1.
CHAR_BIT, the number of bits in a char, is at least 8. That is, as far as C is concerned, a byte may not be smaller than 8 bits. (It may be larger, and this is a surprise to many people, but it does not concern us here.)
There is a strong suggestion (if not an explicit requirement) that char (or "byte") is the machine's "minimum addressable unit" or some such.
So for a machine that can address 4 bits at a time, we would have to pick unnatural values for sizeof(char) and CHAR_BIT (which would otherwise probably want to be 2 and 4, respectively), and we would have to ignore the suggestion that type char is the machine's minimum addressable unit.
C does not mandate the internal representation (the bit pattern) of a pointer. The closest a portable C program can get to doing anything with the internal representation of a pointer value is to print it out using %p -- and that's explicitly defined to be implementation-defined.
So I think the only way to implement C on a "4 bit" machine would involve having the code
char a[10];
char *p = a;
p++;
generate instructions which actually incremented the address behind p by 2.
It would then be an interesting question whether %p should print the actual, raw pointer value, or the value divided by 2.
It would also be lots of fun to watch the ensuing fireworks as too-clever programmers on such a machine used type punning techniques to get their hands on the internal value of pointers so that they could increment them by actually 1 -- not the 2 that "proper" additions of 1 would always generate -- such that they could amaze their friends by accessing the odd nybble of a byte, or confound the regulars on SO by asking questions about it. "I just incremented a char pointer by 1. Why is %p showing a value that's 2 greater?"
Seems like the confusion in this question comes from the fact that the word "byte" in the C standard doesn't have the typical definition (which is 8 bits). Specifically, the word "byte" in the C standard means a collection of bits, where the number of bits is specified by the implementation-defined constant CHAR_BITS. Furthermore, a "byte" as defined by the C standard is the smallest addressable object that a C program can access.
This leaves open the question as to whether there is a one-to-one correspondence between the C definition of "addressable", and the hardware's definition of "addressable". In other words, is it possible that the hardware can address objects that are smaller than a "byte"? If (as in the OP) a "byte" occupies 3 addresses, then that implies that "byte" accesses have an alignment restriction. Which is to say that 3 and 6 are valid "byte" addresses, but 4 and 5 are not. This is prohibited by section 6.2.8 which discusses the alignment of objects.
Which means that the architecture proposed by the OP is not supported by the C specification. In particular, an implementation may not have pointers that point to 4-bit objects when CHAR_BIT is equal to 12.
Here are the relevant sections from the C standard:
§3.6 The definition of "byte" as used in the standard
[A byte is an] addressable unit of data storage large enough to hold
any member of the basic character set of the execution environment.
NOTE 1 It is possible to express the address of each individual byte
of an object uniquely.
NOTE 2 A byte is composed of a contiguous sequence of bits, the number
of which is implementation-defined. The least significant bit is
called the low-order bit; the most significant bit is called the
high-order bit.
§5.2.4.2.1 describes CHAR_BIT as the
number of bits for smallest object that is not a bit-field (byte)
§6.2.6.1 restricts all objects that are larger than a char to be a multiple of CHAR_BIT bits:
[...]
Except for bit-fields, objects are composed of contiguous sequences of
one or more bytes, the number, order, and encoding of which are either
explicitly specified or implementation-defined.
[...] Values stored in non-bit-field objects of any other object type
consist of n × CHAR_BIT bits, where n is the size of an object of that
type, in bytes.
§6.2.8 restricts the alignment of objects
Complete object types have alignment requirements which place
restrictions on the addresses at which objects of that type may be
allocated. An alignment is an implementation-defined integer value
representing the number of bytes between successive addresses at which
a given object can be allocated.
Valid alignments include only those values returned by an _Alignof
expression for fundamental types, plus an additional
implementation-defined set of values, which may be empty. Every
valid alignment value shall be a nonnegative integral power of two.
§6.5.3.2 specifies the sizeof a char, and hence a "byte"
When sizeof is applied to an operand that has type char, unsigned
char, or signed char, (or a qualified version thereof) the result is
1.
The following code fragment demonstrates an invariant of C pointer arithmetic -- no matter what CHAR_BIT is, no matter what the hardware least addressable unit is, and no matter what the actual bit representation of pointers is,
#include <assert.h>
int main(void)
{
T x[2]; // for any object type T whatsoever
assert(&x[1] - &x[0] == 1); // must be true
}
And since sizeof(char) == 1 by definition, this also means that
#include <assert.h>
int main(void)
{
T x[2]; // again for any object type T whatsoever
char *p = (char *)&x[0];
char *q = (char *)&x[1];
assert(q - p == sizeof(T)); // must be true
}
However, if you convert to integers before performing the subtraction, the invariant evaporates:
#include <assert.h>
#include <inttypes.h>
int main(void);
{
T x[2];
uintptr_t p = (uintptr_t)&x[0];
uintptr_t q = (uintptr_t)&x[1];
assert(q - p == sizeof(T)); // implementation-defined whether true
}
because the transformation performed by converting a pointer to an integer of the same size, or vice versa, is implementation-defined. I think it's required to be bijective, but I could be wrong about that, and it is definitely not required to preserve any of the above invariants.

Using bit operations to "turn off" binary digits of a pointer

I was able to use bit operations to "turn off" binary digits of a number.
Ex:
x = x & ~(1<<0)
x = x & ~(1<<1)
(and repeat until desired number of digits starting from the right are changed to 0)
I would like to apply this technique to a pointer's address.
Unfortunately, the & operator cannot be used with pointers. Using the same lines of code as above, where x is a pointer, the compiler says "invalid operands to binary & (have int and int)."
I tried to typecast the pointers as ints, but that doesn't work as I assume the ints are too small (and I just realized I'm not allowed to cast).
(note: though this is part of a homework problem, I've already reasoned out why I need to turn off some digits after a good couple hours, so I'm fine in that regard. I'm simply trying to see if I can get a clever technique to do what I want to do here).
Restrictions: I cannot use loops, conditionals, any special functions, constants greater than 255, division, mod.
(edit: added restrictions to the bottom)
Use uintptr_t from <stdint.h>. You should always use unsigned types for bit twiddling, and (u)intptr_t is specifically chosen to be able to hold a pointer's value.
Note however that adjusting a pointer manually and dereferencing it is undefined behaviour, so watch your step. You shall be able to recover the exact original value of the pointer (or another valid pointer) before doing so.
Edit : from your comment I understand that you don't plan on dereferencing the twiddled pointer at all, so no undefined behaviour for you. Here is how you can check if your pointers share the same 64-byte block :
uintptr_t p1 = (uintptr_t)yourPointer1;
uintptr_t p2 = (uintptr_t)yourPointer2;
uintptr_t mask = ~(uintptr_t)63u; // Shave off 5 low-order bits
return (p1 & mask) == (p2 & mask);
C language standard library includes the (optional though) type intptr_t, for which there is guarantee that "any valid pointer to void can be converted to this type, then converted back to pointer to void, and the result will compare equal to the original pointer".
Of course if you perform bitwise operation on the integer than the result is undefined behaviour.
Edit:
How unfortunate haha. I need a function to show two pointers are in
the same 64-byte block of memory. This holds true so long as every
digit but the least significant 6 digits of their binary
representations are equal. By making sure the last 6 digits are all
the same (ex: 0), I can return true if both pointers are equal. Well,
at least I hope so.
You should be able to check if they're in the same 64 block of memory by something like this:
if ((char *)high_pointer - (char *)low_pointer < 64) {
// do stuff
}
Edit2: This is likely to be undefined behaviour as pointed out by chris.
Original post:
You're probably looking for intptr_t or uintptr_t. The standard says you can cast to and from these types to pointers and have the value equal to the original.
However, despite it being a standard type, it is optional so some library implementations may choose not to implement it. Some architectures might not even represent pointers as integers so such a type wouldn't make sense.
It is still better than casting to and from an int or a long since it is guaranteed to work on implementations that supply it. Otherwise, at least you'll know at compile time that your program will break on a certain implementation/architecture.
(Oh, and as other answers have stated, manually changing the pointer when casted to an integer type and dereferencing it is undefined behaviour)

Initializing bit-fields

When you write
struct {
unsigned a:3, b:2;
} x = {10, 11};
is x.b guaranteed to be 3 by ANSI C (C89)? I have read and reread the standard, but can't seem to find exactly that case.
For example, "result that cannot be represented by the
resulting unsigned integer type is reduced modulo the number that is
one greater than the largest value that can be represented by the
resulting unsigned integer type." speaks about computation, not about initialization. And moreover, bit-field is not really a type.
Also, (when speaking about unsigned t:4) "contains values in the range [0,15]", but it doesn't necessarily mean that initializer must be reduced modulo 16 to be mapped to [0,15].
Struct initialization is really painstakingly detailedly described, but I really can't seem to find exactly that behavior. (Of course compilers do exactly that. And IBM documentation says " when you assign a value that is out of range to a bit field, the low-order bit pattern is preserved and the appropriate bits are assigned.", but I'd like to know if ANSI C standardizes that.
"ANSI C"/C89 has been obsolete for 25 years. Therefore, my answer cites the current C standard ISO 9899:2011, also known as C11.
Pretty much everything related to bit-fields in the C standard is poorly defined. Typically, you will not find anything explicitly addressing the behavior of bit fields, but their behavior is rather specified implicitly, "between the lines". This is why you should avoid using bit fields.
However, I believe that this specific case is well-defined: it should work like any other integer initialization.
The detailed struct initialization rules you mention (6.7.9) show how the literal 11 in the initializer list is related to the variable b. Nothing strange with that. What then applies is "simple assignment", the same thing that would happen as if you wrote x.b = 11;.
When doing any kind of assignment or initialization in C, the right operand is converted to the type of the left operand. This is specified by C11 6.5.16:
In simple assignment (=), the value of the right operand is converted
to the type of the assignment expression and replaces the value stored
in the object designated by the left operand.
In your case, the literal 11 of type int is converted to a bit field of unsigned int:2.
Therefore, the rule you are looking for should be found in the chapter dealing with conversions (C11 6.3). What applies is what you already cited in your question, C11 6.3.1.3:
...if the new type is unsigned, the value is converted by repeatedly
adding or subtracting one more than the maximum value that can be
represented in the new type until the value is in the range of the new
type.
The maximum value of an unsigned int:2 is 3. One more than the maximum value is 3+1=4. The compiler should repeatedly subtract this from the value 11:
11 - (3+1) = 7 does not fit, subtract once more:
7 - (3+1) = 3 does fit, store value 3
But then of course, this is the very same thing as taking the 2 least significant bits of the decimal value 11 and storing them in the bit field.
WRT "speaks about computation, not about initialization", the C89 standard explicitly applies the rules of assignment and conversion to initialization. It also says:
A bit-field is interpreted as an integral type consisting of the specified number of bits.
Given those, while a compiler warning would clearly be in order, it seems that throwing away upper-order bits is guaranteed by the standard.

Is pointer tagging in C undefined according to the standard?

Some dynamically-typed languages use pointer tagging as a quick way to identify or narrow down the runtime type of the value being represented. A classic way to do this is to convert pointers to a suitably sized integer, and add a tag value over the least significant bits which are assumed to be zero for aligned objects. When the object needs to be accessed, the tag bits are masked away, the integer is converted to a pointer, and the pointer is dereferenced as normal.
This by itself is all in order, except it all hinges on one colossal assumption: that the aligned pointer will convert to an integer guaranteed to have zero bits in the right places.
Is it possible to guarantee this according to the letter of the standard?
Although standard section 6.3.2.3 (references are to the C11 draft) says that the result of a conversion from pointer to integer is implementation-defined, what I'm wondering is whether the pointer arithmetic rules in 6.5.2.1 and 6.5.6 effectively constrain the result of pointer->integer conversion to follow the same predictable arithmetic rules that many programs already assume. (6.3.2.3 note 67 seemingly suggests that this is the intended spirit of the standard anyway, not that that means much.)
I'm specifically thinking of the case where one might allocate a large array to act as a heap for the dynamic language, and therefore the pointers we're talking about are to elements of this array. I'm assuming that the start of the C-allocated array itself can be placed at an aligned position by some secondary means (by all means discuss this too though). Say we have an array of eight-byte "cons cells"; can we guarantee that the pointer to any given cell will convert to an integer with the lowest three bits free for a tag?
For instance:
typedef Cell ...; // such that sizeof(Cell) == 8
Cell heap[1024]; // such that ((uintptr_t)&heap[0]) & 7 == 0
((char *)&heap[11]) - ((char *)&heap[10]); // == 8
(Cell *)(((char *)&heap[10]) + 8); // == &heap[11]
&(&heap[10])[0]; // == &heap[10]
0[heap]; // == heap[0]
// So...
&((char *)0)[(uintptr_t)&heap[10]]; // == &heap[10] ?
&((char *)0)[(uintptr_t)&heap[10] + 8]; // == &heap[11] ?
// ...implies?
(Cell *)((uintptr_t)&heap[10] + 8); // == &heap[11] ?
(If I understand correctly, if an implementation provides uintptr_t then the undefined behaviour hinted at in 6.3.2.3 paragraph 6 is irrelevant, right?)
If all of these hold, then I would assume that it means that you can in fact rely on the low bits of any converted pointer to an element of an aligned Cell array to be free for tagging. Do they && does it?
(As far as I'm aware this question is hypothetical since the normal assumption holds for common platforms anyway, and if you found one where it didn't, you probably wouldn't want to look to the C standard for guidance rather than the platform docs; but that's beside the point.)
This by itself is all in order, except it all hinges on one colossal
assumption: that the aligned pointer will convert to an integer
guaranteed to have zero bits in the right places.
Is it possible to guarantee this according to the letter of the
standard?
It's possible for an implementation to guarantee this. The result of converting a pointer to an integer is implementation-defined, and an implementation can define it any way it likes, as long as it meets the standard's requirements.
The standard absolutely does not guarantee this in general.
A concrete example: I've worked on a Cray T90 system, which had a C compiler running under a UNIX-like operating system. In the hardware, an address is a 64-bit word containing the address of a 64-bit word; there were no hardware byte addresses. Byte pointers (void*, char*) were implemented in software by storing a 3-bit offset in the otherwise unused high-order 3 bits of a 64-bit word pointer.
All pointer-to-pointer, pointer-to-integer, and integer-to-pointer conversions simply copied the representation.
Which means that a pointer to an 8-byte aligned object, when converted to an integer, could have any bit pattern in its low-order 3 bits.
Nothing in the standard forbids this.
The bottom line: A scheme like the one you describe, that plays games with pointer representations, can work if you make certain assumptions about how the current system represents pointers -- as long as those assumptions happen to be valid for the current system.
But no such assumptions can be 100% reliable, because the standard says nothing about how pointers are represented (other than that they're of a fixed size for each pointer type, and that the representation can be viewed as an array of unsigned char).
(The standard doesn't even guarantee that all pointers are the same size.)
You're right about the relevant parts of the standard. For reference:
An integer may be converted to any pointer type. Except as previously specified, the result is implementation-defined, might not be correctly aligned, might not point to an entity of the referenced type, and might be a trap representation.
Any pointer type may be converted to an integer type. Except as previously specified, the result is implementation-defined. If the result cannot be represented in the integer type, the behavior is undefined. The result need not be in the range of values of any integer type.
Since the conversions are implementation defined (except when the integer type is too small, in which case it's undefined), there's nothing the standard is going to tell you about this behaviour. If your implementation makes the guarantees you want, you're set. Otherwise, too bad.
I guess the answer to your explicit question:
Is it possible to guarantee this according to the letter of the standard?
Is "yes", since the standard punts on this behaviour and says the implementation has to define it. Arguably, "no" is just as good an answer for the same reason.

Is using memcmp on array of int strictly conforming?

Is the following program a strictly conforming program in C? I am interested in c90 and c99 but c11 answers are also acceptable.
#include <stdio.h>
#include <string.h>
struct S { int array[2]; };
int main () {
struct S a = { { 1, 2 } };
struct S b;
b = a;
if (memcmp(b.array, a.array, sizeof(b.array)) == 0) {
puts("ok");
}
return 0;
}
In comments to my answer in a different question, Eric Postpischil insists that the program output will change depending on the platform, primarily due to the possibility of uninitialized padding bits. I thought the struct assignment would overwrite all bits in b to be the same as in a. But, C99 does not seem to offer such a guarantee. From Section 6.5.16.1 p2:
In simple assignment (=), the value of the right operand is converted to the type of the assignment expression and replaces the value stored in the object designated by the left operand.
What is meant by "converted" and "replaces" in the context of compound types?
Finally, consider the same program, except that the definitions of a and b are made global. Would that program be a strictly conforming program?
Edit: Just wanted to summarize some of the discussion material here, and not add my own answer, since I don't really have one of my own creation.
The program is not strictly conforming. Since the assignment is by value and not by representation, b.array may or may not contain bits set differently from a.array.
a doesn't need to be converted since it is the same type as b, but the replacement is by value, and done member by member.
Even if the definitions in a and b are made global, post assignment, b.array may or may not contain bits set differently from a.array. (There was little discussion about the padding bytes in b, but the posted question was not about structure comparison. c99 lacks a mention of how padding is initialized in static storage, but c11 explicitly states it is zero initialized.)
On a side note, there is agreement that the memcmp is well defined if b was initialized with memcpy from a.
My thanks to all involved in the discussion.
In C99 §6.2.6
§6.2.6.1 General
1 The representations of all types are unspecified except as stated in this subclause.
[...]
4 [..] Two values (other than NaNs) with the same object representation compare equal, but values that compare equal may have different object representations.
6 When a value is stored in an object of structure or union type, including in a member object, the bytes of the object representation that correspond to any padding bytes take unspecified values.42)
42) Thus, for example, structure assignment need not copy any padding bits.
43) It is possible for objects x and y with the same effective type T to have the same value when they are accessed as objects of type T, but to have different values in other contexts. In particular, if == is defined for type T, then x == y does not imply that memcmp(&x, &y, sizeof (T)) == 0. Furthermore, x == y does not necessarily imply that x and y have the same value; other operations on values of type T may distinguish between them.
§6.2.6.2 Integer Types
[...]
2 For signed integer types, the bits of the object representation shall be divided into three groups: value bits, padding bits, and the sign bit. There need not be any padding bits;[...]
[...]
5 The values of any padding bits are unspecified.[...]
In J.1 Unspecified Behavior
The value of padding bytes when storing values in structures or unions (6.2.6.1).
[...]
The values of any padding bits in integer representations (6.2.6.2).
Therefore there may be bits in the representation of a and b that differ while not affecting the value. This is the same conclusion as the other answer, but I thought that these quotes from the standard would be good additional context.
If you do a memcpy then the memcmp would always return 0 and the program would be strictly conforming. The memcpy duplicates the object representation of a into b.
My opinion is that it is strictly conforming. According to 4.5 that Eric Postpischil mentioned:
A strictly conforming program shall use only those features of the
language and library specified in this International Standard. It
shall not produce output dependent on any unspecified, undefined, or
implementation-defined behavior, and shall not exceed any minimum
implementation limit.
The behavior in question is the behavior of memcmp, and this is well-defined, without any unspecified, undefined or implementation-defined aspects. It works on the raw bits of the representation, without knowing anything about the values, padding bits or trap representations. Thus the result (but not the functionality) of memcmp in this specific case depends on the implementation of the values stored within these bytes.
Footnote 43) in 6.2.6.2:
It is possible for objects x and y with the same effective type T to
have the same value when they are accessed as objects of type T, but
to have different values in other contexts. In particular, if == is
defined for type T, then x == y does not imply that memcmp(&x, &y,
sizeof (T)) == 0. Furthermore, x == y does not necessarily imply that
x and y have the same value; other operations on values of type T may
distinguish between them.
EDIT:
Thinking it a bit further, I'm not so sure about the strictly conforming anymore because of this:
It shall not produce output dependent on any unspecified [...]
Clearly the result of memcmp depends on the unspecified behavior of the representation, thereby fulfilling this clause, even though the behavior of memcmp itself is well defined. The clause doesn't say anything about the depth of functionality until the output happens.
So it is not strictly conforming.
EDIT 2:
I'm not so sure that it will become strictly conforming when memcpy is used to copy the struct. According to Annex J, the unspecified behavior happens when a is initialized:
struct S a = { { 1, 2 } };
Even if we assume that the padding bits won't change and memcpy always returns 0, it still uses the padding bits to obtain its result. And it relies on the assumption that they won't change, but there is no guarantee in the standard about this.
We should differentiate between paddings bytes in structs, used for alignment, and padding bits in specific native types like int. While we can safely assume that the padding bytes won't change, but only because there is no real reason for it, the same does not apply for the padding bits. The standard mentions a parity flag as an example of a padding bit. This may be a software function of the implementation, but it may as well be a hardware function. Thus there may be other hardware flags used for the padding bits, including one that changes on read accesses for whatever reason.
We will have difficulties in finding such an exotic machine and implementation, but I see nothing that forbid this. Correct me if I'm wrong.

Resources