Consider this code in block scope:
struct foo { unsigned char a; unsigned char b; } x, y;
x.a = 0;
y = x;
C [N1570] 6.3.2.1 2 says “If the lvalue designates an object of automatic storage duration that could have been declared with the register storage class (never had its address taken), and that object is uninitialized (not declared with an initializer and no assignment to it has been performed prior to use), the behavior is undefined.”
Although a member of x has been assigned a value, no assignment to x has been performed, and its address has not been taken. Thus, it appears 6.3.2.1 2 tells us the behavior of x in y = x is undefined.
However, if we had assigned a value to every member of x, it would seem unreasonable to consider x to be uninitialized for the purposes of 6.3.2.1 2.
(1) Is there anything in the standard which, strictly speaking, causes 6.3.2.1 2 not to apply to (make undefined) the code above?
(2) Supposing we were modifying the standard or determining a reasonable modification to 6.3.2.1 2, are there reasons to prefer one of the following over the others? (a) 6.3.2.1 2 does not apply to structures. (b) If at least one member of a structure has been assigned a value, the structure is not uninitialized for purposes of 6.3.2.1 2. (c) If all named1 members of a structure have been assigned a value, the structure is not uninitialized for purposes of 6.3.2.1 2.
Footnote
1 Structures may have unnamed members, so it is not always possible to assign a value to every member of a structure. (Unnamed members have indeterminate value even if the structure is initialized, per 6.7.9 9.)
My opinion is that it is undefined behaviour simply because it is not explicitly defined by the standard. From 4 Conformance §2 (emphasize mine) :
...Undefined behavior is otherwise
indicated in this International Standard by the words ‘‘undefined behavior’’ or by the
omission of any explicit definition of behavior.
After many reads in N1570 draft I cannot find any explicit definition of behaviour for using a partially initialized struct. On one hand 6.3.2.1 §2 says:
...If
the lvalue designates an object of automatic storage duration that could have been
declared with the register storage class (never had its address taken), and that object
is uninitialized (not declared with an initializer and no assignment to it has been
performed prior to use), the behavior is undefined
so here x is automatic, has never be initialized (only one of its members), and admitedly its address is never taken so we could think that it is explicitely UB
On the other hand, 6.2.6.1 §6 says:
... The value of a structure or union object is never a trap representation, even though the value of a member of the structure or union object may be a trap representation.
As 6.2.6.1 §5 has just defined a trap representation:
Certain object representations need not represent a value of the object type. If the stored
value of an object has such a representation and is read by an lvalue expression that does
not have character type, the behavior is undefined. If such a representation is produced
by a side effect that modifies all or any part of the object by an lvalue expression that means 0 value for a member and an undefined value for b member.
does not have character type, the behavior is undefined.50) Such a representation is called
a trap representation.
we could think that it is always legal to take the value of a struct because it cannot be a trap representation
In addition, it is not clear for me if setting the value of a member of a struct actually leaves the struct in an unitialized state.
For all those reasons, I think that the standard does not clearly defines what the behaviour should be and simply for that reason it is undefined behaviour.
That being said I am pretty sure that any common compiler will accept it and will give y the current representation of x, that means 0 value for a member and an indeterminate value of same representation as the current one for x.b for the b member.
Firstly, let's note that the quoted part of 6.3.2.1/2, the so-called "Itanium clause" is the only clause under which this code might have a problem. In other words, if this clause were not present, the code is fine. Structs may not have trap representations, so y = x; is otherwise OK even if x is entirely uninitialized. The resolution of DR 451 clarifies that indeterminate values may be propagated by assignment, without causing UB.
Back to the Itanium clause here. As you point out, the Standard does not clearly specify whether x.a = 0; negates the precondition "x is uninitialized".
IMO, this means we should turn to the rationale for the Itanium clause to determine the intent. The purpose of the wording of the standard document, in general, is to implement an intent; generally speaking, I don't agree with being dogmatic about minute detail of the standard: taking shades of meaning out of the wording that were not intended by those who created the wording.
This Q/A gives a good explanation of the rationale. The potential problem is that x might be stored in a register with the NaT bit set, and then y = x will cause a hardware exception due to reading a register that has that bit set.
So the question is: On IA64 does x.a = 0; clear the NaT bit? I don't know and I guess we would need someone familar with that platform to give a conclusive answer here.
Naively, I imagine that if x is in a register then, in general, x.a = 0; will need to read the old value, and apply a mask to clear the bits for a, thereby triggering the exception if x was NaT. However, x.a = 0; cannot trigger UB, so that logic must be incorrect. Perhaps IA64 compilers never store a struct in a register, or perhaps they clear the NaT bit on declaration of one, or perhaps there's a hardware instruction to implement x.a = 0; on a previously-NaT register, I don't know.
Copying a partially-written structure falls in the category of actions which quality implementations will process in consistent fashion absent a good reason to do otherwise, specialized implementations might process differently because they have a good reason to do so, and poor-quality-but-conforming implementations may use as an excuse to behave nonsensically.
Note that copying uninitialized values of an automatic-duration or malloc-created character array would fall in a similar category of actions, except that implementations that would trap on such an action (e.g. to help programmers identify and track down potential information leaks) would not be allowed to describe themselves as "conforming".
An implementation which is specialized to diagnose accidental information leaks might sensibly trap efforts to copy a partially-written structure. On an implementation where using an unitialized value of some type could result in strange behavior, copying a structure with an unitialized member of that type and then attempting to use that member of the copy might sensibly do likewise.
The Standard doesn't particularly say whether a partially-written structure counts as having been written or not, because people seeking to produce quality implementations shouldn't care. Quality implementations specialized for detecting potential information leakage should squawk at any attempt to copy uninitialized data, without regard for when the Standard would or would not allow such behavior (provided that they describe themselves as non-conforming). Quality general-purpose implementations designed to support a wide variety of programs should allow partially-initialized structures to be copied in cases where programs don't look at the uninitialized portions outside the context of whole-structure copying (such treatment is useful and generally costs nothing in non-contrived cases). The Standard could be construed as granting poor-quality-but-conforming implementations the right treat copying of partially-written structures as an excuse to behave nonsensically, but such implementations could use almost anything as such an excuse. Quality implementations won't do anything unusual when copying structures unless they document a good reason for doing so.
The C Standard specifies that structure types cannot have trap representations, although members of structs may. The primary circumstance in which that guarantee would be useful would be in cases involving partially-written structures. Further, a prohibition on copying structures before one had written all members, even ones the recipient of the copy would never use, would require programmers to write needlessly-inefficient code and serve no useful purpose. Imposing such a requirement in the name of "optimization" would be downright dumb, and I know of no evidence that the authors of the Standard intended to do so.
Unfortunately, the authors of the Standard use the same terminology to describe two situations:
Some implementations define the behavior of some action X in all cases, while some only define it for some; other parts of the Standard define the action in a few select cases. The authors want to say that implementations need not behave like the ones that define the behavior in all cases, without revoking guarantees made elsewhere in the Standard
Although other parts of the Standard would define the behavior of action X in some cases, guaranteeing the behavior in all such cases could be expensive and implementations are not required to guarantee them even cases where other parts of the Standard would define them.
Before the Standard was written, some implementations would zero-initialize all automatic variables. Thus, those implementations would guarantee the behavior of reading uninitialized values, even of types with trap representations. The authors of the Standard wished to make clear that they did not want to require all implementations do likewise. Further, some objects may define the behavior of all bit patterns when stored in memory, but not when stored in registers. Such treatment would generally be limtied to scalar types, however, rather than structures.
From a practical perspective, defining the behavior of copying a structure as copying the state (defined or indeterminate) of all fields would not cost any more than allowing compilers to behave in arbitrary fashion when copying partially-written structures. Unfortunately, some compiler writers erroneously believe that "cleverness" and "stupidity" are antonyms, and thus behave as though the authors of the Standard wished to invite compilers to assume that programs will never receive any input that would cause structures to be copied after having been partially written.
Related
On cppreference.com, in the section Implicit conversions, in the subsection "Lvalue conversion", it is noted that
[i]f the lvalue designates an object of automatic storage duration whose address was never taken and if that object was uninitialized (not declared with an initializer and no assignment to it has been performed prior to use), the behavior is undefined. [emphasis mine]
From that, I undestand that the "act of taking an address" of an object at some point in time may influence in some way whether the undefined behavior happens or not later when this object "is used". If I'm right, then it seems at least unusual.
Am I right? If so, how is that possible? If not, what am I missing?
cppreference.com is deriving this from a rule in the C standard. C 2018 6.3.2 2 says:
If the lvalue designates an object of automatic storage duration that could have been declared with the register storage class (never had its address taken), and that object is uninitialized (not declared with an initializer and no assignment to it has been performed prior to use), the behavior is undefined.
So, the reason that taking an address matters is fundamentally because “the C standard says so” rather than because taking the address “does something” in the model of computing.
The reason this rule was added to the C standard was to support some behaviors Hewlett-Packard (HP) desired for its Itanium processor. In that processor, each of certain registers has an associated bit that indicates the register is “uninitialized.” So HP was able to make programs detect and trap in some instances where an object had not been initialized. (This detection does not extend to memory; the bit was only associated with a processor register.)
By saying the behavior is undefined if an uninitialized object is used, the C standard allows a trap to occur but also allows that a trap might not occur. So it allowed HP’s behavior of trapping, it allowed it when HP’s software did not detect the issue and so did not trap, and it allowed other vendors to ignore this and provide whatever value happened to be in a register, as well as other behaviors that might arise from optimization by the compiler.
As for predicating the undefined behavior based on automatic storage duration and not taking the address, I suspect this was a bit of a kludge. It provides a criterion that worked for the parties involved: HP was able to design their compiler to use the “unitialized” feature with their registers, but the rule does not carve out a great deal of object use as undefined behavior. For example, somebody might want to write an algorithm that processes large parts of many arrays en masse, ignoring that a few values “along the edges” of defined regions might be uninitialized. The idea there would be that, in some situations, it is more efficient to do a block of operations and then, at the end, carve away the ones you do not care about. Thus, for these situations, programmers want the code to work with “indeterminate values”—the code will execute, will reach the end of the operations, and will have valid values in the results in cares about, and there will not have been any traps or other undefined behavior arising from the values they did not care about. So limiting the undefined behavior of uninitialized objects to automatic objects whose address is not taken may have provided a boundary that worked for all parties concerned.
If an object had never had its address taken, it could potentially be optimized away. In such cases, attempting to read an uninitialized variable need not yield the same value on multiple reads.
By taking the address of an object, that guarantees that storage is set aside for it which can subsequently be read from. Then the value read will at least be consistent (though not necessarily predictable) assuming it is not a trap representation.
While reading this I saw a UB that I don't understand, hoping you can clarify
size_t f(int x)
{
size_t a;
if(x) // either x nonzero or UB
a = 42;
return a;
}
I guess the UB is due to a not having an initialized value, but isn't that it's defined behavior? Meaning, f(0) will return the value held by variable a, whatever it is (I consider this to be something like rand()). Must we know what value the code snippet returns for the code to have a well-defined-behavior?
Meaning, f(0) will return the value held by variable a, whatever it is...
Well, in your case,
a is automatic local variable
it can have trap representation
it does not have its address taken.
So, yes, this, by definition causes undefined behavior.
Quoting C11, chapter §6.3.2.1
[...] If
the lvalue designates an object of automatic storage duration that could have been
declared with the register storage class (never had its address taken), and that object
is uninitialized (not declared with an initializer and no assignment to it has been
performed prior to use), the behavior is undefined.
Related to "why undefined behaviour is undefined", see this post.
There's a very nice answer related to trap representation and undefined behaviour, check it out.
Finally, a fine lining between UB and usage of indeterminate values.
Supplemental to #SouravGhosh's answer, it is important to understand that having undefined behavior is a property of certain combinations of language constructs and of certain runtime evaluations a program may perform, as specified by the standard. It is not a function of an analysis of what a compiler or program might do; in fact, it is more the opposite: a license to compilers and programs, releasing them from any particular constraint.
Therefore, although the standard is fairly logical and consistent about declaring UB, it is not much useful to approach the question from the direction of questioning why a particular construct has UB or why a particular evaluation may or does exhibit UB. There are reasons for the standard specifying what it does, but the primary answer to why a thing has UB is always "because the standard says so."
Undefined Behavior is a license for an implementation to process code in whatever way the author judges to be most suitable for the intended purpose. Some implementations included logic to trap in cases where an automatic variable was read without having been written first, even if the types otherwise had no trap representations; the authors of the Standard were almost certainly aware of such behavior and judged it useful. The Standard specifies only one situation where things may trap, but only in defined fashion (conversion from a larger integer type to a smaller one); in all other cases where things may trap, the authors of the Standard simply left the behavior Undefined rather than trying to go into any detail about how particular traps work, whether they are recoverable, etc.
Additionally, automatic variables are often mapped to registers that are larger than the variables in question, and even types which don't have trap representations may behave oddly in such cases. Consider, for example:
volatile uint16_t v;
uint32_t x(uint32_t a, uint32_t b)
{
uint16_t temp;
if (b) temp=v;
return temp;
}
If b is non-zero, then temp will get loaded with v, and the act of loading v will cause temp to hold some value 0-65535. If b is zero, however, the compiler can't load temp with v (because of the volatile qualifier). If temp had been assigned to a 32-bit register (on some platforms, it might logically be assigned the same one used for a), the function may behave as though temp held a value which is larger than 65535. The simplest way for the Standard to allow for such a possibility is to say that returning temp in the above situation would be Undefined Behavior. Not because it would be expecting that implementations would do anything particularly wonky in cases where the caller ends up ignoring the return value (if the caller was going to use the return value, the caller presumably wouldn't have passed b==0) but because leaving things to implementers' judgment is easier than trying to formulate perfect one-size-fits-all rules for such things.
Modern C implementers no longer treat Undefined Behavior as an invitation to exercise judgment, but rather as an invitation to assume no judgment is required. Consequently, they may behave in ways that can disrupt program execution even if the value of the uninitialized value is used for no purpose except to pass it through code that doesn't know if it's meaningful, to code that ultimately ignores it.
This question already has answers here:
(Why) is using an uninitialized variable undefined behavior?
(7 answers)
Closed 6 years ago.
Various esteemed, high rep users on SO keeps insisting that reading a variable with indeterminate value "is always UB". So where exactly is this mentioned in the C standard?
It is very clear that an indeterminate value could either be an unspecified value or a trap representation:
3.19.2
indeterminate value
either an unspecified value or a trap representation
3.19.3
unspecified value
valid value of the relevant type where this International Standard imposes no
requirements on which value is chosen in any instance
NOTE An unspecified value cannot be a trap representation.
3.19.4
trap representation
an object representation that need not represent a value of the object type
It is also clear that reading a trap representation invokes undefined behavior, 6.2.6.1:
Certain object representations need not represent a value of the object type. If the stored
value of an object has such a representation and is read by an lvalue expression that does
not have character type, the behavior is undefined. If such a representation is produced
by a side effect that modifies all or any part of the object by an lvalue expression that
does not have character type, the behavior is undefined.50) Such a representation is called
a trap representation.
However, an indeterminate value does not necessarily contain a trap representation. In fact, trap representations are very rare for systems using two's complement.
Where in the C standard does it actually say that reading an indeterminate value invokes undefined behavior?
I was reading the non-normative Annex J of C11 and found that this is indeed listed as one case of UB:
The value of an object with automatic storage duration is used while it is
indeterminate (6.2.4, 6.7.9, 6.8).
However, the listed sections are irrelevant. 6.2.4 only states rules regarding life time and when a variable's value becomes indeterminate. Similarly, 6.7.9 is regarding initialization and states how a variable's value becomes indeterminate. 6.8 seems mostly irrelevant. None of the sections contains any normative text saying that accessing an indeterminate value can lead to UB. Is this a defect in Annex J?
There is however some relevant, normative text in 6.3.2.1 regarding lvalues:
If the lvalue designates an
object of automatic storage duration that could have been declared with the register
storage class (never had its address taken), and that object is uninitialized (not declared
with an initializer and no assignment to it has been performed prior to use), the behavior
is undefined.
But that is a special case, which only applies to variables of automatic storage duration that never had their address taken. I have always thought that this section of 6.3.2.1 is the only case of UB regarding indeterminate values (that are not trap representations). But people keep insisting that "it is always UB". So where exactly is this mentioned?
As far as I know, there is nothing in the standard that says that using an indeterminate value is always undefined behavior.
The cases that are spelled out as invoking undefined behavior are:
If the value happens to be a trap representation.
If the indeterminate value is an object of automatic storage.
If the value is a pointer to an object whose lifetime has ended.
As an example, the C standard specifies that the type unsigned char has no padding bits and therefore none of its values can ever be a trap representation.
Portable implementations of functions such as memcpy take advantage of this fact to perform a copy of any value, including indeterminate values. Those values could potentially be trap representations when used as values of a type that contains padding bits, but they are simply unspecified when used as values of unsigned char.
I believe that it is erroneous to assume that if something could invoke undefined behavior then it does invoke undefined behavior when the program has no safe way of checking. Consider the following example:
int read(int* array, int n, int i)
{
if (0 <= i)
if (i < n)
return array[i];
return 0;
}
In this case, the read function has no safe way of checking whether array really is of (at least) length n. Clearly, if the compiler considered these possible UB operations as definite UB, it would be nearly impossible to write any pointer code.
More generally, if the compiler cannot prove that something is UB, it has to assume that it isn't UB, otherwise it risks breaking conforming programs.
The only case where the possibility is treated like a certainty, is the case of objects of automatic storage. I think it's reasonable to assume that the reason for that is because those cases can be statically rejected, since all the information the compiler needs can be obtained through local flow analysis.
On the other hand, declaring it as UB for non-automatic storage objects would not give the compiler any useful information in terms of optimizations or portability (in the general case). Thus, the standard probably doesn't mention those cases because it wouldn't change anything in realistic implementations anyway.
To allow the best blend of optimization opportunities and useful semantics, types which have no trap representations should have Indeterminate Values subdivided into three kinds:
The first read will yield any value that could result from an unspecified
bit pattern; subsequent would be guaranteed to yield the same value.
This would be similar to "Unspecified value", except that the Standard
doesn't generally distinguish between types which do and don't have trap
representations, and in cases where the Standard calls for "Unspecified
Value" it requires that an implementation ensure the value is not a trap
representation; in the general case, that would require that an
implementation include code to guard against certain bit patterns.
Each read may independently yield any value that could result from an
unspecified bit pattern.
The value read, and the result of most computations performed upon it,
may behave non-deterministically as though the read had yielded any
possible value.
Unfortunately, the Standard doesn't make such distinctions, and there is some
disagreement about what it calls for. I would suggest that #2 should be the
default, but it should be possible for code to indicate all places where code
needs to force the compiler to pick a concrete value, and indicate that a
compiler may use #3-style semantics everywhere else. For example, if code for
a collection of distinct 16-bit values stored as:
struct COLLECTION { size_t count; uint16_t values[65536], locations[65536]; };
maintains the invariant that for each i < count, locations[values[i]]==i, it
should be possible to initialize such a structure merely by setting "count"
to zero, even if the storage had previously been used as some other type.
If casts are specified as always yielding concrete values, code which wants
to see if something is in the collection could use:
uint32_t index = (uint32_t)(collection->locations[value]);
if (index < collection->count && collections->values[index]==value)
... value was found
It would be acceptable to have the above code arbitrarily yield any number for "index" each time it reads an item from the array, but it would be essential that both uses of "index" in the second line use the same value.
Unfortunately, some compiler writers seem to think compilers should treat all indeterminate values as #3, while some algorithms require #1 and some require #2, and there's no real way to distinguish the varying requirements.
3.19.2 permits implementation to be a trap representation, and both reading and writing are undefined behaviour.
Your platform may give you guarantees (e.g. that integer types never have trap representations) but that is not required by the Standard, and if you rely on that, your code loses some portability. That's a valid choice, but shouldn't be made in ignorance.
More systems have trap representations for floating-point types than for integer types, but C programs may be run on processors that track register validity - see (Why) is using an uninitialized variable undefined behavior in C?. This degree of latitude is the principal reason for C's wide adoption across many hardware architectures.
This is a followup to Can a char array be used with any data type?
I know about dynamic memory and common implementations of malloc, references can be found on wikipedia. I also know that the pointer returned by malloc can be cast to whatever the programmer wants, without even a warning because the standard states in 6.3.2.3 Pointers §1
A pointer to void may be converted to or from a pointer to any incomplete or object
type. A pointer to any incomplete or object type may be converted to a pointer to void
and back again; the result shall compare equal to the original pointer.
The question is assuming I have a freestanding environment without malloc and free, how can I build in conformant C an implementation of those two functions?
If I take some freedom regarding the standard, it is easy:
start with a large character array
use a reasonably large alignment (8 should be enough for many architectures)
implement an algorithm that returns addresses from that array, at that alignment, keeping track of what has been allocated - nice examples can be found in malloc implementation?
The problem is that the effective type of the pointers returned by that implementation will still be char *
And standard says in same paragraph § 7
A pointer to an object or incomplete type may be converted to a pointer to a different
object or incomplete type. If the resulting pointer is not correctly aligned for the
pointed-to type, the behavior is undefined. Otherwise, when converted back again, the
result shall compare equal to the original pointer.
That does not seem to allow me to pretend that what was declared as simple characters can magically contains another type, and even different types in different part of this array or at different moments in same part. Said differently dereferencing such pointers seem undefined behaviour with a strict interpretation of standard. That is why common idioms use memcpy instead of aliasing when you get a byte representation of an object in a string buffer, for example when you read it from a network stream.
So how can I build a conformant implementation of malloc in pure C???
This answer is only an interpretation of the standard, because I could not find an explicit answer in C99 n1256 draft nor in C11 n1570.
The rationale comes from the C++ standard (C++14 draft n4296).
3.8 Object lifetime [basic.life] says (emphasize mine):
§ 1The lifetime of an object of type T begins when:
storage with the proper alignment and size for type T is obtained, and
if the object has non-vacuous initialization, its initialization is complete.
The lifetime of an object of type T ends when:
if T is a class type with a non-trivial destructor (12.4), the destructor call starts, or
the storage which the object occupies is reused or released.
and
§ 3 The properties ascribed to objects throughout this International Standard apply for a given object only
during its lifetime.
I know that C and C++ are different languages, but they are related, and the above is only here to explain the following interpretation
The relevant part in C standard is 7.20.3 Memory management functions.
... The pointer returned if the allocation
succeeds is suitably aligned so that it may be assigned to a pointer to any type of object
and then used to access such an object or an array of such objects in the space allocated
(until the space is explicitly deallocated). The lifetime of an allocated object extends
from the allocation until the deallocation. Each such allocation shall yield a pointer to an
object disjoint from any other object. The pointer returned points to the start (lowest byte
address) of the allocated space...
My interpretation is that provided you have a memory zone with correct size and alignement, for example a part of a large character array, but any other type of array of type could be used here you can pretend that it is a pointer to an uninitialized object or array of another type (say T) and convert a char or void pointer to the first byte of the zone to a pointer of the new type (T). But in order to not violate the strict aliasing rule, this zone must no longer be accessed through any previous value or pointer or the initial type - if the initial type was character, it will be still allowed for reading, but writing could lead to trap representation. As this object is not initialized, it can contain a trap representation and reading it before its initialization is undefined behaviour. This T object and its associated pointer will be valid until you decide to use the memory zone for any other usage and the pointer to T becomes dangling at that time.
TL/DR: The strict aliasing rule only mandates that a memory zone can only contain an object of one effective type at one single moment. But you are allowed to re-use the memory zone for an object of a different type provided:
the size and alignment are compatible
you initialize the new object with a correct value before using it
you no longer access the initial object
Because that way you simply use the memory zone as allocated memory.
Per C standard, the lifetime of the initial object will not be ended (static objects last until the end of the program, and automatic ones until the end of their declaring scope), but you can no longer access it because of the strict aliasing rule
The authors of the C Standard put far more effort into specifying behaviors which weren't obviously desirable than those that were, since they expected that sensible compiler writers would support useful behaviors whether or not the Standard mandated it, and since obtuse compilers writers could produce "compliant" implementations that were fully-compliant but completely useless(*).
It was possible to write reliable and efficient malloc() equivalents on many platforms prior to the advent of C89, and I see no reason to believe that the authors intended that people writing C89 compilers for a platform which had been able to handle malloc() equivalents previously would not make those implementations just as capable as their predecessors. Unfortunately, the language which was popular in the 1990s (which was a combined superset of C89 and its predecessors) has been replaced by a poor-quality dialect which omits features that the authors of C89 would have taken for granted and expected others to do likewise.
Even beyond the question of how one acquires memory, a larger issue is that
malloc() promises that newly-allocated memory will, at worst, hold
Indeterminate Value; because structure types have no trap representations,
reading such storage using a pointer of structure type will have defined
behavior. If the memory was previously written using some other type,
however, a structure-type read would have Undefined Behavior unless either
the free() or malloc() physically erases all of the storage in question,
thus negating the performance benefit of having malloc() rather than just
calloc().
(*)Provided that there exists at least one set of source files that the implementation processes in compliant fashion without UB, an implementation may require arbitrary (perhaps impossibly large) amounts of stack space when given any other set of source files, and behave in arbitrary fashion if that space is unavailable.
Given an array of type foo_t[n] and a set of n threads, where each of the n threads reads and modifies a different element of the array, do I need to explicitly synchronize modifications of the array or can I assume that concurrently modifying members of the array is well-defined behavior? Does it matter how large foo_t is / what alignment it has?
What I try to do is well-defined behavior.
See ISO/IEC 9899:2011 §5.1.2.4.27:
NOTE 13 Compiler transformations that introduce assignments to a potentially shared memory location that would not be modified by the abstract machine are generally precluded by this standard, since such an assignment might overwrite another assignment by a different thread in cases in which an abstract machine execution would not have encountered a data race. This includes implementations of data member assignment that overwrite adjacent members in separate memory locations. We also generally preclude reordering of atomic loads in cases in which the atomics in question may alias, since this may violate the "visible sequence" rules.
Note that this language was introduced with C11 to make optimizations that cause bugs like this illegal. Pre-C11 compilers may not abide to this rule.