The code that invokes undefined behavior (in this example, division by zero) will never get executed, is the program still undefined behavior?
int main(void)
{
int i;
if(0)
{
i = 1/0;
}
return 0;
}
I think it still is undefined behavior, but I can't find any evidence in the standard to support or deny me.
So, any ideas?
Let's look at how the C standard defines the terms "behavior" and "undefined behavior".
References are to the N1570 draft of the ISO C 2011 standard; I'm not aware of any relevant differences in any of the three published ISO C standards (1990, 1999, and 2011).
Section 3.4:
behavior
external appearance or action
Ok, that's a bit vague, but I'd argue that a given statement has no "appearance", and certainly no "action", unless it's actually executed.
Section 3.4.3:
undefined behavior
behavior, upon use of a nonportable or erroneous program construct or of erroneous data,
for which this International Standard imposes no requirements
It says "upon use" of such a construct. The word "use" is not defined by the standard, so we fall back to the common English meaning. A construct is not "used" if it's never executed.
There's a note under that definition:
NOTE Possible undefined behavior ranges from ignoring the situation
completely with unpredictable results, to behaving during translation
or program execution in a documented manner characteristic of the
environment (with or without the issuance of a diagnostic message), to
terminating a translation or execution (with the issuance of a
diagnostic message).
So a compiler is permitted to reject your program at compile time if its behavior is undefined. But my interpretation of that is that it can do so only if it can prove that every execution of the program will encounter undefined behavior. Which implies, I think, that this:
if (rand() % 2 == 0) {
i = i / 0;
}
which certainly can have undefined behavior, cannot be rejected at compile time.
As a practical matter, programs have to be able to perform runtime tests to guard against invoking undefined behavior, and the standard has to permit them to do so.
Your example was:
if (0) {
i = 1/0;
}
which never executes the division by 0. A very common idiom is:
int x, y;
/* set values for x and y */
if (y != 0) {
x = x / y;
}
The division certainly has undefined behavior if y == 0, but it's never executed if y == 0. The behavior is well defined, and for the same reason that your example is well defined: because the potential undefined behavior can never actually happen.
(Unless INT_MIN < -INT_MAX && x == INT_MIN && y == -1 (yes, integer division can overflow), but that's a separate issue.)
In a comment (since deleted), somebody pointed out that the compiler may evaluate constant expressions at compile time. Which is true, but not relevant in this case, because in the context of
i = 1/0;
1/0 is not a constant expression.
A constant-expression is a syntactic category that reduces to conditional-expression (which excludes assignments and comma expressions). The production constant-expression appears in the grammar only in contexts that actually require a constant expression, such as case labels. So if you write:
switch (...) {
case 1/0:
...
}
then 1/0 is a constant expression -- and one that violates the constraint in 6.6p4: "Each constant expression shall evaluate to a constant that is in the range of representable
values for its type.", so a diagnostic is required. But the right hand side of an assignment does not require a constant-expression, merely a conditional-expression, so the constraints on constant expressions don't apply. A compiler can evaluate any expression that it's able to at compile time, but only if the behavior is the same as if it were evaluated during execution (or, in the context of if (0), not evaluated during execution().
(Something that looks exactly like a constant-expression is not necessarily a constant-expression, just as, in x + y * z, the sequence x + y is not an additive-expression because of the context in which it appears.)
Which means the footnote in N1570 section 6.6 that I was going to cite:
Thus, in the following initialization,
static int i = 2 || 1 / 0;
the expression is a valid integer constant expression with value one.
isn't actually relevant to this question.
Finally, there are a few things that are defined to cause undefined behavior that aren't about what happens during execution. Annex J, section 2 of the C standard (again, see the N1570 draft) lists things that cause undefined behavior, gathered from the rest of the standard. Some examples (I don't claim this is an exhaustive list) are:
A nonempty source file does not end in a new-line character which is not immediately preceded by a backslash character or ends in a partial
preprocessing token or comment
Token concatenation produces a character sequence matching the syntax of a universal character name
A character not in the basic source character set is encountered in a source file, except in an identifier, a character constant, a string
literal, a header name, a comment, or a preprocessing token that is
never converted to a token
An identifier, comment, string literal, character constant, or header name contains an invalid multibyte character or does not begin
and end in the initial shift state
The same identifier has both internal and external linkage in the same translation unit
These particular cases are things that a compiler could detect. I think their behavior is undefined because the committee didn't want to, or couldn't, impose the same behavior on all implementations, and defining a range of permitted behaviors just wasn't worth the effort. They don't really fall into the category of "code that will never be executed", but I mention them here for completeness.
This article discusses this question in section 2.6:
int main(void){
guard();
5 / 0;
}
The authors consider that the program is defined when guard() does not terminate. They also find themselves distinguishing notions of “statically undefined” and “dynamically undefined”, e.g.:
The intention behind the standard11 appears to be that, in general, situations are made statically undefined if it is not easy to generate code for them. Only when code can be generated, then the situation can be undefined dynamically.
11) Private correspondence with committee member.
I would recommend looking at the entire article. Taken together, it paints a consistent picture.
The fact that the authors of the article had to discuss the question with a committee member confirms that the standard is currently fuzzy on the answer to your question.
In this case the undefined behavior is the result of executing the code. So if the code is not executed, there is no undefined behavior.
Non executed code could invoke undefined behavior if the undefined behavior was the result of solely the declaration of the code (e.g. if some case of variable shadowing was undefined).
I'd go with the last paragraph of this answer: https://stackoverflow.com/a/18384176/694576
... UB is a runtime issue, not a compiletime issue ...
So, no, there is no UB invoked.
Only when the standard makes breaking changes and your code suddenly is no longer "never gets executed". But I don't see any logical way in which this can cause 'undefined behaviour'. Its not causing anything.
On the subject of undefined behaviour it is often hard to separate the formal aspects from the practical ones. This is the definition of undefined behaviour in the 1989 standard (I don't have a more recent version at hand, but I don't expect this to have changed substantially):
1 undefined behavior
behavior, upon use of a nonportable or erroneous program construct or of
erroneous data, for which this International Standard imposes no requirements
2 NOTE Possible undefined behavior ranges from ignoring the situation completely
with unpredictable results, to behaving during translation or program execution
in a documented manner characteristic of the environment (with or without the
issuance of a diagnostic message), to terminating a translation or
execution (with the issuance of a diagnostic message).
From a formal point of view I'd say your program does invoke undefined behaviour, which means that the standard places no requirement whatsoever on what it will do when run, just because it contains division by zero.
On the other hand, from a practical point of view I'd be surprised to find a compiler that didn't behave as you intuitively expect.
The standard says, as I remember right, it's allowed to do anything from the moment, a rule got broken. Maybe there are some special events with kind of global flavour (but I never heard or read about something like that)... So I would say: No this can't be UB, because as long the behavior is well defined 0 is allways false, so the rule can't get broken on runtime.
I think it still is undefined behavior, but I can't find any evidence in the standard to support or deny me.
I think the program does not invoke undefined behavior.
Defect Report #109 addresses a similar question and says:
Furthermore, if every possible execution of a given program would result in undefined behavior, the given program is not strictly conforming.
A conforming implementation must not fail to translate a strictly conforming program simply because some possible execution of that program would result in undefined behavior. Because foo might never be called, the example given must be successfully translated by a conforming implementation.
It depends on how the expression "undefined behavior" is defined, and whether "undefined behavior" of a statement is the same as "undefined behavior" for a program.
This program looks like C, so a deeper analysis of what the C standard used by the compiler (as some answers did) is appropriate.
In absence of a specified standard, the correct answer is "it depends". In some languages, compilers after the first error try to guess what the programmer might mean and still generate some code, according to the compilers guess. In other, more pure languages, once somerthing is undefined, the undefinedness propagate to the whole program.
Other languages have a concept of "bounded errors". For some limited kinds of errors, these languages define how much damage an error can produce. In particular languages with implied garbage collection frequently make a difference whether an error invalidates the typing system or does not.
Related
What is undefined behavior (UB) in C and C++? What about unspecified behavior and implementation-defined behavior? What is the difference between them?
Undefined behavior is one of those aspects of the C and C++ language that can be surprising to programmers coming from other languages (other languages try to hide it better). Basically, it is possible to write C++ programs that do not behave in a predictable way, even though many C++ compilers will not report any errors in the program!
Let's look at a classic example:
#include <iostream>
int main()
{
char* p = "hello!\n"; // yes I know, deprecated conversion
p[0] = 'y';
p[5] = 'w';
std::cout << p;
}
The variable p points to the string literal "hello!\n", and the two assignments below try to modify that string literal. What does this program do? According to section 2.14.5 paragraph 11 of the C++ standard, it invokes undefined behavior:
The effect of attempting to modify a string literal is undefined.
I can hear people screaming "But wait, I can compile this no problem and get the output yellow" or "What do you mean undefined, string literals are stored in read-only memory, so the first assignment attempt results in a core dump". This is exactly the problem with undefined behavior. Basically, the standard allows anything to happen once you invoke undefined behavior (even nasal demons). If there is a "correct" behavior according to your mental model of the language, that model is simply wrong; The C++ standard has the only vote, period.
Other examples of undefined behavior include accessing an array beyond its bounds, dereferencing the null pointer, accessing objects after their lifetime ended or writing allegedly clever expressions like i++ + ++i.
Section 1.9 of the C++ standard also mentions undefined behavior's two less dangerous brothers, unspecified behavior and implementation-defined behavior:
The semantic descriptions in this International Standard define a parameterized nondeterministic abstract machine.
Certain aspects and operations of the abstract machine are described in this International Standard as implementation-defined (for example, sizeof(int)). These constitute the parameters of the abstract machine. Each implementation shall include documentation describing its characteristics and behavior in these respects.
Certain other aspects and operations of the abstract machine are described in this International Standard as unspecified (for example, order of evaluation of arguments to a function). Where possible, this International Standard defines a set of allowable behaviors. These define the nondeterministic aspects of the abstract machine.
Certain other operations are described in this International Standard as undefined (for example, the effect of dereferencing the null pointer). [ Note: this International Standard imposes no requirements on the behavior of programs that contain undefined behavior. —end note ]
Specifically, section 1.3.24 states:
Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).
What can you do to avoid running into undefined behavior? Basically, you have to read good C++ books by authors who know what they're talking about. Avoid internet tutorials. Avoid bullschildt.
Well, this is basically a straight copy-paste from the standard
3.4.1 1 implementation-defined behavior unspecified behavior where
each implementation documents how the
choice is made
2 EXAMPLE An example of
implementation-defined behavior is the
propagation of the high-order bit when
a signed integer is shifted right.
3.4.3 1 undefined behavior behavior, upon use of a nonportable or erroneous
program construct or of erroneous
data, for which this International
Standard imposes no requirements
2
NOTE Possible undefined behavior
ranges from ignoring the situation
completely with unpredictable results,
to behaving during translation or
program execution in a documented
manner characteristic of the
environment (with or without the
issuance of a diagnostic message), to
terminating a translation or execution
(with the issuance of a diagnostic
message).
3 EXAMPLE An example of
undefined behavior is the behavior on
integer overflow.
3.4.4 1 unspecified behavior use of an unspecified value, or other behavior
where this International Standard
provides two or more possibilities and
imposes no further requirements on
which is chosen in any instance
2
EXAMPLE An example of unspecified
behavior is the order in which the
arguments to a function are evaluated.
Maybe simpler wording could be easier to understand than the rigorous definition of the standards.
implementation-defined behavior:
The language says that we have data-types. The compiler vendors specify what sizes shall they use, and provide a documentation of what they did.
undefined behavior:
You are doing something wrong. For example, you have a very large value in an int that doesn't fit in char. How do you put that value in char? actually there is no way! Anything could happen, but the most sensible thing would be to take the first byte of that int and put it in char. It is just wrong to do that to assign the first byte, but thats what happens under the hood.
unspecified behavior:
Which of these two functions is executed first?
void fun(int n, int m);
int fun1() {
std::cout << "fun1";
return 1;
}
int fun2() {
std::cout << "fun2";
return 2;
}
//...
fun(fun1(), fun2()); // which one is executed first?
The language doesn't specify the evaluation, left to right or right to left! So an unspecified behavior may or mayn't result in an undefined behavior, but certainly your program should not produce an unspecified behavior.
#eSKay I think your question is worth editing the answer to clarify more :)
for fun(fun1(), fun2()); isn't the behaviour "implementation defined"? The compiler has to choose one or the other course, after all?
The difference between implementation-defined and unspecified, is that the compiler is supposed to pick a behavior in the first case but it doesn't have to in the second case. For example, an implementation must have one and only one definition of sizeof(int). So, it can't say that sizeof(int) is 4 for some portion of the program and 8 for others. Unlike unspecified behavior, where the compiler can say: "OK I am gonna evaluate these arguments left-to-right and the next function's arguments are evaluated right-to-left." It can happen in the same program, that's why it is called unspecified. In fact, C++ could have been made easier if some of the unspecified behaviors were specified. Take a look here at Dr. Stroustrup's answer for that:
It is claimed that the difference between what can be produced giving the compiler this freedom and requiring "ordinary left-to-right evaluation" can be significant. I'm unconvinced, but with innumerable compilers "out there" taking advantage of the freedom and some people passionately defending that freedom, a change would be difficult and could take decades to penetrate to the distant corners of the C and C++ worlds. I am disappointed that not all compilers warn against code such as ++i+i++. Similarly, the order of evaluation of arguments is unspecified.
IMO far too many "things" are left undefined, unspecified, that's easy to say and even to give examples of, but hard to fix. It should also be noted that it is not all that difficult to avoid most of the problems and produce portable code.
From the official C Rationale Document
The terms unspecified behavior, undefined behavior, and implementation-defined behavior are used to categorize the result of writing programs whose properties the Standard does not, or cannot, completely describe. The goal of adopting this categorization is to allow a certain variety among implementations which permits quality of implementation to be an active force in the marketplace as well as to allow certain popular extensions, without removing the cachet of conformance to the Standard. Appendix F to the Standard catalogs those behaviors which fall into one of these three categories.
Unspecified behavior gives the implementor some latitude in translating programs. This latitude does not extend as far as failing to translate the program.
Undefined behavior gives the implementor license not to catch certain program errors that are difficult to diagnose. It also identifies areas of possible conforming language extension: the implementor may augment the language by providing a definition of the officially undefined behavior.
Implementation-defined behavior gives an implementor the freedom to choose the appropriate approach, but requires that this choice be explained to the user. Behaviors designated as implementation-defined are generally those in which a user could make meaningful coding decisions based on the implementation definition. Implementors should bear in mind this criterion when deciding how extensive an implementation definition ought to be. As with unspecified behavior, simply failing to translate the source containing the implementation-defined behavior is not an adequate response.
Undefined Behavior vs. Unspecified Behavior has a short description of it.
Their final summary:
To sum up, unspecified behavior is usually something you shouldn't
worry about, unless your software is required to be portable.
Conversely, undefined behavior is always undesirable and should never
occur.
Implementation defined-
Implementors wish,should be well documented,standard gives choices but sure to compile
Unspecified -
Same as implementation-defined but not documented
Undefined-
Anything might happen,take care of it.
Historically, both Implementation-Defined Behavior and Undefined Behavior represented situations in which the authors of the Standard expected that people writing quality implementations would use judgment to decide what behavioral guarantees, if any, would be useful for programs in the intended application field running on the intended targets. The needs of high-end number-crunching code are quite different from those of low-level systems code, and both UB and IDB give compiler writers flexibility to meet those different needs. Neither category mandates that implementations behave in a way that's useful for any particular purpose, or even for any purpose whatsoever. Quality implementations that claim to be suitable for a particular purpose, however, should behave in a manner befitting such purpose whether the Standard requires it or not.
The only difference between Implementation-Defined Behavior and Undefined Behavior is that the former requires that implementations define and document a consistent behavior even in cases where nothing the implementation could possibly do would be useful. The dividing line between them is not whether it would generally be useful for implementations to define behaviors (compiler writers should define useful behaviors when practical whether the Standard requires them to or not) but whether there might be implementations where defining a behavior would be simultaneously costly and useless. A judgment that such implementations might exist does not in any way, shape, or form, imply any judgment about the usefulness of supporting a defined behavior on other platforms.
Unfortunately, since the mid 1990s compiler writers have started to interpret the lack of behavioral mandates as an judgment that behavioral guarantees aren't worth the cost even in application fields where they're vital, and even on systems where they cost practically nothing. Instead of treating UB as an invitation to exercise reasonable judgment, compiler writers have started treating it as an excuse not to do so.
For example, given the following code:
int scaled_velocity(int v, unsigned char pow)
{
if (v > 250)
v = 250;
if (v < -250)
v = -250;
return v << pow;
}
a two's-complement implementation would not have to expend any effort
whatsoever to treat the expression v << pow as a two's-complement shift
without regard for whether v was positive or negative.
The preferred philosophy among some of today's compiler writers, however, would suggest that because v can only be negative if the program is going to engage in Undefined Behavior, there's no reason to have the program clip the negative range of v. Even though left-shifting of negative values used to be supported on every single compiler of significance, and a large amount of existing code relies upon that behavior, modern philosophy would interpret the fact that the Standard says that left-shifting negative values is UB as implying that compiler writers should feel free to ignore that.
C++ standard n3337 § 1.3.10
implementation-defined behavior
behavior, for a well-formed program construct and correct data, that
depends on the implementation and that each implementation documents
Sometimes C++ Standard doesn't impose particular behavior on some constructs but says instead that a particular, well defined behavior has to be chosen and described by particular implementation (version of library). So user can still know exactly how will program behave even though Standard doesn't describe this.
C++ standard n3337 § 1.3.24
undefined behavior
behavior for which this International Standard imposes no requirements
[ Note: Undefined behavior may be expected when this International
Standard omits any explicit definition of behavior or when a program
uses an erroneous construct or erroneous data. Permissible undefined
behavior ranges from ignoring the situation completely with
unpredictable results, to behaving during translation or program
execution in a documented manner characteristic of the environment
(with or without the issuance of a diagnostic message), to terminating
a translation or execution (with the issuance of a diagnostic
message). Many erroneous program constructs do not engender undefined
behavior; they are required to be diagnosed. — end note ]
When the program encounters construct that is not defined according to C++ Standard it is allowed to do whatever it wants to do ( maybe send an email to me or maybe send an email to you or maybe ignore the code completely).
C++ standard n3337 § 1.3.25
unspecified behavior
behavior, for a well-formed program construct and correct data, that
depends on the implementation [ Note: The implementation is not
required to document which behavior occurs. The range of possible
behaviors is usually delineated by this International Standard. — end
note ]
C++ Standard doesn't impose particular behavior on some constructs but says instead that a particular, well defined behavior has to be chosen ( bot not necessary described) by particular implementation (version of library). So in the case when no description has been provided it can be difficult to the user to know exactly how will program behave.
Undefined behavior is ugly -- as in, "The good, the bad, and the ugly".
Good: a program that compiles and works, for the right reasons.
Bad: a program that has an error, of a kind that the compiler can detect and complain about.
Ugly: a program that has an error, that the compiler cannot detect and warn about, meaning that the program compiles, and may seem to work correctly some of the time, but also fails bizarrely some of the time. That's what undefined behavior is.
Some program languages and other formal systems try hard to limit the "gulf of undefinedness" -- that is, they try to arrange things so that most or all programs are either "good" or "bad", and that very few are "ugly". It's a characteristic feature of C, however, that its "gulf of undefinedness" is quite wide.
What is undefined behavior (UB) in C and C++? What about unspecified behavior and implementation-defined behavior? What is the difference between them?
Undefined behavior is one of those aspects of the C and C++ language that can be surprising to programmers coming from other languages (other languages try to hide it better). Basically, it is possible to write C++ programs that do not behave in a predictable way, even though many C++ compilers will not report any errors in the program!
Let's look at a classic example:
#include <iostream>
int main()
{
char* p = "hello!\n"; // yes I know, deprecated conversion
p[0] = 'y';
p[5] = 'w';
std::cout << p;
}
The variable p points to the string literal "hello!\n", and the two assignments below try to modify that string literal. What does this program do? According to section 2.14.5 paragraph 11 of the C++ standard, it invokes undefined behavior:
The effect of attempting to modify a string literal is undefined.
I can hear people screaming "But wait, I can compile this no problem and get the output yellow" or "What do you mean undefined, string literals are stored in read-only memory, so the first assignment attempt results in a core dump". This is exactly the problem with undefined behavior. Basically, the standard allows anything to happen once you invoke undefined behavior (even nasal demons). If there is a "correct" behavior according to your mental model of the language, that model is simply wrong; The C++ standard has the only vote, period.
Other examples of undefined behavior include accessing an array beyond its bounds, dereferencing the null pointer, accessing objects after their lifetime ended or writing allegedly clever expressions like i++ + ++i.
Section 1.9 of the C++ standard also mentions undefined behavior's two less dangerous brothers, unspecified behavior and implementation-defined behavior:
The semantic descriptions in this International Standard define a parameterized nondeterministic abstract machine.
Certain aspects and operations of the abstract machine are described in this International Standard as implementation-defined (for example, sizeof(int)). These constitute the parameters of the abstract machine. Each implementation shall include documentation describing its characteristics and behavior in these respects.
Certain other aspects and operations of the abstract machine are described in this International Standard as unspecified (for example, order of evaluation of arguments to a function). Where possible, this International Standard defines a set of allowable behaviors. These define the nondeterministic aspects of the abstract machine.
Certain other operations are described in this International Standard as undefined (for example, the effect of dereferencing the null pointer). [ Note: this International Standard imposes no requirements on the behavior of programs that contain undefined behavior. —end note ]
Specifically, section 1.3.24 states:
Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).
What can you do to avoid running into undefined behavior? Basically, you have to read good C++ books by authors who know what they're talking about. Avoid internet tutorials. Avoid bullschildt.
Well, this is basically a straight copy-paste from the standard
3.4.1 1 implementation-defined behavior unspecified behavior where
each implementation documents how the
choice is made
2 EXAMPLE An example of
implementation-defined behavior is the
propagation of the high-order bit when
a signed integer is shifted right.
3.4.3 1 undefined behavior behavior, upon use of a nonportable or erroneous
program construct or of erroneous
data, for which this International
Standard imposes no requirements
2
NOTE Possible undefined behavior
ranges from ignoring the situation
completely with unpredictable results,
to behaving during translation or
program execution in a documented
manner characteristic of the
environment (with or without the
issuance of a diagnostic message), to
terminating a translation or execution
(with the issuance of a diagnostic
message).
3 EXAMPLE An example of
undefined behavior is the behavior on
integer overflow.
3.4.4 1 unspecified behavior use of an unspecified value, or other behavior
where this International Standard
provides two or more possibilities and
imposes no further requirements on
which is chosen in any instance
2
EXAMPLE An example of unspecified
behavior is the order in which the
arguments to a function are evaluated.
Maybe simpler wording could be easier to understand than the rigorous definition of the standards.
implementation-defined behavior:
The language says that we have data-types. The compiler vendors specify what sizes shall they use, and provide a documentation of what they did.
undefined behavior:
You are doing something wrong. For example, you have a very large value in an int that doesn't fit in char. How do you put that value in char? actually there is no way! Anything could happen, but the most sensible thing would be to take the first byte of that int and put it in char. It is just wrong to do that to assign the first byte, but thats what happens under the hood.
unspecified behavior:
Which of these two functions is executed first?
void fun(int n, int m);
int fun1() {
std::cout << "fun1";
return 1;
}
int fun2() {
std::cout << "fun2";
return 2;
}
//...
fun(fun1(), fun2()); // which one is executed first?
The language doesn't specify the evaluation, left to right or right to left! So an unspecified behavior may or mayn't result in an undefined behavior, but certainly your program should not produce an unspecified behavior.
#eSKay I think your question is worth editing the answer to clarify more :)
for fun(fun1(), fun2()); isn't the behaviour "implementation defined"? The compiler has to choose one or the other course, after all?
The difference between implementation-defined and unspecified, is that the compiler is supposed to pick a behavior in the first case but it doesn't have to in the second case. For example, an implementation must have one and only one definition of sizeof(int). So, it can't say that sizeof(int) is 4 for some portion of the program and 8 for others. Unlike unspecified behavior, where the compiler can say: "OK I am gonna evaluate these arguments left-to-right and the next function's arguments are evaluated right-to-left." It can happen in the same program, that's why it is called unspecified. In fact, C++ could have been made easier if some of the unspecified behaviors were specified. Take a look here at Dr. Stroustrup's answer for that:
It is claimed that the difference between what can be produced giving the compiler this freedom and requiring "ordinary left-to-right evaluation" can be significant. I'm unconvinced, but with innumerable compilers "out there" taking advantage of the freedom and some people passionately defending that freedom, a change would be difficult and could take decades to penetrate to the distant corners of the C and C++ worlds. I am disappointed that not all compilers warn against code such as ++i+i++. Similarly, the order of evaluation of arguments is unspecified.
IMO far too many "things" are left undefined, unspecified, that's easy to say and even to give examples of, but hard to fix. It should also be noted that it is not all that difficult to avoid most of the problems and produce portable code.
From the official C Rationale Document
The terms unspecified behavior, undefined behavior, and implementation-defined behavior are used to categorize the result of writing programs whose properties the Standard does not, or cannot, completely describe. The goal of adopting this categorization is to allow a certain variety among implementations which permits quality of implementation to be an active force in the marketplace as well as to allow certain popular extensions, without removing the cachet of conformance to the Standard. Appendix F to the Standard catalogs those behaviors which fall into one of these three categories.
Unspecified behavior gives the implementor some latitude in translating programs. This latitude does not extend as far as failing to translate the program.
Undefined behavior gives the implementor license not to catch certain program errors that are difficult to diagnose. It also identifies areas of possible conforming language extension: the implementor may augment the language by providing a definition of the officially undefined behavior.
Implementation-defined behavior gives an implementor the freedom to choose the appropriate approach, but requires that this choice be explained to the user. Behaviors designated as implementation-defined are generally those in which a user could make meaningful coding decisions based on the implementation definition. Implementors should bear in mind this criterion when deciding how extensive an implementation definition ought to be. As with unspecified behavior, simply failing to translate the source containing the implementation-defined behavior is not an adequate response.
Undefined Behavior vs. Unspecified Behavior has a short description of it.
Their final summary:
To sum up, unspecified behavior is usually something you shouldn't
worry about, unless your software is required to be portable.
Conversely, undefined behavior is always undesirable and should never
occur.
Implementation defined-
Implementors wish,should be well documented,standard gives choices but sure to compile
Unspecified -
Same as implementation-defined but not documented
Undefined-
Anything might happen,take care of it.
Historically, both Implementation-Defined Behavior and Undefined Behavior represented situations in which the authors of the Standard expected that people writing quality implementations would use judgment to decide what behavioral guarantees, if any, would be useful for programs in the intended application field running on the intended targets. The needs of high-end number-crunching code are quite different from those of low-level systems code, and both UB and IDB give compiler writers flexibility to meet those different needs. Neither category mandates that implementations behave in a way that's useful for any particular purpose, or even for any purpose whatsoever. Quality implementations that claim to be suitable for a particular purpose, however, should behave in a manner befitting such purpose whether the Standard requires it or not.
The only difference between Implementation-Defined Behavior and Undefined Behavior is that the former requires that implementations define and document a consistent behavior even in cases where nothing the implementation could possibly do would be useful. The dividing line between them is not whether it would generally be useful for implementations to define behaviors (compiler writers should define useful behaviors when practical whether the Standard requires them to or not) but whether there might be implementations where defining a behavior would be simultaneously costly and useless. A judgment that such implementations might exist does not in any way, shape, or form, imply any judgment about the usefulness of supporting a defined behavior on other platforms.
Unfortunately, since the mid 1990s compiler writers have started to interpret the lack of behavioral mandates as an judgment that behavioral guarantees aren't worth the cost even in application fields where they're vital, and even on systems where they cost practically nothing. Instead of treating UB as an invitation to exercise reasonable judgment, compiler writers have started treating it as an excuse not to do so.
For example, given the following code:
int scaled_velocity(int v, unsigned char pow)
{
if (v > 250)
v = 250;
if (v < -250)
v = -250;
return v << pow;
}
a two's-complement implementation would not have to expend any effort
whatsoever to treat the expression v << pow as a two's-complement shift
without regard for whether v was positive or negative.
The preferred philosophy among some of today's compiler writers, however, would suggest that because v can only be negative if the program is going to engage in Undefined Behavior, there's no reason to have the program clip the negative range of v. Even though left-shifting of negative values used to be supported on every single compiler of significance, and a large amount of existing code relies upon that behavior, modern philosophy would interpret the fact that the Standard says that left-shifting negative values is UB as implying that compiler writers should feel free to ignore that.
C++ standard n3337 § 1.3.10
implementation-defined behavior
behavior, for a well-formed program construct and correct data, that
depends on the implementation and that each implementation documents
Sometimes C++ Standard doesn't impose particular behavior on some constructs but says instead that a particular, well defined behavior has to be chosen and described by particular implementation (version of library). So user can still know exactly how will program behave even though Standard doesn't describe this.
C++ standard n3337 § 1.3.24
undefined behavior
behavior for which this International Standard imposes no requirements
[ Note: Undefined behavior may be expected when this International
Standard omits any explicit definition of behavior or when a program
uses an erroneous construct or erroneous data. Permissible undefined
behavior ranges from ignoring the situation completely with
unpredictable results, to behaving during translation or program
execution in a documented manner characteristic of the environment
(with or without the issuance of a diagnostic message), to terminating
a translation or execution (with the issuance of a diagnostic
message). Many erroneous program constructs do not engender undefined
behavior; they are required to be diagnosed. — end note ]
When the program encounters construct that is not defined according to C++ Standard it is allowed to do whatever it wants to do ( maybe send an email to me or maybe send an email to you or maybe ignore the code completely).
C++ standard n3337 § 1.3.25
unspecified behavior
behavior, for a well-formed program construct and correct data, that
depends on the implementation [ Note: The implementation is not
required to document which behavior occurs. The range of possible
behaviors is usually delineated by this International Standard. — end
note ]
C++ Standard doesn't impose particular behavior on some constructs but says instead that a particular, well defined behavior has to be chosen ( bot not necessary described) by particular implementation (version of library). So in the case when no description has been provided it can be difficult to the user to know exactly how will program behave.
Undefined behavior is ugly -- as in, "The good, the bad, and the ugly".
Good: a program that compiles and works, for the right reasons.
Bad: a program that has an error, of a kind that the compiler can detect and complain about.
Ugly: a program that has an error, that the compiler cannot detect and warn about, meaning that the program compiles, and may seem to work correctly some of the time, but also fails bizarrely some of the time. That's what undefined behavior is.
Some program languages and other formal systems try hard to limit the "gulf of undefinedness" -- that is, they try to arrange things so that most or all programs are either "good" or "bad", and that very few are "ugly". It's a characteristic feature of C, however, that its "gulf of undefinedness" is quite wide.
What is undefined behavior (UB) in C and C++? What about unspecified behavior and implementation-defined behavior? What is the difference between them?
Undefined behavior is one of those aspects of the C and C++ language that can be surprising to programmers coming from other languages (other languages try to hide it better). Basically, it is possible to write C++ programs that do not behave in a predictable way, even though many C++ compilers will not report any errors in the program!
Let's look at a classic example:
#include <iostream>
int main()
{
char* p = "hello!\n"; // yes I know, deprecated conversion
p[0] = 'y';
p[5] = 'w';
std::cout << p;
}
The variable p points to the string literal "hello!\n", and the two assignments below try to modify that string literal. What does this program do? According to section 2.14.5 paragraph 11 of the C++ standard, it invokes undefined behavior:
The effect of attempting to modify a string literal is undefined.
I can hear people screaming "But wait, I can compile this no problem and get the output yellow" or "What do you mean undefined, string literals are stored in read-only memory, so the first assignment attempt results in a core dump". This is exactly the problem with undefined behavior. Basically, the standard allows anything to happen once you invoke undefined behavior (even nasal demons). If there is a "correct" behavior according to your mental model of the language, that model is simply wrong; The C++ standard has the only vote, period.
Other examples of undefined behavior include accessing an array beyond its bounds, dereferencing the null pointer, accessing objects after their lifetime ended or writing allegedly clever expressions like i++ + ++i.
Section 1.9 of the C++ standard also mentions undefined behavior's two less dangerous brothers, unspecified behavior and implementation-defined behavior:
The semantic descriptions in this International Standard define a parameterized nondeterministic abstract machine.
Certain aspects and operations of the abstract machine are described in this International Standard as implementation-defined (for example, sizeof(int)). These constitute the parameters of the abstract machine. Each implementation shall include documentation describing its characteristics and behavior in these respects.
Certain other aspects and operations of the abstract machine are described in this International Standard as unspecified (for example, order of evaluation of arguments to a function). Where possible, this International Standard defines a set of allowable behaviors. These define the nondeterministic aspects of the abstract machine.
Certain other operations are described in this International Standard as undefined (for example, the effect of dereferencing the null pointer). [ Note: this International Standard imposes no requirements on the behavior of programs that contain undefined behavior. —end note ]
Specifically, section 1.3.24 states:
Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).
What can you do to avoid running into undefined behavior? Basically, you have to read good C++ books by authors who know what they're talking about. Avoid internet tutorials. Avoid bullschildt.
Well, this is basically a straight copy-paste from the standard
3.4.1 1 implementation-defined behavior unspecified behavior where
each implementation documents how the
choice is made
2 EXAMPLE An example of
implementation-defined behavior is the
propagation of the high-order bit when
a signed integer is shifted right.
3.4.3 1 undefined behavior behavior, upon use of a nonportable or erroneous
program construct or of erroneous
data, for which this International
Standard imposes no requirements
2
NOTE Possible undefined behavior
ranges from ignoring the situation
completely with unpredictable results,
to behaving during translation or
program execution in a documented
manner characteristic of the
environment (with or without the
issuance of a diagnostic message), to
terminating a translation or execution
(with the issuance of a diagnostic
message).
3 EXAMPLE An example of
undefined behavior is the behavior on
integer overflow.
3.4.4 1 unspecified behavior use of an unspecified value, or other behavior
where this International Standard
provides two or more possibilities and
imposes no further requirements on
which is chosen in any instance
2
EXAMPLE An example of unspecified
behavior is the order in which the
arguments to a function are evaluated.
Maybe simpler wording could be easier to understand than the rigorous definition of the standards.
implementation-defined behavior:
The language says that we have data-types. The compiler vendors specify what sizes shall they use, and provide a documentation of what they did.
undefined behavior:
You are doing something wrong. For example, you have a very large value in an int that doesn't fit in char. How do you put that value in char? actually there is no way! Anything could happen, but the most sensible thing would be to take the first byte of that int and put it in char. It is just wrong to do that to assign the first byte, but thats what happens under the hood.
unspecified behavior:
Which of these two functions is executed first?
void fun(int n, int m);
int fun1() {
std::cout << "fun1";
return 1;
}
int fun2() {
std::cout << "fun2";
return 2;
}
//...
fun(fun1(), fun2()); // which one is executed first?
The language doesn't specify the evaluation, left to right or right to left! So an unspecified behavior may or mayn't result in an undefined behavior, but certainly your program should not produce an unspecified behavior.
#eSKay I think your question is worth editing the answer to clarify more :)
for fun(fun1(), fun2()); isn't the behaviour "implementation defined"? The compiler has to choose one or the other course, after all?
The difference between implementation-defined and unspecified, is that the compiler is supposed to pick a behavior in the first case but it doesn't have to in the second case. For example, an implementation must have one and only one definition of sizeof(int). So, it can't say that sizeof(int) is 4 for some portion of the program and 8 for others. Unlike unspecified behavior, where the compiler can say: "OK I am gonna evaluate these arguments left-to-right and the next function's arguments are evaluated right-to-left." It can happen in the same program, that's why it is called unspecified. In fact, C++ could have been made easier if some of the unspecified behaviors were specified. Take a look here at Dr. Stroustrup's answer for that:
It is claimed that the difference between what can be produced giving the compiler this freedom and requiring "ordinary left-to-right evaluation" can be significant. I'm unconvinced, but with innumerable compilers "out there" taking advantage of the freedom and some people passionately defending that freedom, a change would be difficult and could take decades to penetrate to the distant corners of the C and C++ worlds. I am disappointed that not all compilers warn against code such as ++i+i++. Similarly, the order of evaluation of arguments is unspecified.
IMO far too many "things" are left undefined, unspecified, that's easy to say and even to give examples of, but hard to fix. It should also be noted that it is not all that difficult to avoid most of the problems and produce portable code.
From the official C Rationale Document
The terms unspecified behavior, undefined behavior, and implementation-defined behavior are used to categorize the result of writing programs whose properties the Standard does not, or cannot, completely describe. The goal of adopting this categorization is to allow a certain variety among implementations which permits quality of implementation to be an active force in the marketplace as well as to allow certain popular extensions, without removing the cachet of conformance to the Standard. Appendix F to the Standard catalogs those behaviors which fall into one of these three categories.
Unspecified behavior gives the implementor some latitude in translating programs. This latitude does not extend as far as failing to translate the program.
Undefined behavior gives the implementor license not to catch certain program errors that are difficult to diagnose. It also identifies areas of possible conforming language extension: the implementor may augment the language by providing a definition of the officially undefined behavior.
Implementation-defined behavior gives an implementor the freedom to choose the appropriate approach, but requires that this choice be explained to the user. Behaviors designated as implementation-defined are generally those in which a user could make meaningful coding decisions based on the implementation definition. Implementors should bear in mind this criterion when deciding how extensive an implementation definition ought to be. As with unspecified behavior, simply failing to translate the source containing the implementation-defined behavior is not an adequate response.
Undefined Behavior vs. Unspecified Behavior has a short description of it.
Their final summary:
To sum up, unspecified behavior is usually something you shouldn't
worry about, unless your software is required to be portable.
Conversely, undefined behavior is always undesirable and should never
occur.
Implementation defined-
Implementors wish,should be well documented,standard gives choices but sure to compile
Unspecified -
Same as implementation-defined but not documented
Undefined-
Anything might happen,take care of it.
Historically, both Implementation-Defined Behavior and Undefined Behavior represented situations in which the authors of the Standard expected that people writing quality implementations would use judgment to decide what behavioral guarantees, if any, would be useful for programs in the intended application field running on the intended targets. The needs of high-end number-crunching code are quite different from those of low-level systems code, and both UB and IDB give compiler writers flexibility to meet those different needs. Neither category mandates that implementations behave in a way that's useful for any particular purpose, or even for any purpose whatsoever. Quality implementations that claim to be suitable for a particular purpose, however, should behave in a manner befitting such purpose whether the Standard requires it or not.
The only difference between Implementation-Defined Behavior and Undefined Behavior is that the former requires that implementations define and document a consistent behavior even in cases where nothing the implementation could possibly do would be useful. The dividing line between them is not whether it would generally be useful for implementations to define behaviors (compiler writers should define useful behaviors when practical whether the Standard requires them to or not) but whether there might be implementations where defining a behavior would be simultaneously costly and useless. A judgment that such implementations might exist does not in any way, shape, or form, imply any judgment about the usefulness of supporting a defined behavior on other platforms.
Unfortunately, since the mid 1990s compiler writers have started to interpret the lack of behavioral mandates as an judgment that behavioral guarantees aren't worth the cost even in application fields where they're vital, and even on systems where they cost practically nothing. Instead of treating UB as an invitation to exercise reasonable judgment, compiler writers have started treating it as an excuse not to do so.
For example, given the following code:
int scaled_velocity(int v, unsigned char pow)
{
if (v > 250)
v = 250;
if (v < -250)
v = -250;
return v << pow;
}
a two's-complement implementation would not have to expend any effort
whatsoever to treat the expression v << pow as a two's-complement shift
without regard for whether v was positive or negative.
The preferred philosophy among some of today's compiler writers, however, would suggest that because v can only be negative if the program is going to engage in Undefined Behavior, there's no reason to have the program clip the negative range of v. Even though left-shifting of negative values used to be supported on every single compiler of significance, and a large amount of existing code relies upon that behavior, modern philosophy would interpret the fact that the Standard says that left-shifting negative values is UB as implying that compiler writers should feel free to ignore that.
C++ standard n3337 § 1.3.10
implementation-defined behavior
behavior, for a well-formed program construct and correct data, that
depends on the implementation and that each implementation documents
Sometimes C++ Standard doesn't impose particular behavior on some constructs but says instead that a particular, well defined behavior has to be chosen and described by particular implementation (version of library). So user can still know exactly how will program behave even though Standard doesn't describe this.
C++ standard n3337 § 1.3.24
undefined behavior
behavior for which this International Standard imposes no requirements
[ Note: Undefined behavior may be expected when this International
Standard omits any explicit definition of behavior or when a program
uses an erroneous construct or erroneous data. Permissible undefined
behavior ranges from ignoring the situation completely with
unpredictable results, to behaving during translation or program
execution in a documented manner characteristic of the environment
(with or without the issuance of a diagnostic message), to terminating
a translation or execution (with the issuance of a diagnostic
message). Many erroneous program constructs do not engender undefined
behavior; they are required to be diagnosed. — end note ]
When the program encounters construct that is not defined according to C++ Standard it is allowed to do whatever it wants to do ( maybe send an email to me or maybe send an email to you or maybe ignore the code completely).
C++ standard n3337 § 1.3.25
unspecified behavior
behavior, for a well-formed program construct and correct data, that
depends on the implementation [ Note: The implementation is not
required to document which behavior occurs. The range of possible
behaviors is usually delineated by this International Standard. — end
note ]
C++ Standard doesn't impose particular behavior on some constructs but says instead that a particular, well defined behavior has to be chosen ( bot not necessary described) by particular implementation (version of library). So in the case when no description has been provided it can be difficult to the user to know exactly how will program behave.
Undefined behavior is ugly -- as in, "The good, the bad, and the ugly".
Good: a program that compiles and works, for the right reasons.
Bad: a program that has an error, of a kind that the compiler can detect and complain about.
Ugly: a program that has an error, that the compiler cannot detect and warn about, meaning that the program compiles, and may seem to work correctly some of the time, but also fails bizarrely some of the time. That's what undefined behavior is.
Some program languages and other formal systems try hard to limit the "gulf of undefinedness" -- that is, they try to arrange things so that most or all programs are either "good" or "bad", and that very few are "ugly". It's a characteristic feature of C, however, that its "gulf of undefinedness" is quite wide.
What is undefined behavior (UB) in C and C++? What about unspecified behavior and implementation-defined behavior? What is the difference between them?
Undefined behavior is one of those aspects of the C and C++ language that can be surprising to programmers coming from other languages (other languages try to hide it better). Basically, it is possible to write C++ programs that do not behave in a predictable way, even though many C++ compilers will not report any errors in the program!
Let's look at a classic example:
#include <iostream>
int main()
{
char* p = "hello!\n"; // yes I know, deprecated conversion
p[0] = 'y';
p[5] = 'w';
std::cout << p;
}
The variable p points to the string literal "hello!\n", and the two assignments below try to modify that string literal. What does this program do? According to section 2.14.5 paragraph 11 of the C++ standard, it invokes undefined behavior:
The effect of attempting to modify a string literal is undefined.
I can hear people screaming "But wait, I can compile this no problem and get the output yellow" or "What do you mean undefined, string literals are stored in read-only memory, so the first assignment attempt results in a core dump". This is exactly the problem with undefined behavior. Basically, the standard allows anything to happen once you invoke undefined behavior (even nasal demons). If there is a "correct" behavior according to your mental model of the language, that model is simply wrong; The C++ standard has the only vote, period.
Other examples of undefined behavior include accessing an array beyond its bounds, dereferencing the null pointer, accessing objects after their lifetime ended or writing allegedly clever expressions like i++ + ++i.
Section 1.9 of the C++ standard also mentions undefined behavior's two less dangerous brothers, unspecified behavior and implementation-defined behavior:
The semantic descriptions in this International Standard define a parameterized nondeterministic abstract machine.
Certain aspects and operations of the abstract machine are described in this International Standard as implementation-defined (for example, sizeof(int)). These constitute the parameters of the abstract machine. Each implementation shall include documentation describing its characteristics and behavior in these respects.
Certain other aspects and operations of the abstract machine are described in this International Standard as unspecified (for example, order of evaluation of arguments to a function). Where possible, this International Standard defines a set of allowable behaviors. These define the nondeterministic aspects of the abstract machine.
Certain other operations are described in this International Standard as undefined (for example, the effect of dereferencing the null pointer). [ Note: this International Standard imposes no requirements on the behavior of programs that contain undefined behavior. —end note ]
Specifically, section 1.3.24 states:
Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).
What can you do to avoid running into undefined behavior? Basically, you have to read good C++ books by authors who know what they're talking about. Avoid internet tutorials. Avoid bullschildt.
Well, this is basically a straight copy-paste from the standard
3.4.1 1 implementation-defined behavior unspecified behavior where
each implementation documents how the
choice is made
2 EXAMPLE An example of
implementation-defined behavior is the
propagation of the high-order bit when
a signed integer is shifted right.
3.4.3 1 undefined behavior behavior, upon use of a nonportable or erroneous
program construct or of erroneous
data, for which this International
Standard imposes no requirements
2
NOTE Possible undefined behavior
ranges from ignoring the situation
completely with unpredictable results,
to behaving during translation or
program execution in a documented
manner characteristic of the
environment (with or without the
issuance of a diagnostic message), to
terminating a translation or execution
(with the issuance of a diagnostic
message).
3 EXAMPLE An example of
undefined behavior is the behavior on
integer overflow.
3.4.4 1 unspecified behavior use of an unspecified value, or other behavior
where this International Standard
provides two or more possibilities and
imposes no further requirements on
which is chosen in any instance
2
EXAMPLE An example of unspecified
behavior is the order in which the
arguments to a function are evaluated.
Maybe simpler wording could be easier to understand than the rigorous definition of the standards.
implementation-defined behavior:
The language says that we have data-types. The compiler vendors specify what sizes shall they use, and provide a documentation of what they did.
undefined behavior:
You are doing something wrong. For example, you have a very large value in an int that doesn't fit in char. How do you put that value in char? actually there is no way! Anything could happen, but the most sensible thing would be to take the first byte of that int and put it in char. It is just wrong to do that to assign the first byte, but thats what happens under the hood.
unspecified behavior:
Which of these two functions is executed first?
void fun(int n, int m);
int fun1() {
std::cout << "fun1";
return 1;
}
int fun2() {
std::cout << "fun2";
return 2;
}
//...
fun(fun1(), fun2()); // which one is executed first?
The language doesn't specify the evaluation, left to right or right to left! So an unspecified behavior may or mayn't result in an undefined behavior, but certainly your program should not produce an unspecified behavior.
#eSKay I think your question is worth editing the answer to clarify more :)
for fun(fun1(), fun2()); isn't the behaviour "implementation defined"? The compiler has to choose one or the other course, after all?
The difference between implementation-defined and unspecified, is that the compiler is supposed to pick a behavior in the first case but it doesn't have to in the second case. For example, an implementation must have one and only one definition of sizeof(int). So, it can't say that sizeof(int) is 4 for some portion of the program and 8 for others. Unlike unspecified behavior, where the compiler can say: "OK I am gonna evaluate these arguments left-to-right and the next function's arguments are evaluated right-to-left." It can happen in the same program, that's why it is called unspecified. In fact, C++ could have been made easier if some of the unspecified behaviors were specified. Take a look here at Dr. Stroustrup's answer for that:
It is claimed that the difference between what can be produced giving the compiler this freedom and requiring "ordinary left-to-right evaluation" can be significant. I'm unconvinced, but with innumerable compilers "out there" taking advantage of the freedom and some people passionately defending that freedom, a change would be difficult and could take decades to penetrate to the distant corners of the C and C++ worlds. I am disappointed that not all compilers warn against code such as ++i+i++. Similarly, the order of evaluation of arguments is unspecified.
IMO far too many "things" are left undefined, unspecified, that's easy to say and even to give examples of, but hard to fix. It should also be noted that it is not all that difficult to avoid most of the problems and produce portable code.
From the official C Rationale Document
The terms unspecified behavior, undefined behavior, and implementation-defined behavior are used to categorize the result of writing programs whose properties the Standard does not, or cannot, completely describe. The goal of adopting this categorization is to allow a certain variety among implementations which permits quality of implementation to be an active force in the marketplace as well as to allow certain popular extensions, without removing the cachet of conformance to the Standard. Appendix F to the Standard catalogs those behaviors which fall into one of these three categories.
Unspecified behavior gives the implementor some latitude in translating programs. This latitude does not extend as far as failing to translate the program.
Undefined behavior gives the implementor license not to catch certain program errors that are difficult to diagnose. It also identifies areas of possible conforming language extension: the implementor may augment the language by providing a definition of the officially undefined behavior.
Implementation-defined behavior gives an implementor the freedom to choose the appropriate approach, but requires that this choice be explained to the user. Behaviors designated as implementation-defined are generally those in which a user could make meaningful coding decisions based on the implementation definition. Implementors should bear in mind this criterion when deciding how extensive an implementation definition ought to be. As with unspecified behavior, simply failing to translate the source containing the implementation-defined behavior is not an adequate response.
Undefined Behavior vs. Unspecified Behavior has a short description of it.
Their final summary:
To sum up, unspecified behavior is usually something you shouldn't
worry about, unless your software is required to be portable.
Conversely, undefined behavior is always undesirable and should never
occur.
Implementation defined-
Implementors wish,should be well documented,standard gives choices but sure to compile
Unspecified -
Same as implementation-defined but not documented
Undefined-
Anything might happen,take care of it.
Historically, both Implementation-Defined Behavior and Undefined Behavior represented situations in which the authors of the Standard expected that people writing quality implementations would use judgment to decide what behavioral guarantees, if any, would be useful for programs in the intended application field running on the intended targets. The needs of high-end number-crunching code are quite different from those of low-level systems code, and both UB and IDB give compiler writers flexibility to meet those different needs. Neither category mandates that implementations behave in a way that's useful for any particular purpose, or even for any purpose whatsoever. Quality implementations that claim to be suitable for a particular purpose, however, should behave in a manner befitting such purpose whether the Standard requires it or not.
The only difference between Implementation-Defined Behavior and Undefined Behavior is that the former requires that implementations define and document a consistent behavior even in cases where nothing the implementation could possibly do would be useful. The dividing line between them is not whether it would generally be useful for implementations to define behaviors (compiler writers should define useful behaviors when practical whether the Standard requires them to or not) but whether there might be implementations where defining a behavior would be simultaneously costly and useless. A judgment that such implementations might exist does not in any way, shape, or form, imply any judgment about the usefulness of supporting a defined behavior on other platforms.
Unfortunately, since the mid 1990s compiler writers have started to interpret the lack of behavioral mandates as an judgment that behavioral guarantees aren't worth the cost even in application fields where they're vital, and even on systems where they cost practically nothing. Instead of treating UB as an invitation to exercise reasonable judgment, compiler writers have started treating it as an excuse not to do so.
For example, given the following code:
int scaled_velocity(int v, unsigned char pow)
{
if (v > 250)
v = 250;
if (v < -250)
v = -250;
return v << pow;
}
a two's-complement implementation would not have to expend any effort
whatsoever to treat the expression v << pow as a two's-complement shift
without regard for whether v was positive or negative.
The preferred philosophy among some of today's compiler writers, however, would suggest that because v can only be negative if the program is going to engage in Undefined Behavior, there's no reason to have the program clip the negative range of v. Even though left-shifting of negative values used to be supported on every single compiler of significance, and a large amount of existing code relies upon that behavior, modern philosophy would interpret the fact that the Standard says that left-shifting negative values is UB as implying that compiler writers should feel free to ignore that.
C++ standard n3337 § 1.3.10
implementation-defined behavior
behavior, for a well-formed program construct and correct data, that
depends on the implementation and that each implementation documents
Sometimes C++ Standard doesn't impose particular behavior on some constructs but says instead that a particular, well defined behavior has to be chosen and described by particular implementation (version of library). So user can still know exactly how will program behave even though Standard doesn't describe this.
C++ standard n3337 § 1.3.24
undefined behavior
behavior for which this International Standard imposes no requirements
[ Note: Undefined behavior may be expected when this International
Standard omits any explicit definition of behavior or when a program
uses an erroneous construct or erroneous data. Permissible undefined
behavior ranges from ignoring the situation completely with
unpredictable results, to behaving during translation or program
execution in a documented manner characteristic of the environment
(with or without the issuance of a diagnostic message), to terminating
a translation or execution (with the issuance of a diagnostic
message). Many erroneous program constructs do not engender undefined
behavior; they are required to be diagnosed. — end note ]
When the program encounters construct that is not defined according to C++ Standard it is allowed to do whatever it wants to do ( maybe send an email to me or maybe send an email to you or maybe ignore the code completely).
C++ standard n3337 § 1.3.25
unspecified behavior
behavior, for a well-formed program construct and correct data, that
depends on the implementation [ Note: The implementation is not
required to document which behavior occurs. The range of possible
behaviors is usually delineated by this International Standard. — end
note ]
C++ Standard doesn't impose particular behavior on some constructs but says instead that a particular, well defined behavior has to be chosen ( bot not necessary described) by particular implementation (version of library). So in the case when no description has been provided it can be difficult to the user to know exactly how will program behave.
Undefined behavior is ugly -- as in, "The good, the bad, and the ugly".
Good: a program that compiles and works, for the right reasons.
Bad: a program that has an error, of a kind that the compiler can detect and complain about.
Ugly: a program that has an error, that the compiler cannot detect and warn about, meaning that the program compiles, and may seem to work correctly some of the time, but also fails bizarrely some of the time. That's what undefined behavior is.
Some program languages and other formal systems try hard to limit the "gulf of undefinedness" -- that is, they try to arrange things so that most or all programs are either "good" or "bad", and that very few are "ugly". It's a characteristic feature of C, however, that its "gulf of undefinedness" is quite wide.
The code that invokes undefined behavior (in this example, division by zero) will never get executed, is the program still undefined behavior?
int main(void)
{
int i;
if(0)
{
i = 1/0;
}
return 0;
}
I think it still is undefined behavior, but I can't find any evidence in the standard to support or deny me.
So, any ideas?
Let's look at how the C standard defines the terms "behavior" and "undefined behavior".
References are to the N1570 draft of the ISO C 2011 standard; I'm not aware of any relevant differences in any of the three published ISO C standards (1990, 1999, and 2011).
Section 3.4:
behavior
external appearance or action
Ok, that's a bit vague, but I'd argue that a given statement has no "appearance", and certainly no "action", unless it's actually executed.
Section 3.4.3:
undefined behavior
behavior, upon use of a nonportable or erroneous program construct or of erroneous data,
for which this International Standard imposes no requirements
It says "upon use" of such a construct. The word "use" is not defined by the standard, so we fall back to the common English meaning. A construct is not "used" if it's never executed.
There's a note under that definition:
NOTE Possible undefined behavior ranges from ignoring the situation
completely with unpredictable results, to behaving during translation
or program execution in a documented manner characteristic of the
environment (with or without the issuance of a diagnostic message), to
terminating a translation or execution (with the issuance of a
diagnostic message).
So a compiler is permitted to reject your program at compile time if its behavior is undefined. But my interpretation of that is that it can do so only if it can prove that every execution of the program will encounter undefined behavior. Which implies, I think, that this:
if (rand() % 2 == 0) {
i = i / 0;
}
which certainly can have undefined behavior, cannot be rejected at compile time.
As a practical matter, programs have to be able to perform runtime tests to guard against invoking undefined behavior, and the standard has to permit them to do so.
Your example was:
if (0) {
i = 1/0;
}
which never executes the division by 0. A very common idiom is:
int x, y;
/* set values for x and y */
if (y != 0) {
x = x / y;
}
The division certainly has undefined behavior if y == 0, but it's never executed if y == 0. The behavior is well defined, and for the same reason that your example is well defined: because the potential undefined behavior can never actually happen.
(Unless INT_MIN < -INT_MAX && x == INT_MIN && y == -1 (yes, integer division can overflow), but that's a separate issue.)
In a comment (since deleted), somebody pointed out that the compiler may evaluate constant expressions at compile time. Which is true, but not relevant in this case, because in the context of
i = 1/0;
1/0 is not a constant expression.
A constant-expression is a syntactic category that reduces to conditional-expression (which excludes assignments and comma expressions). The production constant-expression appears in the grammar only in contexts that actually require a constant expression, such as case labels. So if you write:
switch (...) {
case 1/0:
...
}
then 1/0 is a constant expression -- and one that violates the constraint in 6.6p4: "Each constant expression shall evaluate to a constant that is in the range of representable
values for its type.", so a diagnostic is required. But the right hand side of an assignment does not require a constant-expression, merely a conditional-expression, so the constraints on constant expressions don't apply. A compiler can evaluate any expression that it's able to at compile time, but only if the behavior is the same as if it were evaluated during execution (or, in the context of if (0), not evaluated during execution().
(Something that looks exactly like a constant-expression is not necessarily a constant-expression, just as, in x + y * z, the sequence x + y is not an additive-expression because of the context in which it appears.)
Which means the footnote in N1570 section 6.6 that I was going to cite:
Thus, in the following initialization,
static int i = 2 || 1 / 0;
the expression is a valid integer constant expression with value one.
isn't actually relevant to this question.
Finally, there are a few things that are defined to cause undefined behavior that aren't about what happens during execution. Annex J, section 2 of the C standard (again, see the N1570 draft) lists things that cause undefined behavior, gathered from the rest of the standard. Some examples (I don't claim this is an exhaustive list) are:
A nonempty source file does not end in a new-line character which is not immediately preceded by a backslash character or ends in a partial
preprocessing token or comment
Token concatenation produces a character sequence matching the syntax of a universal character name
A character not in the basic source character set is encountered in a source file, except in an identifier, a character constant, a string
literal, a header name, a comment, or a preprocessing token that is
never converted to a token
An identifier, comment, string literal, character constant, or header name contains an invalid multibyte character or does not begin
and end in the initial shift state
The same identifier has both internal and external linkage in the same translation unit
These particular cases are things that a compiler could detect. I think their behavior is undefined because the committee didn't want to, or couldn't, impose the same behavior on all implementations, and defining a range of permitted behaviors just wasn't worth the effort. They don't really fall into the category of "code that will never be executed", but I mention them here for completeness.
This article discusses this question in section 2.6:
int main(void){
guard();
5 / 0;
}
The authors consider that the program is defined when guard() does not terminate. They also find themselves distinguishing notions of “statically undefined” and “dynamically undefined”, e.g.:
The intention behind the standard11 appears to be that, in general, situations are made statically undefined if it is not easy to generate code for them. Only when code can be generated, then the situation can be undefined dynamically.
11) Private correspondence with committee member.
I would recommend looking at the entire article. Taken together, it paints a consistent picture.
The fact that the authors of the article had to discuss the question with a committee member confirms that the standard is currently fuzzy on the answer to your question.
In this case the undefined behavior is the result of executing the code. So if the code is not executed, there is no undefined behavior.
Non executed code could invoke undefined behavior if the undefined behavior was the result of solely the declaration of the code (e.g. if some case of variable shadowing was undefined).
I'd go with the last paragraph of this answer: https://stackoverflow.com/a/18384176/694576
... UB is a runtime issue, not a compiletime issue ...
So, no, there is no UB invoked.
Only when the standard makes breaking changes and your code suddenly is no longer "never gets executed". But I don't see any logical way in which this can cause 'undefined behaviour'. Its not causing anything.
On the subject of undefined behaviour it is often hard to separate the formal aspects from the practical ones. This is the definition of undefined behaviour in the 1989 standard (I don't have a more recent version at hand, but I don't expect this to have changed substantially):
1 undefined behavior
behavior, upon use of a nonportable or erroneous program construct or of
erroneous data, for which this International Standard imposes no requirements
2 NOTE Possible undefined behavior ranges from ignoring the situation completely
with unpredictable results, to behaving during translation or program execution
in a documented manner characteristic of the environment (with or without the
issuance of a diagnostic message), to terminating a translation or
execution (with the issuance of a diagnostic message).
From a formal point of view I'd say your program does invoke undefined behaviour, which means that the standard places no requirement whatsoever on what it will do when run, just because it contains division by zero.
On the other hand, from a practical point of view I'd be surprised to find a compiler that didn't behave as you intuitively expect.
The standard says, as I remember right, it's allowed to do anything from the moment, a rule got broken. Maybe there are some special events with kind of global flavour (but I never heard or read about something like that)... So I would say: No this can't be UB, because as long the behavior is well defined 0 is allways false, so the rule can't get broken on runtime.
I think it still is undefined behavior, but I can't find any evidence in the standard to support or deny me.
I think the program does not invoke undefined behavior.
Defect Report #109 addresses a similar question and says:
Furthermore, if every possible execution of a given program would result in undefined behavior, the given program is not strictly conforming.
A conforming implementation must not fail to translate a strictly conforming program simply because some possible execution of that program would result in undefined behavior. Because foo might never be called, the example given must be successfully translated by a conforming implementation.
It depends on how the expression "undefined behavior" is defined, and whether "undefined behavior" of a statement is the same as "undefined behavior" for a program.
This program looks like C, so a deeper analysis of what the C standard used by the compiler (as some answers did) is appropriate.
In absence of a specified standard, the correct answer is "it depends". In some languages, compilers after the first error try to guess what the programmer might mean and still generate some code, according to the compilers guess. In other, more pure languages, once somerthing is undefined, the undefinedness propagate to the whole program.
Other languages have a concept of "bounded errors". For some limited kinds of errors, these languages define how much damage an error can produce. In particular languages with implied garbage collection frequently make a difference whether an error invalidates the typing system or does not.