What is the rationale for "semantics violation does not require diagnostics"? - c

Follow-up question for: If "shall / shall not" requirement is violated, then does it matter in which section (e.g. Semantics, Constraints) such requirement is located?.
ISO/IEC 9899:202x (E) working draft— December 11, 2020 N2596, 5.1.1.3 Diagnostics, 1:
A conforming implementation shall produce at least one diagnostic message (identified in an
implementation-defined manner) if a preprocessing translation unit or translation unit contains a
violation of any syntax rule or constraint, even if the behavior is also explicitly specified as undefined or implementation-defined. Diagnostic messages need not be produced in other circumstances.
Consequence: semantics violation does not require diagnostics.
Question: what is the (possible) rationale for "semantics violation does not require diagnostics"?

A possible rationale is given by Rice's theorem : non-trivial semantic properties of programs are undecidable
For example, division by zero is a semantics violation; and you cannot decide, by static analysis alone of the C source code, that it won't happen...
A standard cannot require total detection of such undefined behavior, even if of course some tools (e.g. Frama-C) are sometimes capable of detecting them.
See also the halting problem. You should not expect a C compiler to solve it!

The C99 rationale v5.10 gives this explanation:
5.1.1.3 Diagnostics
By mandating some form of diagnostic message for any program containing a syntax error or
constraint violation, the Standard performs two important services. First, it gives teeth to the
concept of erroneous program, since a conforming implementation must distinguish such a program from a valid one. Second, it severely constrains the nature of extensions permissible to
a conforming implementation.
The Standard says nothing about the nature of the diagnostic message, which could simply be
“syntax error”, with no hint of where the error occurs. (An implementation must, of course,
describe what translator output constitutes a diagnostic message, so that the user can recognize it as such.) The C89 Committee ultimately decided that any diagnostic activity beyond this level is
an issue of quality of implementation, and that market forces would encourage more useful
diagnostics. Nevertheless, the C89 Committee felt that at least some significant class of errors
must be diagnosed, and the class specified should be recognizable by all translators.

This happens because the grammar of the C language is context-sensitive and for all the languages that are defined with context-free or more complex grammars on the Chomsky hierarchy one must do a tradeoff between the semantics of the language and its power.
C designers chose to allow much power for the language and this is why the problem of undecidability is omnipresent in C.
There are languages like Coq that try to cut out the undecidable situations and they restrict the semantics of the recursive functions (they allow only sigma(primitive) recursivity).

The question of whether an implementation provides any useful diagnostics in any particular situation is a Quality of Implementation issue outside the Standard's jurisdiction. If an implementation were to unconditionally output "Warning: this program does not output any useful diagnostics" or even "Warning: water is wet", such output would fully satisfy all of the Standard's requirements with regard to diagnostics even if the implementation didn't output any other diagnostics.
Further, the authors of the Standard characterized as "Undefined Behavior" many actions which they expected would be processed in a meaningful and useful fashion by many if not most implementations. According to the published Rationale document, Undefined Behavior among other things "identifies areas of conforming language extension", since implementations are allowed to specify how they will behave in cases that are not defined by the Standard.
Having implementations issue warnings about constructs which were non-portable, but which they would process in a useful fashion would have been annoying.
Prior to the Standard, some implementations would usefully accept constructs like:
struct foo {
int *p;
char pad [4-sizeof (int*)];
int q,r;
};
for all sizes of pointer up to four bytes (8-byte pointers weren't a thing back then), rather than squawking if pointers were exactly four bytes, but some people on the Committee were opposed to the idea of accepting declarations for zero-sized arrays. Thus, a compromise was reached where compilers would squawk about such things, programmers would ignore the useless warnings, and the useful constructs would remain usable on implementations that supported them.
While there was a vague attempt to distinguish between constructs that should produce warnings that programmers could ignore, versus constructs that might be used so much that warnings would be annoying, the fact that issuance of useful diagnostics was a Quality of Implementation issue outside the Standard's jurisdiction meant there was no real need to worry too much about such distinctions.

Related

using scanf("%[^\n]") and saving it to int type. (I get different output on different platforms.) [duplicate]

What is undefined behavior (UB) in C and C++? What about unspecified behavior and implementation-defined behavior? What is the difference between them?
Undefined behavior is one of those aspects of the C and C++ language that can be surprising to programmers coming from other languages (other languages try to hide it better). Basically, it is possible to write C++ programs that do not behave in a predictable way, even though many C++ compilers will not report any errors in the program!
Let's look at a classic example:
#include <iostream>
int main()
{
char* p = "hello!\n"; // yes I know, deprecated conversion
p[0] = 'y';
p[5] = 'w';
std::cout << p;
}
The variable p points to the string literal "hello!\n", and the two assignments below try to modify that string literal. What does this program do? According to section 2.14.5 paragraph 11 of the C++ standard, it invokes undefined behavior:
The effect of attempting to modify a string literal is undefined.
I can hear people screaming "But wait, I can compile this no problem and get the output yellow" or "What do you mean undefined, string literals are stored in read-only memory, so the first assignment attempt results in a core dump". This is exactly the problem with undefined behavior. Basically, the standard allows anything to happen once you invoke undefined behavior (even nasal demons). If there is a "correct" behavior according to your mental model of the language, that model is simply wrong; The C++ standard has the only vote, period.
Other examples of undefined behavior include accessing an array beyond its bounds, dereferencing the null pointer, accessing objects after their lifetime ended or writing allegedly clever expressions like i++ + ++i.
Section 1.9 of the C++ standard also mentions undefined behavior's two less dangerous brothers, unspecified behavior and implementation-defined behavior:
The semantic descriptions in this International Standard define a parameterized nondeterministic abstract machine.
Certain aspects and operations of the abstract machine are described in this International Standard as implementation-defined (for example, sizeof(int)). These constitute the parameters of the abstract machine. Each implementation shall include documentation describing its characteristics and behavior in these respects.
Certain other aspects and operations of the abstract machine are described in this International Standard as unspecified (for example, order of evaluation of arguments to a function). Where possible, this International Standard defines a set of allowable behaviors. These define the nondeterministic aspects of the abstract machine.
Certain other operations are described in this International Standard as undefined (for example, the effect of dereferencing the null pointer). [ Note: this International Standard imposes no requirements on the behavior of programs that contain undefined behavior. —end note ]
Specifically, section 1.3.24 states:
Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).
What can you do to avoid running into undefined behavior? Basically, you have to read good C++ books by authors who know what they're talking about. Avoid internet tutorials. Avoid bullschildt.
Well, this is basically a straight copy-paste from the standard
3.4.1 1 implementation-defined behavior unspecified behavior where
each implementation documents how the
choice is made
2 EXAMPLE An example of
implementation-defined behavior is the
propagation of the high-order bit when
a signed integer is shifted right.
3.4.3 1 undefined behavior behavior, upon use of a nonportable or erroneous
program construct or of erroneous
data, for which this International
Standard imposes no requirements
2
NOTE Possible undefined behavior
ranges from ignoring the situation
completely with unpredictable results,
to behaving during translation or
program execution in a documented
manner characteristic of the
environment (with or without the
issuance of a diagnostic message), to
terminating a translation or execution
(with the issuance of a diagnostic
message).
3 EXAMPLE An example of
undefined behavior is the behavior on
integer overflow.
3.4.4 1 unspecified behavior use of an unspecified value, or other behavior
where this International Standard
provides two or more possibilities and
imposes no further requirements on
which is chosen in any instance
2
EXAMPLE An example of unspecified
behavior is the order in which the
arguments to a function are evaluated.
Maybe simpler wording could be easier to understand than the rigorous definition of the standards.
implementation-defined behavior:
The language says that we have data-types. The compiler vendors specify what sizes shall they use, and provide a documentation of what they did.
undefined behavior:
You are doing something wrong. For example, you have a very large value in an int that doesn't fit in char. How do you put that value in char? actually there is no way! Anything could happen, but the most sensible thing would be to take the first byte of that int and put it in char. It is just wrong to do that to assign the first byte, but thats what happens under the hood.
unspecified behavior:
Which of these two functions is executed first?
void fun(int n, int m);
int fun1() {
std::cout << "fun1";
return 1;
}
int fun2() {
std::cout << "fun2";
return 2;
}
//...
fun(fun1(), fun2()); // which one is executed first?
The language doesn't specify the evaluation, left to right or right to left! So an unspecified behavior may or mayn't result in an undefined behavior, but certainly your program should not produce an unspecified behavior.
#eSKay I think your question is worth editing the answer to clarify more :)
for fun(fun1(), fun2()); isn't the behaviour "implementation defined"? The compiler has to choose one or the other course, after all?
The difference between implementation-defined and unspecified, is that the compiler is supposed to pick a behavior in the first case but it doesn't have to in the second case. For example, an implementation must have one and only one definition of sizeof(int). So, it can't say that sizeof(int) is 4 for some portion of the program and 8 for others. Unlike unspecified behavior, where the compiler can say: "OK I am gonna evaluate these arguments left-to-right and the next function's arguments are evaluated right-to-left." It can happen in the same program, that's why it is called unspecified. In fact, C++ could have been made easier if some of the unspecified behaviors were specified. Take a look here at Dr. Stroustrup's answer for that:
It is claimed that the difference between what can be produced giving the compiler this freedom and requiring "ordinary left-to-right evaluation" can be significant. I'm unconvinced, but with innumerable compilers "out there" taking advantage of the freedom and some people passionately defending that freedom, a change would be difficult and could take decades to penetrate to the distant corners of the C and C++ worlds. I am disappointed that not all compilers warn against code such as ++i+i++. Similarly, the order of evaluation of arguments is unspecified.
IMO far too many "things" are left undefined, unspecified, that's easy to say and even to give examples of, but hard to fix. It should also be noted that it is not all that difficult to avoid most of the problems and produce portable code.
From the official C Rationale Document
The terms unspecified behavior, undefined behavior, and implementation-defined behavior are used to categorize the result of writing programs whose properties the Standard does not, or cannot, completely describe. The goal of adopting this categorization is to allow a certain variety among implementations which permits quality of implementation to be an active force in the marketplace as well as to allow certain popular extensions, without removing the cachet of conformance to the Standard. Appendix F to the Standard catalogs those behaviors which fall into one of these three categories.
Unspecified behavior gives the implementor some latitude in translating programs. This latitude does not extend as far as failing to translate the program.
Undefined behavior gives the implementor license not to catch certain program errors that are difficult to diagnose. It also identifies areas of possible conforming language extension: the implementor may augment the language by providing a definition of the officially undefined behavior.
Implementation-defined behavior gives an implementor the freedom to choose the appropriate approach, but requires that this choice be explained to the user. Behaviors designated as implementation-defined are generally those in which a user could make meaningful coding decisions based on the implementation definition. Implementors should bear in mind this criterion when deciding how extensive an implementation definition ought to be. As with unspecified behavior, simply failing to translate the source containing the implementation-defined behavior is not an adequate response.
Undefined Behavior vs. Unspecified Behavior has a short description of it.
Their final summary:
To sum up, unspecified behavior is usually something you shouldn't
worry about, unless your software is required to be portable.
Conversely, undefined behavior is always undesirable and should never
occur.
Implementation defined-
Implementors wish,should be well documented,standard gives choices but sure to compile
Unspecified -
Same as implementation-defined but not documented
Undefined-
Anything might happen,take care of it.
Historically, both Implementation-Defined Behavior and Undefined Behavior represented situations in which the authors of the Standard expected that people writing quality implementations would use judgment to decide what behavioral guarantees, if any, would be useful for programs in the intended application field running on the intended targets. The needs of high-end number-crunching code are quite different from those of low-level systems code, and both UB and IDB give compiler writers flexibility to meet those different needs. Neither category mandates that implementations behave in a way that's useful for any particular purpose, or even for any purpose whatsoever. Quality implementations that claim to be suitable for a particular purpose, however, should behave in a manner befitting such purpose whether the Standard requires it or not.
The only difference between Implementation-Defined Behavior and Undefined Behavior is that the former requires that implementations define and document a consistent behavior even in cases where nothing the implementation could possibly do would be useful. The dividing line between them is not whether it would generally be useful for implementations to define behaviors (compiler writers should define useful behaviors when practical whether the Standard requires them to or not) but whether there might be implementations where defining a behavior would be simultaneously costly and useless. A judgment that such implementations might exist does not in any way, shape, or form, imply any judgment about the usefulness of supporting a defined behavior on other platforms.
Unfortunately, since the mid 1990s compiler writers have started to interpret the lack of behavioral mandates as an judgment that behavioral guarantees aren't worth the cost even in application fields where they're vital, and even on systems where they cost practically nothing. Instead of treating UB as an invitation to exercise reasonable judgment, compiler writers have started treating it as an excuse not to do so.
For example, given the following code:
int scaled_velocity(int v, unsigned char pow)
{
if (v > 250)
v = 250;
if (v < -250)
v = -250;
return v << pow;
}
a two's-complement implementation would not have to expend any effort
whatsoever to treat the expression v << pow as a two's-complement shift
without regard for whether v was positive or negative.
The preferred philosophy among some of today's compiler writers, however, would suggest that because v can only be negative if the program is going to engage in Undefined Behavior, there's no reason to have the program clip the negative range of v. Even though left-shifting of negative values used to be supported on every single compiler of significance, and a large amount of existing code relies upon that behavior, modern philosophy would interpret the fact that the Standard says that left-shifting negative values is UB as implying that compiler writers should feel free to ignore that.
C++ standard n3337 § 1.3.10
implementation-defined behavior
behavior, for a well-formed program construct and correct data, that
depends on the implementation and that each implementation documents
Sometimes C++ Standard doesn't impose particular behavior on some constructs but says instead that a particular, well defined behavior has to be chosen and described by particular implementation (version of library). So user can still know exactly how will program behave even though Standard doesn't describe this.
C++ standard n3337 § 1.3.24
undefined behavior
behavior for which this International Standard imposes no requirements
[ Note: Undefined behavior may be expected when this International
Standard omits any explicit definition of behavior or when a program
uses an erroneous construct or erroneous data. Permissible undefined
behavior ranges from ignoring the situation completely with
unpredictable results, to behaving during translation or program
execution in a documented manner characteristic of the environment
(with or without the issuance of a diagnostic message), to terminating
a translation or execution (with the issuance of a diagnostic
message). Many erroneous program constructs do not engender undefined
behavior; they are required to be diagnosed. — end note ]
When the program encounters construct that is not defined according to C++ Standard it is allowed to do whatever it wants to do ( maybe send an email to me or maybe send an email to you or maybe ignore the code completely).
C++ standard n3337 § 1.3.25
unspecified behavior
behavior, for a well-formed program construct and correct data, that
depends on the implementation [ Note: The implementation is not
required to document which behavior occurs. The range of possible
behaviors is usually delineated by this International Standard. — end
note ]
C++ Standard doesn't impose particular behavior on some constructs but says instead that a particular, well defined behavior has to be chosen ( bot not necessary described) by particular implementation (version of library). So in the case when no description has been provided it can be difficult to the user to know exactly how will program behave.
Undefined behavior is ugly -- as in, "The good, the bad, and the ugly".
Good: a program that compiles and works, for the right reasons.
Bad: a program that has an error, of a kind that the compiler can detect and complain about.
Ugly: a program that has an error, that the compiler cannot detect and warn about, meaning that the program compiles, and may seem to work correctly some of the time, but also fails bizarrely some of the time. That's what undefined behavior is.
Some program languages and other formal systems try hard to limit the "gulf of undefinedness" -- that is, they try to arrange things so that most or all programs are either "good" or "bad", and that very few are "ugly". It's a characteristic feature of C, however, that its "gulf of undefinedness" is quite wide.

Structs in c initialization question, not so random values without initialization [duplicate]

What is undefined behavior (UB) in C and C++? What about unspecified behavior and implementation-defined behavior? What is the difference between them?
Undefined behavior is one of those aspects of the C and C++ language that can be surprising to programmers coming from other languages (other languages try to hide it better). Basically, it is possible to write C++ programs that do not behave in a predictable way, even though many C++ compilers will not report any errors in the program!
Let's look at a classic example:
#include <iostream>
int main()
{
char* p = "hello!\n"; // yes I know, deprecated conversion
p[0] = 'y';
p[5] = 'w';
std::cout << p;
}
The variable p points to the string literal "hello!\n", and the two assignments below try to modify that string literal. What does this program do? According to section 2.14.5 paragraph 11 of the C++ standard, it invokes undefined behavior:
The effect of attempting to modify a string literal is undefined.
I can hear people screaming "But wait, I can compile this no problem and get the output yellow" or "What do you mean undefined, string literals are stored in read-only memory, so the first assignment attempt results in a core dump". This is exactly the problem with undefined behavior. Basically, the standard allows anything to happen once you invoke undefined behavior (even nasal demons). If there is a "correct" behavior according to your mental model of the language, that model is simply wrong; The C++ standard has the only vote, period.
Other examples of undefined behavior include accessing an array beyond its bounds, dereferencing the null pointer, accessing objects after their lifetime ended or writing allegedly clever expressions like i++ + ++i.
Section 1.9 of the C++ standard also mentions undefined behavior's two less dangerous brothers, unspecified behavior and implementation-defined behavior:
The semantic descriptions in this International Standard define a parameterized nondeterministic abstract machine.
Certain aspects and operations of the abstract machine are described in this International Standard as implementation-defined (for example, sizeof(int)). These constitute the parameters of the abstract machine. Each implementation shall include documentation describing its characteristics and behavior in these respects.
Certain other aspects and operations of the abstract machine are described in this International Standard as unspecified (for example, order of evaluation of arguments to a function). Where possible, this International Standard defines a set of allowable behaviors. These define the nondeterministic aspects of the abstract machine.
Certain other operations are described in this International Standard as undefined (for example, the effect of dereferencing the null pointer). [ Note: this International Standard imposes no requirements on the behavior of programs that contain undefined behavior. —end note ]
Specifically, section 1.3.24 states:
Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).
What can you do to avoid running into undefined behavior? Basically, you have to read good C++ books by authors who know what they're talking about. Avoid internet tutorials. Avoid bullschildt.
Well, this is basically a straight copy-paste from the standard
3.4.1 1 implementation-defined behavior unspecified behavior where
each implementation documents how the
choice is made
2 EXAMPLE An example of
implementation-defined behavior is the
propagation of the high-order bit when
a signed integer is shifted right.
3.4.3 1 undefined behavior behavior, upon use of a nonportable or erroneous
program construct or of erroneous
data, for which this International
Standard imposes no requirements
2
NOTE Possible undefined behavior
ranges from ignoring the situation
completely with unpredictable results,
to behaving during translation or
program execution in a documented
manner characteristic of the
environment (with or without the
issuance of a diagnostic message), to
terminating a translation or execution
(with the issuance of a diagnostic
message).
3 EXAMPLE An example of
undefined behavior is the behavior on
integer overflow.
3.4.4 1 unspecified behavior use of an unspecified value, or other behavior
where this International Standard
provides two or more possibilities and
imposes no further requirements on
which is chosen in any instance
2
EXAMPLE An example of unspecified
behavior is the order in which the
arguments to a function are evaluated.
Maybe simpler wording could be easier to understand than the rigorous definition of the standards.
implementation-defined behavior:
The language says that we have data-types. The compiler vendors specify what sizes shall they use, and provide a documentation of what they did.
undefined behavior:
You are doing something wrong. For example, you have a very large value in an int that doesn't fit in char. How do you put that value in char? actually there is no way! Anything could happen, but the most sensible thing would be to take the first byte of that int and put it in char. It is just wrong to do that to assign the first byte, but thats what happens under the hood.
unspecified behavior:
Which of these two functions is executed first?
void fun(int n, int m);
int fun1() {
std::cout << "fun1";
return 1;
}
int fun2() {
std::cout << "fun2";
return 2;
}
//...
fun(fun1(), fun2()); // which one is executed first?
The language doesn't specify the evaluation, left to right or right to left! So an unspecified behavior may or mayn't result in an undefined behavior, but certainly your program should not produce an unspecified behavior.
#eSKay I think your question is worth editing the answer to clarify more :)
for fun(fun1(), fun2()); isn't the behaviour "implementation defined"? The compiler has to choose one or the other course, after all?
The difference between implementation-defined and unspecified, is that the compiler is supposed to pick a behavior in the first case but it doesn't have to in the second case. For example, an implementation must have one and only one definition of sizeof(int). So, it can't say that sizeof(int) is 4 for some portion of the program and 8 for others. Unlike unspecified behavior, where the compiler can say: "OK I am gonna evaluate these arguments left-to-right and the next function's arguments are evaluated right-to-left." It can happen in the same program, that's why it is called unspecified. In fact, C++ could have been made easier if some of the unspecified behaviors were specified. Take a look here at Dr. Stroustrup's answer for that:
It is claimed that the difference between what can be produced giving the compiler this freedom and requiring "ordinary left-to-right evaluation" can be significant. I'm unconvinced, but with innumerable compilers "out there" taking advantage of the freedom and some people passionately defending that freedom, a change would be difficult and could take decades to penetrate to the distant corners of the C and C++ worlds. I am disappointed that not all compilers warn against code such as ++i+i++. Similarly, the order of evaluation of arguments is unspecified.
IMO far too many "things" are left undefined, unspecified, that's easy to say and even to give examples of, but hard to fix. It should also be noted that it is not all that difficult to avoid most of the problems and produce portable code.
From the official C Rationale Document
The terms unspecified behavior, undefined behavior, and implementation-defined behavior are used to categorize the result of writing programs whose properties the Standard does not, or cannot, completely describe. The goal of adopting this categorization is to allow a certain variety among implementations which permits quality of implementation to be an active force in the marketplace as well as to allow certain popular extensions, without removing the cachet of conformance to the Standard. Appendix F to the Standard catalogs those behaviors which fall into one of these three categories.
Unspecified behavior gives the implementor some latitude in translating programs. This latitude does not extend as far as failing to translate the program.
Undefined behavior gives the implementor license not to catch certain program errors that are difficult to diagnose. It also identifies areas of possible conforming language extension: the implementor may augment the language by providing a definition of the officially undefined behavior.
Implementation-defined behavior gives an implementor the freedom to choose the appropriate approach, but requires that this choice be explained to the user. Behaviors designated as implementation-defined are generally those in which a user could make meaningful coding decisions based on the implementation definition. Implementors should bear in mind this criterion when deciding how extensive an implementation definition ought to be. As with unspecified behavior, simply failing to translate the source containing the implementation-defined behavior is not an adequate response.
Undefined Behavior vs. Unspecified Behavior has a short description of it.
Their final summary:
To sum up, unspecified behavior is usually something you shouldn't
worry about, unless your software is required to be portable.
Conversely, undefined behavior is always undesirable and should never
occur.
Implementation defined-
Implementors wish,should be well documented,standard gives choices but sure to compile
Unspecified -
Same as implementation-defined but not documented
Undefined-
Anything might happen,take care of it.
Historically, both Implementation-Defined Behavior and Undefined Behavior represented situations in which the authors of the Standard expected that people writing quality implementations would use judgment to decide what behavioral guarantees, if any, would be useful for programs in the intended application field running on the intended targets. The needs of high-end number-crunching code are quite different from those of low-level systems code, and both UB and IDB give compiler writers flexibility to meet those different needs. Neither category mandates that implementations behave in a way that's useful for any particular purpose, or even for any purpose whatsoever. Quality implementations that claim to be suitable for a particular purpose, however, should behave in a manner befitting such purpose whether the Standard requires it or not.
The only difference between Implementation-Defined Behavior and Undefined Behavior is that the former requires that implementations define and document a consistent behavior even in cases where nothing the implementation could possibly do would be useful. The dividing line between them is not whether it would generally be useful for implementations to define behaviors (compiler writers should define useful behaviors when practical whether the Standard requires them to or not) but whether there might be implementations where defining a behavior would be simultaneously costly and useless. A judgment that such implementations might exist does not in any way, shape, or form, imply any judgment about the usefulness of supporting a defined behavior on other platforms.
Unfortunately, since the mid 1990s compiler writers have started to interpret the lack of behavioral mandates as an judgment that behavioral guarantees aren't worth the cost even in application fields where they're vital, and even on systems where they cost practically nothing. Instead of treating UB as an invitation to exercise reasonable judgment, compiler writers have started treating it as an excuse not to do so.
For example, given the following code:
int scaled_velocity(int v, unsigned char pow)
{
if (v > 250)
v = 250;
if (v < -250)
v = -250;
return v << pow;
}
a two's-complement implementation would not have to expend any effort
whatsoever to treat the expression v << pow as a two's-complement shift
without regard for whether v was positive or negative.
The preferred philosophy among some of today's compiler writers, however, would suggest that because v can only be negative if the program is going to engage in Undefined Behavior, there's no reason to have the program clip the negative range of v. Even though left-shifting of negative values used to be supported on every single compiler of significance, and a large amount of existing code relies upon that behavior, modern philosophy would interpret the fact that the Standard says that left-shifting negative values is UB as implying that compiler writers should feel free to ignore that.
C++ standard n3337 § 1.3.10
implementation-defined behavior
behavior, for a well-formed program construct and correct data, that
depends on the implementation and that each implementation documents
Sometimes C++ Standard doesn't impose particular behavior on some constructs but says instead that a particular, well defined behavior has to be chosen and described by particular implementation (version of library). So user can still know exactly how will program behave even though Standard doesn't describe this.
C++ standard n3337 § 1.3.24
undefined behavior
behavior for which this International Standard imposes no requirements
[ Note: Undefined behavior may be expected when this International
Standard omits any explicit definition of behavior or when a program
uses an erroneous construct or erroneous data. Permissible undefined
behavior ranges from ignoring the situation completely with
unpredictable results, to behaving during translation or program
execution in a documented manner characteristic of the environment
(with or without the issuance of a diagnostic message), to terminating
a translation or execution (with the issuance of a diagnostic
message). Many erroneous program constructs do not engender undefined
behavior; they are required to be diagnosed. — end note ]
When the program encounters construct that is not defined according to C++ Standard it is allowed to do whatever it wants to do ( maybe send an email to me or maybe send an email to you or maybe ignore the code completely).
C++ standard n3337 § 1.3.25
unspecified behavior
behavior, for a well-formed program construct and correct data, that
depends on the implementation [ Note: The implementation is not
required to document which behavior occurs. The range of possible
behaviors is usually delineated by this International Standard. — end
note ]
C++ Standard doesn't impose particular behavior on some constructs but says instead that a particular, well defined behavior has to be chosen ( bot not necessary described) by particular implementation (version of library). So in the case when no description has been provided it can be difficult to the user to know exactly how will program behave.
Undefined behavior is ugly -- as in, "The good, the bad, and the ugly".
Good: a program that compiles and works, for the right reasons.
Bad: a program that has an error, of a kind that the compiler can detect and complain about.
Ugly: a program that has an error, that the compiler cannot detect and warn about, meaning that the program compiles, and may seem to work correctly some of the time, but also fails bizarrely some of the time. That's what undefined behavior is.
Some program languages and other formal systems try hard to limit the "gulf of undefinedness" -- that is, they try to arrange things so that most or all programs are either "good" or "bad", and that very few are "ugly". It's a characteristic feature of C, however, that its "gulf of undefinedness" is quite wide.

Do a malloc prior to using strcpy? [duplicate]

What is undefined behavior (UB) in C and C++? What about unspecified behavior and implementation-defined behavior? What is the difference between them?
Undefined behavior is one of those aspects of the C and C++ language that can be surprising to programmers coming from other languages (other languages try to hide it better). Basically, it is possible to write C++ programs that do not behave in a predictable way, even though many C++ compilers will not report any errors in the program!
Let's look at a classic example:
#include <iostream>
int main()
{
char* p = "hello!\n"; // yes I know, deprecated conversion
p[0] = 'y';
p[5] = 'w';
std::cout << p;
}
The variable p points to the string literal "hello!\n", and the two assignments below try to modify that string literal. What does this program do? According to section 2.14.5 paragraph 11 of the C++ standard, it invokes undefined behavior:
The effect of attempting to modify a string literal is undefined.
I can hear people screaming "But wait, I can compile this no problem and get the output yellow" or "What do you mean undefined, string literals are stored in read-only memory, so the first assignment attempt results in a core dump". This is exactly the problem with undefined behavior. Basically, the standard allows anything to happen once you invoke undefined behavior (even nasal demons). If there is a "correct" behavior according to your mental model of the language, that model is simply wrong; The C++ standard has the only vote, period.
Other examples of undefined behavior include accessing an array beyond its bounds, dereferencing the null pointer, accessing objects after their lifetime ended or writing allegedly clever expressions like i++ + ++i.
Section 1.9 of the C++ standard also mentions undefined behavior's two less dangerous brothers, unspecified behavior and implementation-defined behavior:
The semantic descriptions in this International Standard define a parameterized nondeterministic abstract machine.
Certain aspects and operations of the abstract machine are described in this International Standard as implementation-defined (for example, sizeof(int)). These constitute the parameters of the abstract machine. Each implementation shall include documentation describing its characteristics and behavior in these respects.
Certain other aspects and operations of the abstract machine are described in this International Standard as unspecified (for example, order of evaluation of arguments to a function). Where possible, this International Standard defines a set of allowable behaviors. These define the nondeterministic aspects of the abstract machine.
Certain other operations are described in this International Standard as undefined (for example, the effect of dereferencing the null pointer). [ Note: this International Standard imposes no requirements on the behavior of programs that contain undefined behavior. —end note ]
Specifically, section 1.3.24 states:
Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).
What can you do to avoid running into undefined behavior? Basically, you have to read good C++ books by authors who know what they're talking about. Avoid internet tutorials. Avoid bullschildt.
Well, this is basically a straight copy-paste from the standard
3.4.1 1 implementation-defined behavior unspecified behavior where
each implementation documents how the
choice is made
2 EXAMPLE An example of
implementation-defined behavior is the
propagation of the high-order bit when
a signed integer is shifted right.
3.4.3 1 undefined behavior behavior, upon use of a nonportable or erroneous
program construct or of erroneous
data, for which this International
Standard imposes no requirements
2
NOTE Possible undefined behavior
ranges from ignoring the situation
completely with unpredictable results,
to behaving during translation or
program execution in a documented
manner characteristic of the
environment (with or without the
issuance of a diagnostic message), to
terminating a translation or execution
(with the issuance of a diagnostic
message).
3 EXAMPLE An example of
undefined behavior is the behavior on
integer overflow.
3.4.4 1 unspecified behavior use of an unspecified value, or other behavior
where this International Standard
provides two or more possibilities and
imposes no further requirements on
which is chosen in any instance
2
EXAMPLE An example of unspecified
behavior is the order in which the
arguments to a function are evaluated.
Maybe simpler wording could be easier to understand than the rigorous definition of the standards.
implementation-defined behavior:
The language says that we have data-types. The compiler vendors specify what sizes shall they use, and provide a documentation of what they did.
undefined behavior:
You are doing something wrong. For example, you have a very large value in an int that doesn't fit in char. How do you put that value in char? actually there is no way! Anything could happen, but the most sensible thing would be to take the first byte of that int and put it in char. It is just wrong to do that to assign the first byte, but thats what happens under the hood.
unspecified behavior:
Which of these two functions is executed first?
void fun(int n, int m);
int fun1() {
std::cout << "fun1";
return 1;
}
int fun2() {
std::cout << "fun2";
return 2;
}
//...
fun(fun1(), fun2()); // which one is executed first?
The language doesn't specify the evaluation, left to right or right to left! So an unspecified behavior may or mayn't result in an undefined behavior, but certainly your program should not produce an unspecified behavior.
#eSKay I think your question is worth editing the answer to clarify more :)
for fun(fun1(), fun2()); isn't the behaviour "implementation defined"? The compiler has to choose one or the other course, after all?
The difference between implementation-defined and unspecified, is that the compiler is supposed to pick a behavior in the first case but it doesn't have to in the second case. For example, an implementation must have one and only one definition of sizeof(int). So, it can't say that sizeof(int) is 4 for some portion of the program and 8 for others. Unlike unspecified behavior, where the compiler can say: "OK I am gonna evaluate these arguments left-to-right and the next function's arguments are evaluated right-to-left." It can happen in the same program, that's why it is called unspecified. In fact, C++ could have been made easier if some of the unspecified behaviors were specified. Take a look here at Dr. Stroustrup's answer for that:
It is claimed that the difference between what can be produced giving the compiler this freedom and requiring "ordinary left-to-right evaluation" can be significant. I'm unconvinced, but with innumerable compilers "out there" taking advantage of the freedom and some people passionately defending that freedom, a change would be difficult and could take decades to penetrate to the distant corners of the C and C++ worlds. I am disappointed that not all compilers warn against code such as ++i+i++. Similarly, the order of evaluation of arguments is unspecified.
IMO far too many "things" are left undefined, unspecified, that's easy to say and even to give examples of, but hard to fix. It should also be noted that it is not all that difficult to avoid most of the problems and produce portable code.
From the official C Rationale Document
The terms unspecified behavior, undefined behavior, and implementation-defined behavior are used to categorize the result of writing programs whose properties the Standard does not, or cannot, completely describe. The goal of adopting this categorization is to allow a certain variety among implementations which permits quality of implementation to be an active force in the marketplace as well as to allow certain popular extensions, without removing the cachet of conformance to the Standard. Appendix F to the Standard catalogs those behaviors which fall into one of these three categories.
Unspecified behavior gives the implementor some latitude in translating programs. This latitude does not extend as far as failing to translate the program.
Undefined behavior gives the implementor license not to catch certain program errors that are difficult to diagnose. It also identifies areas of possible conforming language extension: the implementor may augment the language by providing a definition of the officially undefined behavior.
Implementation-defined behavior gives an implementor the freedom to choose the appropriate approach, but requires that this choice be explained to the user. Behaviors designated as implementation-defined are generally those in which a user could make meaningful coding decisions based on the implementation definition. Implementors should bear in mind this criterion when deciding how extensive an implementation definition ought to be. As with unspecified behavior, simply failing to translate the source containing the implementation-defined behavior is not an adequate response.
Undefined Behavior vs. Unspecified Behavior has a short description of it.
Their final summary:
To sum up, unspecified behavior is usually something you shouldn't
worry about, unless your software is required to be portable.
Conversely, undefined behavior is always undesirable and should never
occur.
Implementation defined-
Implementors wish,should be well documented,standard gives choices but sure to compile
Unspecified -
Same as implementation-defined but not documented
Undefined-
Anything might happen,take care of it.
Historically, both Implementation-Defined Behavior and Undefined Behavior represented situations in which the authors of the Standard expected that people writing quality implementations would use judgment to decide what behavioral guarantees, if any, would be useful for programs in the intended application field running on the intended targets. The needs of high-end number-crunching code are quite different from those of low-level systems code, and both UB and IDB give compiler writers flexibility to meet those different needs. Neither category mandates that implementations behave in a way that's useful for any particular purpose, or even for any purpose whatsoever. Quality implementations that claim to be suitable for a particular purpose, however, should behave in a manner befitting such purpose whether the Standard requires it or not.
The only difference between Implementation-Defined Behavior and Undefined Behavior is that the former requires that implementations define and document a consistent behavior even in cases where nothing the implementation could possibly do would be useful. The dividing line between them is not whether it would generally be useful for implementations to define behaviors (compiler writers should define useful behaviors when practical whether the Standard requires them to or not) but whether there might be implementations where defining a behavior would be simultaneously costly and useless. A judgment that such implementations might exist does not in any way, shape, or form, imply any judgment about the usefulness of supporting a defined behavior on other platforms.
Unfortunately, since the mid 1990s compiler writers have started to interpret the lack of behavioral mandates as an judgment that behavioral guarantees aren't worth the cost even in application fields where they're vital, and even on systems where they cost practically nothing. Instead of treating UB as an invitation to exercise reasonable judgment, compiler writers have started treating it as an excuse not to do so.
For example, given the following code:
int scaled_velocity(int v, unsigned char pow)
{
if (v > 250)
v = 250;
if (v < -250)
v = -250;
return v << pow;
}
a two's-complement implementation would not have to expend any effort
whatsoever to treat the expression v << pow as a two's-complement shift
without regard for whether v was positive or negative.
The preferred philosophy among some of today's compiler writers, however, would suggest that because v can only be negative if the program is going to engage in Undefined Behavior, there's no reason to have the program clip the negative range of v. Even though left-shifting of negative values used to be supported on every single compiler of significance, and a large amount of existing code relies upon that behavior, modern philosophy would interpret the fact that the Standard says that left-shifting negative values is UB as implying that compiler writers should feel free to ignore that.
C++ standard n3337 § 1.3.10
implementation-defined behavior
behavior, for a well-formed program construct and correct data, that
depends on the implementation and that each implementation documents
Sometimes C++ Standard doesn't impose particular behavior on some constructs but says instead that a particular, well defined behavior has to be chosen and described by particular implementation (version of library). So user can still know exactly how will program behave even though Standard doesn't describe this.
C++ standard n3337 § 1.3.24
undefined behavior
behavior for which this International Standard imposes no requirements
[ Note: Undefined behavior may be expected when this International
Standard omits any explicit definition of behavior or when a program
uses an erroneous construct or erroneous data. Permissible undefined
behavior ranges from ignoring the situation completely with
unpredictable results, to behaving during translation or program
execution in a documented manner characteristic of the environment
(with or without the issuance of a diagnostic message), to terminating
a translation or execution (with the issuance of a diagnostic
message). Many erroneous program constructs do not engender undefined
behavior; they are required to be diagnosed. — end note ]
When the program encounters construct that is not defined according to C++ Standard it is allowed to do whatever it wants to do ( maybe send an email to me or maybe send an email to you or maybe ignore the code completely).
C++ standard n3337 § 1.3.25
unspecified behavior
behavior, for a well-formed program construct and correct data, that
depends on the implementation [ Note: The implementation is not
required to document which behavior occurs. The range of possible
behaviors is usually delineated by this International Standard. — end
note ]
C++ Standard doesn't impose particular behavior on some constructs but says instead that a particular, well defined behavior has to be chosen ( bot not necessary described) by particular implementation (version of library). So in the case when no description has been provided it can be difficult to the user to know exactly how will program behave.
Undefined behavior is ugly -- as in, "The good, the bad, and the ugly".
Good: a program that compiles and works, for the right reasons.
Bad: a program that has an error, of a kind that the compiler can detect and complain about.
Ugly: a program that has an error, that the compiler cannot detect and warn about, meaning that the program compiles, and may seem to work correctly some of the time, but also fails bizarrely some of the time. That's what undefined behavior is.
Some program languages and other formal systems try hard to limit the "gulf of undefinedness" -- that is, they try to arrange things so that most or all programs are either "good" or "bad", and that very few are "ugly". It's a characteristic feature of C, however, that its "gulf of undefinedness" is quite wide.

Why this code doesn't crash during execution [duplicate]

What is undefined behavior (UB) in C and C++? What about unspecified behavior and implementation-defined behavior? What is the difference between them?
Undefined behavior is one of those aspects of the C and C++ language that can be surprising to programmers coming from other languages (other languages try to hide it better). Basically, it is possible to write C++ programs that do not behave in a predictable way, even though many C++ compilers will not report any errors in the program!
Let's look at a classic example:
#include <iostream>
int main()
{
char* p = "hello!\n"; // yes I know, deprecated conversion
p[0] = 'y';
p[5] = 'w';
std::cout << p;
}
The variable p points to the string literal "hello!\n", and the two assignments below try to modify that string literal. What does this program do? According to section 2.14.5 paragraph 11 of the C++ standard, it invokes undefined behavior:
The effect of attempting to modify a string literal is undefined.
I can hear people screaming "But wait, I can compile this no problem and get the output yellow" or "What do you mean undefined, string literals are stored in read-only memory, so the first assignment attempt results in a core dump". This is exactly the problem with undefined behavior. Basically, the standard allows anything to happen once you invoke undefined behavior (even nasal demons). If there is a "correct" behavior according to your mental model of the language, that model is simply wrong; The C++ standard has the only vote, period.
Other examples of undefined behavior include accessing an array beyond its bounds, dereferencing the null pointer, accessing objects after their lifetime ended or writing allegedly clever expressions like i++ + ++i.
Section 1.9 of the C++ standard also mentions undefined behavior's two less dangerous brothers, unspecified behavior and implementation-defined behavior:
The semantic descriptions in this International Standard define a parameterized nondeterministic abstract machine.
Certain aspects and operations of the abstract machine are described in this International Standard as implementation-defined (for example, sizeof(int)). These constitute the parameters of the abstract machine. Each implementation shall include documentation describing its characteristics and behavior in these respects.
Certain other aspects and operations of the abstract machine are described in this International Standard as unspecified (for example, order of evaluation of arguments to a function). Where possible, this International Standard defines a set of allowable behaviors. These define the nondeterministic aspects of the abstract machine.
Certain other operations are described in this International Standard as undefined (for example, the effect of dereferencing the null pointer). [ Note: this International Standard imposes no requirements on the behavior of programs that contain undefined behavior. —end note ]
Specifically, section 1.3.24 states:
Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).
What can you do to avoid running into undefined behavior? Basically, you have to read good C++ books by authors who know what they're talking about. Avoid internet tutorials. Avoid bullschildt.
Well, this is basically a straight copy-paste from the standard
3.4.1 1 implementation-defined behavior unspecified behavior where
each implementation documents how the
choice is made
2 EXAMPLE An example of
implementation-defined behavior is the
propagation of the high-order bit when
a signed integer is shifted right.
3.4.3 1 undefined behavior behavior, upon use of a nonportable or erroneous
program construct or of erroneous
data, for which this International
Standard imposes no requirements
2
NOTE Possible undefined behavior
ranges from ignoring the situation
completely with unpredictable results,
to behaving during translation or
program execution in a documented
manner characteristic of the
environment (with or without the
issuance of a diagnostic message), to
terminating a translation or execution
(with the issuance of a diagnostic
message).
3 EXAMPLE An example of
undefined behavior is the behavior on
integer overflow.
3.4.4 1 unspecified behavior use of an unspecified value, or other behavior
where this International Standard
provides two or more possibilities and
imposes no further requirements on
which is chosen in any instance
2
EXAMPLE An example of unspecified
behavior is the order in which the
arguments to a function are evaluated.
Maybe simpler wording could be easier to understand than the rigorous definition of the standards.
implementation-defined behavior:
The language says that we have data-types. The compiler vendors specify what sizes shall they use, and provide a documentation of what they did.
undefined behavior:
You are doing something wrong. For example, you have a very large value in an int that doesn't fit in char. How do you put that value in char? actually there is no way! Anything could happen, but the most sensible thing would be to take the first byte of that int and put it in char. It is just wrong to do that to assign the first byte, but thats what happens under the hood.
unspecified behavior:
Which of these two functions is executed first?
void fun(int n, int m);
int fun1() {
std::cout << "fun1";
return 1;
}
int fun2() {
std::cout << "fun2";
return 2;
}
//...
fun(fun1(), fun2()); // which one is executed first?
The language doesn't specify the evaluation, left to right or right to left! So an unspecified behavior may or mayn't result in an undefined behavior, but certainly your program should not produce an unspecified behavior.
#eSKay I think your question is worth editing the answer to clarify more :)
for fun(fun1(), fun2()); isn't the behaviour "implementation defined"? The compiler has to choose one or the other course, after all?
The difference between implementation-defined and unspecified, is that the compiler is supposed to pick a behavior in the first case but it doesn't have to in the second case. For example, an implementation must have one and only one definition of sizeof(int). So, it can't say that sizeof(int) is 4 for some portion of the program and 8 for others. Unlike unspecified behavior, where the compiler can say: "OK I am gonna evaluate these arguments left-to-right and the next function's arguments are evaluated right-to-left." It can happen in the same program, that's why it is called unspecified. In fact, C++ could have been made easier if some of the unspecified behaviors were specified. Take a look here at Dr. Stroustrup's answer for that:
It is claimed that the difference between what can be produced giving the compiler this freedom and requiring "ordinary left-to-right evaluation" can be significant. I'm unconvinced, but with innumerable compilers "out there" taking advantage of the freedom and some people passionately defending that freedom, a change would be difficult and could take decades to penetrate to the distant corners of the C and C++ worlds. I am disappointed that not all compilers warn against code such as ++i+i++. Similarly, the order of evaluation of arguments is unspecified.
IMO far too many "things" are left undefined, unspecified, that's easy to say and even to give examples of, but hard to fix. It should also be noted that it is not all that difficult to avoid most of the problems and produce portable code.
From the official C Rationale Document
The terms unspecified behavior, undefined behavior, and implementation-defined behavior are used to categorize the result of writing programs whose properties the Standard does not, or cannot, completely describe. The goal of adopting this categorization is to allow a certain variety among implementations which permits quality of implementation to be an active force in the marketplace as well as to allow certain popular extensions, without removing the cachet of conformance to the Standard. Appendix F to the Standard catalogs those behaviors which fall into one of these three categories.
Unspecified behavior gives the implementor some latitude in translating programs. This latitude does not extend as far as failing to translate the program.
Undefined behavior gives the implementor license not to catch certain program errors that are difficult to diagnose. It also identifies areas of possible conforming language extension: the implementor may augment the language by providing a definition of the officially undefined behavior.
Implementation-defined behavior gives an implementor the freedom to choose the appropriate approach, but requires that this choice be explained to the user. Behaviors designated as implementation-defined are generally those in which a user could make meaningful coding decisions based on the implementation definition. Implementors should bear in mind this criterion when deciding how extensive an implementation definition ought to be. As with unspecified behavior, simply failing to translate the source containing the implementation-defined behavior is not an adequate response.
Undefined Behavior vs. Unspecified Behavior has a short description of it.
Their final summary:
To sum up, unspecified behavior is usually something you shouldn't
worry about, unless your software is required to be portable.
Conversely, undefined behavior is always undesirable and should never
occur.
Implementation defined-
Implementors wish,should be well documented,standard gives choices but sure to compile
Unspecified -
Same as implementation-defined but not documented
Undefined-
Anything might happen,take care of it.
Historically, both Implementation-Defined Behavior and Undefined Behavior represented situations in which the authors of the Standard expected that people writing quality implementations would use judgment to decide what behavioral guarantees, if any, would be useful for programs in the intended application field running on the intended targets. The needs of high-end number-crunching code are quite different from those of low-level systems code, and both UB and IDB give compiler writers flexibility to meet those different needs. Neither category mandates that implementations behave in a way that's useful for any particular purpose, or even for any purpose whatsoever. Quality implementations that claim to be suitable for a particular purpose, however, should behave in a manner befitting such purpose whether the Standard requires it or not.
The only difference between Implementation-Defined Behavior and Undefined Behavior is that the former requires that implementations define and document a consistent behavior even in cases where nothing the implementation could possibly do would be useful. The dividing line between them is not whether it would generally be useful for implementations to define behaviors (compiler writers should define useful behaviors when practical whether the Standard requires them to or not) but whether there might be implementations where defining a behavior would be simultaneously costly and useless. A judgment that such implementations might exist does not in any way, shape, or form, imply any judgment about the usefulness of supporting a defined behavior on other platforms.
Unfortunately, since the mid 1990s compiler writers have started to interpret the lack of behavioral mandates as an judgment that behavioral guarantees aren't worth the cost even in application fields where they're vital, and even on systems where they cost practically nothing. Instead of treating UB as an invitation to exercise reasonable judgment, compiler writers have started treating it as an excuse not to do so.
For example, given the following code:
int scaled_velocity(int v, unsigned char pow)
{
if (v > 250)
v = 250;
if (v < -250)
v = -250;
return v << pow;
}
a two's-complement implementation would not have to expend any effort
whatsoever to treat the expression v << pow as a two's-complement shift
without regard for whether v was positive or negative.
The preferred philosophy among some of today's compiler writers, however, would suggest that because v can only be negative if the program is going to engage in Undefined Behavior, there's no reason to have the program clip the negative range of v. Even though left-shifting of negative values used to be supported on every single compiler of significance, and a large amount of existing code relies upon that behavior, modern philosophy would interpret the fact that the Standard says that left-shifting negative values is UB as implying that compiler writers should feel free to ignore that.
C++ standard n3337 § 1.3.10
implementation-defined behavior
behavior, for a well-formed program construct and correct data, that
depends on the implementation and that each implementation documents
Sometimes C++ Standard doesn't impose particular behavior on some constructs but says instead that a particular, well defined behavior has to be chosen and described by particular implementation (version of library). So user can still know exactly how will program behave even though Standard doesn't describe this.
C++ standard n3337 § 1.3.24
undefined behavior
behavior for which this International Standard imposes no requirements
[ Note: Undefined behavior may be expected when this International
Standard omits any explicit definition of behavior or when a program
uses an erroneous construct or erroneous data. Permissible undefined
behavior ranges from ignoring the situation completely with
unpredictable results, to behaving during translation or program
execution in a documented manner characteristic of the environment
(with or without the issuance of a diagnostic message), to terminating
a translation or execution (with the issuance of a diagnostic
message). Many erroneous program constructs do not engender undefined
behavior; they are required to be diagnosed. — end note ]
When the program encounters construct that is not defined according to C++ Standard it is allowed to do whatever it wants to do ( maybe send an email to me or maybe send an email to you or maybe ignore the code completely).
C++ standard n3337 § 1.3.25
unspecified behavior
behavior, for a well-formed program construct and correct data, that
depends on the implementation [ Note: The implementation is not
required to document which behavior occurs. The range of possible
behaviors is usually delineated by this International Standard. — end
note ]
C++ Standard doesn't impose particular behavior on some constructs but says instead that a particular, well defined behavior has to be chosen ( bot not necessary described) by particular implementation (version of library). So in the case when no description has been provided it can be difficult to the user to know exactly how will program behave.
Undefined behavior is ugly -- as in, "The good, the bad, and the ugly".
Good: a program that compiles and works, for the right reasons.
Bad: a program that has an error, of a kind that the compiler can detect and complain about.
Ugly: a program that has an error, that the compiler cannot detect and warn about, meaning that the program compiles, and may seem to work correctly some of the time, but also fails bizarrely some of the time. That's what undefined behavior is.
Some program languages and other formal systems try hard to limit the "gulf of undefinedness" -- that is, they try to arrange things so that most or all programs are either "good" or "bad", and that very few are "ugly". It's a characteristic feature of C, however, that its "gulf of undefinedness" is quite wide.

Issues covered by rule 3.1 of misra c 2004 "Implementation-defined behavior documented"

In this rule you have to go to ISO/IEC 9899:1990 Appendix G and study each case of Implementation defined behavior to document them.
It's a difficult task to determine what are the manual checks to do in the code.
Is there some kind of list of manual checks to do because of this rule?
MISRA-C is primarily concerned with avoiding unpredictable behavior in the C language, those “traps and pitfalls” (such as undefined and unspecified behavior) all C developers should be aware of that a compiler will not always warn you about. This includes implementation-defined behavior, where the C standard specifies the behavior of certain constructs after compilation can vary. These tend to be less critical from a safety point of view, provided the compiler documentation describes its intended behavior as required by the standard.
That is, for each specific compiler the behavior is well-defined, but the concern is to assure the developers have verified this, including documenting language extensions, known bugs in the compiler (and build chain) and workarounds.
Although it is possible to manually check C code fully for MISRA􏰀-C compliancy, it is not recommended. The guidelines were developed with static analysis tools in mind. Not all guidelines can be fully checked by tools, but the better MISRA-C tools (be careful in your evaluations, there are not many “good” ones), will at least assist where it can identify automatically where code relies on implementation-specific behavior. This includes all the checks required in Rule 3.1., where implementation-defined behavior cannot be completely checked by a tool, then a manual review will be required.
Also, if you are starting a new MISRA-C project, I highly recommend referring to MISRA-C:2012, even if you are required to be MISRA-C:2004 compliant. Having MISRA-C:2012 around helps, because it has clarified many of the guidelines, including additional rationale, explanations and examples. The standard (which can be obtained at misra-c.com ) lists the C90 and C99 implementation-defined behaviors that are considered to have the potential to cause unintended behavior. This may or may not overlap with guidelines that address implementation-defined behaviors that MISRA-C is specifically concerned about.
First of all, the standard definition of implementation-defined behavior is: specific behavior which the compiler must document. So you can always refer to the compiler documentation whenever there is a need to document how a certain implementation-defined behavior is implemented.
What's left to you to do then is to document where the code relies on implementation-defined behavior. This is preferably done in source code comments.
Spontaneously, here are the most important things which you need to look for in the code. The list is not including those cases that are already covered by other MISRA rules (for example signedness of char).
The size of all the integer types. The size of int being most important, as it determines which type that is given to integer literals, C "boolean" expressions, implicitly promoted integers etc etc.
Obscure integer formats that aren't standard two's complement.
Any reliance on endianess.
The enum type format.
The floating point format.
Pointer to integer conversions, in case they are obscure on the given system.
Behavior of function inlining and the register keyword, if these are used.
Alignment issues including struct padding. Reliance on the size of a struct/union.
#include paths, in case they are obscure. Particularly if they are absolute and not relative.
Bitwise operators mixed with signed types (in most cases this is a bug or design mistake).

Resources