In C99, the term arithmetic operation appears 16 times, but I don't see a definition for it.
The term arithmetic operator only appears twice in the text (again without definition) but it does appear in the Index:
arithmetic operators
additive, 6.5.6, G.5.2
bitwise, 6.5.10, 6.5.11, 6.5.12
increment and decrement, 6.5.2.4, 6.5.3.1
multiplicative 6.5.5, G.5.1
shift, 6.5.7
unary, 6.5.3.3
Then we have + - | &(binary) ++ -- *(binary) / % << >> ~ as arithmetic operators, if the Index is considered normative!
Perhaps we should identify arithmetic operation as being the use of an arithmetic operator. But F9.4.5 says that the sqrt() function is also an arithmetic operation, and refers to IEC 60559 (aka. IEEE754) for details. So there must be arithmetic operations that are not just the use of arithmetic operators.
Since we don't have a formal definition let's see if we can piece together a rationale interpretation of what an arithmetic operation should be. This will be speculative but I can not find any obvious defect reports or open issues that cover this.
I guess I would start with what are considered arithmetic types, which is covered in section 6.2.5 Types paragraph 18 says (emphasis mine going forward):
Integer and floating types are collectively called arithmetic types.
Each arithmetic type belongs to one type domain: the real type domain
comprises the real types, the complex type domain comprises the
complex types.
ok, so we know that an arithmetic operation has to operate on either an integer or a floating point type. So what is an operation? It seems like we have a good go at defining that from section 5.1.2.3 Program execution paragraph 2 which says:
Accessing a volatile object, modifying an object, modifying a file, or
calling a function that does any of those operations are all side
effects,11) which are changes in the state of the execution
environment. [...]
So modifying an object or call a function that does that, it is an operation. What is an object? Section 3.14 says:
region of data storage in the execution environment, the contents of
which can represent values
Although the standard seems to use the term operation more loosely to mean an evaluation, for example in section 7.12.1 Treatment of error conditions it says:
The behavior of each of the functions in is specified for all
representable values of its input arguments, except where stated
otherwise. Each function shall execute as if it were a single
operation without generating any externally visible exceptional
conditions.
and in section 6.5 Expressions paragraph 8 which says:
A floating expression may be contracted, that is, evaluated as though
it were an atomic operation [...]
So this would seem to imply that an evaluation is an operation.
So it would seem from these sections that pretty much all the arithmetic operators and any math function would fall under a common sense definition of arithmetic operation.
The most convincing bit I could find to be an implicit definition lies in 7.14 Signal Handling, paragraph 3, in the definition of the SIGFPE signal:
SIGFPE - an erroneous arithmetic operation, such as a zero divide or an operation resulting in overflow
One might then draw a conclusion that any operation that may cause SIGFPE to be raised can be considered an arithmetic operation; only arithmetic operations can result in the SIGFPE signal being raised.
That covers pretty much anything in <math.h> and the arithmetic operators, and <complex.h> if implemented. While a signal may not be raised for integral types, signed overflow and other "exceptional" conditions are allowed to generate trap representations, which means no other operations may be carried out reliably until a valid value is obtained — something that can only be done via assignment. In other words, the definition can apply equally to operations on an integral value.
As a result, pretty much any operation other than getting the size of an object/type, dereferencing a pointer, and taking the address of an object may be considered an arithmetic operation. Note that a[n] is *((a) + (n)), so even using an array can be considered an arithmetic operation.
An arithmetic operation involve manipulation of numbers. sqrt also manipulate numbers and that could be the reason that standard says it an arithmetic operation.
Related
Is negating the integer -2^(31) defined as undefined behavior in the C standard or simply -2^(31) again? Trying it the latter holds, but it would be interesting to know how the C standard specifies it.
The standard (n2176 draft) says explictely at 6.5 Expressions § 5:
If an exceptional condition occurs during the evaluation of an expression (that is, if the result is not
mathematically defined or not in the range of representable values for its type), the behavior is
undefined.
We are exactly there: the result is not is the range of representable values for the type, so it is explicitely UB.
That being said, most implementation use 2'complement for negative values and process operations on signed types as the operation on the unsigned type values having the same representation. Which is perfectly defined.
So getting -2^(31) again can be expected on common implementations. But as the standard says that it is UB, it cannot be relied on.
While at least from a hand wave point of view I believe I know what an "arithmetic operator" is, I'm looking for a formal definition. I've examined the C17 standard document and I can't find such a definition, although it uses the term "arithmetic operator" in several places.
The closest I've been able to find is in the index of C17, where page numbers are provided for additive, bitwise, increment and decrement, multiplicative, shift, and unary under the common heading "arithmetic operators". I've looked online at various sources and the most common thing I've found only says that binary +, -, *, /, and % are the C arithmetic operators. Some also throw in ++ and --.
I'm pretty sure I'm simply missing something since I do find the standard quite daunting. However, I also find the various online sources somewhat dubious since they often seem to differ.
Thanks!
Update: Since some readers objected to my references to both C and C++ in the same posting, I've removed the references to C++ in the modified version above and will do an entirely separate posting for it later if I can first get the issue resolved for C.
The C standard does not explicitly define the term arithmetic operator, though it defines what an arithmetic operand is. If you read carefully, nothing in C is defined by using the term arithmetic operator, it exists only as a grouping in the index and in a title of one section. The term arithmetic operator by itself does not appear in any paragraph.
From the index, we indeed can get a list
arithmetic operators
additive, 6.2.6.2, 6.5.6, G.5.2
bitwise, 6.2.6.2, 6.5.3.3, 6.5.10, 6.5.11, 6.5.12
increment and decrement, 6.5.2.4, 6.5.3.1
multiplicative, 6.2.6.2, 6.5.5, G.5.1
shift, 6.2.6.2, 6.5.7
unary, 6.5.3.3
From this we could formulate that the arithmetic operators are those that require the operands to be arithmetic operands, i.e. of an arithmetic type (except in special cases such as pointer addition, subtraction), i.e.
additive + and -
bitwise &, | and ^
increment and decrement ++ and --
multiplicative *, / and %
shift << and >>
unary -, ~ and +. It is debatable whether ! is an arithmetic operator or not, even though it is listed in section 6.5.3.3.
Another notable thing about these operators are that the operands might undergo usual arithmetic conversions.
Arithmatic operators are operators used to perform mathematical operations like addition, substraction, multiplication and division. As simple as that.
ex: a+b = c
I'll quote from N1570, but the C11 standard has similar wording:
The fpclassify macro classifies its argument value as NaN, infinite, normal,
subnormal, zero, or into another implementation-defined category. First, an argument
represented in a format wider than its semantic type is converted to its semantic type.
Then classification is based on the type of the argument.
(my emphasis)
And a footnote:
Since an expression can be evaluated with more range and precision than its type has, it is important to
know the type that classification is based on. For example, a normal long double value might
become subnormal when converted to double, and zero when converted to float.
What does it mean for the argument to be "converted to its semantic type". There is no definition of "semantic type" anywhere evident.
My understanding is that that any excess precision is removed, as if storing the expression's value to a variable of float, double or long double, resulting in a value of the precision the programmer expected. In which case, using fpclassify() and friends on an lvalue would result in no conversion necessary for a non-optimising compiler. Am I correct, or are these functions much less useful than advertised to be?
(This question arises from comments to a Code Review answer)
The semantic type is simply the type of the expression as described elsewhere in the C standard, disregarding the fact that the value is permitted to be represented with excess precision and range. Equivalently, the semantic type is the type of the expression if clause 5.2.4.2.2 paragraph 9 (which says that floating-point values may be evaluated with excess range and precision) were not in the standard.
Converting an argument to its semantic type means discarding the excess precision and range (by rounding the value to the semantic type using whatever rounding rule is in effect for the operation).
Regarding your hypothesis that applying fpclassify to an lvalue does not require any conversion (because the value stored in an object designated by an lvalue must have already been converted to its semantic type when it was assigned), I am not sure that holds formally. Certainly when the object’s value is updated by assignment, 5.2.4.2.2 9 requires that excess range and precision be removed. But consider alternate ways of modifying the value, such as the postfix increment operator. Does that count as an assignment? Its specification in 6.5.2.4 2 says to see the discussion of compound assignment for information on its conversions and effects. That is a bit vague. One would have to consider all possible ways of modifying an object and evaluate what the C standard says about them.
In the following C snippet that checks if the first two bits of a 16-bit sequence are set:
bool is_pointer(unsigned short int sequence) {
return (sequence >> 14) == 3;
}
CLion's Clang-Tidy is giving me a "Use of a signed integer operand with a binary bitwise operator" warning, and I can't understand why. Is unsigned short not unsigned enough?
The code for this warning checks if either operand to the bitwise operator is signed. It is not sequence causing the warning, but 14, and you can alleviate the problem by making 14 unsigned by appending a u to the end.
(sequence >> 14u)
This warning is bad. As Roland's answer describes, CLion is fixing this.
There is a check in clang-tidy that is called hicpp-signed-bitwise. This check follows the wording of the HIC++ standard. That standard is freely available and says:
5.6.1. Do not use bitwise operators with signed operands
Use of signed operands with bitwise operators is in some cases subject to undefined or implementation defined behavior. Therefore, bitwise operators should only be used with operands of unsigned integral types.
The authors of the HIC++ coding standard misinterpreted the intention of the C and C++ standards and either accidentally or intentionally focused on the type of the operands instead of the value of the operands.
The check in clang-tidy implements exactly this wording, in order to conform to that standard. That check is not intended to be generally useful, its only purpose is to help the poor souls whose programs have to conform to that one stupid rule from the HIC++ standard.
The crucial point is that by definition integer literals without any suffix are of type int, and that type is defined as being a signed type. HIC++ now wrongly concludes that positive integer literals might be negative and thus could invoke undefined behavior.
For comparison, the C11 standard says:
6.5.7 Bitwise shift operators
If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.
This wording is carefully chosen and emphasises that the value of the right operand is important, not its type. It also covers the case of a too large value, while the HIC++ standard simply forgot that case. Therefore, saying 1u << 1000u is ok in HIC++, while 1 << 3 isn't.
The best strategy is to explicitly disable this single check. There are several bug reports for CLion mentioning this, and it is getting fixed there.
Update 2019-12-16: I asked Perforce what the motivation behind this exact wording was and whether the wording was intentional. Here is their response:
Our C++ team who were involved in creating the HIC++ standard have taken a look at the Stack Overflow question you mentioned.
In short, referring to the object type in the HIC++ rule instead of the value is an intentional choice to allow easier automated checking of the code. The type of an object is always known, while the value is not.
HIC++ rules in general aim to be "decidable". Enforcing against the type ensures that a decidable check is always possible, ie. directly where the operator is used or where a signed type is converted to unsigned.
The rationale explicitly refers to "possible" undefined behavior, therefore a sensible implementation can exclude:
constants unless there is definitely an issue and,
unsigned types that are promoted to signed types.
The best operation is therefore for CLion to limit the checking to non-constant types before promotion.
I think the integer promotion causes here the warning. Operands smaller than an int are widened to integer for the arithmetic expression, which is signed. So your code is effectively return ( (int)sequence >> 14)==3; which leds to the warning. Try return ( (unsigned)sequence >> 14)==3; or return (sequence & 0xC000)==0xC000;.
Edited to include proper standard reference thanks to Carl Norum.
The C standard states
If an exceptional condition occurs during the evaluation of an expression (that is, if the result is not mathematically defined or not in the range of representable values for its type), the behavior is undefined.
Are there compiler switches that guarantee certain behaviors on integer overflow? I'd like to avoid nasal demons. In particular, I'd like to force the compiler to wrap on overflow.
For the sake of uniqueness, let's take the standard to be C99 and the compiler to be gcc. But I would be interested in answers for other compilers (icc, cl) and other standards (C1x, C89). In fact, just to annoy the C/C++ crowd, I'd even appreciate answers for C++0x, C++03, and C++98.
Note: International standard ISO/IEC 10967-1 may be relevant here, but as far as I could tell it was mentioned only in the informative annex.
Take a look at -ftrapv and -fwrapv:
-ftrapv
This option generates traps for signed overflow on addition, subtraction, multiplication operations.
-fwrapv
This option instructs the compiler to assume that signed arithmetic overflow of addition, subtraction and multiplication wraps around using twos-complement representation. This flag enables some optimizations and disables other. This option is enabled by default for the Java front-end, as required by the Java language specification.
For your C99 answer, I think 6.5 Expressions, paragraph 5 is what you're looking for:
If an exceptional condition occurs during the evaluation of an expression (that is, if the result is not mathematically defined or not in the range of representable values for its type), the behavior is undefined.
That means if you get an overflow, you're out of luck - no behaviour of any kind guaranteed. Unsigned types are a special case, and never overflow (6.2.5 Types, paragraph 9):
A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.
C++ has the same statements, worded a bit differently:
5 Expressions, paragraph 4:
If during the evaluation of an expression, the result is not mathematically defined or not in the range of representable values for its type, the behavior is undefined. [Note: most existing implementations of C++ ignore integer overflows. Treatment of division by zero, forming a remainder using a zero divisor, and all floating point exceptions vary among machines, and is usually adjustable by a library function. —endnote]
3.9.1 Fundamental types, paragraph 4:
Unsigned integers, declared unsigned, shall obey the laws of arithmetic modulo 2^n where n is the number of bits in the value representation of that particular size of integer.
In C99 the general behavior is desribed in 6.5/5
If an exceptional condition occurs
during the evaluation of an expression
(that is, if the result is not
mathematically defined or not in the
range of representable values for its
type), the behavior is undefined.
The behavior of unsigned types is described in 6.2.5/9, which basically states that operations on unsigned types never lead to exceptional condition
A computation involving unsigned
operands can never overflow, because a
result that cannot be represented by
the resulting unsigned integer type is
reduced modulo the number that is one
greater than the largest value that
can be represented by the resulting
type.
GCC compiler has a special option -ftrapv, which is intended to catch run-time overflow of signed integer operations.
For completeness, I'd like to add that Clang now has "checked arithmetic builtins" as a language extension. Here is an example using checked unsigned multiplication:
unsigned x, y, result;
...
if (__builtin_umul_overflow(x, y, &result)) {
/* overflow occured */
...
}
...
http://clang.llvm.org/docs/LanguageExtensions.html#checked-arithmetic-builtins
6.2.5 paragraph 9 is what you're looking for:
The range of nonnegative values of a signed integer type is a subrange of the
corresponding unsigned integer type, and the representation of the same value in each
type is the same.31) A computation involving unsigned operands can never overflow,
because a result that cannot be represented by the resulting unsigned integer type is
reduced modulo the number that is one greater than the largest value that can be
represented by the resulting type.
The previous postings all commented on the C99 standard, but in fact this guarantee was already available earlier.
The 5th paragraph of Section 6.1.2.5 Types
of the C89 standard states
A computation involving unsigned operands can never overflow,
because a result that cannot be represented by the resulting unsigned
integer type is reduced modulo the number that is one greater than
the largest value that can be represented by the resulting unsigned integer type.
Note that this allows C programmers to replace all unsigned divisions by some constant to be replaced by a multiplication with the inverse element of the ring formed by C's modulo 2^N interval arithmetic.
And this can be done without any "correction" as it would be necessary by approximating the division with a fixed-point multiplication with the reciprocal value.
Instead, the Extended Euclidian Algorithm can be used to find the inverse Element and use it as the multiplier. (Of course, for the sake of staying portable, bitwise AND operations should also be applied in order to ensure the results have the same bit widths.)
It may be worthwhile to comment that most C compilers already implement this as an optimization. However, such optimizations are not guaranteed, and therefore it might still be interesting for programmers to perform such optimizations manually in situations where speed matters, but the capabilities of the C optimizer are either unknown or particularly weak.
And as a final remark, the reason for why trying to do so at all: The machine-level instructions for multiplication are typically much faster than those for division, especially on high-performance CPUs.
I'm not sure if there are any compiler switches you can use to enforce uniform behavior for overflows in C/C++. Another option is to use the SafeInt<T> template. It's a cross platform C++ template that provides definitive overflow / underflow checks for all types of integer operations.
http://safeint.codeplex.com/