Iterating an array backward in C using an unsigned index - c

Is the following code, safe to iterate an array backward?
for (size_t s = array_size - 1; s != -1; s--)
array[s] = <do something>;
Note that I'm comparing s, which is unsigned, against -1;
Is there a better way?

This code is surprisingly tricky. If my reading of the C standard is correct, then your code is safe if size_t is at least as big as int. This is normally the case because size_t is usually implemented as something like unsigned long int.
In this case -1 is converted to size_t (the type of s). -1 can't be represented by an unsigned type, so we apply modulo arithmetic to bring it in range. This gives us SIZE_MAX (the largest possible value of type size_t). Similarly, decrementing s when it is 0 is done modulo SIZE_MAX+1, which also results in SIZE_MAX. Therefore your loop ends exactly where you want it to end, after processing the s = 0 case.
On the other hand, if size_t were something like unsigned short (and int bigger than short), then int could represent all possible size_t values and s would be converted to int. In other words, the comparison would be done as (int)SIZE_MAX != -1, which would always return false, thus breaking your code. But I've never seen a system where this could happen.
You can avoid any potential problems by using SIZE_MAX (which is provided by <stdint.h>) instead of -1:
for (size_t s = array_size - 1; s != SIZE_MAX; s--)
...
But my favorite solution is this:
for (size_t s = array_size; s--; )
...

Well, s will never be -1, so your ending condition will never happen. s will go from 0 to SIZE_MAX, at which point your program will probably segfault from a memory access error. The better solution would be to start at the max size, and subtract one from everywhere you use it:
for (size_t s = array_size; s > 0; s--)
array[s-1] = <do something>;
Or you can combine this functionality into the for loop's syntax:
for (size_t s = array_size; s--;)
array[s] = <do something>;
Which will subtract one before going into the loop, but checks for s == 0 before subtracting 1.

IMO in the iterations use large enough signed value. It ss easier to read by humans.

Related

behavior when for-loop variable overflow and compiler optimization

While reading http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html about undefined behavior in c, I get a question on this example.
for (i = 0; i <= N; ++i) { ... }
In this loop, the compiler can assume that the loop will iterate
exactly N+1 times if "i" is undefined on overflow, which allows a
broad range of loop optimizations to kick in. On the other hand, if
the variable is defined to wrap around on overflow, then the compiler
must assume that the loop is possibly infinite (which happens if N is
INT_MAX) - which then disables these important loop optimizations.
This particularly affects 64-bit platforms since so much code uses
"int" as induction variables.
This example is to show the C compiler could take advantage of the undefined behavior to make assumption that the execution times would be exact N+1. But I don't understand why this assumption is valid.
I can understand that if the variable is defined to wrap around on overflow and N is INT_MAX, then the for loop will be infinite because i will go from 0 to INT_MAX and overflow to INT_MIN, then loop to INT_MAX and restart from INT_MIN etc. So the compiler could not make this assumption about execution times and can not do optimization on this point.
But what about when i is undefined on overflow? In this case, i loops normally from 0 to INT_MAX, then i will be assigned INT_MAX+1, which would overflow to an undefined value such as between 0 and INT_MAX. If so, the condition i<= INT_MAX is still valid, should the for-loop not continue and also be infinite?
… then i will be assigned INT_MAX+1, which would overflow to an undefined value such as between 0 and INT_MAX.
No, that is not correct. That is written as if the rule were:
If ++i overflows, then i will be given some int value, although it is not specified which one.
However, the rule is:
If ++i overflows, the entire behavior of the program is undefined by the C standard.
That is, if ++i overflows, the C standard allows any of these things to happen:
i stays at INT_MAX.
i changes to INT_MIN.
i changes to zero.
i changes to 37.
The processor generates a trap, and the operating system terminates your process.
Some other variable changes value.
Program control jumps out of the loop, as if it had ended normally.
Anything.
Now consider this assumption used in optimization by the compiler:
… the compiler can assume that the loop will iterate exactly N+1 times…
If ++i can only set i to some int value, then the loop will not terminate, as you conclude. On the other hand, if the compiler generates code that assumes the loop will iterate exactly N+1 times, then something else will happen in the case when ++i overflows. Exactly what happens depends on the contents of the loop and what the compiler does with them. But it does not matter what: Generating this code is allowed by the C standard because whatever happens when ++i overflows is allowed by the C standard.
Lets consider an actual case:
#include <limits.h>
#include <stdio.h>
unsigned long long test_int(unsigned long long L, int N) {
for (int i = 0; i <= N; ++i) {
L++;
return L;
}
unsigned long long test_unsigned(unsigned long long L, unsigned N) {
for (unsigned i = 0; i <= N; ++i) {
L++;
return L;
}
int main() {
fprintf(stderr, "int: %llu\n", test_int(0, INT_MAX));
fprintf(stderr, "unsigned: %llu\n", test_unsigned(0, UINT_MAX));
return 0;
}
The point of the blog article is the of possible behavior of the compiler for the above code:
for test_int() the compiler can determine that for argument values from INT_MIN to -1, the function should return L unchanged, for values between 0 and INT_MAX-1, the return value should be L + N + 1 and for INT_MAX the behavior is undefined, so returning L + N + 1 is OK too, hence the code can be simplified as
unsigned long long test_int(unsigned long long L, int N) {
if (N >= 0)
L += N + 1;
return L;
}
for test_unsigned(), the same analysis yields: for argument values below UINT_MAX, the return value is L + N + 1 and for UINT_MAX there is an infinite loop:
unsigned long long test_unsigned(unsigned long long L, unsigned N) {
if (N != UINT_MAX)
return L + N + 1;
for (;;);
}
As can be seen on https://godbolt.org/z/abafdE8P4 both gcc and clang perform this optimisation for test_int, taking advantage of undefined behavior on overflow but generate iterative code for test_unsigned.
Signed integer overflow invokes the Undefined Behaviour. Programmer cannot assume that a portable program will behave the particular way.
On the other hand, a program compiled for the particular platform using particular version of the compiler and using the same versions of the libraries will behave deterministic way. But you do not know if any of those change (ie. compiler, compiler version etc etc) that the behaviour will remain the same.
So your assumptions can be valid for the particular build and execution environment, but are invalid in general.

Don't fully understand custom-written 'memcpy' function in C

So I was browsing the Quake engine source code earlier today and stumbled upon some written utility functions. One of them was 'Q_memcpy':
void Q_memcpy (void *dest, void *src, int count)
{
int i;
if (( ( (long)dest | (long)src | count) & 3) == 0 )
{
count>>=2;
for (i=0 ; i<count ; i++)
((int *)dest)[i] = ((int *)src)[i];
}
else
for (i=0 ; i<count ; i++)
((byte *)dest)[i] = ((byte *)src)[i];
}
I understand the whole premise of the function but I don't quite understand the reason for the bitwise OR between the source and destination address. So the sum of my questions are as follows:
Why does 'count' get used in the same bitwise arithmetic?
Why is that result's last two bits checked if they are differing?
What purpose does this whole check serve?
I'm sure it's something obvious but please excuse my ignorance because I haven't really delved into the more low level side of things when it comes to programming. I just find it interesting and want to learn more.
It is finding out whether the source and destination pointers are int aligned, and whether the count is an exact int size of bytes.
If those three things are all true, the l.s. 2 bits of them all will be 0 (assuming pointers and int are 4 bytes). So the algorithm ORs the three values, and isolates the l.s. 2 bits.
In this case, it copies int by int. Otherwise it copies char by char.
If the test fails, a more sophisticated algorithm would copy some of the leading and trailing bytes char by char and the intermediate bytes int by int.
The bitwise ORing and ANding with 3 is to check whether the source, destination and count are divisible by 4. If they are, the operation can work with 4-byte words, while this code is assuming int as 4 bytes. Otherwise the operation is performed bytewise.
It first tests if all 3 arguments are divisible by 4. If - and only if - they all are, it proceeds with copying 4 bytes at a time.
I.e. this undecoded would be
if ((long) src % 4 == 0 && (long) dst % 4 == 0 && count % 4 == 0 )
{
count = count / 4;
for (i = 0; i < count; i++)
((int *)dest)[i] = ((int *)src)[i];
}
I am not sure if they tested their compiler and it generated bad code for even a test, and therefore they decided to write it in such a convoluted way. In any case, the x | y | z will guarantee that a bit n is set in the result if it is set in any of x, y or z. Therefore if the (x | y | z) & 3 results in 0, none of the numbers had either of the 2 lowest bits set, and therefore are divisible by 4.
Of course it would be rather silly to use now - the standard library memcpy in recent library implementations is almost certainly better than this.
Therefore, on recent compilers you can optimize all calls to Q_memcpy by switching them to memcpy. GCC could generate things like 64-bit or SIMD moves with memcpy depending on the size of area to be copied.

Proper way to count down with unsigned

I am reading Carnegie Mellon slides on computer systems for my quiz. In the slide page 49 :
Counting Down with Unsigned
Proper way to use unsigned as loop index
unsigned i;
for (i = cnt-2; i < cnt; i--)
a[i] += a[i+1];
Even better
size_t i;
for (i = cnt-2; i < cnt; i--)
a[i] += a[i+1];
I don't get why it's not going to be infinite loop. I am decrementing i and it is unsigned so it should be always less than cnt. Please explain.
The best option for down counting loops I have found so far is to use
for(unsigned i=N; i-->0; ) { }
This invokes the loop body with i=N-1 ... 0. This works the same way for both signed and unsigned data types and does not rely on any overflows.
This loop is simply relying on the fact that i will be decremented past 0, which makes it the max uint value. Which breaks the loop because now i < cnt == false.
Per Overflowing of Unsigned Int:
unsigned numbers can't overflow, but instead wrap around using the
properties of modulo.
Both the C and C++ standard guarantee this uint wrapping behavior, but it's undefined for signed integers.
The goal of the loops is to loop from cnt-2 down to 0. It achieves the effect of writing i >= 0.
The previous slide correctly talks about why a loop condition of i >= 0 doesn't work. Unsigned numbers are always greater than or equal to 0, so such a condition would be vacuously true. A loop condition of i < cnt ends up looping until i goes past 0 and wraps around. When you decrement an unsigned 0 it becomes UINT_MAX (232 - 1 for a 32-bit integer). When that happens, i < cnt is guaranteed to be false, and the loop terminates.
I would not write loops like this. It is technically correct but very poor style. Good code is not just correct, it is readable, so others can easily figure out what it's doing.
It's taking advantage of what happens when you decrement unsigned integer 0. Here's a simple example.
unsigned cnt = 2;
for (int i = 0; i < 5; i++) {
printf("%u\n", cnt);
cnt--;
}
That produces...
2
1
0
4294967295
4294967294
Unsigned integer 0 - 1 becomes UINT_MAX. So instead of looking for -1, you watch for when your counter becomes bigger than its initial state.
Simplifying the example a bit, here's how you can count down to 0 from 5 (exclusive).
unsigned i;
unsigned cnt = 5;
for (i = cnt-1; i < cnt; i--) {
printf("%d\n", i);
}
That prints:
4
3
2
1
0
On the final iteration i = UINT_MAX which is guaranteed to be larger than cnt so i < cnt is false.
size_t is "better" because it's unsigned and it's as big as the biggest thing in C, so you don't have to ensure that cnt is the same type as i.
This appears to be an alternative expression of the established idiom for implementing the same thing
for (unsigned i = N; i != -1; --i)
...;
They simply replaced the more readable condition of i != -1 with a slightly more cryptic i < cnt. When 0 is decremented in the unsigned domain it actually wraps around to the UINT_MAX value, which compares equal to -1 (in the unsigned domain) and which is greater than or equal to cnt. So, either i != -1 or i < cnt works as a condition for continuing iterations.
Why would they do it that way specifically? Apparently because they start from cnt - 2 and the value of cnt can be smaller than 2, in which case their condition does indeed work properly (and i != -1 doesn't). Aside from such situations there's no reason to involve cnt into the termination condition. One might say that an even better idea would be to pre-check the value of cnt and then use the i != -1 idiom
if (cnt >= 2)
for (unsigned i = cnt - 2; i != -1; --i)
...;
Note, BTW, that as long as the starting value of i is known to be non-negative, the implementation based on the i != -1 condition works regardless of whether i is signed or unsigned.
I think you are confused with int and unsigned int data types. These two are different. In the int datatype (2 byte storage size), you have its range from -32,768 to 32,767 whereas in the unsigned int datatype (2 byte storage size). you have the range from 0 to 65,535 .
In the above example mentioned, you are using the variable i of type unsigned int. It will decrements up to i=0 and then ends the for loop as per the semantics.

How to compare C pointers?

Recently, I wrote some code to compare pointers like this:
if(p1+len < p2)
however, some staff said that I should write like this:
if(p2-p1 > len)
to be safe.
Here,p1 and p2 are char * pointers,len is an integer.
I have no idea about that.Is that right?
EDIT1: of course,p1 and p2 pointer to the same memory object at begging.
EDIT2:just one min ago,I found the bogo of this question in my code(about 3K lines),because len is so big that p1+len can't store in 4 bytes of pointer,so p1+len < p2 is true.But it shouldn't in fact,so I think we should compare pointers like this in some situation:
if(p2 < p1 || (uint32_t)p2-p1 > (uint32_t)len)
In general, you can only safely compare pointers if they're both pointing to parts of the same memory object (or one position past the end of the object). When p1, p1 + len, and p2 all conform to this rule, both of your if-tests are equivalent, so you needn't worry. On the other hand, if only p1 and p2 are known to conform to this rule, and p1 + len might be too far past the end, only if(p2-p1 > len) is safe. (But I can't imagine that's the case for you. I assume that p1 points to the beginning of some memory-block, and p1 + len points to the position after the end of it, right?)
What they may have been thinking of is integer arithmetic: if it's possible that i1 + i2 will overflow, but you know that i3 - i1 will not, then i1 + i2 < i3 could either wrap around (if they're unsigned integers) or trigger undefined behavior (if they're signed integers) or both (if your system happens to perform wraparound for signed-integer overflow), whereas i3 - i1 > i2 will not have that problem.
Edited to add: In a comment, you write "len is a value from buff, so it may be anything". In that case, they are quite right, and p2 - p1 > len is safer, since p1 + len may not be valid.
"Undefined behavior" applies here. You cannot compare two pointers unless they both point to the same object or to the first element after the end of that object. Here is an example:
void func(int len)
{
char array[10];
char *p = &array[0], *q = &array[10];
if (p + len <= q)
puts("OK");
}
You might think about the function like this:
// if (p + len <= q)
// if (array + 0 + len <= array + 10)
// if (0 + len <= 10)
// if (len <= 10)
void func(int len)
{
if (len <= 10)
puts("OK");
}
However, the compiler knows that ptr <= q is true for all valid values of ptr, so it might optimize the function to this:
void func(int len)
{
puts("OK");
}
Much faster! But not what you intended.
Yes, there are compilers that exist in the wild that do this.
Conclusion
This is the only safe version: subtract the pointers and compare the result, don't compare the pointers.
if (p - q <= 10)
Technically, p1 and p2 must be pointers into the same array. If they are not in the same array, the behaviour is undefined.
For the addition version, the type of len can be any integer type.
For the difference version, the result of the subtraction is ptrdiff_t, but any integer type will be converted appropriately.
Within those constraints, you can write the code either way; neither is more correct. In part, it depends on what problem you're solving. If the question is 'are these two elements of the array more than len elements apart', then subtraction is appropriate. If the question is 'is p2 the same element as p1[len] (aka p1 + len)', then the addition is appropriate.
In practice, on many machines with a uniform address space, you can get away with subtracting pointers to disparate arrays, but you might get some funny effects. For example, if the pointers are pointers to some structure type, but not parts of the same array, then the difference between the pointers treated as byte addresses may not be a multiple of the structure size. This may lead to peculiar problems. If they're pointers into the same array, there won't be a problem like that — that's why the restriction is in place.
The existing answers show why if (p2-p1 > len) is better than if (p1+len < p2), but there's still a gotcha with it -- if p2 happens to point BEFORE p1 in the buffer and len is an unsigned type (such as size_t), then p2-p1 will be negative, but will be converted to a large unsigned value for comparison with the unsigned len, so the result will probably be true, which may not be what you want.
So you might actually need something like if (p1 <= p2 && p2 - p1 > len) for full safety.
As Dietrich already said, comparing unrelated pointers is dangerous, and could be considered as undefined behavior.
Given that two pointers are within the range 0 to 2GB (on a 32-bit Windows system), subtracting the 2 pointers will give you a value between -2^31 and +2^31. This is exactly the domain of a signed 32-bit integer. So in this case it does seem to make sense to subtract two pointers because the result will always be within the domain you would expect.
However, if the LargeAddressAware flag is enabled in your executable (this is Windows-specific, don't know about Unix), then your application will have an address space of 3GB (when run in 32-bit Windows with the /3G flag) or even 4GB (when run on a 64-bit Windows system).
If you then start to subtract two pointers, the result could be outside the domain of a 32-bit integer, and your comparison will fail.
I think this is one of the reasons why the address space was originally divided in 2 equal parts of 2GB, and the LargeAddressAware flag is still optional. However, my impression is that current software (your own software and the DLL's you're using) seem to be quite safe (nobody subtracts pointers anymore, isn't it?) and my own application has the LargeAddressAware flag turned on by default.
Neither variant is safe if an attacker controls your inputs
The expression p1 + len < p2 compiles down to something like p1 + sizeof(*p1)*len < p2, and the scaling with the size of the pointed-to type can overflow your pointer:
int *p1 = (int*)0xc0ffeec0ffee0000;
int *p2 = (int*)0xc0ffeec0ffee0400;
int len = 0x4000000000000000;
if(p1 + len < p2) {
printf("pwnd!\n");
}
When len is multiplied by the size of int, it overflows to 0 so the condition is evaluated as if(p1 + 0 < p2). This is obviously true, and the following code is executed with a much too high length value.
Ok, so what about p2-p1 < len. Same thing, overflow kills you:
char *p1 = (char*)0xa123456789012345;
char *p2 = (char*)0x0123456789012345;
int len = 1;
if(p2-p1 < len) {
printf("pwnd!\n");
}
In this case, the difference between the pointer is evaluated as p2-p1 = 0xa000000000000000, which is interpreted as a negative signed value. As such, it compares smaller then len, and the following code is executed with a much too low len value (or much too large pointer difference).
The only approach that I know is safe in the presence of attacker-controlled values, is to use unsigned arithmetic:
if(p1 < p2 &&
((uintptr_t)p2 - (uintptr_t)p1)/sizeof(*p1) < (uintptr_t)len
) {
printf("safe\n");
}
The p1 < p2 guarantees that p2 - p1 cannot yield a genuinely negative value. The second clause performs the actions of p2 - p1 < len while forcing use of unsigned arithmetic in a non-UB way. I.e. (uintptr_t)p2 - (uintptr_t)p1 gives exactly the count of bytes between the bigger p2 and the smaller p1, no matter the values involved.
Of course, you don't want to see such comparisons in your code unless you know that you need to defend against determined attackers. Unfortunately, it's the only way to be safe, and if you rely on either form given in the question, you open yourself up to attacks.

For loop condition to stop at 0 when using unsigned integers?

I have a loop that has to go from N to 0 (inclusively). My i variable is of type size_t which is usually unsigned. I am currently using the following code:
for (size_t i = N; i != (size_t) -1; --i) {
...
}
Is that correct? Is there a better way to handle the condition?
Thanks,
Vincent.
Yes, it's correct and it is a very common approach. I wouldn't consider changing it.
Arithmetic on unsigned integer types is guaranteed to use modulo 2^N arithmetic (where N is the number of value bits in the type) and behaviour on overflow is well defined. The result is converted into the range 0 to 2^N - 1 by adding or subtracting multiples of 2^N (i.e. modulo 2^N arithmetic).
-1 converted to an unsigned integer type (of which size_t is one) converts to 2^N - 1. -- also uses modulo 2^N arithmetic for unsigned types so an unsigned type with value 0 will be decremented to 2^N - 1. Your loop termination condition is correct.
Just because for has a convenient place to put a test at the beginning of each iteration doesn't mean you have to use it. To handle N to 0 inclusive, the test should be at the end, at least if you care about handling the maximum value. Don't let the convenience suck you in to putting the test in the wrong place.
for (size_t i = N;; --i) {
...
if (i == 0) break;
}
A do-while loop would also work but then you'd additionally give up i being scoped to the loop.
You can use this:
for (size_t i = n + 1; i-- > 0;)
{
}
Hope that helps.
Personally, I would just use a different loop construct, but to each their own:
size_t i = N;
do {
...
} while (i --> 0);
(you could just use (i--) as the loop condition, but one should never pass up a chance to use the --> "operator").
for ( size_t i = N ; i <= N ; i-- ) { .... }
This would do it because size_t is an unsigned int. Unsigned ints are 32bits. When the variable i has a value of 0, you want your loop to execute the condition. If you perform i--, the computer does
00000000000000000000000000000000
-00000000000000000000000000000001
Which results in a clear overflow, giving a value of 111111111...1. For a signed two's complement integer, this value is clearly negative. However, the type of i is an unsigned int so the computer will interpret 111111...1 to be a very large positive value.
So you have a few options:
1) Do as above and make the loop terminate when overflow occurs.
2) Make the loop run from i = 0 to i <= N but use (N-i) instead of i in everywhere in your loop. For example, myArray[i] would become myArray[N-i] (off by one depending on what the value of N actually represents).
3) Make the condition of your for loop exploit the precedence of the unary -- operator. As another user posted,
for ( size_t i = N + 1 ; i-- > 0 ; ) { ... }
This will set i to N+1, check to see if the condition N+1 > 0 still holds. It does, but i-- has a side effect, so the value of i is decremented to i = N. Keep going until you get to i = 1. The condition will be test, 1 > 0 is true, the side effect occurs, then i = 0 and it executse.
You can use a second variable as the loop counter to make the range of iteration clear to a future reviewer.
for (size_t j=0, i=N; j<=N; ++j, --i) {
// code here ignores j and uses i which runs from N to 0
...
}
for (i=N; i+1; i--)
Since unsigned integer will roll into its max value when decremented from zero, you can try the following, provided N is less then that maximum value (someone please correct me if this is UB):
for ( size_t i = N; i <= N; i-- ) { /* ... */ }

Resources