Why is glibc's __random_r assigning variables it immediately overwrites? - c

I was looking for the source for glibc's rand() function, which an answer here links to.
Following the link, I'm puzzled about the code for the __random_r() TYPE_0 branch:
int32_t val = state[0];
val = ((state[0] * 1103515245) + 12345) & 0x7fffffff;
state[0] = val;
*result = val;
What is the point of the val variable, getting assigned and then immediately overwritten? The random_data struct that holds state is nothing unusual.
As one would expect, compiling with -O2 on godbolt gives the same code if you just eliminate val. Is there a known reason for this pattern?
UPDATE: This seems it was an aberration in the version linked to from that answer, I've updated the links there to the 2.28 version. It might have been something that was done temporarily to aid in debugging by making the contents of state[0] easier to see in a local watchlist?

Wow, that is indeed some unbelievable garbage code.
There is no justification for this.
And not only is initialization of val not needed, the fact is that state[0] is an int32_t, and multiplication by 1103515245 will trigger undefined behaviour in GCC (integer overflow) on any platform with 32-bit ints (= basically every one). And GCC is the compiler most often used to compile Glibc.
As noted by HostileFork, the code in more recent 2.28 reads:
int32_t val = ((state[0] * 1103515245U) + 12345U) & 0x7fffffff;
state[0] = val;
*result = val;
With this, not only is the useless initialization removed, but the U suffix makes the multiplication happen with unsigned integers, avoiding undefined behaviour. The & 0x7fffffff ensures that the resulting value fits into an int32_t and is positive.

Related

Why initializing a value by itself is not throwing any error in C?

I came up with this issue whilst practicing assignment in C. When I try to initialize any variable with it's name(identifier), which doesn't even exist, is not throwing any kind of error.
int x = x;
As far as I know associativity of assignment operator is right to left. So the following code should throw an error whilst i'm initializing a variable with an rvalue which doesn't even exist. Rather, it's assigning some kind of garbage value to it. Why is this happening?
Setting value to be itself is undefined behavior in c standard. After appending compiler option -Winit-self and warning occurs.
GCC 11.2 does diagnose this if you use -Wuninitialized -Winit-self.
I suspect int x = x; may have been used as an idiom1 for “I know what I am doing; do not warn me about x being uninitialized,” and so it was excluded from -Wuninitialized, but a separate warning was provided with -Winit-self.
Note that while the behavior of int x = x; is not defined by the C standard (if it is inside a function and the address of x is not taken), neither does it violate any constraints of the C standard. This means a compiler is not required to diagnose the problem. Choosing to issue a warning message is a matter of choices and quality of the C implementation rather than rules of the C standard.
Apple Clang 11.0 does not warn for int x = x; even with -Wuninitialized -Winit-self. I suspect this is a bug (a deviation from what the authors would have wanted, if not from the rules of the C standard), but perhaps there was some reason for it.
Consider code such as:
int FirstIteration = 1;
int x;
for (int i = 0; i < N; ++i)
{
if (FirstIteration)
x = 4;
x = foo(x);
FirstIteration = 0;
}
A compiler might observe that x = 4; is inside an if, and therefore reason that x might not be initialized in foo(x). The compiler might be designed to issue a warning message in such cases and be unable to reason that the use of FirstIteration guarantees that x is always initialized before being used in foo(x). Taking int x = x; as an assertion from the author that they have deliberately designed the code this way gives them a way to suppress the spurious warning.
Footnote
1 There are several idioms which are used to tell a compiler the author is deliberately using some construction which is often an error. For example, if (x = 3) … is completely defined C code that sets x to 3 and always evaluates as true for the if, but it is usually a mistake as x == 3 was intended. The code if ((x = 3)) … is identical in C semantics, but the presence of the extra parentheses is an idiom used to say “I used assignment deliberately,” and so a compiler may warn for if (x = 3) … but not for if ((x = 3)) ….

What does *(int32_t *)(a + 4) = b mean?

I have decompiled an .so file (from an ARM lib in an Android app) using retdec and among the code I could find instructions like this:
int32_t a = `some value`;
int32_t b = `another value`;
*(int32_t *)(a + 4) = b;
Due to the fact that running this with any value results in a warning when compiling and segmentation fault when running, I'm not sure what it really does.
Working from the inside out:
a + 4
Takes the value of a, and adds 4 to it, following the usual arithmetic conversions if applicable. This expression has at least the rank of int32_t.
Next:
(int32_t *)(a + 4)
Means that you take this new integer value, and interpret it as a pointer to an int32_t. This expression has type int32_t *.
One step further out, you're dereferencing it with the * operator:
*(int32_t *)(a + 4)
This gives an lvalue (like a typical variable) of type int32_t at the address a + 4 (The validity of such an address would be implementation-dependant).
Finally, you assign the value in b to this location:
*(int32_t *)(a + 4) = b;
All together, this means that you store the value of the int32_t b, taken as an int32_t, into the memory location 4 past the value of a.
Unless a + 4 happens to point to a valid memory location to store an int32_t (as it presumably would have in its original context), this will likely result in the program misbehaving. At best, the behaviour is implementation-defined. At worst, it's undefined.
The problem is that the decompiler cannot know the types of variables. It just can know that there is some stuff in registers and some stuff on stack of certain size and it is used in a certain way, so it figures that all 32-bit entities are int32_t even though they could be pointers too on ARM. Or even zero-extended chars moved around in registers.
In this case a seems not to be an integer, but a pointer to an element in an array, or perhaps a pointer to a structure and the code was something like
int *a = something;
int b = calculate_something();
a[1] = b;
Or perhaps
struct foo *a = something;
int b = calculate_something();
a->second_member = b;
We wouldn't know. So the best the decompiler can come up with is
int32_t a = something;
int32_t b = calculate_something();
*(int32_t *)(a + 4) = b;
i.e. "oops, the value in a + sizeof (int) now should be used as a pointer, and b be assigned to that location."
As for compiling it again - don't even dream compiling it for any other platform than the code originated from.
It means that de-compilation of machine code does not yield the original source code back! Let's take, for example, the code snippet below.
int a[5];
int b;
void somefunc(void)
{
a[1] = b;
}
It compiles to something like this:
somefunc:
ldr r2, =b # Load the address of b
ldr r3, =a # Load the address of a
ldr r2, [r2] # Load the value in b
str r2, [r3, #4] # Store value in b to a[1] or *(a + 4)
bx lr # return
Now, if someone were to try to de-compile it line by line into C code, without knowing about the array and any other context, it would turn out something like the snippet you posted.
str r2, [r3, 4] => *((int32_t *)r3 + 4) = r2
There are probably also many other snippets of C code that could compile to the exact same assembly sequence. Which is why decompiling is far from an 'exact science'!
*(int32_t *)(a + 4) = b;
In simple terms, this means get the value of a+4 and treat it as an address at which a variable of type int32_t resides. At that address store the value of b.
Decompiling can't always produce the exact result, because a code like this is supposed to crash unless you have reserved memory location at a+4 for a int32_t.
Also, I assume this is because that .so is a decompiled version of code written specifically for a 32 bit architecture which is why it says type int32_t. Making a guess, it "may" work if you supply gcc with -m32 flag, which asks it to compile the code for 32 bit architecture.
The ARM cpu is a load-store architecture. It has a form of store as follows,
str rN, [rP, #4]
This will take the value of register rP (a pointer) and add four to it. The BUS will issue a store to memory with the value in register rN. You decompiler is seems rudimentaryNote below and has translated this as,
int32_t a = `some value`; /* sets up pointer register `rP` */
int32_t b = `another value`; /* Initializes value `rN` */
*(int32_t *)(a + 4) = b; /* the instruction `str rN, [rP, #4]` */
If you look at the decompiling wiki it notes that compiling to binary looses information. A goal of the decompiler will be that if you compile the result unaltered, it should give the same binary.
As the code is trying to replicate a machine language identical, there is no way the code will ever be portable.
Part of the issue with the tool is,
I have decompiled an .so file (from an ARM lib in an Android app)
Shared libraries are compiled to generate some strange code to allow them to be used by multiple users. It is possible that the registers used are non-standard which doesn't allow the decompiler to match the EABI regular register use as found in a main executable.
I looked briefly and the tool didn't seem to have a '-shared-library' decompile option. I suspect you are decompiling a thunk of some sort. Ie, a plt or got; see ARM Dynamic linking. Here is a question on shared library for the ARM; if the decompiler had a -shared-library option, it would probably need an OS (and version) qualifier.

How can I work around GCC optimization-miss bug 90271?

GCC versions released before May 2019 (and maybe later) fail to optimize this piece of code:
// Replace the k'th byte within an int
int replace_byte(int v1 ,char v2, size_t k)
{
memcpy( (void*) (((char*)&v1)+k) , &v2 , sizeof(v2) );
return v1;
}
as can be seen here (GodBolt): clang optimizes this code properly GCC and MSVC do not. This is GCC bug 90271, which will at some point be fixed. But - it won't be fixed for GCC versions that are out today, and I want to write this code today...
So: Is there a workaround which will make GCC generate the same code as clang for this function, or at least - code of comparable performance, keeping data in registers and not resorting to pointers and the stack?
Notes:
I marked this as C, since the code snippet is in C. I assume a workaround, if one exists, can also be implemented in C.
I'm interested in optimizing both the non-inlined function and the inlined version.
This question is related to this one, but only regards GCC and the specific approach in the piece of code here; and is in C rather than C++.
This makes the non-inlined version a little longer, but the inlined version is optimized for all three compilers:
int replace_bytes(int v1 ,char v2, size_t k)
{
return (v1 & ~(0xFF << k * 8)) | ((unsigned char)v2 << k * 8);
}
The cast of v2 to an unsigned char before the shift is necessary if char is a signed type. When that's the case, without the case v2 will be sign extended to an integer, which will cause unwanted bits to be set to 1 in the result.

Use of uninitialised value of size 8 Valgrind

I'm getting
==56903== Use of uninitialised value of size 8
==56903== at 0x1000361D1: checkMSB (in ./UnittestCSA)
==56903== by 0x10003732A: S_derive_k1_k2 (in ./UnittestCSA)
Code is as follows:
int32_t checkMSB(uint8_t *pKey){
int8_t msb = 0;
int32_t ret = 0;
msb = 1 << (8 - 1);
/* Perform bitwise AND with msb and num */
if(pKey[0] & msb){
ret = 1;
} else {
ret = 0;
}
return ret;
}
Not sure what is causing the issue.
If this
#define BITS (sizeof(int8_t) * 8)
is changed to this
#define BITS (sizeof(int) * 8)
it doesn't complain. I have #include <stdint.h> header file.
UPDATE
uint8_t localK1[BLOCKSIZE];
for(index = 0; index < inputLen; index++){
localK1[index] = pInputText[index];
}
result = checkMSB(localK1);
Your checkMSB function declares only two local variables and one function parameter. The variables both have initializers, and the parameter (a pointer) will receive a value as a result of the function call, supposing that a correct prototype for it is in scope at the point of the call. Thus, none of these is used uninitialized.
The only other data that are used (not counting constants), are those pointed to by the argument, pKey. Of those, your code uses pKey[0]. That it is Valgrind reporting the issue supports the conclusion that that's the data it is complaining about: valgrind's default memcheck service watches dynamically allocated memory, and that's the only thing involved that might be dynamically allocated.
That the error disappears when you change the definition of BITS could be explained by the expression pKey[0] & msb being optimized away when BITS evaluates to a value larger than 8.
As far as your update, which purports to show that the function's argument in fact points to initialized data, I'm inclined to think that you're looking in the wrong place, or else in the right place but at the wrong code. That is, there probably is either a different call to checkMSB that causes Valgrind to complain, or else the binary being tested was built from a different version of the code. I'm not prepared to believe that everything you've presented in the question is true, or at least not that it fits together the way you seem to be saying it does.

How to divide two 64-bit numbers in Linux Kernel?

Some code that rounds up the division to demonstrate (C-syntax):
#define SINT64 long long int
#define SINT32 long int
SINT64 divRound(SINT64 dividend, SINT64 divisor)
{
SINT32 quotient1 = dividend / divisor;
SINT32 modResult = dividend % divisor;
SINT32 multResult = modResult * 2;
SINT32 quotient2 = multResult / divisor;
SINT64 result = quotient1 + quotient2;
return ( result );
}
Now, if this were User-space we probably wouldn't even notice that our compiler is generating code for those operators (e.g. divdi3() for division). Chances are we link with libgcc without even knowing it. The problem is that Kernel-space is different (e.g. no libgcc). What to do?
Crawl Google for a while, notice that pretty much everyone addresses the unsigned variant:
#define UINT64 long long int
#define UINT32 long int
UINT64 divRound(UINT64 dividend, UINT64 divisor)
{
UINT32 quotient1 = dividend / divisor;
UINT32 modResult = dividend % divisor;
UINT32 multResult = modResult * 2;
UINT32 quotient2 = multResult / divisor;
UINT64 result = quotient1 + quotient2;
return ( result );
}
I know how to fix this one: Override udivdi3() and umoddi3() with do_div() from asm/div64.h. Done right? Wrong. Signed is not the same as unsigned, sdivdi3() does not simply call udivdi3(), they are separate functions for a reason.
Have you solved this problem? Do you know of a library that will help me do this? I'm really stuck so whatever you might see here that I just don't right now would be really helpful.
Thanks,
Chad
This functionality is introduced in /linux/lib/div64.c as early as kernel v2.6.22.
Here's my really naive solution. Your mileage may vary.
Keep a sign bit, which is sign(dividend) ^ sign(divisor). (Or *, or /, if you're storing your sign as 1 and -1, as opposed to false and true. Basically, negative if either one is negative, positive if none or both are negative.)
Then, call the unsigned division function on the absolute values of both. Then tack the sign back onto the result.
P.S. That is actually how __divdi3 is implemented in libgcc2.c (from GCC 4.2.3, the version that's installed on my Ubuntu system). I just checked. :-)
I don't think (at least can't find a way to make) Chris' answer work in this case because do_div() actually changes the dividend in-place. Getting the absolute value implies a temporary variable whose value will change the way I require but can't be passed out of my __divdi3() override.
I don't see a way around the parameter-by-value signature of __divdi3() at this point except to mimic the technique used by do_div().
It might seem like I'm bending over backwards here and should just come up with an algorithm to do the 64-bit/32-bit division I actually need. The added complication here though is that I have a bunch of numerical code using the '/' operator and would need to go through that code and replace every '/' with my function calls.
I'm getting desperate enough to do just that though.
Thanks for any follow-up,
Chad
ldiv ?
Edit: reread title, so you might want to ignore this. Or not, depending on if it has an appropriate non-library version.

Resources