When running the test suite of the igraph package with MemorySanitizer (clang 13.0), I am getting false positives in some (but not all) cases involving very long vararg lists, such as here:
https://github.com/igraph/igraph/blob/70e9e32748144a9387cb5d661487fe07b4bce271/tests/unit/global_transitivity.c#L57
Reducing the number of arguments passed to the function eliminates the error.
The program logic here is basically identical to the short program at the end of this post. However, that program cannot reproduce the issue with MemorySanitizer, no matter how many arguments I pass to the function. I get a stack overflow before any MemorySanitizer error is triggered. Thus, the problem must have to do with multiple nested calls, even though only one of those involves varargs.
Question: Has anyone seen similar false positives with MemorySanitizer? If yes, is this a bug in MemorySanitizer, or is there some option that controls how deep it looks into the stack to mark values as "initialized"? Is there a way to eliminate the problem without shortening the argument list?
Example program:
#include <stdio.h>
#include <stdarg.h>
/* Fills up 'arr' with int values passed to the function, up to and including the first -1 */
void f(int arr[], ...) {
va_list ap;
int i = 0;
va_start(ap, arr);
while (1) {
int num = va_arg(ap, int);
arr[i++] = num;
if (num == -1) {
break;
}
}
va_end(ap);
}
int main() {
int arr[200]; /* 'arr' is uninitialized here */
int i=0;
/* Initialize the first 101 elements of 'arr' using f(). */
f(arr, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36,
37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53,
54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70,
71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87,
88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100,
-1);
/* Print the initialized elements to verify whether
MemorySanitizer reports an error (a false positive). */
while (arr[i] != -1) {
printf("%d\n", arr[i]);
i++;
}
return 0;
}
For completeness, here is the MemorySanitizer output for the code I linked, although there aren't any hints I can see here.
==2908012==WARNING: MemorySanitizer: use-of-uninitialized-value
#0 0x7f088a65d160 in igraph_vector_int_isininterval /home/szhorvat/Repos/igraph/src/core/vector.pmt:1834:27
#1 0x7f088a745a7b in igraph_create /home/szhorvat/Repos/igraph/src/constructors/basic_constructors.c:74:23
#2 0x7f088a746454 in igraph_small /home/szhorvat/Repos/igraph/src/constructors/basic_constructors.c:149:5
#3 0x49ce86 in main /home/szhorvat/Repos/igraph/tests/unit/global_transitivity.c:57:5
#4 0x7f088a1c50b2 in __libc_start_main /build/glibc-sMfBJT/glibc-2.31/csu/../csu/libc-start.c:308:16
#5 0x41d63d in _start (/home/szhorvat/Repos/igraph/build/tests/test_global_transitivity+0x41d63d)
Uninitialized value was stored to memory at
#0 0x44be92 in __interceptor_realloc (/home/szhorvat/Repos/igraph/build/tests/test_global_transitivity+0x44be92)
#1 0x7f088a65692f in igraph_vector_int_reserve /home/szhorvat/Repos/igraph/src/core/vector.pmt:471:11
#2 0x7f088a65692f in igraph_vector_int_push_back /home/szhorvat/Repos/igraph/src/core/vector.pmt:577:9
#3 0x7f088a746393 in igraph_small /home/szhorvat/Repos/igraph/src/constructors/basic_constructors.c:145:9
#4 0x49ce86 in main /home/szhorvat/Repos/igraph/tests/unit/global_transitivity.c:57:5
#5 0x7f088a1c50b2 in __libc_start_main /build/glibc-sMfBJT/glibc-2.31/csu/../csu/libc-start.c:308:16
Uninitialized value was stored to memory at
#0 0x7f088a656ac4 in igraph_vector_int_push_back /home/szhorvat/Repos/igraph/src/core/vector.pmt:580:15
#1 0x7f088a746393 in igraph_small /home/szhorvat/Repos/igraph/src/constructors/basic_constructors.c:145:9
#2 0x49ce86 in main /home/szhorvat/Repos/igraph/tests/unit/global_transitivity.c:57:5
#3 0x7f088a1c50b2 in __libc_start_main /build/glibc-sMfBJT/glibc-2.31/csu/../csu/libc-start.c:308:16
Uninitialized value was created
<empty stack>
SUMMARY: MemorySanitizer: use-of-uninitialized-value /home/szhorvat/Repos/igraph/src/core/vector.pmt:1834:27 in igraph_vector_int_isininterval
Exiting
#include <stdio.h>
typedef struct{
int len;
int vec[16];
}tvector;
int main(){
int elem, res;
tvector v;
v.len = 16;
v.vec[16] = {3, 15, 19, 19, 23, 32, 38, 53, 123, 321, 543, 1000, 1123, 6578, 6660, 7999};
I don't know what's wrong with it, I get it in line 12 at the first bracket {
I've tried other ways around it but it makes it worse, and also did some research but none helped.
Thanks.
You're attempting to use an initializer list in an assignment which isn't allowed. You also can't assign directly to an array, which is what you think you're doing but you're actually assigning to a single array element (and one past the end at that).
What you can do is initialize the struct at the time is is declared:
tvector v = { 16, {3, 15, 19, 19, 23, 32, 38, 53, 123, 321, 543, 1000, 1123, 6578, 6660, 7999} };
You can use this method :
tvector v = { .len = 16, .vec = {3, 15, 19, 19, 23, 32, 38, 53, 123, 321, 543, 1000, 1123, 6578, 6660, 7999} };
You don't have to worry about the order of initialization (here).
I writing a fast "8 bit reverse"-routine for an avr-project with an ATmega2560 processor.
I'm using
GNU C (WinAVR 20100110) version 4.3.3 (avr) / compiled by GNU C version 3.4.5 (mingw-vista special r3), GMP version 4.2.3, MPFR version 2.4.1.
First I created a global lookup-table of reversed bytes (size: 0x100):
uint8_t BitReverseTable[]
__attribute__((__progmem__, aligned(0x100))) = {
0x00,0x80,0x40,0xC0,0x20,0xA0,0x60,0xE0,
0x10,0x90,0x50,0xD0,0x30,0xB0,0x70,0xF0,
[...]
0x1F,0x9F,0x5F,0xDF,0x3F,0xBF,0x7F,0xFF
};
This works as expected. That is the macro I intend to use, which should cost me only 5 cylces:
#define BITREVERSE(x) (__extension__({ \
register uint8_t b=(uint8_t)x; \
__asm__ __volatile__ ( \
"ldi r31, hi8(table)" "\n\t" \
"mov r30, ioRegister" "\n\t" \
"lpm ioRegister, z" "\n\t" \
:[ioRegister] "+r" (b) \
:[table] "g" (BitReverseTable) \
:"r30", "r31" \
); \
}))
The code to get it compiled (or not).
int main() /// Test for bitreverse
{
BITREVERSE(25);
return 0;
}
That's the error I get from the compiler:
c:/winavr-20100110/bin/../lib/gcc/avr/4.3.3/../../../../avr/bin/as.exe -mmcu=atmega2560 -o bitreverse.o C:\Users\xxx\AppData\Local\Temp/ccCefE75.s
C:\Users\xxx\AppData\Local\Temp/ccCefE75.s: Assembler messages:
C:\Users\xxx\AppData\Local\Temp/ccCefE75.s:349: Error: constant value required
C:\Users\xxx\AppData\Local\Temp/ccCefE75.s:350: Error: constant value required
I guess the problem is here:
:[table] "g" (BitReverseTable) \
From my point of view BitReverseTable is the memory position of the array, which is fixed and known at compile time. Therefor it is constant.
Maybe I need to cast BitReverseTable into something (i tried anything I could think of). Maybe I need another constraint ("g" was my last test). I'm sure I used anything possible and impossible.
I coded an assembler version, which works fine, but instead of being an inline assembly code, this is a proper function which adds another 6 cycles (for call and ret).
Any advice or suggestions are very welcome!
Full source of bitreverse.c on pastebin.
Verbose compiler output also on pastebin
The following does seem to work on avr-gcc (GCC) 4.8.2, but it does have a distinct hacky aftertaste to me.
Edited to fix the issues pointed out by the OP (Thomas) in the comments:
The high byte of Z register is r31 (I had r30 and r31 swapped)
Newer AVR's like ATmega2560 support also lpm r,Z (older AVRs only lpm r0,Z)
Thanks for the fixes, Thomas! I do have an ATmega2560 board, but I prefer Teensies (in part because of the native USB), so I only compile-tested the code, didn't run it to verify. I should have mentioned that; apologies.
const unsigned char reverse_bits_table[256] __attribute__((progmem, aligned (256))) = {
0, 128, 64, 192, 32, 160, 96, 224, 16, 144, 80, 208, 48, 176, 112, 240,
8, 136, 72, 200, 40, 168, 104, 232, 24, 152, 88, 216, 56, 184, 120, 248,
4, 132, 68, 196, 36, 164, 100, 228, 20, 148, 84, 212, 52, 180, 116, 244,
12, 140, 76, 204, 44, 172, 108, 236, 28, 156, 92, 220, 60, 188, 124, 252,
2, 130, 66, 194, 34, 162, 98, 226, 18, 146, 82, 210, 50, 178, 114, 242,
10, 138, 74, 202, 42, 170, 106, 234, 26, 154, 90, 218, 58, 186, 122, 250,
6, 134, 70, 198, 38, 166, 102, 230, 22, 150, 86, 214, 54, 182, 118, 246,
14, 142, 78, 206, 46, 174, 110, 238, 30, 158, 94, 222, 62, 190, 126, 254,
1, 129, 65, 193, 33, 161, 97, 225, 17, 145, 81, 209, 49, 177, 113, 241,
9, 137, 73, 201, 41, 169, 105, 233, 25, 153, 89, 217, 57, 185, 121, 249,
5, 133, 69, 197, 37, 165, 101, 229, 21, 149, 85, 213, 53, 181, 117, 245,
13, 141, 77, 205, 45, 173, 109, 237, 29, 157, 93, 221, 61, 189, 125, 253,
3, 131, 67, 195, 35, 163, 99, 227, 19, 147, 83, 211, 51, 179, 115, 243,
11, 139, 75, 203, 43, 171, 107, 235, 27, 155, 91, 219, 59, 187, 123, 251,
7, 135, 71, 199, 39, 167, 103, 231, 23, 151, 87, 215, 55, 183, 119, 247,
15, 143, 79, 207, 47, 175, 111, 239, 31, 159, 95, 223, 63, 191, 127, 255,
};
#define USING_REVERSE_BITS \
register unsigned char r31 asm("r31"); \
asm volatile ( "ldi r31,hi8(reverse_bits_table)\n\t" : [r31] "=d" (r31) )
#define REVERSE_BITS(v) \
({ register unsigned char r30 asm("r30") = v; \
register unsigned char ret; \
asm volatile ( "lpm %[ret],Z\n\t" : [ret] "=r" (ret) : [r30] "d" (r30), [r31] "d" (r31) ); \
ret; })
unsigned char reverse_bits(const unsigned char value)
{
USING_REVERSE_BITS;
return REVERSE_BITS(value);
}
void reverse_bits_in(unsigned char *string, unsigned char length)
{
USING_REVERSE_BITS;
while (length-->0) {
*string = REVERSE_BITS(*string);
string++;
}
}
For older AVRs that only support lpm r0,Z, use
#define REVERSE_BITS(v) \
({ register unsigned char r30 asm("r30") = v; \
register unsigned char ret asm("r0"); \
asm volatile ( "lpm %[ret],Z\n\t" : [ret] "=t" (ret) : [r30] "d" (r30), [r31] "d" (r31) ); \
ret; })
The idea is that we use a local reg var r31, to keep the high byte of the Z register pair. The USING_REVERSE_BITS; macro defines it in the current scope, using inline assembly for two purposes: to avoid an unnecessary load of the low part of the table address into a register, and to make sure GCC knows we have stored a value into it (because it is an output operand) without having any way of knowing what the value should be, thus hopefully retaining it throughout the scope.
The REVERSE_BITS() macro yields the result, telling the compiler it needs the argument in register r30, and the table address high byte set by USING_REVERSE_BITS; in r31.
Sounds a bit complicated, but that's just because I don't know how to explain it better. It really is quite simple.
Compiling the above with avr-gcc-4.8.2 -O2 -fomit-frame-pointer -mmcu=atmega2560 -S yields the assembly source. (I do recommend using -O2 -fomit-frame-pointer.)
Omitting comments and the normal directives:
.text
reverse_bits:
ldi r31,hi8(reverse_bits_table)
mov r30,r24
lpm r24,Z
ret
reverse_bits_in:
mov r26,r24
mov r27,r25
ldi r31,hi8(reverse_bits_table)
ldi r24,lo8(-1)
add r24,r22
tst r22
breq .L2
.L8:
ld r30,X
lpm r30,Z
st X+,r30
subi r24,1
brcc .L8
.L2:
ret
.section .progmem.data,"a",#progbits
.p2align 8
reverse_bits_table:
.byte 0
.byte -128
; Rest of data omitted for brevity
In case you are wondering, on ATmega2560 GCC puts the first 8-bit parameter and the 8-bit function result both in register r24.
The first function is optimal, as far as I can tell. (On older AVRs that only support lpm r0,Z, you get an added move to copy the result from r0 to r24.)
For the second function, the setup part might not be exactly optimal (for one, you could do the tst r22 breq .L2 first thing to speed up the zero-length-array check), but I'm not sure if I could write a faster/shorter one myself; it's certainly acceptable to me.
The loop in the second function looks optimal to me. The way it uses r30 I found strange and scary at first, but then I realized it makes perfect sense -- fewer registers used, and there is no harm in reusing r30 this way (even if it is low part of Z register too), because it will be loaded with a new value from string at the start of the next iteration.
Note that in my previous edit, I mentioned that swapping the order of the function parameters yielded better code, but with Thomas's additions, that is no longer the case. The registers change, that's it.
If you are sure you always supply a larger-than-zero length, using
void reverse_bits_in(unsigned char *string, unsigned char length)
{
USING_REVERSE_BITS;
do {
*string = REVERSE_BITS(*string);
string++;
} while (--length);
}
yields
reverse_bits_in:
mov r26,r24 ; 1 cycle
mov r27,r25 ; 1 cycle
ldi r31,hi8(reverse_bits_table) ; 2 cycles
.L4:
ld r30,X ; 2 cycles
lpm r30,Z ; 3 cycles
st X+,r30 ; 2 cycles
subi r22,lo8(-(-1)) ; 1 cycle
brne .L4 ; 2 cycles
ret ; 4 cycles
which starts to look downright impressive to me: ten cycles per byte, four cycles for setup, and three cycles cleanup (brne takes just one cycle if no jump). The cycle counts I listed off the top of my head, so there are likely small errors in 'em (a cycle here or there). r26:r27 is X, and the first pointer parameter to the function is supplied in r24:r25, with length in r22.
The reverse_bits_table is in the correct section, and correctly aligned. (.p2align 8 does align to 256 bytes; it specifies an alignment where the low 8 bits are zero.)
Although GCC is notorious for superfluous register moves, I really like the code it generates above. Sure, there is always room for finessing; for the important code sequences I recommend trying different variants, even changing the order of function parameters (or declaring loop variables in local scopes), and so on, then compile using -S to see the generated code. The AVR instruction timings are simple, so it is pretty easy to compare code sequences, to see if one is clearly better. I like to remove the directives and comments first; it makes it easier to read the assembly.
The reason for the hacky aftertaste is that the GCC documentation explicitly says that "Defining such a register variable does not reserve the register; it remains available for other uses in places where flow control determines the variable's value is not live", and I just don't trust that this means the same to the GCC developers as it means to me. Even if it did right now, it might not in the future; there is no standard GCC developers ought to adhere to here, since this is a GCC-specific feature.
On the other hand, I do only rely on documented GCC behaviour, above, and although "hacky", it does generate efficient assembly from straightforward C code.
Personally, I would recommend recompiling the above test code, and looking at the generated assembly (perhaps use sed to strip out the comments and labels, and compare to a known good version?), whenever you update avr-gcc.
Questions?
Suppose defined: int a[100] Type print a then gdb will automatically display it as an array:1, 2, 3, 4.... However, if a is passed to a function as a parameter, then gdb will treat it as a normal int pointer, type print a will display:(int *)0x7fffffffdaa0. What should I do if I want to view a as an array?
See here. In short you should do:
p *array#len
*(T (*)[N])p where T is the type, N is the number of elements and p is the pointer.
Use the x command.
(gdb) x/100w a
How to view or print any number of bytes from any array in any printf-style format using the gdb debugger
As #Ivaylo Strandjev says here, the general syntax is:
print *my_array#len
# OR the shorter version:
p *my_array#len
Example to print the first 10 bytes from my_array:
print *my_array#10
[Recommended!] Custom printf-style print formatting: however, if the commands above look like garbage since it tries to interpret the values as chars, you can force different formatting options like this:
print/x *my_array#10 = hex
print/d *my_array#10 = signed integer
print/u *my_array#10 = unsigned integer
print/<format> *my_array#10 = print according to the general printf()-style format string, <format>
Here are some real examples from my debugger to print 16 bytes from a uint8_t array named byteArray. Notice how ugly the first one is, with just p *byteArray#16:
(gdb) p *byteArray#16
$4 = "\000\001\002\003\004\005\006\a\370\371\372\373\374\375\376\377"
(gdb) print/x *byteArray#16
$5 = {0x0, 0x1, 0x2, 0x3, 0x4, 0x5, 0x6, 0x7, 0xf8, 0xf9, 0xfa, 0xfb, 0xfc, 0xfd, 0xfe, 0xff}
(gdb) print/d *byteArray#16
$6 = {0, 1, 2, 3, 4, 5, 6, 7, -8, -7, -6, -5, -4, -3, -2, -1}
(gdb) print/u *byteArray#16
$7 = {0, 1, 2, 3, 4, 5, 6, 7, 248, 249, 250, 251, 252, 253, 254, 255}
In my case, the best version, with the correct representation I want to see, is the last one where I print the array as unsigned integers using print/u, since it is a uint8_t unsigned integer array after-all:
(gdb) print/u *byteArray#16
$7 = {0, 1, 2, 3, 4, 5, 6, 7, 248, 249, 250, 251, 252, 253, 254, 255}
(int[100])*pointer worked for me thanks to suggestion in the comments by #Ruslan