Related
When running the test suite of the igraph package with MemorySanitizer (clang 13.0), I am getting false positives in some (but not all) cases involving very long vararg lists, such as here:
https://github.com/igraph/igraph/blob/70e9e32748144a9387cb5d661487fe07b4bce271/tests/unit/global_transitivity.c#L57
Reducing the number of arguments passed to the function eliminates the error.
The program logic here is basically identical to the short program at the end of this post. However, that program cannot reproduce the issue with MemorySanitizer, no matter how many arguments I pass to the function. I get a stack overflow before any MemorySanitizer error is triggered. Thus, the problem must have to do with multiple nested calls, even though only one of those involves varargs.
Question: Has anyone seen similar false positives with MemorySanitizer? If yes, is this a bug in MemorySanitizer, or is there some option that controls how deep it looks into the stack to mark values as "initialized"? Is there a way to eliminate the problem without shortening the argument list?
Example program:
#include <stdio.h>
#include <stdarg.h>
/* Fills up 'arr' with int values passed to the function, up to and including the first -1 */
void f(int arr[], ...) {
va_list ap;
int i = 0;
va_start(ap, arr);
while (1) {
int num = va_arg(ap, int);
arr[i++] = num;
if (num == -1) {
break;
}
}
va_end(ap);
}
int main() {
int arr[200]; /* 'arr' is uninitialized here */
int i=0;
/* Initialize the first 101 elements of 'arr' using f(). */
f(arr, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36,
37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53,
54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70,
71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87,
88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100,
-1);
/* Print the initialized elements to verify whether
MemorySanitizer reports an error (a false positive). */
while (arr[i] != -1) {
printf("%d\n", arr[i]);
i++;
}
return 0;
}
For completeness, here is the MemorySanitizer output for the code I linked, although there aren't any hints I can see here.
==2908012==WARNING: MemorySanitizer: use-of-uninitialized-value
#0 0x7f088a65d160 in igraph_vector_int_isininterval /home/szhorvat/Repos/igraph/src/core/vector.pmt:1834:27
#1 0x7f088a745a7b in igraph_create /home/szhorvat/Repos/igraph/src/constructors/basic_constructors.c:74:23
#2 0x7f088a746454 in igraph_small /home/szhorvat/Repos/igraph/src/constructors/basic_constructors.c:149:5
#3 0x49ce86 in main /home/szhorvat/Repos/igraph/tests/unit/global_transitivity.c:57:5
#4 0x7f088a1c50b2 in __libc_start_main /build/glibc-sMfBJT/glibc-2.31/csu/../csu/libc-start.c:308:16
#5 0x41d63d in _start (/home/szhorvat/Repos/igraph/build/tests/test_global_transitivity+0x41d63d)
Uninitialized value was stored to memory at
#0 0x44be92 in __interceptor_realloc (/home/szhorvat/Repos/igraph/build/tests/test_global_transitivity+0x44be92)
#1 0x7f088a65692f in igraph_vector_int_reserve /home/szhorvat/Repos/igraph/src/core/vector.pmt:471:11
#2 0x7f088a65692f in igraph_vector_int_push_back /home/szhorvat/Repos/igraph/src/core/vector.pmt:577:9
#3 0x7f088a746393 in igraph_small /home/szhorvat/Repos/igraph/src/constructors/basic_constructors.c:145:9
#4 0x49ce86 in main /home/szhorvat/Repos/igraph/tests/unit/global_transitivity.c:57:5
#5 0x7f088a1c50b2 in __libc_start_main /build/glibc-sMfBJT/glibc-2.31/csu/../csu/libc-start.c:308:16
Uninitialized value was stored to memory at
#0 0x7f088a656ac4 in igraph_vector_int_push_back /home/szhorvat/Repos/igraph/src/core/vector.pmt:580:15
#1 0x7f088a746393 in igraph_small /home/szhorvat/Repos/igraph/src/constructors/basic_constructors.c:145:9
#2 0x49ce86 in main /home/szhorvat/Repos/igraph/tests/unit/global_transitivity.c:57:5
#3 0x7f088a1c50b2 in __libc_start_main /build/glibc-sMfBJT/glibc-2.31/csu/../csu/libc-start.c:308:16
Uninitialized value was created
<empty stack>
SUMMARY: MemorySanitizer: use-of-uninitialized-value /home/szhorvat/Repos/igraph/src/core/vector.pmt:1834:27 in igraph_vector_int_isininterval
Exiting
#include <stdio.h>
typedef struct{
int len;
int vec[16];
}tvector;
int main(){
int elem, res;
tvector v;
v.len = 16;
v.vec[16] = {3, 15, 19, 19, 23, 32, 38, 53, 123, 321, 543, 1000, 1123, 6578, 6660, 7999};
I don't know what's wrong with it, I get it in line 12 at the first bracket {
I've tried other ways around it but it makes it worse, and also did some research but none helped.
Thanks.
You're attempting to use an initializer list in an assignment which isn't allowed. You also can't assign directly to an array, which is what you think you're doing but you're actually assigning to a single array element (and one past the end at that).
What you can do is initialize the struct at the time is is declared:
tvector v = { 16, {3, 15, 19, 19, 23, 32, 38, 53, 123, 321, 543, 1000, 1123, 6578, 6660, 7999} };
You can use this method :
tvector v = { .len = 16, .vec = {3, 15, 19, 19, 23, 32, 38, 53, 123, 321, 543, 1000, 1123, 6578, 6660, 7999} };
You don't have to worry about the order of initialization (here).
I want to read some data from a file in order to xor this with another sequence.
The content of the file is
00112233445566778899aabbccddeeff
The sequence this should be xored with is
000102030405060708090a0b0c0d0e0f
The result should be:
00102030405060708090a0b0c0d0e0f0
The reason why i get a differnt result is that rust reads the content as ascii, like this:
buffer: [48, 48, 49, 49, 50, 50, 51, 51, 52, 52, 53, 53, 54, 54, 55, 55]
buffer: [56, 56, 57, 57, 97, 97, 98, 98, 99, 99, 100, 100, 101, 101, 102, 102]
Is there a way to read the content directly to an hex array or how would one convert this?
You can use hex::decode to convert hex into bytes and then use '^' symbol to do xor operation with bits to get your result.
This does the job, reads 16 byte blocks until the end of the file, converts it to &str and converts it again into a vector of Chars:
let mut buffer = [0;16];
while let Ok(n) = file.read(&mut buffer) {
if n == 0 {
break;
}
let s = match str::from_utf8(&buffer) {
Ok(str) => str,
Err(e) => panic!("Invalid UTF-8 sequence: {}", e),
};
let mut content = s.to_string();
let char_vec: Vec<char> = content.chars().collect();
println!("Chars{:?}", char_vec);
}
I am new to C and by reading online I understand that with sizeof() I can have the memory that is allocated to it in bytes, and if I divide it by an element inside it or the data type, I can have the number of elements of an array.
I am trying to use this logic with a 2d multidimensional array and I'm having problems with the inner arrays.
Here is a code sample:
#include <stdio.h>
#define ARRAYLEN(arr) (sizeof(arr) / sizeof(arr[0]))
int main(void) {
int input[][20] = {
{90, 1349, 430, 198, 677, 1869, 1692, 1098, 761, 677, 1004 ,0},
{163, 642 ,2445, 1032, 2738 ,1591 ,3950 ,1600 ,651, 0},
{1730 ,3067 ,1956, 723 ,1307 ,417 ,2838 ,1486 ,3114 ,3698 ,1881 ,0},
{2337, 5131 ,1527 ,5042 ,953, 0},
{80, 389, 413 ,209 ,219, 100 ,191, 419, 181 ,473 ,271 ,0},
{22 ,3900 ,4057, 439 ,2642, 1447 ,3553, 2244, 3328, 3924, 1486, 400, 2394 ,0},
{2870, 621 ,3779, 3508, 3729, 2985, 1083, 1384, 3782 ,2606, 637, 0},
{1400, 108 ,472 ,1411, 10, 453, 1631, 1331, 0},
{808 ,1584, 2545, 2294, 1983, 842 ,447, 807 ,3711, 1067, 490, 0},
{435 ,14 ,261, 395, 340, 340, 25, 114, 178 ,52 ,232 ,19, 54, 0},
{6181 ,2026, 4061, 7796 ,5192 ,958, 4190, 965 ,2642, 5082, 2579, 1872 ,0},
{2030, 106, 579, 36, 1147 ,111 ,1393 ,459, 209, 1847, 1171, 415, 725, 1245, 0}
};
printf("%d", ARRAYLEN(input));
printf(" ");
printf("%d", ARRAYLEN(input[0]));
printf(" ");
printf("%d", sizeof(input[0]) / sizeof(int));
return 0;
}
The first printf() returns 12 which is right, but the second printf() (and third) return 20 which is the memory I've allocated to it but not the number of elements that each one has, which is what I am looking for in order to use on a for loop.
Can someone explain how can I do this? Or what I am doing wrong?
I can't find an answer/explanation anywhere.
Thanks in advance
Your program has potential undefined behavior on architectures where sizeof(size_t) != sizeof(int). Either use %zu in your printf format strings or cast the arguments as (int).
You should also parenthesize the ARRAYLEN macro argument more carefully.
Here is a modified version:
#include <stdio.h>
#define ARRAYLEN(arr) (sizeof(arr) / sizeof((arr)[0]))
int main(void) {
int input[][20] = {
{90, 1349, 430, 198, 677, 1869, 1692, 1098, 761, 677, 1004 ,0},
{163, 642 ,2445, 1032, 2738 ,1591 ,3950 ,1600 ,651, 0},
{1730 ,3067 ,1956, 723 ,1307 ,417 ,2838 ,1486 ,3114 ,3698 ,1881 ,0},
{2337, 5131 ,1527 ,5042 ,953, 0},
{80, 389, 413 ,209 ,219, 100 ,191, 419, 181 ,473 ,271 ,0},
{22 ,3900 ,4057, 439 ,2642, 1447 ,3553, 2244, 3328, 3924, 1486, 400, 2394 ,0},
{2870, 621 ,3779, 3508, 3729, 2985, 1083, 1384, 3782 ,2606, 637, 0},
{1400, 108 ,472 ,1411, 10, 453, 1631, 1331, 0},
{808 ,1584, 2545, 2294, 1983, 842 ,447, 807 ,3711, 1067, 490, 0},
{435 ,14 ,261, 395, 340, 340, 25, 114, 178 ,52 ,232 ,19, 54, 0},
{6181 ,2026, 4061, 7796 ,5192 ,958, 4190, 965 ,2642, 5082, 2579, 1872 ,0},
{2030, 106, 579, 36, 1147 ,111 ,1393 ,459, 209, 1847, 1171, 415, 725, 1245, 0}
};
printf("%d %d %d\n",
(int)ARRAYLEN(input),
(int)ARRAYLEN(input[0]),
(int)(sizeof(input[0]) / sizeof(int)));
return 0;
}
The output is 12 20 20, as expected:
12 is the number of elements in the array input: 12 rows of 20 int.
20 is the number of elements in the array input[0], as per the definition.
20 is again the number of elements in the array input[0] as int is the type of its elements.
The fact that you have fewer elements in the initializer for some or all of the sub-arrays does not change their size, each sub-array has 20 elements as specified in the definition int input[][20] and the remaining elements are initialized to 0. The number of sub-arrays is determined by the compiler from the initializer.
You have an array of 12 rows of 20 elements each. The number of rows is defined by the number of initializing rows you give. The number of elements is determined by the number 20.
The size given by (sizeof(input) / sizeof(input[0])) is the number of rows.
The size given by (sizeof(input[0]) / sizeof(input[0][0])) is the number of elements on a row
The size given by (sizeof(input) / sizeof(input[0][0])) is the number integers, i.e. rows x columns.
The following code prints the array:
for (int i=0; i<ARRAYLEN(input); i++) {
for (int j=0; j<ARRAYLEN(input[0]); j++) {
printf("%d ",input[i][j]);
}
printf("\n");
}
Your question can be reduced to:
int array[10] = {1,2,3};
ARRAYLEN(array) == 10
If you explicitly specify a size for your array (int x[N] instead of int x[]), then the array will always have the size you specified, regardless of the amount of initializers inside of the curly braces.
Elements without initializers are initialized with zeroes.
In other words, there is no difference between
int array[10] = {1,2,3};
and
int array[10] = {1,2,3,0,0,0,0,0,0,0};
Also, as others have noted, your macro should be defined as
#define ARRAYLEN(arr) (sizeof(arr) / sizeof((arr)[0]))
to avoid problems with operator precedence if the macro argument happens to be an expression rather than a simple array name.
The value you get from this macro should be printed with %zu, since it's of type size_t, not int.
I writing a fast "8 bit reverse"-routine for an avr-project with an ATmega2560 processor.
I'm using
GNU C (WinAVR 20100110) version 4.3.3 (avr) / compiled by GNU C version 3.4.5 (mingw-vista special r3), GMP version 4.2.3, MPFR version 2.4.1.
First I created a global lookup-table of reversed bytes (size: 0x100):
uint8_t BitReverseTable[]
__attribute__((__progmem__, aligned(0x100))) = {
0x00,0x80,0x40,0xC0,0x20,0xA0,0x60,0xE0,
0x10,0x90,0x50,0xD0,0x30,0xB0,0x70,0xF0,
[...]
0x1F,0x9F,0x5F,0xDF,0x3F,0xBF,0x7F,0xFF
};
This works as expected. That is the macro I intend to use, which should cost me only 5 cylces:
#define BITREVERSE(x) (__extension__({ \
register uint8_t b=(uint8_t)x; \
__asm__ __volatile__ ( \
"ldi r31, hi8(table)" "\n\t" \
"mov r30, ioRegister" "\n\t" \
"lpm ioRegister, z" "\n\t" \
:[ioRegister] "+r" (b) \
:[table] "g" (BitReverseTable) \
:"r30", "r31" \
); \
}))
The code to get it compiled (or not).
int main() /// Test for bitreverse
{
BITREVERSE(25);
return 0;
}
That's the error I get from the compiler:
c:/winavr-20100110/bin/../lib/gcc/avr/4.3.3/../../../../avr/bin/as.exe -mmcu=atmega2560 -o bitreverse.o C:\Users\xxx\AppData\Local\Temp/ccCefE75.s
C:\Users\xxx\AppData\Local\Temp/ccCefE75.s: Assembler messages:
C:\Users\xxx\AppData\Local\Temp/ccCefE75.s:349: Error: constant value required
C:\Users\xxx\AppData\Local\Temp/ccCefE75.s:350: Error: constant value required
I guess the problem is here:
:[table] "g" (BitReverseTable) \
From my point of view BitReverseTable is the memory position of the array, which is fixed and known at compile time. Therefor it is constant.
Maybe I need to cast BitReverseTable into something (i tried anything I could think of). Maybe I need another constraint ("g" was my last test). I'm sure I used anything possible and impossible.
I coded an assembler version, which works fine, but instead of being an inline assembly code, this is a proper function which adds another 6 cycles (for call and ret).
Any advice or suggestions are very welcome!
Full source of bitreverse.c on pastebin.
Verbose compiler output also on pastebin
The following does seem to work on avr-gcc (GCC) 4.8.2, but it does have a distinct hacky aftertaste to me.
Edited to fix the issues pointed out by the OP (Thomas) in the comments:
The high byte of Z register is r31 (I had r30 and r31 swapped)
Newer AVR's like ATmega2560 support also lpm r,Z (older AVRs only lpm r0,Z)
Thanks for the fixes, Thomas! I do have an ATmega2560 board, but I prefer Teensies (in part because of the native USB), so I only compile-tested the code, didn't run it to verify. I should have mentioned that; apologies.
const unsigned char reverse_bits_table[256] __attribute__((progmem, aligned (256))) = {
0, 128, 64, 192, 32, 160, 96, 224, 16, 144, 80, 208, 48, 176, 112, 240,
8, 136, 72, 200, 40, 168, 104, 232, 24, 152, 88, 216, 56, 184, 120, 248,
4, 132, 68, 196, 36, 164, 100, 228, 20, 148, 84, 212, 52, 180, 116, 244,
12, 140, 76, 204, 44, 172, 108, 236, 28, 156, 92, 220, 60, 188, 124, 252,
2, 130, 66, 194, 34, 162, 98, 226, 18, 146, 82, 210, 50, 178, 114, 242,
10, 138, 74, 202, 42, 170, 106, 234, 26, 154, 90, 218, 58, 186, 122, 250,
6, 134, 70, 198, 38, 166, 102, 230, 22, 150, 86, 214, 54, 182, 118, 246,
14, 142, 78, 206, 46, 174, 110, 238, 30, 158, 94, 222, 62, 190, 126, 254,
1, 129, 65, 193, 33, 161, 97, 225, 17, 145, 81, 209, 49, 177, 113, 241,
9, 137, 73, 201, 41, 169, 105, 233, 25, 153, 89, 217, 57, 185, 121, 249,
5, 133, 69, 197, 37, 165, 101, 229, 21, 149, 85, 213, 53, 181, 117, 245,
13, 141, 77, 205, 45, 173, 109, 237, 29, 157, 93, 221, 61, 189, 125, 253,
3, 131, 67, 195, 35, 163, 99, 227, 19, 147, 83, 211, 51, 179, 115, 243,
11, 139, 75, 203, 43, 171, 107, 235, 27, 155, 91, 219, 59, 187, 123, 251,
7, 135, 71, 199, 39, 167, 103, 231, 23, 151, 87, 215, 55, 183, 119, 247,
15, 143, 79, 207, 47, 175, 111, 239, 31, 159, 95, 223, 63, 191, 127, 255,
};
#define USING_REVERSE_BITS \
register unsigned char r31 asm("r31"); \
asm volatile ( "ldi r31,hi8(reverse_bits_table)\n\t" : [r31] "=d" (r31) )
#define REVERSE_BITS(v) \
({ register unsigned char r30 asm("r30") = v; \
register unsigned char ret; \
asm volatile ( "lpm %[ret],Z\n\t" : [ret] "=r" (ret) : [r30] "d" (r30), [r31] "d" (r31) ); \
ret; })
unsigned char reverse_bits(const unsigned char value)
{
USING_REVERSE_BITS;
return REVERSE_BITS(value);
}
void reverse_bits_in(unsigned char *string, unsigned char length)
{
USING_REVERSE_BITS;
while (length-->0) {
*string = REVERSE_BITS(*string);
string++;
}
}
For older AVRs that only support lpm r0,Z, use
#define REVERSE_BITS(v) \
({ register unsigned char r30 asm("r30") = v; \
register unsigned char ret asm("r0"); \
asm volatile ( "lpm %[ret],Z\n\t" : [ret] "=t" (ret) : [r30] "d" (r30), [r31] "d" (r31) ); \
ret; })
The idea is that we use a local reg var r31, to keep the high byte of the Z register pair. The USING_REVERSE_BITS; macro defines it in the current scope, using inline assembly for two purposes: to avoid an unnecessary load of the low part of the table address into a register, and to make sure GCC knows we have stored a value into it (because it is an output operand) without having any way of knowing what the value should be, thus hopefully retaining it throughout the scope.
The REVERSE_BITS() macro yields the result, telling the compiler it needs the argument in register r30, and the table address high byte set by USING_REVERSE_BITS; in r31.
Sounds a bit complicated, but that's just because I don't know how to explain it better. It really is quite simple.
Compiling the above with avr-gcc-4.8.2 -O2 -fomit-frame-pointer -mmcu=atmega2560 -S yields the assembly source. (I do recommend using -O2 -fomit-frame-pointer.)
Omitting comments and the normal directives:
.text
reverse_bits:
ldi r31,hi8(reverse_bits_table)
mov r30,r24
lpm r24,Z
ret
reverse_bits_in:
mov r26,r24
mov r27,r25
ldi r31,hi8(reverse_bits_table)
ldi r24,lo8(-1)
add r24,r22
tst r22
breq .L2
.L8:
ld r30,X
lpm r30,Z
st X+,r30
subi r24,1
brcc .L8
.L2:
ret
.section .progmem.data,"a",#progbits
.p2align 8
reverse_bits_table:
.byte 0
.byte -128
; Rest of data omitted for brevity
In case you are wondering, on ATmega2560 GCC puts the first 8-bit parameter and the 8-bit function result both in register r24.
The first function is optimal, as far as I can tell. (On older AVRs that only support lpm r0,Z, you get an added move to copy the result from r0 to r24.)
For the second function, the setup part might not be exactly optimal (for one, you could do the tst r22 breq .L2 first thing to speed up the zero-length-array check), but I'm not sure if I could write a faster/shorter one myself; it's certainly acceptable to me.
The loop in the second function looks optimal to me. The way it uses r30 I found strange and scary at first, but then I realized it makes perfect sense -- fewer registers used, and there is no harm in reusing r30 this way (even if it is low part of Z register too), because it will be loaded with a new value from string at the start of the next iteration.
Note that in my previous edit, I mentioned that swapping the order of the function parameters yielded better code, but with Thomas's additions, that is no longer the case. The registers change, that's it.
If you are sure you always supply a larger-than-zero length, using
void reverse_bits_in(unsigned char *string, unsigned char length)
{
USING_REVERSE_BITS;
do {
*string = REVERSE_BITS(*string);
string++;
} while (--length);
}
yields
reverse_bits_in:
mov r26,r24 ; 1 cycle
mov r27,r25 ; 1 cycle
ldi r31,hi8(reverse_bits_table) ; 2 cycles
.L4:
ld r30,X ; 2 cycles
lpm r30,Z ; 3 cycles
st X+,r30 ; 2 cycles
subi r22,lo8(-(-1)) ; 1 cycle
brne .L4 ; 2 cycles
ret ; 4 cycles
which starts to look downright impressive to me: ten cycles per byte, four cycles for setup, and three cycles cleanup (brne takes just one cycle if no jump). The cycle counts I listed off the top of my head, so there are likely small errors in 'em (a cycle here or there). r26:r27 is X, and the first pointer parameter to the function is supplied in r24:r25, with length in r22.
The reverse_bits_table is in the correct section, and correctly aligned. (.p2align 8 does align to 256 bytes; it specifies an alignment where the low 8 bits are zero.)
Although GCC is notorious for superfluous register moves, I really like the code it generates above. Sure, there is always room for finessing; for the important code sequences I recommend trying different variants, even changing the order of function parameters (or declaring loop variables in local scopes), and so on, then compile using -S to see the generated code. The AVR instruction timings are simple, so it is pretty easy to compare code sequences, to see if one is clearly better. I like to remove the directives and comments first; it makes it easier to read the assembly.
The reason for the hacky aftertaste is that the GCC documentation explicitly says that "Defining such a register variable does not reserve the register; it remains available for other uses in places where flow control determines the variable's value is not live", and I just don't trust that this means the same to the GCC developers as it means to me. Even if it did right now, it might not in the future; there is no standard GCC developers ought to adhere to here, since this is a GCC-specific feature.
On the other hand, I do only rely on documented GCC behaviour, above, and although "hacky", it does generate efficient assembly from straightforward C code.
Personally, I would recommend recompiling the above test code, and looking at the generated assembly (perhaps use sed to strip out the comments and labels, and compare to a known good version?), whenever you update avr-gcc.
Questions?