So I was browsing the Quake engine source code earlier today and stumbled upon some written utility functions. One of them was 'Q_memcpy':
void Q_memcpy (void *dest, void *src, int count)
{
int i;
if (( ( (long)dest | (long)src | count) & 3) == 0 )
{
count>>=2;
for (i=0 ; i<count ; i++)
((int *)dest)[i] = ((int *)src)[i];
}
else
for (i=0 ; i<count ; i++)
((byte *)dest)[i] = ((byte *)src)[i];
}
I understand the whole premise of the function but I don't quite understand the reason for the bitwise OR between the source and destination address. So the sum of my questions are as follows:
Why does 'count' get used in the same bitwise arithmetic?
Why is that result's last two bits checked if they are differing?
What purpose does this whole check serve?
I'm sure it's something obvious but please excuse my ignorance because I haven't really delved into the more low level side of things when it comes to programming. I just find it interesting and want to learn more.
It is finding out whether the source and destination pointers are int aligned, and whether the count is an exact int size of bytes.
If those three things are all true, the l.s. 2 bits of them all will be 0 (assuming pointers and int are 4 bytes). So the algorithm ORs the three values, and isolates the l.s. 2 bits.
In this case, it copies int by int. Otherwise it copies char by char.
If the test fails, a more sophisticated algorithm would copy some of the leading and trailing bytes char by char and the intermediate bytes int by int.
The bitwise ORing and ANding with 3 is to check whether the source, destination and count are divisible by 4. If they are, the operation can work with 4-byte words, while this code is assuming int as 4 bytes. Otherwise the operation is performed bytewise.
It first tests if all 3 arguments are divisible by 4. If - and only if - they all are, it proceeds with copying 4 bytes at a time.
I.e. this undecoded would be
if ((long) src % 4 == 0 && (long) dst % 4 == 0 && count % 4 == 0 )
{
count = count / 4;
for (i = 0; i < count; i++)
((int *)dest)[i] = ((int *)src)[i];
}
I am not sure if they tested their compiler and it generated bad code for even a test, and therefore they decided to write it in such a convoluted way. In any case, the x | y | z will guarantee that a bit n is set in the result if it is set in any of x, y or z. Therefore if the (x | y | z) & 3 results in 0, none of the numbers had either of the 2 lowest bits set, and therefore are divisible by 4.
Of course it would be rather silly to use now - the standard library memcpy in recent library implementations is almost certainly better than this.
Therefore, on recent compilers you can optimize all calls to Q_memcpy by switching them to memcpy. GCC could generate things like 64-bit or SIMD moves with memcpy depending on the size of area to be copied.
Related
I am analysing an Internet guide, where I fond code like that. Can somebody explain me the usage of ~ and & operators?
Thanks in advance
uint8_t tx_fifo_put(tx_dataType data)
{
/*Check if FIFO is full*/
if((tx_put_itr - tx_get_itr) & ~(TXFIFOSIZE-1))
{
/*FIFO full - return TXFAIL*/
return (TXFAIL);
}
/*Put data into fifo*/
TX_FIFO[tx_put_itr & (TXFIFOSIZE - 1)] = data;
/*Incerment itr*/
tx_put_itr++;
return(TXSUCCESS);
}
What the code does, is an obfuscated way to replace a more human readable code.
As a commenter wrote before me, the TX_FIFO[tx_put_itr & (TXFIFOSIZE - 1)] = data; loops the output. Also as it was mentioned in comments, the code is meant to have size being power of two.
I do not know why it is done so, for me TX_FIFO[tx_put_itr % TXFIFOSIZE] = data does the same, but more readable. Also, a person expects predicate checks to be before data access. At least it is my nature.
The (w - r) &~ size part is a way to check for (1)w < r and, (2) as an edge case, w being equal to FIFOSIZE and r being zero. Semantically it should have meant, that "if the write pointer points to boundary, and read pointer points to start of a buffer, we suggest that, for our data structure, next write could be an overflow."
Let us see some code, numbers and their binary representation.
let s = 8 - 1, in binary is 00000111 and negated is 11111000.
let w = 0, let r = 1.
now in binary w = 00000000, r = 00000001.
w - r = 11111111, logical and that with ~(8 - 1) and get some value, other then zero.
Continuing the logic for the w < r case, we get that any negative integer will produce some bits in the above. So this definitely gives true for the OP if code.
Now the w = r case can not commit bits to the boolean test.
And last case,
let s = 8,
let w = 8
let r = 0
w - r = 00001000
~(8 - 1) = 11111000
(w - r) &~ 7 = 00001000
All other cases where w > r give zero.
Update
To my great grief, the #UkropUkraine had deleted all comments and his answer. There were some discussion there about the fact, that one can use (w - r) >= mask in place of (w - r) & mask.
Here I present a code, and an explanation that it is not an optimization, or just syntax, or whatever came to mind to the person who wrote the OP code. It is intended code. And it fails to do its purpose: to run as a FIFO or circular queue, or whatever that part of code was meant to do.
First, take an example of usage. The part where Ukrop user had difficulties. The w pointer can be less than r pointer. And the result of w - r will be negative.
The common usage is to add a byte to the buffer and wrap write pointer as soon as it reaches the end. Imagine situation where w pointer already wrapped.
#include <stdio.h>
int main()
{
unsigned char w = 0, r = 1;
int r;
r = (a - b) & 0xffffffff;
printf("%d\n", r);
return 0;
}
-1
I do not know what is a common boolean result type with micro controllers. For a common x86 C machine, it is int. So I expect the if((w - r) &~ size) to be converted to an int. And the result is negative. You can not just write the above with >=, '>', or == as it was stated by the comments and the other answer here.
More than that, the code fails its semantics. It is meant to be a FIFO, or something, I do not know. But in the above situation, the read pointer still has some sensible data to read. And it can be done, because the write pointer, even if it is wrapped, does not overwrite the read portion of a buffer, yet. But the code returns BUFFULL.
I thought about read/write being different directions, but it does not change anything. The code OP gave, fails to do what one would expect.
Maybe I do miss some insight here, as Ukrop user, and OP, point me to the fact that they know code semantics. The OP just did not get a ~ and & usage. Well, this is an answer, the ~& is used to test for a negative value and for the edge cases.
The two operators:
& is a bitwise and operator
~ is a bitwise complement operator
Now for the posted code it's important to notice that TXFIFOSIZE must have a value which is a power of 2, i.e. values like 2, 4, 8, 16, 32, ...
When that is true, the code:
TX_FIFO[tx_put_itr & (TXFIFOSIZE - 1)] = data;
is equivalent to:
TX_FIFO[tx_put_itr % TXFIFOSIZE] = data;
Notice that tx_put_itr is being incremented in such a way that it will take value higher than TXFIFOSIZE. So in order to get a valid array index the code must find the remainder of tx_put_itr with respect to TXFIFOSIZE.
So how does work? Why are the above lines equivalent?
Let's take a value as example.
Assume TXFIFOSIZE is 8 (2 to the power of 3)
So TXFIFOSIZE-1 is 7
7 is bitwise 00....00111
And when you do:
SOME_NUMBER & 00....00111
You keep the 3 least significant bits of SOME_NUMBER
And that is exactly the remainder of when diving by 8
So let's look at
if((tx_put_itr - tx_get_itr) & ~(TXFIFOSIZE-1))
It is equivalent to
if((tx_put_itr - tx_get_itr) >= TXFIFOSIZE)
So it checks for "FIFO full"
Again using an example it works like this:
Assume TXFIFOSIZE is 8 (2 to the power of 3)
So TXFIFOSIZE-1 is 7
7 is bitwise 00....00111
~7 is bitwise 11....11000
And when you do:
SOME_NUMBER & 11....11000
You clear the 3 least significant bits of SOME_NUMBER and keep the rest unchanged
So if the result is non-zero it means that the difference between
tx_put_itr and tx_get_itr is 8 (or more).
I'm making a function that takes a value using scanf_s and converts that into a binary value. The function works perfectly... until I put in a really high value.
I'm also doing this on VS 2019 in x64 in C
And in case it matters, I'm using
main(int argc, char* argv[])
for the main function.
Since I'm not sure what on earth is happening, here's the whole code I guess.
BinaryGet()
{
// Declaring lots of stuff
int x, y, z, d, b, c;
int counter = 0;
int doubler = 1;
int getb;
int binarray[2000] = { 0 };
// I only have to change things to 1 now, am't I smart?
int binappend[2000] = { 0 };
// Get number
printf("Gimme a number\n");
scanf_s("%d", &getb);
// Because why not
printf("\n");
// Get the amount of binary places to be used (how many times getb divides by 2)
x = getb;
while (x > 1)
{
d = x;
counter += 1;
// Tried x /= 2, gave me infinity loop ;(
x = d / 2;
}
// Fill the array with binary values (i.e. 1, 2, 4, 8, 16, 32, etc)
for (b = 1; b <= counter; b++)
{
binarray[b] = doubler * 2;
doubler *= 2;
}
// Compare the value of getb to binary values, subtract and repeat until getb = 0)
c = getb;
for (y = counter; c >= 1; y--)
{
// Printing c at each subtraction
printf("\n%d\n", c);
// If the value of c (a temp variable) compares right to the binary value, subtract that binary value
// and put a 1 in that spot in binappend, the 1 and 0 list
if (c >= binarray[y])
{
c -= binarray[y];
binappend[y] += 1;
}
// Prevents buffer under? runs
if (y <= 0)
{
break;
}
}
// Print the result
for (z = 0; z <= counter; z++)
{
printf("%d", binappend[z]);
}
}
The problem is that when I put in the value 999999999999999999 (18 digits) it just prints 0 once and ends the function. The value of the digits doesn't matter though, 18 ones will have the same result.
However, when I put in 17 digits, it gives me this:
99999999999999999
// This is the input value after each subtraction
1569325055
495583231
495583231
227147775
92930047
25821183
25821183
9043967
655359
655359
655359
655359
131071
131071
131071
65535
32767
16383
8191
4095
2047
1023
511
255
127
63
31
15
7
3
1
// This is the binary
1111111111111111100100011011101
The binary value it gives me is 31 digits. I thought that it was weird that at 32, a convenient number, it gimps out, so I put in the value of the 32nd binary place minus 1 (2,147,483,647) and it worked. But adding 1 to that gives me 0.
Changing the type of array (unsigned int and long) didn't change this. Neither did changing the value in the brackets of the arrays. I tried searching to see if it's a limit of scanf_s, but found nothing.
I know for sure (I think) it's not the arrays, but probably something dumb I'm doing with the function. Can anyone help please? I'll give you a long-distance high five.
The problem is indeed related to the power-of-two size of the number you've noticed, but it's in this call:
scanf_s("%d", &getb);
The %d argument means it is reading into a signed integer, which on your platform is probably 32 bits, and since it's signed it means it can go up to 2³¹-1 in the positive direction.
The conversion specifiers used by scanf() and related functions can accept larger sizes of data types though. For example %ld will accept a long int, and %lld will accept a long long int. Check the data type sizes for your platform, because a long int and an int might actually be the same size (32 bits) eg. on Windows.
So if you use %lld instead, you should be able to read larger numbers, up to the range of a long long int, but make sure you change the target (getb) to match! Also if you're not interested in negative numbers, let the type system help you out and use an unsigned type: %llu for an unsigned long long.
Some details:
If scanf or its friends fail, the value in getb is indeterminate ie. uninitialised, and reading from it is undefined behaviour (UB). UB is an extremely common source of bugs in C, and you want to avoid it. Make sure your code only reads from getb if scanf tells you it worked.
In fact, in general it is not possible to avoid UB with scanf unless you're in complete control of the input (eg. you wrote it out previously with some other, bug free, software). While you can check the return value of scanf and related functions (it will return the number of fields it converts), its behaviour is undefined if, say, a field is too large to fit into the data type you have for it.
There's a lot more detail on scanf etc. here.
To avoid problems with not knowing what size an int is, or if a long int is different on this platform or that, there is also the header stdint.h which defines integer types of a specific width eg. int64_t. These also have macros for use with scanf() like SCNd64. These are available from C99 onwards, but note that Windows' support of C99 in its compilers is incomplete and may not include this.
Don't be so hard on yourself, you're not dumb, C is a hard language to master and doesn't follow modern idioms that have developed since it was first designed.
Is the following code, safe to iterate an array backward?
for (size_t s = array_size - 1; s != -1; s--)
array[s] = <do something>;
Note that I'm comparing s, which is unsigned, against -1;
Is there a better way?
This code is surprisingly tricky. If my reading of the C standard is correct, then your code is safe if size_t is at least as big as int. This is normally the case because size_t is usually implemented as something like unsigned long int.
In this case -1 is converted to size_t (the type of s). -1 can't be represented by an unsigned type, so we apply modulo arithmetic to bring it in range. This gives us SIZE_MAX (the largest possible value of type size_t). Similarly, decrementing s when it is 0 is done modulo SIZE_MAX+1, which also results in SIZE_MAX. Therefore your loop ends exactly where you want it to end, after processing the s = 0 case.
On the other hand, if size_t were something like unsigned short (and int bigger than short), then int could represent all possible size_t values and s would be converted to int. In other words, the comparison would be done as (int)SIZE_MAX != -1, which would always return false, thus breaking your code. But I've never seen a system where this could happen.
You can avoid any potential problems by using SIZE_MAX (which is provided by <stdint.h>) instead of -1:
for (size_t s = array_size - 1; s != SIZE_MAX; s--)
...
But my favorite solution is this:
for (size_t s = array_size; s--; )
...
Well, s will never be -1, so your ending condition will never happen. s will go from 0 to SIZE_MAX, at which point your program will probably segfault from a memory access error. The better solution would be to start at the max size, and subtract one from everywhere you use it:
for (size_t s = array_size; s > 0; s--)
array[s-1] = <do something>;
Or you can combine this functionality into the for loop's syntax:
for (size_t s = array_size; s--;)
array[s] = <do something>;
Which will subtract one before going into the loop, but checks for s == 0 before subtracting 1.
IMO in the iterations use large enough signed value. It ss easier to read by humans.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
My project is to scan an address space (which in my case is 0x00000000 - 0xffffffff, or 0 - (232)-1) for a pattern and return in an array the locations in memory where the pattern was found (could be found multiple times).
Since the address space is 32 bits, i is a double and max is pow(2,32) (also a double).
I want to keep the original value of i intact so that I can use that to report the location of where the pattern was found (since actually finding the pattern requires moving forward several bytes past i), so I want temp, declared as char *, to copy the value of i. Then, later in my program, I will dereference temp.
double i, max = pow(2, 32);
char *temp;
for (i = 0; i < max; i++)
{
temp = (char *) i;
//some code involving *temp
}
The issue I'm running into is a double can't be cast as a char *. An int can be; however, since the address space is 32 bits (not 16) I need a double, which is exactly large enough to represent 2^32.
Is there anything I can do about this?
In C, double and float are not represented the way you think they are; this code demonstrates that:
#include <stdio.h>
typedef union _DI
{
double d;
int i;
} DI;
int main()
{
DI di;
di.d = 3.00;
printf("%d\n", di.i);
return 0;
}
You will not see an output of 3 in this case.
In general, even if you could read other process' memory, your strategy is not going to work on any modern operating system because of virtual memory (the address space that one process "sees" doesn't necessarily (in fact, it usually doesn't) represent the physical memory on the system).
Never use a floating point variable to store an integer. Floating point variables make approximate computations. It would happen to work in this case, because the integers are small enough, but to know that, you need intimate knowledge of how floating point works on a particular machine/compiler and what range of integers you'll be using. Plus it's harder to write the program, and the program would be slower.
C defines an integer type that's large enough to store a pointer: uintptr_t. You can cast a pointer to uintptr_t and back. On a 32-bit machine, uintptr_t will be a 32-bit type, so it's only able to store values up to 232-1. To express a loop that covers the whole range of the type including the first and last value, you can't use an ordinary for loop with a variable that's incremented, because the ending condition requires a value of the loop index that's out of range. If you naively write
uintptr_t i;
for (i = 0; i <= UINTPTR_MAX; i++) {
unsigned char *temp = (unsigned char *)i;
// ...
}
then you get an infinite loop, because after the iteration with i equal to UINTPTR_MAX, running i++ wraps the value of i to 0. The fact that the loop is infinite can also be seen in a simpler logical way: the condition i <= UINTPTR_MAX is always true since all values of the type are less or equal to the maximum.
You can fix this by putting the test near the end of the loop, before incrementing the variable.
i = 0;
do {
unsigned char *temp = (unsigned char *)i;
// ...
if (i == UINTPTR_MAX) break;
i++;
} while (1);
Note that exploring 4GB in this way will be extremely slow, if you can even do it. You'll get a segmentation fault whenever you try to access an address that isn't mapped. You can handle the segfault with a signal handler, but that's tricky and slow. What you're attempting may or may not be what your teacher expects, but it doesn't make any practical sense.
To explore a process's memory on Linux, read /proc/self/maps to discover its memory mappings. See my answer on Unix.SE for some sample code in Python.
Note also that if you're looking for a pattern, you need to take the length of the whole pattern into account, a byte-by-byte lookup doesn't do the whole job.
Ahh, a school assignment. OK then.
uint32_t i;
for ( i = 0; i < 0xFFFFFFFF; i++ )
{
char *x = (char *)i;
// Do magic here.
}
// Also, the above code skips on 0xFFFFFFFF itself, so magic that one address here.
// But if your pattern is longer than 1 byte, then it's not necessary
// (in fact, use something less than 0xFFFFFFFF in the above loop then)
The cast of a double to a pointer is a constraint violation - hence the error.
A floating type shall not be converted to any pointer type. C11dr §6.5.4 4
To scan the entire 32-bit address space, use a do loop with an integer type capable of the [0 ... 0xFFFFFFFF] range.
uint32_t address = 0;
do {
char *p = (char *) address;
foo(p);
} while (address++ < 0xFFFFFFFF);
Recently, I wrote some code to compare pointers like this:
if(p1+len < p2)
however, some staff said that I should write like this:
if(p2-p1 > len)
to be safe.
Here,p1 and p2 are char * pointers,len is an integer.
I have no idea about that.Is that right?
EDIT1: of course,p1 and p2 pointer to the same memory object at begging.
EDIT2:just one min ago,I found the bogo of this question in my code(about 3K lines),because len is so big that p1+len can't store in 4 bytes of pointer,so p1+len < p2 is true.But it shouldn't in fact,so I think we should compare pointers like this in some situation:
if(p2 < p1 || (uint32_t)p2-p1 > (uint32_t)len)
In general, you can only safely compare pointers if they're both pointing to parts of the same memory object (or one position past the end of the object). When p1, p1 + len, and p2 all conform to this rule, both of your if-tests are equivalent, so you needn't worry. On the other hand, if only p1 and p2 are known to conform to this rule, and p1 + len might be too far past the end, only if(p2-p1 > len) is safe. (But I can't imagine that's the case for you. I assume that p1 points to the beginning of some memory-block, and p1 + len points to the position after the end of it, right?)
What they may have been thinking of is integer arithmetic: if it's possible that i1 + i2 will overflow, but you know that i3 - i1 will not, then i1 + i2 < i3 could either wrap around (if they're unsigned integers) or trigger undefined behavior (if they're signed integers) or both (if your system happens to perform wraparound for signed-integer overflow), whereas i3 - i1 > i2 will not have that problem.
Edited to add: In a comment, you write "len is a value from buff, so it may be anything". In that case, they are quite right, and p2 - p1 > len is safer, since p1 + len may not be valid.
"Undefined behavior" applies here. You cannot compare two pointers unless they both point to the same object or to the first element after the end of that object. Here is an example:
void func(int len)
{
char array[10];
char *p = &array[0], *q = &array[10];
if (p + len <= q)
puts("OK");
}
You might think about the function like this:
// if (p + len <= q)
// if (array + 0 + len <= array + 10)
// if (0 + len <= 10)
// if (len <= 10)
void func(int len)
{
if (len <= 10)
puts("OK");
}
However, the compiler knows that ptr <= q is true for all valid values of ptr, so it might optimize the function to this:
void func(int len)
{
puts("OK");
}
Much faster! But not what you intended.
Yes, there are compilers that exist in the wild that do this.
Conclusion
This is the only safe version: subtract the pointers and compare the result, don't compare the pointers.
if (p - q <= 10)
Technically, p1 and p2 must be pointers into the same array. If they are not in the same array, the behaviour is undefined.
For the addition version, the type of len can be any integer type.
For the difference version, the result of the subtraction is ptrdiff_t, but any integer type will be converted appropriately.
Within those constraints, you can write the code either way; neither is more correct. In part, it depends on what problem you're solving. If the question is 'are these two elements of the array more than len elements apart', then subtraction is appropriate. If the question is 'is p2 the same element as p1[len] (aka p1 + len)', then the addition is appropriate.
In practice, on many machines with a uniform address space, you can get away with subtracting pointers to disparate arrays, but you might get some funny effects. For example, if the pointers are pointers to some structure type, but not parts of the same array, then the difference between the pointers treated as byte addresses may not be a multiple of the structure size. This may lead to peculiar problems. If they're pointers into the same array, there won't be a problem like that — that's why the restriction is in place.
The existing answers show why if (p2-p1 > len) is better than if (p1+len < p2), but there's still a gotcha with it -- if p2 happens to point BEFORE p1 in the buffer and len is an unsigned type (such as size_t), then p2-p1 will be negative, but will be converted to a large unsigned value for comparison with the unsigned len, so the result will probably be true, which may not be what you want.
So you might actually need something like if (p1 <= p2 && p2 - p1 > len) for full safety.
As Dietrich already said, comparing unrelated pointers is dangerous, and could be considered as undefined behavior.
Given that two pointers are within the range 0 to 2GB (on a 32-bit Windows system), subtracting the 2 pointers will give you a value between -2^31 and +2^31. This is exactly the domain of a signed 32-bit integer. So in this case it does seem to make sense to subtract two pointers because the result will always be within the domain you would expect.
However, if the LargeAddressAware flag is enabled in your executable (this is Windows-specific, don't know about Unix), then your application will have an address space of 3GB (when run in 32-bit Windows with the /3G flag) or even 4GB (when run on a 64-bit Windows system).
If you then start to subtract two pointers, the result could be outside the domain of a 32-bit integer, and your comparison will fail.
I think this is one of the reasons why the address space was originally divided in 2 equal parts of 2GB, and the LargeAddressAware flag is still optional. However, my impression is that current software (your own software and the DLL's you're using) seem to be quite safe (nobody subtracts pointers anymore, isn't it?) and my own application has the LargeAddressAware flag turned on by default.
Neither variant is safe if an attacker controls your inputs
The expression p1 + len < p2 compiles down to something like p1 + sizeof(*p1)*len < p2, and the scaling with the size of the pointed-to type can overflow your pointer:
int *p1 = (int*)0xc0ffeec0ffee0000;
int *p2 = (int*)0xc0ffeec0ffee0400;
int len = 0x4000000000000000;
if(p1 + len < p2) {
printf("pwnd!\n");
}
When len is multiplied by the size of int, it overflows to 0 so the condition is evaluated as if(p1 + 0 < p2). This is obviously true, and the following code is executed with a much too high length value.
Ok, so what about p2-p1 < len. Same thing, overflow kills you:
char *p1 = (char*)0xa123456789012345;
char *p2 = (char*)0x0123456789012345;
int len = 1;
if(p2-p1 < len) {
printf("pwnd!\n");
}
In this case, the difference between the pointer is evaluated as p2-p1 = 0xa000000000000000, which is interpreted as a negative signed value. As such, it compares smaller then len, and the following code is executed with a much too low len value (or much too large pointer difference).
The only approach that I know is safe in the presence of attacker-controlled values, is to use unsigned arithmetic:
if(p1 < p2 &&
((uintptr_t)p2 - (uintptr_t)p1)/sizeof(*p1) < (uintptr_t)len
) {
printf("safe\n");
}
The p1 < p2 guarantees that p2 - p1 cannot yield a genuinely negative value. The second clause performs the actions of p2 - p1 < len while forcing use of unsigned arithmetic in a non-UB way. I.e. (uintptr_t)p2 - (uintptr_t)p1 gives exactly the count of bytes between the bigger p2 and the smaller p1, no matter the values involved.
Of course, you don't want to see such comparisons in your code unless you know that you need to defend against determined attackers. Unfortunately, it's the only way to be safe, and if you rely on either form given in the question, you open yourself up to attacks.