Why does this program print 23 after casting? - c

int main() {
uint32_t x;
uint32_t* p = (uint32_t*)malloc(sizeof(uint32_t));
uint32_t array[9] = {42, 5, 23, 82, 127, 21, 324, 3, 8};
*p = *(uint32_t*) ((char*) array + 8);
printf("l: %d\n", *p);
return 0;
}
Why does *p print the value of the second index of array after *p = *(uint32_t*) ((char*) array + 8); ?

Perhaps the question revolves around operator precedence and associativity. This expression ...
*(uint32_t*) ((char*) array + 8)
... is equivalent to ...
*((uint32_t*) (((char*) array) + 8))
... NOT ...
*((uint32_t*) ((char*) (array + 8))) // not equivalent
Typecast operators have very high precedence, higher than binary +, in particular. Pointer arithmetic works in units of the pointed-to type, and the type of ((char *) array) is, or course, char *. Its pointed to type is char. Therefore, (((char*) array) + 8) evaluates to a pointer to the 8th char past the beginning of the array.
Your implementation has 8-bit chars, which is exceedingly common (but not universal). Type uint32_t has exactly 32 bits, and so comprises four 8-bit chars. It follows that (((char*) array) + 8) points to the first char of the third uint32_t in array, as the 8 bytes skipped are exactly the bytes of the first two uint32_ts.
When that pointer is converted back to type uint32_t *, it points to the whole third uint32_t of the array, and the value of that element, 23, is read back by dereferencing the pointer with the unary * operator.

sizeof(uint32_t) is 4
sizeof(char) is 1 => 8 * sizeof(char) == 8
8 / sizeof(uint32_t) = 2
then ((char *)array) + 8 == array + (8 / sizeof(uint32_t))
p = array + (8 / sizeof(uint32_t)) => p = array + 2
then *p == array[2] == 23

Related

why the value of *p0 is having a different value every time in the compilation. here is the code for the the problem in c language .?

<#include <stdio.h>
int main()
{
//type casting in pointers
int a = 500; //value is assgned
int *p; //pointer p
p = &a; //stores the address in the pointer
printf("p=%d\n*p=%d", p, *p);
printf("\np+1=%d\n*(p+1)=%d", p + 1, *(p + 1));
char *p0;
p0 = (char *)p;
printf("\n\np0=%d\n*p0=%d", p0, *p0);
return 0;
}
I was exploring the pointers in the C language and found a problem in finding the value at the address of
the char pointer when I converted it from a integer pointer.
Tell me how it works and explain please
To print a pointer use %p and cast the argument to (void*).
Like
printf("p=%p\n*p=%d", (void*)p, *p);
Reading p + 1, i.e. doing *(p + 1), is undefined behavior because p + 1 doesn't point to an int. So don't do that!
In a comment OP asks:
p=6487564 *p=500 p+1=6487568 *(p+1)=0 p0=6487564 *p0=-12
these are the output i am getting why *p0 is -12 pz explain
The decimal value 500 is the same as the hexadecimal value 0x000001F4. On a little endian machine (with 32 bit int) this is stored like:
p -> F4 01 00 00
Then you assign p0 the value of p so you have
p -> F4 01 00 00
^
|
p0
so p0 points to 0xF4 (assuming 8 bit char).
On a machine with signed chars, the hex value 0xF4 is the decimal value -12 (i.e. signed 8 bit 2's complement representation).
Conclusion On a little endian machine with signed 8 bit chars the printed value will be -12.
If you change
char *p0;
p0 = (char *)p;
to
unsigned char *p0;
p0 = (unsigned char *)p;
then it will print 244. That may be easier to understand because 500 is 256 + 244 (or in hex: 0x1F4 = 0x100 + 0xF4).

Can someone explain this bitwise C code?

I don't know what's going on with this:
#define _PACK32(str, x) \
{ \
*(x) = ((int) *((str) + 3) ) \
| ((int) *((str) + 2) << 8) \
| ((int) *((str) + 1) << 16) \
| ((int) *((str) + 0) << 24); \
}
the str it's a integer and the x it's a integer pointer
Well, as mentioned, str is not an integer. It's a pointer, as it is being dereference with * operator.
*((str) + 3) is equivalent to *(str + sizeof(str[0])*3), thus this depends on the type of str, as seen here. Same goes for other dereference operator.
So what's going on? Well, it takes the least significant 8bit of str[0], str1, str[2], and assemble them to one 32 bit size integer.
For instance, let W, X, Y, Z, A be arbitrary bit. Then,
*(str + 3) = WWWWWWWWWWWWWWWWWWWWWWWWXXXXXXXX
*(str + 2) = WWWWWWWWWWWWWWWWWWWWWWWWYYYYYYYY
*(str + 1) = WWWWWWWWWWWWWWWWWWWWWWWWZZZZZZZZ
*(str + 0) = WWWWWWWWWWWWWWWWWWWWWWWWAAAAAAAA
The last 3 are shifted, 8, 16, and 24, respectively, thus,
*(str + 3) = WWWWWWWWWWWWWWWWWWWWWWWWXXXXXXXX
*(str + 2) = WWWWWWWWWWWWWWWWYYYYYYYY00000000
*(str + 1) = WWWWWWWWZZZZZZZZ0000000000000000
*(str + 0) = AAAAAAAA000000000000000000000000
Note that the least significant digits of the last 3 are replaced with 0 during the shift.
Last, they are OR'ED, which is then assigned to X,
X = AAAAAAAAZZZZZZZZYYYYYYYYXXXXXXXX
Edit: O'Ring is not as straightforward as it might seem seems W's could be anything.
Looks like str is a pointer to an array of 4 bytes, and x is a pointer to a 32 bit value. str would actually point to the first byte (LSB) of a little endian 32 bit number, and this macro would read it and store in the variable pointed by x.
Correctly written as an inline function this should look something like:
void pack32(void const*p, unsigned* x) {
unsigned char const* str = p;
*x = str[0];
*X = *x<<8 | str[1];
*X = *x<<8 | str[2];
*X = *x<<8 | str[3];
}
you should use unsigned types when you do bit shifting, otherwise your result can overflow. And perhaps it also makes the idea clearer. The suposedly 8 bit of each byte are placed in the different bits of the target x.

Converting address of an array to other data types

int main()
{
char arr[5][7][6];
char (*p)[5][7][6] = &arr;
printf("%d\n", (&arr + 1) - &arr);
printf("%d\n", (char *)(&arr + 1) - (char *)&arr);
printf("%d\n", (unsigned)(arr + 1) - (unsigned)arr);
printf("%d\n", (unsigned)(p + 1) - (unsigned)p);
return 0;
}
When I run the above code I get following output:
1
210
42
210
Why is the output not 1 in every case?
Well, if I wanted to split hair: in first place, the code invokes undefined behavior all over the place, throughout the printf() statements. The difference of two pointers has type ptrdiff_t, and for that, the correct conversion specifier is %td, not %d.
The rest is only speculation. Let's suppose your system is reasonable, and numerically the pointer value &arr is always the same, whatever type it be converted to.
Now, (&arr + 1) - &arr is 1, of course, according to the rules of pointer arithmetic. (The actual difference between the two pointers is 210 * sizeof(int) bytes, but this is not school maths but pointer arithmetic, that's why the result is given in units of size sizeof(T), where T is the base type of the pointer.)
Then (char *)(&arr + 1) - (char *)&arr casts the pointers to char *, and since the size of char is 1, this will print the difference in bytes; you're effectively tricking/abusing pointer arithmetic here.
Furthermore: printf("%d\n", (unsigned)(arr + 1) - (unsigned)arr) is subtracting two pointers of type int (*)[7][6]. That's what arr decays into. Of course, 7 * 6 = 42, so the size difference between arr + 1 and arr is 42 elements.
p, however, is not a pointer to the first element of the array, but it's a pointer to the array itself. Its type is correctly denoted as int (*)[5][6][7]. Now if you print the difference using that type, but you don't let the compiler do the division by fooling it into that the pointers are just unsigned, then you will get 5 * 6 * 7, which is 210.
Note &arr is complete 3-dimensional char array's address, whereas arr points to first element that is 2-dimensional char array. Something like below in diagram:
0xbf8ce2c6
+------------------+ ◄-- arr = 0xbf8ce2c6
| 0xbf8ce2f0 |
| +------------------+ ◄-- arr + 1 = 0xbf8ce2f0
| | 0xbf8ce31a | |
| | +------------------+ ◄-- arr + 2 = 0xbf8ce31a
| | 0xbf8ce344 | | |
| | | +------------------+ ◄-- arr + 3 = 0xbf8ce344
| | 0xbf8ce36e | | | |
| | | | +------------------+ ◄-- arr + 4 = 0xbf8ce36e
| | | | | | | | | |
+---|---|---|--|---+ | | | | Each are 7*6, 2-Dimensional
| | | | | | | | Consists Of 42 bytes
+---|---|--|-------+ | | |
| | | | | |
+---|--|-----------+ | |
| | | |
+--|---------------+ |
| |
+------------------+
The diagram show:
1. How a 3-dimensional can be interpreted as series of 2-dimensional arrays
2. Here (arr + i) points to a 2-D array
3. Notice difference between: (arr + i + 1) - (arr + i) = 0x2a = 42, where i = [0, 4]
Type of &arr is char(*)[5][7][6] that is address of char 3D-array of dimension [5][7][6].
Value-wise difference between &arr and &arr + 1 is 5 * 7 * 6 * sizeof(char) = 210.
Because size of char[5][7][6] is 5 * 7 * 6 * sizeof(char).
In your code &arr points to 3-D array and &arry + 1 next 3-D array (that doesn't exist in our code).
Check this working code at codepade:
int main()
{
char arr[5][7][6];
printf(" &arr : %p", &arr);
printf(" &arr+1: %p", &arr + 1);
return 0;
}
Output:
&arr : 0xbf5dd7de
&arr+1: 0xbf5dd8b0
Difference between (&arr + 1) - (&arr) = 0xbf5dd8b0 - 0xbf5dd7de = 0xd2 = 210.
In your second printf:
printf("%d\n", (char *)(&arr + 1) - (char *)&arr);
You typecasts addresses of type char(*)[5][7][6] to plain (char*), and because sizeof char[5][7][6] is 210 both addresses are 210 far. (remember sizeof(char) == 1). This is the reason outputs: 210
Now as I said in first statement, arr is address of first element that is a two dimensional array of chars. Type of arr is char(*)[7][6]. Now one element (two-dimensional array of size is 6 * 7 * sizeof(char) = 42).
(Note: you can think a 3-D array as one-d array where each element is a 2-d array).
In your third printf:
printf("%d\n", (unsigned)(arr + 1) - (unsigned)arr);
You typecasts to unsigned value (but not to an address/pointer type). The difference between arr + 1 and arr is 42 * sizeof(char) = 42 (that is equals to size of char[7][6]). So the printf statement outputs: 42.
Note: You should read sizeof (int) == sizeof (void*)?, because you are typecasting address to value. and this conversion is not fully defined. (my explanation is wrt your output and the output I have given).
For further clarification check below working code at codepade:
int main()
{
char arr[5][7][6];
printf(" arr : %p\n", arr);
printf(" arr+1: %p", arr + 1);
return 0;
}
Output is:
arr : 0xbf48367e
arr+1: 0xbf4836a8
Take difference between (arr + 1) - (arr) = 0xbf4836a8 - 0xbf48367e = 0x2a = 42.
Last printf:
printf("%d\n", (unsigned)(p + 1) - (unsigned)p);
Just take difference between &arr+1 and &arr = 210 (similar to second printf) because p is pointer to 3-D char array (=&arr). And you are typecasting it to value type(not pointer type).
Additionally, (Just adding for understanding purpose, I guess reader will find it helpful),
Lets we learn one more difference between arr and &arr using sizeof operator that will help your to understand concept more deeper. For this first read: sizeof Operator
When you apply the sizeof operator to an array identifier, the result is the size of the entire array rather than
the size of the pointer represented by the array identifier.
Check this working code at codepade:
int main()
{
char arr[5][7][6];
printf(" Sizeof(&arr) : %lu and value &arr: %p\n", sizeof(&arr), &arr);
printf(" Sizeof(arr) : %lu and value arr : %p\n", sizeof(arr), arr);
printf(" Sizeof(arr[0]): %lu and value a[0]: %p\n",sizeof(arr[0]), arr[0]);
return 0;
}
Its output:
Sizeof(&arr) : 4 and value &arr: 0xbf4d9eda
Sizeof(arr) : 210 and value arr : 0xbf4d9eda
Sizeof(arr[0]): 42 and value a[0]: 0xbf4d9eda
Here &arr is just an address, and in the system address is of four-bytes and this is address of complete 3-dimensional char array.
arr is name of 3-dimensional array, and sizeof operator gives total size of array that is 210 = 5 * 7 * 6 * sizeof(char).
As I shown in my diagram arr points to first elements that is an 2-dimensional array. So because arr = (arr + 0). Now using * Dereference operator at (arr + 0) gives value at address so *(arr + 0) = arr[0].
Notice sizeof(arr[0]) gives 42 = 7 * 6 * sizeof(char). And this proofs conceptually a 3-dimensional array is noting but array of 2-dimensional array.
Because above in my answer at many time I written like: "size of char[5][7][6] is 5 * 7 * 6 * sizeof(char)." so I am adding an interesting code below #codepade:
int main(){
printf(" Char : %lu \n", sizeof(char));
printf(" Char[5] : %lu \n", sizeof(char[6]));
printf(" Char[5][7] : %lu \n", sizeof(char[7][6]));
printf(" Char[5][7][6]: %lu \n", sizeof(char[5][7][6]));
return 1;
}
Output:
Char : 1
Char[5] : 6
Char[5][7] : 42
Char[5][7][6]: 210
In (&arr + 1) - &arr:
&arr is the address of an array of 5 arrays of 7 arrays of 6 char. Adding one produces the address of where the next array of 5 arrays of 7 arrays of 6 char would be, if we had an array of those objects instead of just one. Subtracting the original address, &arr, produces the difference between the two addresses. Per the C standard, this difference is expressed as the number of elements between the two addresses, where the element type is the type of object being pointed to. Since that type is an array of 5 arrays of 7 arrays of 6 char, the distance between the two addresses is one element. In other words, the distance from &arr to (&arr + 1) is one array of 5 arrays of 7 arrays of 6 char.
In (char *)(&arr + 1) - (char *)&arr:
&arr + 1 is again a pointer to where the next array of 5 arrays of 7 arrays of 6 char would be. When it is converted to a char *, the result is a pointer to what would be the first byte of that next array. Similarlly, (char *)&arr is a pointer to the first byte of the first array. Then subtracting the two pointers yields the difference between them in elements. Since these pointers are pointer to char, the difference is produced as a number of char. So the difference is the number of bytes in an array of 5 arrays of 7 arrays of 6 char, which is 5•7•6 char, or 210 char.
In (unsigned)(arr + 1) - (unsigned)arr:
Since arr is not used with & (or sizeof or other exceptional cases), it is automatically converted from an array of 5 arrays of 7 arrays of 6 char to a pointer to the first element. Thus, it is a pointer to an array of 7 arrays of 6 char. Then arr + 1 is a pointer to the next array of 7 arrays of 6 char. When this address is converted to unsigned, the result, in the C implementation you are using, is in effect the memory address of the object. (This is common but is not guaranteed by the C standard and certainly breaks when addresses are 64 bits but unsigned is 32 bits.) Similarly, (unsigned)arr is the address of the first object. When the addresses are subtracted, the result is the distance between them in bytes. So the difference is the number of bytes in an array of 7 arrays of 6 char, which is 7•6 bytes, or 42 bytes. Note the key difference in this case: &arr is a pointer to an array of 5 arrays of 7 arrays of 6 char, but arr is a pointer to an array of 7 arrays of 6 char.
In (unsigned)(p + 1) - (unsigned)p:
p is a pointer to an array of 5 arrays of 7 arrays of 6 char. Then p+1 is a pointer to where the next array would be. Converting to unsigned acts as described above. Subtracting yields the difference it bytes, so it is the size of an array of 5 arrays of 7 arrays of 6 char, so it is again 210 bytes.
As an aside:
The type of (&arr + 1) - &arr is ptrdiff_t and should be printed with %td.
The type of (char *)(&arr + 1) - (char *)&arr is ptrdiff_t and should be printed with %td.
The type of (unsigned)(arr + 1) - (unsigned)arr is unsigned int and should be printed with %u.
The type of (unsigned)(p + 1) - (unsigned)p is unsigned int and should be printed with %u.

I cannot understand this behavior of struct pointers and XOR

I'm working with struct pointers for the first time, and I can't seem to make sense of what's happening here. My test applies the basic property of xor that says x ^ y ^ y = x, but not in C?
The below code is in my main program, and accurately restores all of the letters of "test" (which I proceed to print on screen, but I've omitted a lot of junk so as to keep this question short(er)). The struct "aes" refers to this definition:
typedef uint32_t word;
struct aes {
word iv[4];
word key[8];
word state[4];
word schedule[56];
};
As the context might suggest, the encapsulating project is an AES implementation (I'm trying to speed up my current one by trying new techniques).
In my testing, make_string and make_state work reliably, even in the functions in question, but for references sake:
void make_string (word in[], char out[]) {
for (int i = 0; i < 4; i++) {
out[(i * 4) + 0] = (char) (in[i] >> 24);
out[(i * 4) + 1] = (char) (in[i] >> 16);
out[(i * 4) + 2] = (char) (in[i] >> 8);
out[(i * 4) + 3] = (char) (in[i] );
}
}
void make_state(word out[], char in[]) {
for (int i = 0; i < 4; i++) {
out[i] = (word) (in[(i * 4) + 0] << 24) ^
(word) (in[(i * 4) + 1] << 16) ^
(word) (in[(i * 4) + 2] << 8) ^
(word) (in[(i * 4) + 3] );
}
}
Anyway, here is the block that DOES work. It's this functionality that I'm trying to modularize by stowing it away in a function:
char test[16] = {
'a', 'b', 'c', 'd',
'e', 'f', 'g', 'h',
'i', 'j', 'k', 'l',
'm', 'n', 'o', 'p'
};
aes cipher;
struct aes * work;
work = &cipher;
make_state(work->state, test);
work->state[0] ^= 0xbc6378cd;
work->state[0] ^= 0xbc6378cd;
make_string(work->state, test);
And while this code works, doing the same thing by passing it to a function does not:
void encipher_block (struct aes * work, char in[]) {
make_state(work->state, in);
work->state[0] ^= 0xff00cd00;
make_string(work->state, in);
}
void decipher_block (struct aes * work, char in[]) {
make_state(work->state, in);
work->state[0] ^= 0xff00cd00;
make_string(work->state, in);
}
Yet, by removing the make_state and make_string calls in both encipher and decipher, it works as expected!
make_state(work->state, test);
encipher_block(&cipher, test);
decipher_block(&cipher, test);
make_string(work->state, test);
So to clarify, I do not have a problem! I just want to understand this behavior.
Change char to unsigned char. char may be signed, and likely is on your system, which causes problems when converting to other integer types and when shifting.
In the expression (char) (in[i] >> 24) in make_string, an unsigned 32-bit integer is converted to a signed 8-bit integer (in your C implementation). This expression may convert values to a char that are not representable in a char, notably the values from 128 to 255. According to C 2011 6.3.1.3 3, the result is implementation-defined or an implementation-defined signal is raised.
In the expression (word) (in[(i * 4) + 3] ) in make_state, in[…] is a char, which is a signed 8-bit integer (in your C implementation). This char is converted to an int, per the usual integer promotions defined in C 2011 6.3.1.1 2. If the char is negative, then the resulting int is negative. Then, when it is converted to a word, which is unsigned, the effect is that the sign bit is replicated in the high 24 bits. For example, if the char has value -166 (0x90), the result will be 0xffffff90, but you want 0x00000090.
Change char to unsigned char throughout this code.
Additionally, in make_state, in[(i * 4) + 0] should be cast to word before the left shift. This is because it will start as an unsigned char, which is promoted to int before the shift. If it has some value with the high bit set, such as 0x80, then shifting it left 24 bits produces a value that cannot be represented in an int, such as 0x80000000. Per C 2011 6.5.7 4, the behavior is then undefined.
This will not be a problem in most C implementations; two’s complement is commonly used for signed integers, and the result will wrap as desired. Additionally, I expect this is a model situation that the compiler developers design for, since it is a very common code structure. However, to improve portability, casting to word will avoid the possibility of overflow.
The make_state() function overwrites the array passed in the first argument. If you put the encipher_block() and decipher_block() bodies inline, you get this:
/* encipher_block inline */
make_state(work->state, in);
work->state[0] ^= 0xff00cd00;
make_string(work->state, in);
/* decipher_block inline */
make_state(work->state, in); /* <-- Here's the problem */
work->state[0] ^= 0xff00cd00;
make_string(work->state, in);

How to align a pointer in C

Is there a way to align a pointer in C? Suppose I'm writing data to an array stack (so the pointer goes downward) and I want the next data I write to be 4-aligned so the data is written at a memory location which is a multiple of 4, how would I do that?
I have
uint8_t ary[1024];
ary = ary+1024;
ary -= /* ... */
Now suppose that ary points at location 0x05. I want it to point to 0x04.
Now I could just do
ary -= (ary % 4);
but C doesn't allow modulo on pointers. Is there any solution that is architecture independent?
Arrays are NOT pointers, despite anything you may have read in misguided answers here (meaning this question in particular or Stack Overflow in general — or anywhere else).
You cannot alter the value represented by the name of an array as shown.
What is confusing, perhaps, is that if ary is a function parameter, it will appear that you can adjust the array:
void function(uint8_t ary[1024])
{
ary += 213; // No problem because ary is a uint8_t pointer, not an array
...
}
Arrays as parameters to functions are different from arrays defined either outside a function or inside a function.
You can do:
uint8_t ary[1024];
uint8_t *stack = ary + 510;
uintptr_t addr = (uintptr_t)stack;
if (addr % 8 != 0)
addr += 8 - addr % 8;
stack = (uint8_t *)addr;
This ensures that the value in stack is aligned on an 8-byte boundary, rounded up. Your question asks for rounding down to a 4-byte boundary, so the code changes to:
if (addr % 4 != 0)
addr -= addr % 4;
stack = (uint8_t *)addr;
Yes, you can do that with bit masks too. Either:
addr = (addr + (8 - 1)) & -8; // Round up to 8-byte boundary
or:
addr &= -4; // Round down to a 4-byte boundary
This only works correctly if the LHS is a power of two — not for arbitrary values. The code with modulus operations will work correctly for any (positive) modulus.
See also: How to allocate aligned memory using only the standard library.
Demo code
Gnzlbg commented:
The code for a power of two breaks if I try to align e.g. uintptr_t(2) up to a 1 byte boundary (both are powers of 2: 2^1 and 2^0). The result is 1 but should be 2 since 2 is already aligned to a 1 byte boundary.
This code demonstrates that the alignment code is OK — as long as you interpret the comments just above correctly (now clarified by the 'either or' words separating the bit masking operations; I got caught when first checking the code).
The alignment functions could be written more compactly, especially without the assertions, but the compiler will optimize to produce the same code from what is written and what could be written. Some of the assertions could be made more stringent, too. And maybe the test function should print out the base address of the stack before doing anything else.
The code could, and maybe should, check that there won't be numeric overflow or underflow with the arithmetic. This would be more likely a problem if you aligned addresses to a multi-megabyte boundary; while you keep under 1 KiB, alignments, you're unlikely to find a problem if you're not attempting to go out of bounds of the arrays you have access to. (Strictly, even if you do multi-megabyte alignments, you won't run into trouble if the result will be within the range of memory allocated to the array you're manipulating.)
#include <assert.h>
#include <stdint.h>
#include <stdio.h>
/*
** Because the test code works with pointers to functions, the inline
** function qualifier is moot. In 'real' code using the functions, the
** inline might be useful.
*/
/* Align upwards - arithmetic mode (hence _a) */
static inline uint8_t *align_upwards_a(uint8_t *stack, uintptr_t align)
{
assert(align > 0 && (align & (align - 1)) == 0); /* Power of 2 */
assert(stack != 0);
uintptr_t addr = (uintptr_t)stack;
if (addr % align != 0)
addr += align - addr % align;
assert(addr >= (uintptr_t)stack);
return (uint8_t *)addr;
}
/* Align upwards - bit mask mode (hence _b) */
static inline uint8_t *align_upwards_b(uint8_t *stack, uintptr_t align)
{
assert(align > 0 && (align & (align - 1)) == 0); /* Power of 2 */
assert(stack != 0);
uintptr_t addr = (uintptr_t)stack;
addr = (addr + (align - 1)) & -align; // Round up to align-byte boundary
assert(addr >= (uintptr_t)stack);
return (uint8_t *)addr;
}
/* Align downwards - arithmetic mode (hence _a) */
static inline uint8_t *align_downwards_a(uint8_t *stack, uintptr_t align)
{
assert(align > 0 && (align & (align - 1)) == 0); /* Power of 2 */
assert(stack != 0);
uintptr_t addr = (uintptr_t)stack;
addr -= addr % align;
assert(addr <= (uintptr_t)stack);
return (uint8_t *)addr;
}
/* Align downwards - bit mask mode (hence _b) */
static inline uint8_t *align_downwards_b(uint8_t *stack, uintptr_t align)
{
assert(align > 0 && (align & (align - 1)) == 0); /* Power of 2 */
assert(stack != 0);
uintptr_t addr = (uintptr_t)stack;
addr &= -align; // Round down to align-byte boundary
assert(addr <= (uintptr_t)stack);
return (uint8_t *)addr;
}
static inline int inc_mod(int x, int n)
{
assert(x >= 0 && x < n);
if (++x >= n)
x = 0;
return x;
}
typedef uint8_t *(*Aligner)(uint8_t *addr, uintptr_t align);
static void test_aligners(const char *tag, Aligner align_a, Aligner align_b)
{
const int align[] = { 64, 32, 16, 8, 4, 2, 1 };
enum { NUM_ALIGN = sizeof(align) / sizeof(align[0]) };
uint8_t stack[1024];
uint8_t *sp = stack + sizeof(stack);
int dec = 1;
int a_idx = 0;
printf("%s\n", tag);
while (sp > stack)
{
sp -= dec++;
uint8_t *sp_a = (*align_a)(sp, align[a_idx]);
uint8_t *sp_b = (*align_b)(sp, align[a_idx]);
printf("old %p, adj %.2d, A %p, B %p\n",
(void *)sp, align[a_idx], (void *)sp_a, (void *)sp_b);
assert(sp_a == sp_b);
sp = sp_a;
a_idx = inc_mod(a_idx, NUM_ALIGN);
}
putchar('\n');
}
int main(void)
{
test_aligners("Align upwards", align_upwards_a, align_upwards_b);
test_aligners("Align downwards", align_downwards_a, align_downwards_b);
return 0;
}
Sample output (partially truncated):
Align upwards
old 0x7fff5ebcf4af, adj 64, A 0x7fff5ebcf4c0, B 0x7fff5ebcf4c0
old 0x7fff5ebcf4be, adj 32, A 0x7fff5ebcf4c0, B 0x7fff5ebcf4c0
old 0x7fff5ebcf4bd, adj 16, A 0x7fff5ebcf4c0, B 0x7fff5ebcf4c0
old 0x7fff5ebcf4bc, adj 08, A 0x7fff5ebcf4c0, B 0x7fff5ebcf4c0
old 0x7fff5ebcf4bb, adj 04, A 0x7fff5ebcf4bc, B 0x7fff5ebcf4bc
old 0x7fff5ebcf4b6, adj 02, A 0x7fff5ebcf4b6, B 0x7fff5ebcf4b6
old 0x7fff5ebcf4af, adj 01, A 0x7fff5ebcf4af, B 0x7fff5ebcf4af
old 0x7fff5ebcf4a7, adj 64, A 0x7fff5ebcf4c0, B 0x7fff5ebcf4c0
old 0x7fff5ebcf4b7, adj 32, A 0x7fff5ebcf4c0, B 0x7fff5ebcf4c0
old 0x7fff5ebcf4b6, adj 16, A 0x7fff5ebcf4c0, B 0x7fff5ebcf4c0
old 0x7fff5ebcf4b5, adj 08, A 0x7fff5ebcf4b8, B 0x7fff5ebcf4b8
old 0x7fff5ebcf4ac, adj 04, A 0x7fff5ebcf4ac, B 0x7fff5ebcf4ac
old 0x7fff5ebcf49f, adj 02, A 0x7fff5ebcf4a0, B 0x7fff5ebcf4a0
old 0x7fff5ebcf492, adj 01, A 0x7fff5ebcf492, B 0x7fff5ebcf492
…
old 0x7fff5ebcf0fb, adj 08, A 0x7fff5ebcf100, B 0x7fff5ebcf100
old 0x7fff5ebcf0ca, adj 04, A 0x7fff5ebcf0cc, B 0x7fff5ebcf0cc
old 0x7fff5ebcf095, adj 02, A 0x7fff5ebcf096, B 0x7fff5ebcf096
Align downwards
old 0x7fff5ebcf4af, adj 64, A 0x7fff5ebcf480, B 0x7fff5ebcf480
old 0x7fff5ebcf47e, adj 32, A 0x7fff5ebcf460, B 0x7fff5ebcf460
old 0x7fff5ebcf45d, adj 16, A 0x7fff5ebcf450, B 0x7fff5ebcf450
old 0x7fff5ebcf44c, adj 08, A 0x7fff5ebcf448, B 0x7fff5ebcf448
old 0x7fff5ebcf443, adj 04, A 0x7fff5ebcf440, B 0x7fff5ebcf440
old 0x7fff5ebcf43a, adj 02, A 0x7fff5ebcf43a, B 0x7fff5ebcf43a
old 0x7fff5ebcf433, adj 01, A 0x7fff5ebcf433, B 0x7fff5ebcf433
old 0x7fff5ebcf42b, adj 64, A 0x7fff5ebcf400, B 0x7fff5ebcf400
old 0x7fff5ebcf3f7, adj 32, A 0x7fff5ebcf3e0, B 0x7fff5ebcf3e0
old 0x7fff5ebcf3d6, adj 16, A 0x7fff5ebcf3d0, B 0x7fff5ebcf3d0
old 0x7fff5ebcf3c5, adj 08, A 0x7fff5ebcf3c0, B 0x7fff5ebcf3c0
old 0x7fff5ebcf3b4, adj 04, A 0x7fff5ebcf3b4, B 0x7fff5ebcf3b4
old 0x7fff5ebcf3a7, adj 02, A 0x7fff5ebcf3a6, B 0x7fff5ebcf3a6
old 0x7fff5ebcf398, adj 01, A 0x7fff5ebcf398, B 0x7fff5ebcf398
…
old 0x7fff5ebcf0f7, adj 01, A 0x7fff5ebcf0f7, B 0x7fff5ebcf0f7
old 0x7fff5ebcf0d3, adj 64, A 0x7fff5ebcf0c0, B 0x7fff5ebcf0c0
old 0x7fff5ebcf09b, adj 32, A 0x7fff5ebcf080, B 0x7fff5ebcf080
DO NOT USE MODULO!!! IT IS REALLY SLOW!!! Hands down the fastest way to align a pointer is to use 2's complement math. You need to invert the bits, add one, and mask off the 2 (for 32-bit) or 3 (for 64-bit) least significant bits. The result is an offset that you then add to the pointer value to align it. Works great for 32 and 64-bit numbers. For 16-bit alignment just mask the pointer with 0x1 and add that value. Algorithm works identically in any language but as you can see, Embedded C++ is vastly superior than C in every way shape and form.
#include <cstdint>
/** Returns the number to add to align the given pointer to a 8, 16, 32, or 64-bit
boundary.
#author Cale McCollough.
#param ptr The address to align.
#return The offset to add to the ptr to align it. */
template<typename T>
inline uintptr_t MemoryAlignOffset (const void* ptr) {
return ((~reinterpret_cast<uintptr_t> (ptr)) + 1) & (sizeof (T) - 1);
}
/** Word aligns the given byte pointer up in addresses.
#author Cale McCollough.
#param ptr Pointer to align.
#return Next word aligned up pointer. */
template<typename T>
inline T* MemoryAlign (T* ptr) {
uintptr_t offset = MemoryAlignOffset<uintptr_t> (ptr);
char* aligned_ptr = reinterpret_cast<char*> (ptr) + offset;
return reinterpret_cast<T*> (aligned_ptr);
}
For detailed write up and proofs please #see https://github.com/kabuki-starship/kabuki-toolkit/wiki/Fastest-Method-to-Align-Pointers. If you would like to see proof of why you should never use modulo, I invented the world fastest integer-to-string algorithm. The benchmark on the paper shows you the effect of optimizing away just one modulo instruction. Please #see https://github.com/kabuki-starship/kabuki-toolkit/wiki/Engineering-a-Faster-Integer-to-String-Algorithm.
For some reason I can't use modulo or bitwise operations. In this case:
void *alignAddress = (void*)((((intptr_t)address + align - 1) / align) * align) ;
For C++:
template <int align, typename T>
constexpr T padding(T value)
{
return ((value + align - 1) / align) * align;
}
...
char* alignAddress = reinterpret_cast<char*>(padding<8>(reinterpret_cast<uintptr_t>(address)))
I'm editing this answer because:
I had a bug in my original code (I forgot a typecast to intptr_t), and
I'm replying to Jonathan Leffler's criticism in order to clarify my intent.
The code below is not meant to imply you can change the value of an array (foo). But you can get an aligned pointer into that array, and this example illustrates one way to do it.
#define alignmentBytes ( 1 << 2 ) // == 4, but enforces the idea that that alignmentBytes should be a power of two
#define alignmentBytesMinusOne ( alignmentBytes - 1 )
uint8_t foo[ 1024 + alignmentBytesMinusOne ];
uint8_t *fooAligned;
fooAligned = (uint8_t *)((intptr_t)( foo + alignmentBytesMinusOne ) & ~alignmentBytesMinusOne);
Based on tricks learned elsewhere and one from reading #par answer apparently all I needed for my special case which is for a 32-bit like machine is ((size - 1) | 3) + 1 which acts like this and thought might be useful for other,
for (size_t size = 0; size < 20; ++size) printf("%d\n", ((size - 1) | 3) + 1);
0
4
4
4
4
8
8
8
8
12
12
12
12
16
16
16
16
20
20
20
I'm using it to align pointers in C :
#include <inttypes.h>
static inline void * please_align(void * ptr){
char * res __attribute__((aligned(128))) ;
res = (char *)ptr + (128 - (uintptr_t) ptr) % 128;
return res ;
}

Resources