confirming the cache aligment of an array - c

I have an array of structs which i wish to store as cache aligned. I have read about the __CACHELINE_ALIGNED__ and __cacheline_aligned macros. now I wish to confirm that an array of a structs defined with this attribute are truely cache aligned.
struct test{
int val1;
int val2;
int val3;
} __CACHELINE_ALIGNED__;
struct test tArr[2];
of course if i will print the size of tArr[0] i will get 12 so i came up with the following test:
printf("size of first %p, second %p\n", &tArr[0], &tArr[1]);
And I receive pointers located 12 byte apart. Does that mean that the structs are not cache aligned? how can i verify that the lines are really cache aligned.
thanks.

There is nothing called __CACHELINE_ALIGNED__ in standard C, where did you read about it? You are just defining a global variable called __CACHELINE_ALIGNED__ with you code.
If you want your structure to be cache line aligned you have to specify the alignment in a way that is standard for your compiler. For instance assuming 64 byte cache line and GCC you would go:
struct test{
int val1;
int val2;
int val3;
} __attribute__ ((aligned (64)));

Related

Why is this structures code in C outputs differently? [duplicate]

I've read this about structure padding in C:
http://bytes.com/topic/c/answers/543879-what-structure-padding
and wrote this code after the article, what should print out size of 'struct pad' like 16 byte and the size of 'struct pad2' should be 12. -as I think.
I compiled this code with gcc, with different levels of optimization, even the sizeof() operator gives me both of them 16 byte.
Why is it?
This information is necessary for me because of PS3 machines, where the byte boundaries and exploitation of the full dma transfer is important:
#include <stdio.h>
#include <stdlib.h>
struct pad
{
char c1; // 1 byte
short s1; // 2 byte
short s2; // 2 byte
char c2; // 1 byte
long l1; // 4 byte
char c3; // 1 byte
};
struct pad2
{
long l1;
short s1;
short s2;
char c1;
char c2;
char c3;
};
int main(void)
{
struct pad P1;
printf("%d\n", sizeof(P1));
struct pad P2;
printf("%d\n", sizeof(P2));
return EXIT_SUCCESS;
}
There are two tricks that can be used to owercome this problem
Using directive #pragma pack(1) and then #pragma pack(pop)
example:
#pragma pack(1)
struct tight{
short element_1;
int *element_2;
};
#pragma pack(pop)
To check if the sizes of two structs are same during compilation use this trick
char voidstr[(sizeof(struct1)==sizeof(struct2)) - 1]; //it will return error at compile time if this fail
Your structures each include a long, which your platform apparently requires to be on a four-byte boundary. The structure must be at least as aligned as its most aligned member, so it has to be 4-byte aligned, and a structure's size has to be a multiple of its alignment in case it goes into an array.
Extra padding is required to make the long aligned, and so the smallest multiple of 4 is 16.
Two pieces of advice:
You can compute the offset of a field l1 by
printf("Offset of field %s is %d\n", "l1", offsetof(struct pad, l1);
To get the offsetof macro you will need to #include <stddef.h> (thanks caf!).
If you want to pack data as densely as possible, use unsigned char[4] instead of long and unsigned char[2] instead of short, and do the arithmetic to convert.
EDIT:: The sizeof(struct pad2) is 12. Your code has a bug; structure P2 is declared of type struct pad. Try this:
#define xx(T) printf("sizeof(" #T ") == %d\n", sizeof(T))
xx(struct pad);
xx(struct pad2);
P.S. I should definitely stop trying to answer SO questions after midnight.
On PS3, don't guess. Use __attribute__((aligned (16))), or similar. Not only does it guarantee that the start of the structure will be aligned on a proper boundary (if global or static), it also pads the structure to a multiple of your specified alignment.
Your code isn't showing what you think it is, because both P1 and P2 are defined as instances of struct pad. struct pad2 isn't ever used.
If I change the definition of P2 so that it is struct pad2, gcc does indeed decide to make it size 12.
struct pad P1;
printf("%d\n", sizeof(P1));
struct pad P2;
printf("%d\n", sizeof(P2));
P1 and P2 have the same type "struct pad" maybe you want to use "struct pad2" for P2.
All CPU expect that the built in data types like (int, float,char,double) are stored in the memory at their natural boundary, at address of their length.So structure padding is done to faster access of data from memory.
For example,
If int is declared, it should occurs in the memory at an address multiple of 4, as
size of the int is 4 byte.
Similarly for double, it resides in memory at multiple of 8.
If memory is properly aligned, CPU can run faster and work efficiently.
For the following examples, let us assume:
Sizeof(int)=4 byte
Sizeof(float)=4 byte
Sizeof(char)=1 byte
Find details on BoundsCheck

how to place global variables placed adjacent? (without setting it at linker script)

I want to place 3 variables 32 bytes apart adjacent with each other. This is for debugging, a suspicious behavior.
For this line of C code (sparc, bare-metal, defining a global variable outside a function.)
int __attribute__ ((aligned (32))) xx0, layer_complete, xx1;
With just this code, the variables xx0, xx1, layer_complete are aligned to 32 byte but just after layer_complete, there are some variables placed. I want only one variable to be placed in a 32 bit range. (having said that, I have an idea of using union. but I'm curious if I can do it without union).
ADD : I tried this with union (to make some space after layer_complete)
union ttt {
int layer_complete;
int a[8]; // to make it 32 bytes
} __attribute__((aligned(32))) lc_union;
#define layer_complete lc_union.layer_complete
inspecting program.map I can see layer_complete is 32 byte aligned and the following 28 bytes are not used (of course).
This should do the trick:
typedef struct
{
int _Alignas(32) xx0;
int _Alignas(32) layer_complete;
int _Alignas(32) xx1;
} thing;
...
thing t;
If you for some reason is using a very old version of gcc (pre-C11, 4.x something or older), then you can also use non-standard __attribute__ ((aligned (32))).
Since this is just for debugging purposes, you could make macros such as #define xx0 t.xx0 to make the struct compatible with what you already got.
With something like that :
char memVar96[96];
void *ptr1 = memVar96;
void *ptr2 = &memVar96[32];
void *ptr3 = &memVar96[64];
#define var1 (*(MyVarType1*)ptr1)
#define var2 (*(MyVarType2*)ptr2)
#define var3 (*(MyVarType3*)ptr3)

Default structure alignment for 32 bit processor word

I'm working with an struct compiled for a 32bit ARM processor.
typedef struct structure {
short a;
char b;
double c;
int d;
char e;
}structure_t;
If I use nothing, __attribute__ ((aligned (8))) or__attribute__ ((aligned (4))) I get the same results in terms of structure size and elements offset. Total size is 24. So I think it is always aligning to 8 (offsets are for both a=0 , b=2, c=8, d=16, e=20).
Why is 8 the default alignment chosen by the compiler? Should not it be 4 because is a 32 word processor?
Thanks in advance mates.
The aligned attribute only specifies a minimum alignment, not an exact one. From gcc documentation:
The aligned attribute can only increase the alignment; but you can decrease it by specifying packed as well.
And the natural alignment of double is 8 on your platform, so that is what is used.
So to get what you want you need to combine the aligned and packed attributes. With the following code, c has offset 4 (tested using offsetof).
typedef struct structure {
short a;
char b;
__attribute__((aligned(4))) __attribute__((packed)) double c;
int d;
char e;
} structure_t;

C struct size alignment

I want the size of a C struct to be multiple of 16 bytes (16B/32B/48B/..).
It does not matter which size it gets to; it only needs to be multiple of 16 bytes.
How could I enforce the compiler to do that?
For Microsoft Visual C++:
#pragma pack(push, 16)
struct _some_struct
{
...
}
#pragma pack(pop)
For GCC:
struct _some_struct { ... } __attribute__ ((aligned (16)));
Example:
#include <stdio.h>
struct test_t {
int x;
int y;
} __attribute__((aligned(16)));
int main()
{
printf("%lu\n", sizeof(struct test_t));
return 0;
}
compiled with gcc -o main main.c will output 16. The same goes for other compilers.
The size of a C struct will depend on the members of the struct, their types and how many of them there are. There is really no standard way to force the compiler to make structs to be a multiple of some size. Some compilers provide a pragma that will allow you to set the alignment boundary however that is really a different thing. And there may be some that would have such a setting or provide such a pragma.
However if you insist on this one method would be to do memory allocation of the struct and to force the memory allocation to round up to the next 16 byte size.
So if you had a struct like this.
struct _simpleStruct {
int iValueA;
int iValueB;
};
Then you could do something like the following.
{
struct _simpleStruct *pStruct = 0;
pStruct = malloc ((sizeof(*pStruct)/16 + 1)*16);
// use the pStruct for whatever
free(pStruct);
}
What this would do is to push the size up to the next 16 byte size so far as you were concerned. However what the memory allocator does may or may not be to give you a block that is actually that size. The block of memory may actually be larger than your request.
If you are going to do something special with this, for instance lets say that you are going to write this struct to a file and you want to know the block size then you would have to do the same calculation used in the malloc() rather than using the sizeof() operator to calculate the size of the struct.
So the next thing would be to write your own sizeof() operator using a macro such as.
#define SIZEOF16(x) ((sizeof(x)/16 + 1) * 16)
As far as I know there is no dependable method for pulling the size of an allocated block from a pointer. Normally a pointer will have a memory allocation block that is used by the memory heap management functions that will contain various memory management information such as the allocated block size which may actually be larger than the requested amount of memory. However the format for this block and where it is located relative to the actual memory address provided will depend on the C compiler's run time.
This depends entirely on the compiler and other tools since alignment is not specified that deeply in the ISO C standard (it specifies that alignment may happen at the compilers behest but does not go into detail as to how to enforce it).
You'll need to look into the implementation-specific stuff for your compiler toolchain. It may provide a #pragma pack (or align or some other thing) that you can add to your structure defininition.
It may also provide this as a language extension. For example, gcc allows you to add attributes to a definition, one of which controls alignment:
struct mystruct { int val[7]; } __attribute__ ((aligned (16)));
You could perhaps do a double struct, wrapping your actual struct in a second one that can add padding:
struct payload {
int a; /*Your actual fields. */
float b;
char c;
double d;
};
struct payload_padded {
struct payload p;
char padding[16 * ((sizeof (struct payload) + 15) / 16)];
};
Then you can work with the padded struct:
struct payload_padded a;
a.p.d = 43.3;
Of course, you can make use of the fact that the first member of a structure starts 0 bytes from where the structure starts, and treat a pointer to struct payload_padded as if it's a pointer to a struct payload (because it is):
float d_plus_2(const struct payload *p)
{
return p->d + 2;
}
/* ... */
struct payload_padded b;
const double dp2 = d_plus_2((struct payload *) &b);

Structure padding in C

I've read this about structure padding in C:
http://bytes.com/topic/c/answers/543879-what-structure-padding
and wrote this code after the article, what should print out size of 'struct pad' like 16 byte and the size of 'struct pad2' should be 12. -as I think.
I compiled this code with gcc, with different levels of optimization, even the sizeof() operator gives me both of them 16 byte.
Why is it?
This information is necessary for me because of PS3 machines, where the byte boundaries and exploitation of the full dma transfer is important:
#include <stdio.h>
#include <stdlib.h>
struct pad
{
char c1; // 1 byte
short s1; // 2 byte
short s2; // 2 byte
char c2; // 1 byte
long l1; // 4 byte
char c3; // 1 byte
};
struct pad2
{
long l1;
short s1;
short s2;
char c1;
char c2;
char c3;
};
int main(void)
{
struct pad P1;
printf("%d\n", sizeof(P1));
struct pad P2;
printf("%d\n", sizeof(P2));
return EXIT_SUCCESS;
}
There are two tricks that can be used to owercome this problem
Using directive #pragma pack(1) and then #pragma pack(pop)
example:
#pragma pack(1)
struct tight{
short element_1;
int *element_2;
};
#pragma pack(pop)
To check if the sizes of two structs are same during compilation use this trick
char voidstr[(sizeof(struct1)==sizeof(struct2)) - 1]; //it will return error at compile time if this fail
Your structures each include a long, which your platform apparently requires to be on a four-byte boundary. The structure must be at least as aligned as its most aligned member, so it has to be 4-byte aligned, and a structure's size has to be a multiple of its alignment in case it goes into an array.
Extra padding is required to make the long aligned, and so the smallest multiple of 4 is 16.
Two pieces of advice:
You can compute the offset of a field l1 by
printf("Offset of field %s is %d\n", "l1", offsetof(struct pad, l1);
To get the offsetof macro you will need to #include <stddef.h> (thanks caf!).
If you want to pack data as densely as possible, use unsigned char[4] instead of long and unsigned char[2] instead of short, and do the arithmetic to convert.
EDIT:: The sizeof(struct pad2) is 12. Your code has a bug; structure P2 is declared of type struct pad. Try this:
#define xx(T) printf("sizeof(" #T ") == %d\n", sizeof(T))
xx(struct pad);
xx(struct pad2);
P.S. I should definitely stop trying to answer SO questions after midnight.
On PS3, don't guess. Use __attribute__((aligned (16))), or similar. Not only does it guarantee that the start of the structure will be aligned on a proper boundary (if global or static), it also pads the structure to a multiple of your specified alignment.
Your code isn't showing what you think it is, because both P1 and P2 are defined as instances of struct pad. struct pad2 isn't ever used.
If I change the definition of P2 so that it is struct pad2, gcc does indeed decide to make it size 12.
struct pad P1;
printf("%d\n", sizeof(P1));
struct pad P2;
printf("%d\n", sizeof(P2));
P1 and P2 have the same type "struct pad" maybe you want to use "struct pad2" for P2.
All CPU expect that the built in data types like (int, float,char,double) are stored in the memory at their natural boundary, at address of their length.So structure padding is done to faster access of data from memory.
For example,
If int is declared, it should occurs in the memory at an address multiple of 4, as
size of the int is 4 byte.
Similarly for double, it resides in memory at multiple of 8.
If memory is properly aligned, CPU can run faster and work efficiently.
For the following examples, let us assume:
Sizeof(int)=4 byte
Sizeof(float)=4 byte
Sizeof(char)=1 byte
Find details on BoundsCheck

Resources