Is it a good practice to group all the variables of same type together while declaring in local scope within a function? If yes, why? Does it solve memory alignment issues?
I think it mattered with the VAX C compiler I used 20 years ago, but not with any modern compiler. It is not safe to assume that local variables will be in any particular order, certainly not safe to assume they will be in the order you declared them. I have definitely seen the MSVC compiler reorder them.
Grouping variables of the same type does help when they are fields of a struct, because the ordering of fields of a struct is guaranteed to match the order of declaration.
It depends on the compiler; i.e. the compiler will layout memory as it sees fit. So other than being good style, it has no effect (at least in any modern compilers I've used).
In general it will not help for local variables. There are optimization rules which can be applied by the complier and additional "pragma" directives that could be used to manipulate the alignment.
It will not solve alignment issues, since there shouldn't be alignment issues - the compiler will lay out your local variables correctly aligned, so there should be no alignment issues.
The only issue that grouping like-aligned types might have is to reduce use of the stack, but compilers are free to reorder the layout of variables on the stack anyway (or even reuse locations for different local variables at different times, or to keep locals in registers and not ever have them on the stack), so you're generally not buying anything for an optimized compile.
If you're going to be 'type punning' items on the stack, you'll need to use the same methods for alignment safety that you'd use for data off the stack - maybe more, since memory allocated by malloc() or new is guaranteed to be appropriately aligned for any type - that guarantee is not made for storage allocated to automatic variables.
'Type punning' is when you circumvent the type system. such as by accessing the bytes in a char array as an int by casting a char* to an int*:
int x;
char data[4];
fill_data( data, sizeof(data));
int x = *(int*) data;
Since the alignment requirement of the char[] might be different from an int, the above access of data through an int* might not be 'alignment safe'. However, since malloc() is specifed to return pointers suitably aligned for any type, the following should not have any alignment problems:
int x;
char* pData = malloc( 4);
if (!pData) exit(-1);
fill_data( pData, 4);
x = *(int*) pData;
However, note that sizeof(int) might not be 4 and int types might be little- or big-endian, so there are still portability issues with the above code - just not alignment issues. There are other ways of performing type punning including accessing data through different members of a union, but those may have their own portability issues notably that accessing a member that wasn't the last written member is unspecified behavior.
Padding and alignment issues only matter for structs, not local variables, because the compiler can put local variables in whatever order it wants. As for why it matters in structs -
Many C compilers will align struct members by inserting padding bytes between them. For example, if you have a struct S { int a; char b; int c; char d; int e; }, and the target hardware requires that ints be aligned on 4-byte boundaries, then the compiler will insert three bytes of padding between b and c and between d and e, wasting 6 bytes of memory per instance. On the other hand, if the members were in order a c, e, b, d, then it will insert two bytes of padding at the end (so that the size of S as a whole is a multiple of 4, so the members will be properly aligned when in arrays), wasting only 2 bytes per instance. The rules are very much platform and compiler-specific; some compilers may rearrange members to avoid padding, and some have extensions to control the padding and alignment rules in case you need binary compatibility. In general, you should only care about alignment if you're either reading/writing structs directly and depending on them having the same layout (which is usually a bad idea), or you expect to have lots of instances and memory is at a premium.
Related
struct Foo {
int a;
char b;
}
Will it be guaranteed in this case that b will have an offset of sizeof(int) in the struct? Will it be guaranteed that members will be packed together as long as all alignment requirements are being met, no padding required (Not taking into account the padding at the end to align the structures size to the largest member)?
I am asking this because I would like to know if simply fwrite()ing a struct to a save file can cause problems if the layout of a struct is not consistent across platforms, because then each save file would be specific to the platform on which it was created.
There are no guarantees. If a compiler wishes to insert unnecessary padding between structure members, it can do so. (For example, it might have determined that a particular member could be handled much more efficiently were it eight-byte aligned, even though it doesn't require alignment at all.)
Portable code intended for interoperability should not fwrite structs. Aside from the potential of differences in padding, not all platforms use the same endianness, nor the same floating point representation (if that's relevant).
Strictly speaking, the C standard makes no guarantees regarding padding inside of a struct, other than no padding at the beginning.
That being said, most implementations you're likely to come across tend to perform padding in a consistent manner (see The Lost Art of Structure Packing). In your particular example however there's a potential for inconsistency as an int is not necessarily the same size on all platforms.
If you used fixed width types such as int32_t and int8_t instead of int and char then you're probably OK. Adding a static_assert for the size of the struct can help enforce this.
You do however need to worry about endianness, converting each field to a known byte order before saving and back to the host byte order after reading back.
We know that there is padding in some structures in C. Please consider the following 2:
struct node1 {
int a;
int b;
char c;
};
struct node2 {
int a;
char c;
int b;
};
Assuming sizeof(int) = alignof(int) = 4 bytes:
sizeof(node1) = sizeof(node2) = 12, due to padding.
What is the performance difference between the two? (if any, w.r.t. the compiler or the architecture of the system, especially with GCC)
These are bad examples - in this case it doesn't matter, since the amount of padding will be the same in either case. There will not be any performance differences.
The compiler will always strive to fill up trailing padding at the end of a struct or otherwise using arrays of structs wouldn't be feasible, since the first member should always be aligned. If not for trailing padding in some item struct_array[0], then the first member in struct_array[1] would end up misaligned.
The order would matter if we were to do this though:
struct node3 {
int a;
char b;
int c;
char d;
};
Assuming 4 byte int and 4 byte alignment, then b occupies 1+3 bytes here, and d an additional 1+3 bytes. This could have been written better if the two char members were placed adjacently, in which case the total amount of padding would just have been 2 bytes.
I would not be surprised if the interviewer's opinion was based on the old argument of backward compatibility when extending the struct in the future. Additional fields (char, smallint) may benefit from the space occupied by the trailing padding, without the risk of affecting the memory offset of the existing fields.
In most cases, it's a moot point. The approach itself is likely to break compatibility, for two reasons:
Starting the extensions on a new alignment boundary (as would happen to node2) may not be memory-optimal, but it might well prevent the new fields from accidentally being overwritten by the padding of a 'legacy' struct.
When compatibility is that much of an issue (e.g. when persisting or transferring data), then it makes more sense to serialize/deserialize (even if binary is a requirement) than to depend on a binary format that varies per architecture, per compiler, even per compiler option.
OK, I might be completely off the mark here since this is a bit out of my league. If so, please correct me. But this is how I see it:
First of all, why do we need padding and alignment at all? It's just wasted bytes, isn't it? Well, turns out that processors like it. That is, if you issue an instruction to the CPU that operates on a 32-bit integer, the CPU will demand that this integer resides at a memory address which is dividable by 4. For a 64-bit integer it will need to reside in an address dividable by 8. And so on. This is done to make the CPU design simpler and better performant.
If you violate this requirement (aka "unaligned memory access"), most CPUs will raise an exception. x86 is actually an oddity because it will still perform the operation - but it will take more than twice as long because it will fetch the value from memory in two passes rather than one and then do bitwise magic to stick the value together from these separate accesses.
So this is the reason why compilers add padding to structs - so that all the members would be properly aligned and the CPU could access them quickly (or at all). Well, that's assuming the struct itself is located at a proper memory address. But it will also take care of that as long as you stick to standard operations for allocating the memory.
But it is possible to explicitly tell the compiler that you want a different alignment too. For example, if you want to use your struct to read in a bunch of data from a tightly packed file, you could explicitly set the padding to 1. In that case the compiler will also have to emit extra instructions to compensate for potential misalignment.
TL;DR - wrong alignment makes everything slower (or under certain conditions can crash your program entirely).
However this doesn't answer the question "where to better put the padding?" Padding is needed, yes, but where? Well, it doesn't make much difference directly, however by rearranging your members carefully you can reduce the size of the entire struct. And less memory used usually means a faster program. Especially if you create large arrays of these structs, using less memory will mean less memory accesses and more efficient use of CPU cache.
In your example however I don't think there's any difference.
P.S. Why does your struct end with a padding? Because arrays. The compiler wants to make sure that if you allocate an array of these structs, they will all be properly aligned. Because array members don't have any padding between them.
What is the performance difference between the two?
The performance difference is "indeterminable". For most cases it won't make any difference.
For cases where it does make a difference; either version might be faster, depending on how the structure is used. For one example, if you have a large array of these structures and frequently select a structure in the array "randomly"; then if you only access a and b of the randomly selected structure the first version can be faster (because a and b are more likely to be in the same cache line), and if you only access a and c then the second version can be faster.
In an old program I serialized a data structure to bytes, by allocating an array of unsigned char, and then converted ints by:
*((*int)p) = value;
(where p is the unsigned char*, and value is the value to be stored).
This worked fine, except when compiled on Sparc where it triggered exceptions due to accessing memory with improper alignment. Which made perfect sense because the data elements had varying sizes so p quickly became unaligned, and triggered the error when used to store an int value, where the underlying Sparc instructions require alignment.
This was quickly fixed (by writing out the value to the char-array byte-by-byte). But I'm a bit concerned about this because I've used this construction in many programs over the years without issue. But clearly I'm violating some C rule (strict aliasing?) and whereas this case was easily discovered, maybe the violations can cause other types of undefined behavior that is more subtle due to optimizing compilers etc. I'm also a bit puzzled because I believe I've seen constructions like this in lot of C code over the years. I'm thinking of hardware drivers that describe the data-structure exchanged by the hardware as structs (using pack(1) of course), and writing those to h/w registers etc. So it seems to be a common technique.
So my question is, is exactly what rule was violated by the above, and what would be the proper C way to realize the use-case (i.e. serializing data to an array of unsigned char). Of course custom serialization functions can be written for all functions to write it out byte-by-byte but it sounds cumbersome and not very efficient.
Finally, can ill effects (outside of alignment problems etc.) in general be expected through violation of this aliasing rule?
Yes, your code violates strict aliasing rule. In C, only char* and its signed and unsigned counterparts are assumed to alias other types.
So, the proper way to do such raw serialization is to create an array on ints, and then treat it as unsigned char buffer.
int arr[] = { 1, 2, 3, 4, 5 };
unsigned char* rawData = (unsigned char*)arr;
You can memcpy, fwrite, or do other serialization of rawData, and it is absolutely valid.
Deserialization code may look like this:
int* arr = (int*)calloc(5, sizeof(int));
memcpy(arr, rawData, 5 * sizeof(int));
Sure, you should care of endianness, padding and other issues to implement reliable serialization.
It is compiler and platform specific, on how a struct is represented (layed out) in memory and whether or not the start address of a struct is aligned to a 1,2,4,8,... byte boundary. Therefore, you should not take any assumptions on the layout of your structs members.
On platforms, where your member types require specific alignment, padding bytes are added to the struct (which equals the statement I made above, that sizeof(struct Foo) >= the sum of its data member sizes). The padding...
Now, if you fwrite() or memcpy() a struct from one instance to another, on the same machine with the same compiler and settings (e.g. in the same program of yours), you will write both the data content and the padding bytes, added by the compiler. As long as you handle the whole struct, you can successfully round trip (as long as there are no pointer members inside the struct, at least).
What you cannot assume is, that you can cast smaller types (e.g. unsigned char ) to "larger types" (e.g. unsigned int) and memcpy between those in that direction, because unsigned int might require proper alignment on that target platform. Usually if you do that wrong, you see bus errors or alike.
malloc() in the most general case is the generic way to get heap-memory for any type of data. Be it a byte array or some struct, independent of its alignment requirements. There is no system existing, where you cannot struct Foo *ps = malloc(sizeof(struct Foo)). On platforms, where alignment is vital, malloc will not return unaligned addresses as it would break any code, trying to allocate memory for a struct. As malloc() is not psychic, it will also return "struct compatible aligned" pointers if you use it to allocate byte arrays.
Any form of "ad hoc" serialization like writing the whole struct is only a promising approach as long as you need not exchange the serialized data with other machines or other applications (or future versions of the same application where someone might have tinkered with compiler settings, related to alignment).
If you look for a portable and more reliable and robust solution, you should consider using one of the main stream serialization packages, one of which being the aforementioned Google protocol buffers.
This is exert from a book about data alignment of primitive types in memory.
Microsoft Windows imposes a stronger alignment requirement—any primitive object of K bytes, for
K = 2, 4, or 8, must have an address that is a multiple of K. In particular, it requires that the address
of a double or a long long be a multiple of 8. This requirement enhances the memory performance at
the expense of some wasted space. The Linux convention, where 8-byte values are aligned on 4-byte
boundaries was probably good for the i386, back when memory was scarce and memory interfaces were
only 4 bytes wide. With modern processors, Microsoft’s alignment is a better design decision. Data type
long double, for which gcc generates IA32 code allocating 12 bytes (even though the actual data type
requires only 10 bytes) has a 4-byte alignment requirement with both Windows and Linux.
Questions are:
What imposes data alignment, OS or compiler?
Can I change it or it is fixed?
Generally speaking, it's the compiler that imposes the alignment. Whenever you declare a primitive type (eg. double), the compiler will automatically align it to 8 bytes on the stack.
Furthermore, memory allocations are also generally aligned to the largest primitive type so that you can safely do this:
double *ptr = (double*)malloc(size);
without having to worry about alignment.
Therefore, generally speaking, if you're programming with good habits, you won't have to worry about alignment. One way to get something misaligned is to do something like this:
char *ch_ptr = (char*)malloc(size);
double *d_ptr = (double*)(ch_ptr + 1);
There are some exceptions to this: When you start getting into SSE and vectorization, things get a bit messy because malloc no longer guarantees 16-byte alignment.
To override the alignment of something, MSVC has the declspec(align) modifier which will allow this. It's used to increase the alignment of something. Though I'm not sure if it lets you decrease the alignment of a primitive type. It says explicitly that you cannot decrease alignment with this modifier.
EDIT :
I found the documentation stating the alignment of malloc() on GCC:
The address of a block returned by malloc or realloc in the GNU system
is always a multiple of eight (or sixteen on 64-bit systems).
Source: http://www.gnu.org/s/hello/manual/libc/Aligned-Memory-Blocks.html
So yes, GCC now aligns to at least 8 bytes.
The x86 CPUs have pretty lax alignment requirements. Most of data can be stored and accessed at unaligned locations, possibly at the expense of degraded performance. Things become more complex when you start developing multiprocessor software as alignment becomes important for atomicity and observed order of events (writing this from memory, this may be not entirely correct).
Compilers can often be directed to align variables differently from the default alignment. There're compiler options for that and special compiler-specific keywords (e.g. #pragma pack and others).
The well-established OS APIs can't be changed, neither by the application programmer (the OS is already compiled), nor by the OS developers (unless, of course, they are OK with breaking compatibility).
So, you can change some things, but not everything.
I don't know where microsoft got its information from, but the results on
gcc (4.6.1 Target: x86_64-linux-gnu, standard mode, no flags except -Wall) are quite different:
#include <stdio.h>
struct lll {
long l;
long long ll;
};
struct lld {
long l;
long double ld;
};
struct lll lll1, lll2[2];
struct lld lld1, lld2[2];
int main(void)
{
printf("lll1=%u, lll2=%u\n"
, (unsigned) sizeof lll1
, (unsigned) sizeof lll2
);
printf("lld=%u, lld2=%u\n"
, (unsigned) sizeof lld1
, (unsigned) sizeof lld2
);
return 0;
}
Results:
./a.out
lll1=16, lll2=32
lld=32, lld2=64
This might be FUD (from the company that actually managed to put unaligned ints into the MBR ...). But it could also be a result of the author not being informed too well.
To answer the question: it is the hardware that imposes the alignment restrictions. The compiler only needs to implement them.
This question already has answers here:
Closed 12 years ago.
Possible Duplicates:
Can the Size of Pointers Vary Depending on what’s Pointed To?
Are there are any platforms where pointers to different types have different sizes?
Is it possible that the size of a pointer to a float in c differs from a pointer to int? Having tried it out, I get the same result for all kinds of pointers.
#include <stdio.h>
#include <stdlib.h>
int main()
{
printf("sizeof(int*): %i\n", sizeof(int*));
printf("sizeof(float*): %i\n", sizeof(float*));
printf("sizeof(void*): %i\n", sizeof(void*));
return 0;
}
Which outputs here (OSX 10.6 64bit)
sizeof(int*): 8
sizeof(float*): 8
sizeof(void*): 8
Can I assume that pointers of different types have the same size (on one arch of course)?
Pointers are not always the same size on the same arch.
You can read more on the concept of "near", "far" and "huge" pointers, just as an example of a case where pointer sizes differ...
http://en.wikipedia.org/wiki/Intel_Memory_Model#Pointer_sizes
In days of old, using e.g. Borland C compilers on the DOS platform, there were a total of (I think) 5 memory models which could even be mixed to some extent. Essentially, you had a choice of small or large pointers to data, and small or large pointers to code, and a "tiny" model where code and data had a common address space of (If I remember correctly) 64K.
It was possible to specify "huge" pointers within a program that was otherwise built in the "tiny" model. So in the worst case it was possible to have different sized pointers to the same data type in the same program!
I think the standard doesn't even forbid this, so theoretically an obscure C compiler could do this even today. But there are doubtless experts who will be able to confirm or correct this.
Pointers to data must always be compatible with void* so generally they would be nowadays realized as types of the same width.
This statement is not true for function pointers, they may have different width. For that reason in C99 casting function pointers to void* is undefined behavior.
As I understand it there is nothing in the C standard which guarantees that pointers to different types must be the same size, so in theory an int * and a float * on the same platform could be different sizes without breaking any rules.
There is a requirement that char * and void * have the same representation and alignment requirements, and there are various other similar requirements for different subsets of pointer types but there's nothing that encompasses everything.
In practise you're unlikely to run into any implementation that uses different sized pointers unless you head into some fairly obscure places.
Yes. It's uncommon, but this would certainly happen on systems that are not byte-addressable. E.g. a 16 bit system with 64 Kword = 128KB of memory. On such systems, you can still have 16 bits int pointers. But a char pointer to an 8 bit char would need an extra bit to indicate highbyte/lowbyte within the word, and thus you'd have 17/32 bits char pointers.
This might sound exotic, but many DSP's spend 99.x% of the time executing specialized numerical code. A sound DSP can be a bit simpler if it all it has to deal with is 16 bits data, leaving the occasional 8 bits math to be emulated by the compiler.
I was going to write a reply saying that C99 has various pointer conversion requirements that more or less ensure that pointers to data have to be all the same size. However, on reading them carefully, I realised that C99 is specifically designed to allow pointers to be of different sizes for different types.
For instance on an architecture where the integers are 4 bytes and must be 4 byte aligned an int pointer could be two bits smaller than a char or void pointer. Provided the cast actually does the shift in both directions, you're fine with C99. It helpfully says that the result of casting a char pointer to an incorrectly aligned int pointer is undefined.
See the C99 standard. Section 6.3.2.3
Yes, the size of a pointer is platform dependent. More specifically, the size of a pointer depends on the target processor architecture and the "bit-ness" you compile for.
As a rule of thumb, on a 64bit machine a pointer is usually 64bits, on a 32bit machine usually 32 bits. There are exceptions however.
Since a pointer is just a memory address its always the same size regardless of what the memory it points to contains. So a pointer to a float, a char or an int are all the same size.
Can I assume that pointers of different types have the same size (on one arch of course)?
For the platforms with flat memory model (== all popular/modern platforms) pointer size would be the same.
For the platforms with segmented memory model, for efficiency, often there are platform-specific pointer types of different sizes. (E.g. far pointers in the DOS, since 8086 CPU used segmented memory model.) But this is platform specific and non-standard.
You probably should keep in mind that in C++ size of normal pointer might differ from size of pointer to virtual method. Pointers to virtual methods has to preserve extra bit of information to not to work properly with polymorphism. This is probably only exception I'm aware of, which is still relevant (since I doubt that segmented memory model would ever make it back).
There are platforms where function pointers are a different size than other pointers.
I've never seen more variation than this. All other pointers must be at most sizeof(void*) since the standard requires that they can be cast to void* without loss of information.
Pointer is a memory address - and hence should be the same on a specific machine. 32 bit machine => 4Bytes, 64 bit => 8 Bytes.
Hence irrespective of the datatype of the thing that the pointer is pointing to, the size of a pointer on a specific machine would be the same (since the space required to store a memory address would be the same.)
Assumption: I'm talking about near pointers to data values, the kind you declared in your question.