The code I am writing needs to be fully standard compliant. The standard does not promise any alignment options stronger than that of max_align_t. I want to try to align to a cache line, but I understand that to be undefined behavior if the implementation does not support alignments of that strength.
Is there any way around this? Any ways to check at preprocessing what extended alignments are available? Or are there any ways to ask for an alignment, and just not get it, rather than have undefined behavior, if the alignment is not available?
aligned_alloc works for allocated memory. However, I am also interested in statically stored memory.
EDIT:
To illustrate my problem, here are the statements from the C11 standard I have problems with:
6.2.8
Alignments are represented as values of the type size_t. Valid alignments include only those values returned by an _Alignof expression for fundamental types, plus an additional implementation-defined set of values, which may be empty. Every valid alignment value shall be a nonnegative integral power of two.
So any given 2 power is not necesarily a valid alignment, and I can't count on 64 to be less than or equal to max_align_t, and so 64 may not be a valid alignment. If it is not a valid alignment, here is my undefined behavior issue:
6.7.5 Alignment specifier
The constant expression shall be an integer constant expression. It shall evaluate to a valid fundamental alignment, or to a valid extended alignment supported by the implementation in the context in which it appears, or to zero.
No alignment that your compiler will chose by its own should be wider than max_align_t, but that is all that there is to it. There is no interdiction for asking for a wider alignment.
So to ensure that a specific field of a struct lays on a boundary as you wish, you'd just have to use _Alignas. All is well-defined as long as the value that you are asking for is a power of 2, and the particular alignment is allowed by your compiler. If it isn't, you compiler must complain.
This is exactly one of the reasons _Alignas has been added in C11.
I think, you needed see functions valloc() or memalign().
Also, very useful call:
int pagesize = sysconf(_SC_PAGESIZE);
If you needed align buffer with pagesize, you can call:
char *buf = ...; // Buffer to unaligned memory
buf -= (uinsigned)buf & (pagesize - 1); // align to low border
buf += pagesize; // align to high border
Related
Reading https://en.cppreference.com/w/c/language/bit_field, are the following conclusions correct?
Adjacent bit-fields have no padding in between (this seems to be differentin 6.7.2.1 of the C-standard).
The placement of a bit-field within the storage-unit is implementation-defined.
The position of the bits inside a bit-field is implementation-defined.
(For C++ see also: Characteristics of bit-Fields in C++.)
As a preliminary, there is no language "C/C++" such as is referenced by the question title. C and C++ are distinct languages sharing a common subset. In particular, C is not a subset of C++.
With regard to C, all the specifics the current language spec (C17 at this moment) provides about bitfield layout are in paragraphs 6.7.2.1/11-12.
are the following conclusions correct?
Adjacent bit-fields have no padding in between (this seems to be differentin 6.7.2.1 of the C-standard).
Bit fields are not laid out directly within a structure. The C implementation lays out "addressable storage units" for them within the structure, and lays out bitfields within those. The sizes and alignment requirements of the ASUs are unspecified.
The spec does say that if there is sufficient space in the ASU to which one bitfield is assigned, then an immediately-following bitfield is packed into adjacent bits of the same ASU. This means that there are not padding bits between those bitfields. However, if there is not sufficient space, then it is implementation-defined whether the immmediately-following bitfield spans two ASUs or whether all its bits are assigned to a separate one, leaving unused (padding) bits in the first. Additionally, a zero-width bitfield can be used to force the bitfield following it to be assigned to a new ASU, possibly requiring padding bits in a previous one.
Moreover, the spec has nothing to say about whether there are padding bytes between ASUs. ASUs are not required to be uniform in size or to have the same alignment requirements as each other, so it is plausible that padding bytes would sometimes be required between them even in an implementation that is not intentionally perverse in this regard.
The placement of a bit-field within the storage-unit is implementation-defined.
The spec explicitly says that the order of bitfields within an ASU is implementation defined. That's in the right-to-left vs left-to-right sense. "Order" is not exactly the same thing as "placement", but I guess that's what you mean.
The position of the bits inside a bit-field is implementation-defined.
Not really. This is a question of representation, not layout, and the relevant paragraphs of C17 are 6.2.6.1/3-4:
Values stored in unsigned bit-fields and objects of type unsigned char shall be represented using a pure binary notation.
[...] Values stored in bit-fields consist of m bits, where m is the
size specified for the bit-field. The object representation is the set
of m bits the bit-field comprises in the addressable storage unit
holding it.
Footnote 49 clarifies the meaning of "pure binary notation" if you need that. All other details of bitfield representation are unspecified or undefined, not implementation-defined, which means that you cannot rely on them being documented.
Differences in C++ include, but are not necessarily limited to:
C++ officially sanctions more declared types for bit fields than does C.
C++ defines a mechanism for declaring bitfields that contain padding bits, but C does not.
Bitfield allocation and alignment are implementation defined in C++ (vs unspecified in C).
The relevant section of the C++ spec is [class-bit], 11.4.10 in the current draft spec.
Reading https://en.cppreference.com/w/c/language/bit_field, are the following conclusions correct?
Adjacent bit-fields have no padding in between (this seems to be differentin 6.7.2.1 of the C-standard).
The placement of a bit-field within the storage-unit is implementation-defined.
The position of the bits inside a bit-field is implementation-defined.
(For C++ see also: Characteristics of bit-Fields in C++.)
As a preliminary, there is no language "C/C++" such as is referenced by the question title. C and C++ are distinct languages sharing a common subset. In particular, C is not a subset of C++.
With regard to C, all the specifics the current language spec (C17 at this moment) provides about bitfield layout are in paragraphs 6.7.2.1/11-12.
are the following conclusions correct?
Adjacent bit-fields have no padding in between (this seems to be differentin 6.7.2.1 of the C-standard).
Bit fields are not laid out directly within a structure. The C implementation lays out "addressable storage units" for them within the structure, and lays out bitfields within those. The sizes and alignment requirements of the ASUs are unspecified.
The spec does say that if there is sufficient space in the ASU to which one bitfield is assigned, then an immediately-following bitfield is packed into adjacent bits of the same ASU. This means that there are not padding bits between those bitfields. However, if there is not sufficient space, then it is implementation-defined whether the immmediately-following bitfield spans two ASUs or whether all its bits are assigned to a separate one, leaving unused (padding) bits in the first. Additionally, a zero-width bitfield can be used to force the bitfield following it to be assigned to a new ASU, possibly requiring padding bits in a previous one.
Moreover, the spec has nothing to say about whether there are padding bytes between ASUs. ASUs are not required to be uniform in size or to have the same alignment requirements as each other, so it is plausible that padding bytes would sometimes be required between them even in an implementation that is not intentionally perverse in this regard.
The placement of a bit-field within the storage-unit is implementation-defined.
The spec explicitly says that the order of bitfields within an ASU is implementation defined. That's in the right-to-left vs left-to-right sense. "Order" is not exactly the same thing as "placement", but I guess that's what you mean.
The position of the bits inside a bit-field is implementation-defined.
Not really. This is a question of representation, not layout, and the relevant paragraphs of C17 are 6.2.6.1/3-4:
Values stored in unsigned bit-fields and objects of type unsigned char shall be represented using a pure binary notation.
[...] Values stored in bit-fields consist of m bits, where m is the
size specified for the bit-field. The object representation is the set
of m bits the bit-field comprises in the addressable storage unit
holding it.
Footnote 49 clarifies the meaning of "pure binary notation" if you need that. All other details of bitfield representation are unspecified or undefined, not implementation-defined, which means that you cannot rely on them being documented.
Differences in C++ include, but are not necessarily limited to:
C++ officially sanctions more declared types for bit fields than does C.
C++ defines a mechanism for declaring bitfields that contain padding bits, but C does not.
Bitfield allocation and alignment are implementation defined in C++ (vs unspecified in C).
The relevant section of the C++ spec is [class-bit], 11.4.10 in the current draft spec.
Is there a way to subtract one pointer from another in C11 and have the result be always defined?
The standard says the behavior is undefined if the result is not representable as type ptrdiff_t.
I am open to a solution relying on static assertions that are expected to pass on a reasonable implementation in a modern general purpose 32 or 64 bit environment. I would like to avoid solutions that rely on any sort of runtime checks.
If the pointed to type has size greater than 1, I can static assert size_t and ptrdiff_t to have the same number of nonpadding bits. This partial solution relies on two things I am not sure about, so any feedback on this would provide a partial answer:
It can be expected that ptrdiff_t has at most one fewer value bit than size_t in a reasonable implementation in a modern general purpose 32 or 64 bit environment.
I am correct in my understanding of the standard, in that the difference between two pointers to objects of size greater than 1 is defined, even when the same difference would be undefined if the pointers were cast to character pointers. This understanding seems inconsistent with footnote 106 in the committee draft, but it is my understanding that footnotes are not normative.
According to the Standard
You can only subtract pointers if both pointers point to the same object, which includes the "one-past-the-end" pointer.
Subtracting uintptr_t or intptr_t is not necessarily meaningful, because, again, according to the standard, there is no particular way that the conversion from pointer to integer has to be defined. In particular,
Consider far pointers in a segmented memory model, where there may be more than one way to represent a given address (segment + offset, for example, on x86).
Consider pointers with bits that are ignored by processor. (For example, the Motorola 68000 processor, which has 32-bit pointers but the top 8 bits are ignored.)
So, unfortunately, there is no way to do this portably, according to the standard.
Remember: size_t is the maximum size of an object. It is not the size of your address space. It is entirely legal for size_t to have less range that uintptr_t and friends. Same with ptrdiff_t: it is entirely legal for ptrdiff_t to have less range than uintptr_t. Imagine, for example, a segmented memory model where you cannot allocate anything larger than a segment, in this case, size_t and ptrdiff_t might be able to represent the size of a segment but not the size of your address space.
According to Practice
On the computers which you use (modern 32-bit and 64-bit computers), a uintptr_t will just contain the pointer address. Subtract away. This is implementation-defined but not undefined behavior.
Do not subtract the original pointers without casting unless they point to the same object, or to the address past that object. Compilers can and will make aliasing assumptions when you use pointer arithmetic. Not only is your program "technically" wrong, but there is a long history of compilers producing bad code here.
There is a bit of an argument going on right now about what, exactly, it means for a pointer to point to the same object, but this argument was unresolved last time I checked.
This might be a very basic question that is already asked but I was not quite sure if the answer here Casting an int pointer to a char ptr and vice versa is applicable in my case.
So essentially I have something as follows:
void* head = sbrk(1024); //allocate 1024 bytes in heap
*((int*)(head+size)) = value; //value and size are int with valoues between 1 and 1023
I would like to know if for an arbitrary value of size the above does not work then what are the restrictions on the value of size? Does it have to be divisible by 4?
First of all, you can't do pointer arithmetic on void pointers. That code should not even compile.
For the sake of discussion, let us assume that you have a char pointer instead. Then formally, such casts followed by an access is undefined behavior. In the real world however, your code will always work if you can manually ensure alignment. You will have to ensure that the address where you write is at an aligned memory position, or there are no guarantees that the code will work.
EDIT with relevant quotes from the ISO 9899:2011 standard why pointer arithmetic on a void pointer is undefined behavior:
6.3.2.2 void
The (nonexistent) value of a void expression (an expression that has
type void) shall not be used in any way, and implicit or explicit
conversions (except to void) shall not be applied to such an
expression.
.
6.5.6 Additive operators
/--/
For addition, either both operands shall have arithmetic type, or one
operand shall be a pointer to a complete object type and the other
shall have integer type. (Incrementing is equivalent to adding 1.)
.
4 Conformance
If a ‘‘shall’’ or ‘‘shall not’’ requirement that appears outside of a
constraint or runtimeconstraint is violated, the behavior is
undefined. Undefined behavior is otherwise indicated in this
International Standard by the words ‘‘undefined behavior’’ or by the
omission of any explicit definition of behavior. There is no
difference in emphasis among these three; they all describe ‘‘behavior
that is undefined’’.
Whether code violating normative text in the standard "should compile" or not can certainly be debated, but I don't think that discussion is of benefit to the OP. Simply don't write code relying on undefined behavior, ever.
Use memcpy():
memcpy((char*)head + size, &value, sizeof(value));
On many systems, in this circumstance, it is required that size be a multiple of four (subject to additional conditions detailed below, including that the size of int be four bytes on your system). On systems that do not require this, it is usually preferred.
First, the type of head is void *, and the C standard does not define what happens when you do pointer arithmetic with void *.
Some compilers, notably GCC and its heirs, will treat this arithmetic as if the type were char *. I will proceed on this basis.
Second, I am not aware of a guarantee that sbrk returns an address with any particular alignment.
Let us suppose that sbrk does return a well-aligned address, and that your C implementation does the plain thing to evaluate * (int *) (head + size) = value, which is to issue a store instruction to write the value of value (converted to an int) to the address head + size.
Then your question becomes: What does my computing platform do with an int store to this address?
As long as head + size is an address suitably aligned for int on your platform, the store will execute as expected. On most platforms, four-byte integers prefer four-byte alignment, and eight-byte integers prefer eight-byte alignment. As long as head is aligned to a multiple of this preference and size is a multiple of this preference, then the store will execute normally.
Otherwise, what happens depends on your platform. On some platforms, the hardware executes the store but may do it more slowly than normal store instructions, because it breaks it into two separate writes to memory. (This also means that other processes sharing the same memory might be able to read memory while one part of the value has been stored but the other part has not. Again, this depends on the characteristics of your computing platform.)
On some platforms, the hardware signals an exception that interrupts program execution and transfers control to the operating system. Some operating systems fix up misaligned stores by analyzing the failing instruction and executing alternate instructions that perform the intended store (or the operating system relays the exception to special code in your program, possibly in automatically included libraries, that do this fix-up work). On these platforms, misaligned stores will be very slow; they can hugely degrade the performance of a program.
On some platforms, the hardware signals an exception, and the operating system does not fix up the misaligned store. Instead, the operating system either terminates your process or sends it a signal about the problem, which often results in your process terminating. (Other possibilities include triggering a debugger or entering special code you have included in your program to handle signals.)
I am working on turbo C on windows where char takes one byte.Now my problem is with the below union.
union a
{
unsigned char c:2;
}b;
void main()
{
printf("%d",sizeof(b)); \\or even sizeof(union a)
}
This program is printing output as 2 where as union should be taking only 1 byte. Why is it so?
for struct it is fine giving 1 byte but this union is working inappropriately.
And one more thing how to access these bit fields.
scanf("%d",&b.c); //even scanf("%x",b.c);
is not working because we cannot have address for bits.So we have to use another variable like below
int x;
scanf("%d",&x);
b.c=x;
cant we avoid it?? is there any other way???
Compilers are allowed to add padding to structs and unions and while, I admit, that it's a little surprising that yours does round up the union to a two byte size when you are able to get a one byte struct it is perfectly allowed.
In answer to your second question: no it's not avoidable. Bit fields are a struct packing optimization and the performance and convenience penalty to pay is that bit field members are not individually addressable.
Turbo C is based on 8086 microprocessor which has two byte word boundary. The atomic reading and writing is typically bound to CPU's architecture, so the compiler is adding some slack bytes to align your data structure.
Calling #pragma pack(1) may be able to disable it, but not sure if it works on Turbo C.
I'm not sure where you find the requirement that the union must be precisely the minimum size. An object must be at least as big as its members, but that is a lower bound only.
You can't take the address of a bitfield; what would be its type? It can't be int*. scanf(%d) will write sizeof(int) * CHAR_BIT bits to the int* you pass in. That's writing more than 2 bits, yet you don't have that space.
There is a paragraph in the standard that states there shall be no padding before the first member of a struct. But it does not say explicitly so about unions. The difference in size could come because it wants to align the union at 2 byte boundaries, but as it cannot pad before the first member of a struct, the struct will have one byte aligning. Also note that an union could have more members with different types, which could widen the required alignment of your union. There could be reasons for the compiler to give them at least 2 bytes alignment, for example to ease code that has to handle according the required aligment of an union.
Anyway, there is no requirement that your union should be one byte exactly. It just has to have place for all its members.
Here is what the C standard has to say about your second question:
The operand of the unary & operator shall be either a function
designator or an lvalue that designates an object that is not a
bit-field and is not declared with the register storage-class
specifier.
So your best bet is to use your way using the int. you may put braces around the code, so the temporary variable is kept local:
void func(void) { struct bits f; { int x; scanf("%d", &x); f.bitfield = x; } /* ... */ }
There is a lot of misinformation in the answers so I will clarify. It could be for one of 2 reasons (I am not familiar with the compiler).
The bitfield storage unit is 2.
Alignment is forced to word (2 byte) boundary.
I doubt it is the first case as it is a common extension to take the bitfield storage unit as the size of the declared "base" type. In this case the type is char which always has a size of 1.
[In standard you can only declare bitfields of type int or unsigned int and the "storage unit" in which bitfields are grouped is fixed (usually the same size as an int). Even a single bit bitfield will use one storage unit.]
In the 2nd case it is common for C compilers to implement #pragma pack to allow control of alignment. I suspect the default packing is 2 in which case a pad byte will be added at the end of the union. The way to avoid this is to use:
#pragma pack(1)
You should also use #pragma pack() afterward to set back to the default (or even better use the push and pop arguments if supported by your compiler).
To all the repliers who said that you must put up with what the compiler does, this is contrary to the spirit of C. You should be able to use bitfields to map to any size or bit order in situations where you have no control over it such as a file format or hardware mapping.
Of course this is highly non-portable since different implementations have different byte orders, orders that bits are added to a bitfield storage unit (from top or bottom), storage units size, default alignment etc.
As to your 2nd question, I can't see the problem, though I never use scanf as it is problematic.
In addition to the fact that there "there may also be unnamed padding at the end of a structure or union", the compiler is permitted to place a bitfield in "any addressable storage unit large enough to hold a bit-field". (both quotes are from the C90 standard - there is similar, but different, wording tin the C99 standard).
Also note that the standard says that a "bit-field shall have a type that is a qualified or unqualified version of int, unsigned int, or signed int", so having a bit-field in a char type is non-standard.
Because the behavior of bitfields are so dependent on unspecified compiler implementation details (there are several other non-portable issues with bit-fields that I have not mentioned) using them is almost always a bad idea. In particular, they are a bad idea when you are trying to model bit-fields in a file format, network protocol, or hardware register.
More information from another SO answer:
In general you should avoid bitfields
and use other manifest constants
(enums or whatever) with explicit bit
masking and shifting to access the
'sub-fields' in a field.
Here's one reason why bitfields should
be avoided - they aren't very portable
between compilers even for the same
platform. from the C99 standard
(there's similar wording in the C90
standard):
An implementation may allocate any
addressable storage unit large enough
to hold a bitfield. If enough space
remains, a bit-field that immediately
follows another bit-field in a
structure shall be packed into
adjacent bits of the same unit. If
insufficient space remains, whether a
bit-field that does not fit is put
into the next unit or overlaps
adjacent units is
implementation-defined. The order of
allocation of bit-fields within a unit
(high-order to low-order or low-order
to high-order) is
implementation-defined. The alignment
of the addressable storage unit is
unspecified.
You cannot guarantee whether a bit
field will 'span' an int boundary or
not and you can't specify whether a
bitfield starts at the low-end of the
int or the high end of the int (this
is independant of whether the
processor is big-endian or
little-endian).