I saw a code snippet from a good answer for Is it possible to insert three numbers into 2 bytes variable?
For example, I want to store date which contain days, months, years.
days -> 31, months -> 12, years -> 99.
I want to store 31, 12, 99 in one variable, and will use shift operators << and >> to manipulate it.
//Quoted: the C code from that answer
union mydate_struct {
struct {
uint16_t day : 5; // 0 - 31
uint16_t month : 4; // 0 - 12
uint16_t year : 7; // 0 - 127
};
uint16_t date_field;
};
Now, my question is to why wrap the struct with a union? What are the special benefits besides memory related concern?
PS: I know some typical usage to make sure memory size with union.
Because if it is just to use struct, it seems more direct and simple to use:
typedef struct {
uint16_t day : 5; // 0 - 31
uint16_t month : 4; // 0 - 12
uint16_t year : 7; // 0 - 127
} mydate_struct;
Update1:
Some conclusion about benefits to wrap a union here:
Can initailize the year, month and day simultaneously
The advantage of using the union is that give union my_datestruct u;
you can write u.date_field = 0x3456; and initialize the year, month
and day fields simultaneously. It is defined by the implementation
what that does, and different implementations could define it
differently. There's a modest chance that the year will be 0x56, the
month 0x08, and the day 0x06 (aka 86-08-06 — century not clearly
defined); there's also a modest chance that the year will be 0x1A, the
month 0x02, and the day 0x1A (aka 26-02-26 — century still not clearly
defined). People have forgotten Y2K already. ----comment of #Jonathan Leffler
You can read/write the whole number at once.(----comment of #StenSoft)
An union means that every part in it will use the same memory, so you can use the first or the second part (which can be completely different things). In your case, it´s either the whole struct or the uint16_t date_field.
In context of the linked question, the writer intended to use it to convert a struct with two byte size to a two byte integer and vice-versa. Assign something to the struct and read the int value from the same memory. But this is not allowed in C++ and may not work (multitude of reasons...). It´s not possible to arbitrarily switch between what part is used.
Union will share the memory among the members variables. So size of a union will be the size of the biggest element of its member variables. That is the reason struct wrapped within the union with variable uint16_t date_field;
So user can use 16 bits of memory for struct or variable date_field to keep the data.
Related
I'm obtaining data from an accelerometer and trying to log it to a file. However, I'm a little perplexed by the output I'm getting. I'm only logging one sample to ensure the data is written correctly.
I've created the following struct to group data:
struct AccData
{
int16_t x;
int16_t y;
int16_t z;
unsigned int time;
};
The above should amount to 10 bytes in total.
I'm writing to stdout and getting the following data from the sensor:
I (15866) Accelerometer: Measurement: X25 Y252 Z48 Time: 10
The data that's stored to the sd card looks like so:
1900FC00300000000A00
Splitting those up gives us:
1900 FC00 3000 00000A00
This is where I'm starting to get confused. The first 3 sectors only make sense if I reverse the order of the bytes such that:
X
Y
Z
Time
1900 -> 0019 = 25
1900 -> 0019 = 25
3000 -> 0030 = 48
00000A00 -> 000A0000 = 655.360
First, this may be due to my limited C knowledge, but is it normal for the output to be swapped like above?
Additionally, I can't get the time to make sense at all. It almost looks like only 3 bytes are being allocated for the unsigned integer, which would give the correct result if you didn't reverse it.
Like #Someprogrammerdude pointed out in the comments, this had to do with endianess and the fact that my struct was being padded, resulting in the struct being 12 bits instead of 10.
Accounting for the padding the data now looks like so:
1900FC00 30000000 0A000000,
Reading above with little endian made it make sense.
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 3 years ago.
Improve this question
I've inherited some code that uses the elm-chaN fatfs. It creates Directories and files with either missing or wrong date-time stamps. I've traced the problem to what appears to be an incorrect cast.
I changed the cast and things seem to work. But I would like confirmation from more experienced programmers that the original cast was incorrectly constructed (or a compiler issue)
Below is the original code: the curTime stuct assigns uint16_t to "Year". The other members are uint8_t types. They need to be packed into 32b word (DWORD) following the DOS date/time bitmap format.
{
DWORD tmr;
/* Pack date and time into a DWORD variable */
Calendar curTime = sdInterface->sdGetRTCTime();
tmr = ((DWORD)(curTime.Year-1980)<<25)
| ((DWORD)(curTime.Month) << 21)
| ((DWORD)(curTime.DayOfMonth) << 16)
| (curTime.Hours << 11)
| (curTime.Minutes << 5)
| (curTime.Seconds >> 1); //modified to truncate into two second intervals - jn
return tmr;
}
Below is the modified code: I explicitly cast the Hours, Minutes and seconds to DWORD type. It is puzzling as to why the original author would cast Month, DayOfMonth but not the other uint8_t types.
DWORD get_fattime(void)
{
DWORD tmr;
/* Pack date and time into a DWORD variable */
Calendar curTime = sdInterface->sdGetRTCTime();
tmr = ((DWORD)(curTime.Year-1980)<<25)
| ((DWORD)(curTime.Month) << 21)
| ((DWORD)(curTime.DayOfMonth) << 16)
| ((DWORD)(curTime.Hours) << 11)
| ((DWORD)(curTime.Minutes) << 5)
| ((DWORD)(curTime.Seconds) >> 1); //modified to truncate into two second intervals - jn
return tmr;
}
the code seems to work. Would like a sanity check from more experienced programmers.
I am updating this post to provide the requested information: first is the struct Calendar curTime.
//
//! \brief Used in the RTC_C_initCalendar() function as the CalendarTime
//! parameter.
//
//*****************************************************************************
typedef struct Calendar {
//! Seconds of minute between 0-59
uint8_t Seconds;
//! Minutes of hour between 0-59
uint8_t Minutes;
//! Hour of day between 0-23
uint8_t Hours;
//! Day of week between 0-6
uint8_t DayOfWeek;
//! Day of month between 1-31
uint8_t DayOfMonth;
//! Month between 1-12
uint8_t Month;
//! Year between 0-4095
uint16_t Year;
} Calendar;
The image below shows the directory of the SD Card running the original code (without the DWORD cast on the Hour, Minute and Time uint8_t objects). Notice the missing date time stamp on two of the files. The directories are also missing their date time stamps.
And the last two images below shows the results of the code with DWORD applied to the uint9_t objects. Both directories and files now have date time stamps.
From the comments I received so far, I am leaning towards this being a compiler error. The code was developed on earlier version of the compiler. This is a new compiler CCS v9.
[This is not really an answer, but it's too elaborate for a comment.]
Please try this program:
#include <stdio.h>
int main()
{
uint8_t b = 9;
DWORD w1 = b << 6;
DWORD w2 = (DWORD)b << 6;
printf("%d %d\n", (int)w1, (int)w2);
}
The expected output of this program is
576 576
Shifting right by 6 bits is equivalent to multiplying by 64, and 9 × 64 = 576, so this makes sense.
If, on the other hand, you get the output
64 576
I believe this indicates a bug in your compiler. 64 is what you get if you take 9 and shift it left by 6 bits within an 8-bit field, meaning that you lose a bit off the left.
That (almost) makes sense, and it's what the answer posted by #jaz_n is getting at. However, your compiler is not supposed to shift anything left within an 8-bit field. When you write
x << 6
where x is type uint8_t, the first thing the compiler is supposed to do is promote x to a full-width integer type, then shift it left by 6 bits, in a field that's as wide as type int on your machine (which will obviously be either 16 or 32 bits, typically).
This explains why the three "extra" casts to DWORD should not be necessary. The fact that you had to add them suggests that your compiler may have a problem. If your compiler generates code that prints 64 576 for my test, this is additional evidence that your compiler is wrong.
Minutes and Hours are 8 bit entities (uint8_t). In the first code snippet, the compiler did exactly what it was suppose to: shift Minutes and Hours to the left some number of bits. Those bits were lost and the residual bits were OR'd with the 32b DWORD. The bitmask failed because the most significant bits of Hours and Minutes were thrown away. In the second code snippet, Minutes and Hours are first cast into a 32b entity, now the shifted bits are preserved and the bitmask succeeded.
Seconds is shifted to the right with the intent to increase the granularity of the field (i.e. 2 Seconds per count) so technically you did not need to cast Seconds into a 32b entity.
C compilers have some option to compile in different mode(16-bit, 32-bit, 64-bit). From the code you provided above, it looks DWORD stands for 32-bit.
The original code may work fine if it is compiled in 16-bit mode, because in calculation 16-bit lengths is used by default without specific cast.
Please also keep in mind while you are migrating your code from one platform to another, or change the compile option from 16-bit to 32-bit or 64-bit some of your code may not work properly.
I recently saw this post about endianness macros in C and I can't really wrap my head around the first answer.
Code supporting arbitrary byte orders, ready to be put into a file
called order32.h:
#ifndef ORDER32_H
#define ORDER32_H
#include <limits.h>
#include <stdint.h>
#if CHAR_BIT != 8
#error "unsupported char size"
#endif
enum
{
O32_LITTLE_ENDIAN = 0x03020100ul,
O32_BIG_ENDIAN = 0x00010203ul,
O32_PDP_ENDIAN = 0x01000302ul
};
static const union { unsigned char bytes[4]; uint32_t value; } o32_host_order =
{ { 0, 1, 2, 3 } };
#define O32_HOST_ORDER (o32_host_order.value)
#endif
You would check for little endian systems via
O32_HOST_ORDER == O32_LITTLE_ENDIAN
I do understand endianness in general. This is how I understand the code:
Create example of little, middle and big endianness.
Compare test case to examples of little, middle and big endianness and decide what type the host machine is of.
What I don't understand are the following aspects:
Why is an union needed to store the test-case? Isn't uint32_t guaranteed to be able to hold 32 bits/4 bytes as needed? And what does the assignment { { 0, 1, 2, 3 } } mean? It assigns the value to the union, but why the strange markup with two braces?
Why the check for CHAR_BIT? One comment mentions that it would be more useful to check UINT8_MAX? Why is char even used here, when it's not guaranteed to be 8 bits wide? Why not just use uint8_t? I found this link to Google-Devs github. They don't rely on this check... Could someone please elaborate?
Why is a union needed to store the test case?
The entire point of the test is to alias the array with the magic value the array will create.
Isn't uint32_t guaranteed to be able to hold 32 bits/4 bytes as needed?
Well, more-or-less. It will but other than 32 bits there are no guarantees. It would fail only on some really fringe architecture you will never encounter.
And what does the assignment { { 0, 1, 2, 3 } } mean? It assigns the value to the union, but why the strange markup with two braces?
The inner brace is for the array.
Why the check for CHAR_BIT?
Because that's the actual guarantee. If that doesn't blow up, everything will work.
One comment mentions that it would be more useful to check UINT8_MAX? Why is char even used here, when it's not guaranteed to be 8 bits wide?
Because in fact it always is, these days.
Why not just use uint8_t? I found this link to Google-Devs github. They don't rely on this check... Could someone please elaborate?
Lots of other choices would work also.
The initialization has two set of braces because the inner braces initialize the bytes array. So byte[0] is 0, byte[1] is 1, etc.
The union allows a uint32_t to lie on the same bytes as the char array and be interpreted in whatever the machine's endianness is. So if the machine is little endian, 0 is in the low order byte and 3 is in the high order byte of value. Conversely, if the machine is big endian, 0 is in the high order byte and 3 is in the low order byte of value.
{{0, 1, 2, 3}} is the initializer for the union, which will result in bytes component being filled with [0, 1, 2, 3].
Now, since the bytes array and the uint32_t occupy the same space, you can read the same value as a native 32-bit integer. The value of that integer shows you how the array was shuffled - which really means which endian system are you using.
There are only 3 popular possibilities here - O32_LITTLE_ENDIAN, O32_BIG_ENDIAN, and O32_PDP_ENDIAN.
As for the char / uint8_t - I don't know. I think it makes more sense to just use uint_8 with no checks.
I have a 24 bit register that comprises a number of fields. For example, the 3 upper bits are "mode", the bottom 10 bits are "data rate divisor", etc. Now, I can just work out what has to go into this 24 bits and code it as a single hex number 0xNNNNNN. However, that is fairly unreadable to anyone trying to maintain it.
The question is, if I define each subfield separately what's the best way of coding it all together?
The classic way is to use the << left shift operator on constant values and combine all values with either + or |. For example:
*register_address = (SYNC_MODE << 21) | ... | DEFAULT_RATE;
Solution 1
The "standard" approach for this problem is to use a struct with bitfield members. Something like this:
typedef struct {
int divisor: 10;
unsigned int field1: 9;
char field2: 2;
unsigned char mode: 3;
} fields;
The numbers after each field name specify the number of bits used by that member. In the example above, field divisor uses 10 bits and can store values between -512 and 511 (signed integer) while mode can store unsigned values on 3 bits: between 0 and 7.
The range of values for each field use the usual rules regarding signed/unsigned and but the field length (char/int/long) is limited to the specified number of bits. Of course, a char can still hold up to 8 bits, a short up to 16 a.s.o. The coercion rules are the usual rules for the types of the fields, taking into account their size (i.e. storing -5 in mode will convert it to unsigned (and the actual value will probably be 3).
There are several issues you need to pay attention of (some of them are also mentioned in the Notes section of the documentation page about bit fields:
the total amount of bits declared in the structure must be 24 (the size of your register);
because your structure uses 3 bytes, it's possible that some positions in arrays of such structures to behave strange because they span the allocation unit size (which is usually 4 or 8 bytes, depending on the hardware);
the order of the bit fields in the allocation unit is not guaranteed by the standard; depending on the architecture, it's possible that in the final 3-bytes pack, the field mode contains either the most significant 3 bits or the least significant 3 bites; you can sort this thing out easily, though.
You probably need to handle the values you store in a fields structure all at once. For that you can embed the structure in an union:
typedef union {
fields f;
unsigned int a;
} reg;
reg x;
/* Access individual fields */
x.f.mode = 2;
x.f.divisor = 42;
/* Get the entire register */
printf("%06X\n", x.a);
Solution 2
An alternative way to do (kind of) the same thing is to use macros to extract the fields and to compose the entire register:
#define MAKE_REG(mode, field2, field1, divisor) \
((((mode) & 0x07) << 21) | \
(((field2) & 0x03) << 19) | \
(((field1) & 0x01FF) << 10 )| \
((divisor) & 0x03FF))
#define GET_MODE(reg) (((reg) & 0xE00000) >> 21)
#define GET_FIELD2(reg) (((reg) & 0x180000) >> 19)
#define GET_FIELD1(reg) (((reg) & 0x07FC00) >> 10)
#define GET_DIVISOR(reg) ((reg) & 0x0003FF)
The first macro assembles the mode, field2, field1, divisor values into a 3-bytes integer. The other set of macros extract the values of individual fields. All of them assume the processed numbers are unsigned.
Pros and cons
The struct (embedded in an union) solution:
[+] it allows the compiler to do some checks of the values you want to put into the fields (and issue warnings); also, it does the correct conversions between signed and unsigned;
The macro solution:
[+] it is not sensible to memory alignment issues, you put the bits exactly where you want;
(-) it doesn't check the range of the values you put in fields;
(-) the handling of signed values is a little bit trickier using macros; the macros suggested here work only for unsigned values; more shifting is required in order to use signed values.
The following code is a multi-threaded and is running for thread id=0 and 1 simultaneously.
typedef struct
{
unsigned char pixels[4];
} FourPixels;
main()
{
FourPixels spixels[];
//copy on spixels
spixels[id] = gpixels[id];
//example : remove blue component
spixels[id].pixels[0] &= 0xFC;
spixels[id].pixels[1] &= 0xFC;
spixels[id].pixels[2] &= 0xFC;
spixels[id].pixels[3] &= 0xFC;
}
We see that thread id =0 fetches 4 chars, and the thread id =1 fetches another set of 4 chars.
I want to know in memory how the structures spixels[0] and spixles[1] are put, means something like this?
spixels[0] spixels[1]
pixel[0] pixel[1] pixel[2] pixel[3] pixel[0] pixel[1] pixel[2] pixel[3]
2000 2001 2002 2003 2004 2005 2006 2007
The question is are spixel[0] and spixel[1] placed contiguously with guarantee as shown above?
Yes, they will be laid out contiguously as you say. Now, probably someone will come and say that it is not guaranteed on all platforms, because the alignment of the struct could be more than its size, so you could have a gap between the two struct "bodies" due to implicit padding after the first one. But no matter, because the alignment on any sane compiler and platform will be just 1 byte (as in char).
If I were writing code that relied on this, I'd add a compile-time assertion that the size of two of those structs should be exactly 8 bytes, and then I'd be 100% confident.
Edit: here's an example of how a compile-time check might work:
struct check {
char floor[sizeof(FourPixels[2]) - 8];
char ceiling[8 - sizeof(FourPixels[2])];
};
The idea is that if the size is not 8, one of the arrays will have negative size. If it is 8, they'll both have zero size. Note that this is a compiler extension (GCC supports zero-length arrays for example), so you may want to look for a better way. I'm more of a C++ person, and we have fancier tricks for this (in C++11 it's built in: static_assert()).
An array is guaranteed by the standard to be contiguous. It's also guaranteed that the first entry will be on a low address in memory, and the next will be on a higher, etc.
In the case the structures pixel array, pixel[1] will always come directly after pixel[0]. The same with the next entries.
Yes arrays are placed in contiguous memory location.
This is to allow the pointer arithmetic.