C cast: Is this expression coded correctly? [closed] - c

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 3 years ago.
Improve this question
I've inherited some code that uses the elm-chaN fatfs. It creates Directories and files with either missing or wrong date-time stamps. I've traced the problem to what appears to be an incorrect cast.
I changed the cast and things seem to work. But I would like confirmation from more experienced programmers that the original cast was incorrectly constructed (or a compiler issue)
Below is the original code: the curTime stuct assigns uint16_t to "Year". The other members are uint8_t types. They need to be packed into 32b word (DWORD) following the DOS date/time bitmap format.
{
DWORD tmr;
/* Pack date and time into a DWORD variable */
Calendar curTime = sdInterface->sdGetRTCTime();
tmr = ((DWORD)(curTime.Year-1980)<<25)
| ((DWORD)(curTime.Month) << 21)
| ((DWORD)(curTime.DayOfMonth) << 16)
| (curTime.Hours << 11)
| (curTime.Minutes << 5)
| (curTime.Seconds >> 1); //modified to truncate into two second intervals - jn
return tmr;
}
Below is the modified code: I explicitly cast the Hours, Minutes and seconds to DWORD type. It is puzzling as to why the original author would cast Month, DayOfMonth but not the other uint8_t types.
DWORD get_fattime(void)
{
DWORD tmr;
/* Pack date and time into a DWORD variable */
Calendar curTime = sdInterface->sdGetRTCTime();
tmr = ((DWORD)(curTime.Year-1980)<<25)
| ((DWORD)(curTime.Month) << 21)
| ((DWORD)(curTime.DayOfMonth) << 16)
| ((DWORD)(curTime.Hours) << 11)
| ((DWORD)(curTime.Minutes) << 5)
| ((DWORD)(curTime.Seconds) >> 1); //modified to truncate into two second intervals - jn
return tmr;
}
the code seems to work. Would like a sanity check from more experienced programmers.
I am updating this post to provide the requested information: first is the struct Calendar curTime.
//
//! \brief Used in the RTC_C_initCalendar() function as the CalendarTime
//! parameter.
//
//*****************************************************************************
typedef struct Calendar {
//! Seconds of minute between 0-59
uint8_t Seconds;
//! Minutes of hour between 0-59
uint8_t Minutes;
//! Hour of day between 0-23
uint8_t Hours;
//! Day of week between 0-6
uint8_t DayOfWeek;
//! Day of month between 1-31
uint8_t DayOfMonth;
//! Month between 1-12
uint8_t Month;
//! Year between 0-4095
uint16_t Year;
} Calendar;
The image below shows the directory of the SD Card running the original code (without the DWORD cast on the Hour, Minute and Time uint8_t objects). Notice the missing date time stamp on two of the files. The directories are also missing their date time stamps.
And the last two images below shows the results of the code with DWORD applied to the uint9_t objects. Both directories and files now have date time stamps.
From the comments I received so far, I am leaning towards this being a compiler error. The code was developed on earlier version of the compiler. This is a new compiler CCS v9.

[This is not really an answer, but it's too elaborate for a comment.]
Please try this program:
#include <stdio.h>
int main()
{
uint8_t b = 9;
DWORD w1 = b << 6;
DWORD w2 = (DWORD)b << 6;
printf("%d %d\n", (int)w1, (int)w2);
}
The expected output of this program is
576 576
Shifting right by 6 bits is equivalent to multiplying by 64, and 9 × 64 = 576, so this makes sense.
If, on the other hand, you get the output
64 576
I believe this indicates a bug in your compiler. 64 is what you get if you take 9 and shift it left by 6 bits within an 8-bit field, meaning that you lose a bit off the left.
That (almost) makes sense, and it's what the answer posted by #jaz_n is getting at. However, your compiler is not supposed to shift anything left within an 8-bit field. When you write
x << 6
where x is type uint8_t, the first thing the compiler is supposed to do is promote x to a full-width integer type, then shift it left by 6 bits, in a field that's as wide as type int on your machine (which will obviously be either 16 or 32 bits, typically).
This explains why the three "extra" casts to DWORD should not be necessary. The fact that you had to add them suggests that your compiler may have a problem. If your compiler generates code that prints 64 576 for my test, this is additional evidence that your compiler is wrong.

Minutes and Hours are 8 bit entities (uint8_t). In the first code snippet, the compiler did exactly what it was suppose to: shift Minutes and Hours to the left some number of bits. Those bits were lost and the residual bits were OR'd with the 32b DWORD. The bitmask failed because the most significant bits of Hours and Minutes were thrown away. In the second code snippet, Minutes and Hours are first cast into a 32b entity, now the shifted bits are preserved and the bitmask succeeded.
Seconds is shifted to the right with the intent to increase the granularity of the field (i.e. 2 Seconds per count) so technically you did not need to cast Seconds into a 32b entity.

C compilers have some option to compile in different mode(16-bit, 32-bit, 64-bit). From the code you provided above, it looks DWORD stands for 32-bit.
The original code may work fine if it is compiled in 16-bit mode, because in calculation 16-bit lengths is used by default without specific cast.
Please also keep in mind while you are migrating your code from one platform to another, or change the compile option from 16-bit to 32-bit or 64-bit some of your code may not work properly.

Related

Bitshifting vs array indexing, which is more appropriate for usart interfaces on 32bit MCUs

I have an embedded project with a USART HAL. This USART can only transmit or receive 8 or 16 bits at a time (depending on the usart register I chose i.e. single/double in/out). Since it's a 32-bit MCU, I figured I might as well pass around 32-bit fields as (from what I have been lead to understand) this is a more efficient use of bits for the MPU. Same would apply for a 64-bit MPU i.e. pass around 64-bit integers. Perhaps that is misguided advice, or advice taken out of context.
With that in mind, I have packed the 8 bits into a 32-bit field via bit-shifting. I do this for both tx and rx on the usart.
The code for the 8-bit only register is as follows (the 16-bit register just has half the amount of rounds for bit-shifting):
int zg_usartTxdataWrite(USART_data* MPI_buffer,
USART_frameconf* MPI_config,
USART_error* MPI_error)
{
MPI_error = NULL;
if(MPI_config != NULL){
zg_usartFrameConfWrite(MPI_config);
}
HPI_usart_data.txdata = MPI_buffer->txdata;
for (int i = 0; i < USART_TXDATA_LOOP; i++){
if((USART_STATUS_TXC & usart->STATUS) > 0){
usart->TXDATAX = (i == 0 ? (HPI_usart_data.txdata & USART_TXDATA_DATABITS) : (HPI_usart_data.txdata >> SINGLE_BYTE_SHIFT) & USART_TXDATA_DATABITS);
}
usart->IFC |= USART_STATUS_TXC;
}
return 0;
}
EDIT: RE-ENTERTING LOGIC OF ABOVE CODE WITH ADDED DEFINES FOR CLARITY OF TERNARY OPERATOR IMPLICIT PROMOTION PROBLEM DISCUSSED IN COMMENTS SECTION
(the HPI_usart and USART_data structs are the same just different levels, I have since removed the HPI_usart layer, but for the sake of this example I will leave it in)
#define USART_TXDATA_LOOP 4
#define SINGLE_BYTE_SHIFT 8
typedef struct HPI_USART_DATA{
...
uint32_t txdata;
...
}HPI_usart
HPI_usart HPI_usart_data = {'\0'};
const uint8_t USART_TXDATA_DATABITS = 0xFF;
int zg_usartTxdataWrite(USART_data* MPI_buffer,
USART_frameconf* MPI_config,
USART_error* MPI_error)
{
MPI_error = NULL;
if(MPI_config != NULL){
zg_usartFrameConfWrite(MPI_config);
}
HPI_usart_data.txdata = MPI_buffer->txdata;
for (int i = 0; i < USART_TXDATA_LOOP; i++){
if((USART_STATUS_TXC & usart->STATUS) > 0){
usart->TXDATAX = (i == 0 ? (HPI_usart_data.txdata & USART_TXDATA_DATABITS) : (HPI_usart_data.txdata >> SINGLE_BYTE_SHIFT) & USART_TXDATA_DATABITS);
}
usart->IFC |= USART_STATUS_TXC;
}
return 0;
}
However, I now realize that this is potentially causing more issues than it solves because I am essentially internally encoding these bits which then have to be decoded almost immediately when they are passed through to/from different data layers. I feel like it's a clever and sexy solution, but I'm now trying to solve a problem that I shouldn't have created in the first place. Like how to extract variable bit fields when there is an offset i.e. in gps nmea sentences where the first 8 bits might be one relevant field and then the rest are 32bit fields. So it ends up being like this:
32-bit array member 0:
bits 24-31 bits 15-23 bits 8-15 bits 0-7
| 8-bit Value | 32-bit Value A, bits 24-31 | 32-bit Value A, bits 16-23 | 32-bit Value A, bits 8-15 |
32-bit array member 1:
bits 24-31 bits 15-23 bits 8-15 bits 0-7
| 32-bit Value A, bits 0-7 | 32-bit Value B, bits 24-31 | 32-bit Value B, bits 16-23 | 32-bit Value B, bits 8-15 |
32-bit array member 2:
bits 24-31 15-23 8-15 ...
| 32-bit Value B, bits 0-7 | etc... | .... | .... |
The above example requires manual decoding, which is fine I guess, but it's different for every nmea sentence and just feels more manual than programmatic.
My question is this: bitshifting vs array indexing, which is more appropriate?
Should I just have assigned each incoming/outgoing value to a 32-bit array member and then just index that way? I feel like that is the solution since it would not only make it easier to traverse the data on other layers, but I would be able to eliminate all this bit-shifting logic and then the only difference between an rx or tx function would be the direction the data is going.
It does mean a small rewrite of the interface and the resulting gps module layer, but that feels like less work and also a cheap lesson early on in my project.
Also any thoughts and general experience on this would be great.
Since it's a 32-bit MCU, I figured I might as well pass around 32-bit fields
That's not really the programmer's call to make. Put the 8 or 16 bit variable in a struct. Let the compiler add padding if needed. Alternatively you can use uint_fast8_t and uint_fast16_t.
My question is this: bitshifting vs array indexing, which is more appropriate?
Array indexing is for accessing arrays. If you have an array, use it. If not, then don't.
While it is possible to chew through larger chunks of data byte by byte, such code must be written much more carefully, to prevent running into various subtle type conversion and pointer aliasing bugs.
In general, bit shifting is preferred when accessing data up to the CPU's word size, 32 bits in this case. It is fast and also portable, so that you don't have to take endianess in account. It is the preferred method of serialization/de-serialization of integers.

Why wrap a struct with a union?

I saw a code snippet from a good answer for Is it possible to insert three numbers into 2 bytes variable?
For example, I want to store date which contain days, months, years.
days -> 31, months -> 12, years -> 99.
I want to store 31, 12, 99 in one variable, and will use shift operators << and >> to manipulate it.
//Quoted: the C code from that answer
union mydate_struct {
struct {
uint16_t day : 5; // 0 - 31
uint16_t month : 4; // 0 - 12
uint16_t year : 7; // 0 - 127
};
uint16_t date_field;
};
Now, my question is to why wrap the struct with a union? What are the special benefits besides memory related concern?
PS: I know some typical usage to make sure memory size with union.
Because if it is just to use struct, it seems more direct and simple to use:
typedef struct {
uint16_t day : 5; // 0 - 31
uint16_t month : 4; // 0 - 12
uint16_t year : 7; // 0 - 127
} mydate_struct;
Update1:
Some conclusion about benefits to wrap a union here:
Can initailize the year, month and day simultaneously
The advantage of using the union is that give union my_datestruct u;
you can write u.date_field = 0x3456; and initialize the year, month
and day fields simultaneously. It is defined by the implementation
what that does, and different implementations could define it
differently. There's a modest chance that the year will be 0x56, the
month 0x08, and the day 0x06 (aka 86-08-06 — century not clearly
defined); there's also a modest chance that the year will be 0x1A, the
month 0x02, and the day 0x1A (aka 26-02-26 — century still not clearly
defined). People have forgotten Y2K already. ----comment of #Jonathan Leffler
You can read/write the whole number at once.(----comment of #StenSoft)
An union means that every part in it will use the same memory, so you can use the first or the second part (which can be completely different things). In your case, it´s either the whole struct or the uint16_t date_field.
In context of the linked question, the writer intended to use it to convert a struct with two byte size to a two byte integer and vice-versa. Assign something to the struct and read the int value from the same memory. But this is not allowed in C++ and may not work (multitude of reasons...). It´s not possible to arbitrarily switch between what part is used.
Union will share the memory among the members variables. So size of a union will be the size of the biggest element of its member variables. That is the reason struct wrapped within the union with variable uint16_t date_field;
So user can use 16 bits of memory for struct or variable date_field to keep the data.

Bit field extract with struct in c

I uses these two methods to get the bit field information from registers. The location of the bit field that I need extract is given by Intel Manual. Just as the code below. But the results I got are different with these two methods.
I cannot find any problems for these two methods. But for my understanding, maximum_power filed should not be '0' as the first method (It is the value that Intel has already defined in the register.)
Method 1:
typedef struct rapl_parameters_msr_t {
uint64_t thermal_spec_power : 15;
uint64_t : 1;
uint64_t minimum_power : 15;
uint64_t : 1;
uint64_t maximum_power : 15;
uint64_t : 1;
uint64_t maximum_limit_time_window : 6;
uint64_t : 10;
} rapl_parameters_msr_t;
uint64_t msr;
read_msr(cpu, 0x614, &msr);
rapl_parameters_msr_t domain_msr = *(rapl_parameters_msr_t *)&msr;
printf("%ld\n", domain_msr.thermal_spec_power); //print: 280
printf("%ld\n", domain_msr.minimum_power); //print: 192
printf("%ld\n", domain_msr.maximum_power); //print: 0
printf("%ld\n", domain_msr.maximum_limit_time_window); //print: 16
Method 2:
uint64_t
extractBitField(uint64_t inField, uint64_t width, uint64_t offset)
{
uint64_t bitMask;
uint64_t outField;
if ((offset+width) == 32)
{
bitMask = (0xFFFFFFFF<<offset);
}
else
{ /*Just keep the filed needs to be extrated*/
bitMask = (0xFFFFFFFF<<offset) ^ (0xFFFFFFFF<<(offset+width));
}
/*Move to the right most field to be calculated*/
outField = (inField & bitMask) >> offset;
return outField;
}
uint64_t flags;
read_msr(cpu, 0x614, &flags);
printf("thermal power: %d\n", extractBitField(flags,15,0)); //print: 280
printf("minimum power: %d\n", extractBitField(flags,15,16));//print: 192
printf("maximum power: %d\n", extractBitField(flags,15,32));//print: 0
printf("time window: %d\n", extractBitField(flags,6,48)); //print: 0
Do you have any insights where the problem would be?
Update:
Sorry for the confused part. I changed all type to be uint64_t, and the method 2 gets 0 for both maximum power and time window..
If compiler would make possible wrong result for method 1, I am still doubted how much I can trust for the method 2 result..
The following is the bit represented documentation from Intel Manual:
Thermal Spec Power (bits 14:0)
Minimum Power (bits 30:16)
Maximum Power (bits 46:32)
Maximum Time Window (bits 53:48)
Thank you for David, this is the right version for 64 bit extraction.
uint64_t
extractBitField(uint64_t inField, uint64_t width, uint64_t offset)
{
uint64_t bitMask;
uint64_t outField;
if ((offset+width) == 64)
{
bitMask = (0xFFFFFFFFFFFFFFFF<<offset);
}
else
{ /*Just keep the filed needs to be extrated*/
bitMask = (0xFFFFFFFFFFFFFFFF<<offset) ^ (0xFFFFFFFFFFFFFFFF<<(offset+width));
}
/*Move to the right most field to be calculated*/
outField = (inField & bitMask) >> offset;
return outField;
}
uint64_t flags;
read_msr(cpu, 0x614, &flags);
printf("thermal power: %d\n", extractBitField(flags,15,0)); //print: 280
printf("minimum power: %d\n", extractBitField(flags,15,16));//print: 192
printf("maximum power: %d\n", extractBitField(flags,15,32));//print: 0
printf("time window: %d\n", extractBitField(flags,6,48)); //print: 16
The ordering of bits in C bitfields is implementation defined, so be careful if you plan on using them-- the order you think you're getting may not be what you actually are. Check your compiler's documentation to see how it handles this.
Also, your second function takes in a uint32 while your first example is using a 64 bit struct, so your types aren't matching up. Can you correct that and update your results?
edit: Additionally, you have the time windows defined as six bits in the first example and 15 in the second.
C99:6.7.2.1p10 An implementation may allocate any addressable storage
unit large enough to hold a bit- field. If enough space remains, a
bit-field that immediately follows another bit-field in a structure
shall be packed into adjacent bits of the same unit. If insufficient
space remains, whether a bit-field that does not fit is put into the
next unit or overlaps adjacent units is implementation-defined. The
order of allocation of bit-fields within a unit (high-order to
low-order or low-order to high-order) is implementation-defined. The
alignment of the addressable storage unit is unspecified.
You have tried two ways to do the same thing, and I wouldn't trust either of them.
First, bit fields. Don't use them! The ordering of bit fields is unreliable, the behaviour of anything other than unsigned int is unreliable, the distribution of bit fields across struct members is unreliable. All these things can be fixed, but it just isn't worth it.
Second, shift and mask. This is the right way to do it but the code is wrong. You have a 32-bit mask (0xffffffff) shifted by 32 and 48 bits. Not a good idea at all.
So, what you need to do is write a simple reliable function that is an implementation of the signature given.
extractBitField(uint64_t inField, uint64_t width, uint64_t offset)
This is a good place to start. Write the function in a test program and unit test it until you are 100% certain it works exactly right. Step through with the debugger, check out all the shift combinations. Be absolutely sure you have it right.
When the test program works properly then transfer the function to the real program and watch it work first time.
I guess I could code that function for you but I don't think I will. You really need to go through the exercise so you learn how it works and why.

logic operators & bit separation calculation in C (PIC programming)

I am programming a PIC18F94K20 to work in conjunction with a MCP7941X I2C RTCC ship and a 24AA128 I2C CMOS Serial EEPROM device. Currently I have code which successfully intialises the seconds/days/etc values of the RTCC and starts the timer, toggling a LED upon the turnover of every second.
I am attempting to augment the code to read back the correct data for these values, however I am running into trouble when I try to account for the various 'extra' bits in the values. The memory map may help elucidate my problem somewhat:
Taking, for example, the hours column, or the 02h address. Bit 6 is set as 1 to toggle 12 hour time, adding 01000000 to the hours bit. I can read back the entire contents of the byte at this address, but I want to employ an if statement to detect whether 12 or 24 hour time is in place, and adjust accordingly. I'm not worried about the 10-hour bits, as I can calculate that easily enough with a BCD conversion loop (I think).
I earlier used the bitwise OR operator in C to augment the original hours data to 24. I initialised the hours in this particular case to 0x11, and set the 12 hour control bit which is 0x64. When setting the time:
WriteI2C(0x11|0x64);
which as you can see uses the bitwise OR.
When reading back the hours, how can I incorporate operators into my code to separate the superfluous bits from the actual time bits? I tried doing something like this:
current_seconds = ReadI2C();
current_seconds = ST & current_seconds;
but that completely ruins everything. It compiles, but the device gets 'stuck' on this sequence.
How do I separate the ST / AMPM / VBATEN bits from the actual data I need, and what would a good method be of implementing for loops for the various circumstances they present (e.g. reading back 12 hour time if bit 6 = 0 and 24 hour time if bit6 = 1, and so on).
I'm a bit of a C novice and this is my first foray into electronics so I really appreciate any help. Thanks.
To remove (zero) a bit, you can AND the value with a mask having all other bits set, i.e., the complement of the bits that you wish to zero, e.g.:
value_without_bit_6 = value & ~(1<<6);
To isolate a bit within an integer, you can AND the value with a mask having only those bits set. For checking flags this is all you need to do, e.g.,
if (value & (1<<6)) {
// bit 6 is set
} else {
// bit 6 is not set
}
To read the value of a small integer offset within a larger one, first isolate the bits, and then shift them right by the index of the lowest bit (to get the least significant bit into correct position), e.g.:
value_in_bits_4_and_5 = (value & ((1<<4)|(1<<5))) >> 4;
For more readable code, you should use constants or #defined macros to represent the various bit masks you need, e.g.:
#define BIT_VBAT_EN (1<<3)
if (value & BIT_VBAT_EN) {
// VBAT is enabled
}
Another way to do this is to use bitfields to define the organisation of bits, e.g.:
typedef union {
struct {
unsigned ones:4;
unsigned tens:3;
unsigned st:1;
} seconds;
uint8_t byte;
} seconds_register_t;
seconds_register_t sr;
sr.byte = READ_ADDRESS(0x00);
unsigned int seconds = sr.seconds.ones + sr.seconds.tens * 10;
A potential problem with bitfields is that the code generated by the compiler may be unpredictably large or inefficient, which is sometimes a concern with microcontrollers, but obviously it's nicer to read and write. (Another problem often cited is that the organisation of bit fields, e.g., endianness, is largely unspecified by the C standard and thus not guaranteed portable across compilers and platforms. However, it is my opinion that low-level development for microcontrollers tends to be inherently non-portable, so if you find the right bit layout I wouldn't consider using bitfields “wrong”, especially for hobbyist projects.)
Yet you can accomplish similarly readable syntax with macros; it's just the macro itself that is less readable:
#define GET_SECONDS(r) ( ((r) & 0x0F) + (((r) & 0x70) >> 4) * 10 )
uint8_t sr = READ_ADDRESS(0x00);
unsigned int seconds = GET_SECONDS(sr);
Regarding the bit masking itself, you are going to want to make a model of that memory map in your microcontroller. The simplest, cudest way to do that is to #define a number of bit masks, like this:
#define REG1_ST 0x80u
#define REG1_10_SECONDS 0x70u
#define REG1_SECONDS 0x0Fu
#define REG2_10_MINUTES 0x70u
...
And then when reading each byte, mask out the data you are interested in. For example:
bool st = (data & REG1_ST) != 0;
uint8_t ten_seconds = (data & REG1_10_SECONDS) >> 4;
uint8_t seconds = (data & REG1_SECONDS);
The important part is to minimize the amount of "magic numbers" in the source code.
Writing data:
reg1 = 0;
reg1 |= st ? REG1_ST : 0;
reg1 |= (ten_seconds << 4) & REG1_10_SECONDS;
reg1 |= seconds & REG1_SECONDS;
Please note that I left out the I2C communication of this.

What is the smallest number of bytes that can store a timestamp?

I want to create my own time stamp data structure in C.
DAY ( 0 - 30 ), HOUR ( 0 - 23 ), MINUTE ( 0 - 59 )
What is the smallest data structure possible?
Well, you could pack it all in an unsigned short (That's 2 bytes, 5 bits for Day, 5 bits for hour, 6 bits for minute)... and use some shifts and masking to get the values.
unsigned short timestamp = <some value>; // Bits: DDDDDHHHHHMMMMMM
int day = (timestamp >> 11) & 0x1F;
int hour = (timestamp >> 6) & 0x1F;
int min = (timestamp) & 0x3F;
unsigned short dup_timestamp = (short)((day << 11) | (hour << 6) | min);
or using macros
#define DAY(x) (((x) >> 11) & 0x1F)
#define HOUR(x) (((x) >> 6) & 0x1F)
#define MINUTE(x) ((x) & 0x3F)
#define TIMESTAMP(d, h, m) ((((d) & 0x1F) << 11) | (((h) & 0x1F) << 6) | ((m) & 0x3F)
(You didn't mention month/year in your current version of the question, so I've omitted them).
[Edit: use unsigned short - not signed short.]
Do you mean HOUR 0-23 and MINUTE 0-59? I've heard of leap seconds but not leap minutes or hours.
(log (* 31 60 24) 2)
=> 15.446
So you can fit these values 16 bits, or 2 bytes. Whether this is a good idea or not is a completely different question.
Month: range 1 - 12 => 4 bits
Date: range 1 - 31 => 5 bits
Hour: range 0 - 24 => 5 bits
Minute: range 0 - 60 => 6 bits
Total: 20 bits
You can use a bitfield and use a compiler/platform specific pragma to keep it tight:
typedef struct packed_time_t {
unsigned int month : 4;
unsigned int date : 5;
unsigned int hour : 5;
unsigned int minute : 6;
} packed_time_t;
But do you really need this? Wouldn't the standard time functions be enough? Bitfields vary depending on architecture, padding and so on ... not a portable construct.
Why not just use the (4-byte?) output of the C time() function with NULL as an argument. It's just the Unix epoch time (i.e. the number of seconds since January 1st, 1970). Like Joe's answer, it gives you much more room to grow than any answer that tries to pack in months and days and years into bits. It's standard. Converting the time_t variable to an actual time is trivial in standard C (on Unix, at least) and most of the time, if you have a data structure intended to hold a 3 byte variable, it may be rounded up to 4 bytes anyway.
I know you're trying to optimize heavily for size, but 4 bytes is pretty damn small. Even if you truncate off the top byte, you still get 194 days of distinct times out of it.
You can get even more out of this by taking the time from time(NULL) and dividing it by 60 before storing it, truncating it to a minute and storing that. 3 bytes of that gives you, as shown above, 388 months, and for 2 bytes you can store 45 days.
I would go with the 4-byte version, simply because I don't see the difference between 2, 3 and 4 bytes as being at all significant or vital to any program running or not (unless it's a bootloader). It's simpler to get and simpler to handle, and will probably save you many headaches in the end.
EDIT: The code I posted didn't work. I've had 3 hours of sleep and I'll figure out how to do the bit-twiddling correctly eventually. Until then, you can implement this yourself.
Note: The original question has been edited, and the month is no longer necessary. The original calculations were below:
It's simply a matter of how much computation you want to do. The tightest way to pack it is if you can make your own type, and use the following math to convert from and to its corresponding integer:
Valid ranges are:
Month: 1-12 -> (0-11)+1
Day: 1-31 -> (0-30)+1
Hour: 0-24
Minute: 0-60
You can choose an order to store the values in (I'll keep it in the above order).
Month-1 Day-1 Hour Minute
(0-11) (0-30) (0-23) (0-59)
Do a bit of multiplication/division to convert the values using the following formula as a guide:
value = (((Month - 1) * 31 + (Day - 1)) * 24 + Hour) * 60 + Minute
So, you have the minimum value 0 and the maximum value ((11*31+30)*24+23)*60+59, which is 535,679. So you need 20 bits minimum to store this value as an unsigned integer (2^20-1 = 1,048,575; 2^19-1 = 524,287).
If you want to make things dificult but save a byte, you can use 3 bytes and manipulate them yourself. Or you can use an int (32-bit) and work with it normally using simple math operators.
BUT There's some room to play with there though, so let's see if we can make this easier:
Valid ranges are, again:
Month: 1-12 -> (0-11)+1 --- 4 bits (you don't even need the -1)
Day: 1-31 -> (0-30)+1 --- 5 bits (you again don't need the -1)
Hour: 0-24 --- 5 bits
Minute: 0-60 --- 6 bits
That's a total of 20 bits, and really easy to manipulate. So you don't gain anything by compacting any further than using simple bit-shifting, and you can store the value like this:
19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
---Month--- ----Day------- ---Hour--- --Minute---
If you don't care about the month, the tightest you can get is:
value = ((Day - 1) * 24 + Hour) * 60 + Minute
leaving you with a range of 0 to 44,639 which can fit neatly in a 16-bit short.
There's some room to play with there though, so let's see if we can make this easier:
Valid ranges are, again:
Day: 1-31 -> (0-30)+1 --- 5 bits (you don't even need the -1)
Hour: 0-24 --- 5 bits
Minute: 0-60 --- 6 bits
That's a total of 16 bits, and again really easy to manipulate. So....store the value like this:
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
----Day------- ---Hour--- --Minute---
For the use case you describe, (minute resolution of times in a 31 day range) I'd just use a 16-bit minute counter. If you're serializing this data (to disk, network) then you can use some variable length integer encoding to save bytes for small values.
In general you can compute this answer as follows (where log2 is the base 2 logarithm, i.e. the number of bits):
If you want to use shifts and masks to get the data in and out, take log2() of the number of possible values for each field, round up (to get bits), add the results (to get total bits), divide by eight (total bytes, w. fractional bytes), and round up again (total bytes).
log2(60) + log2(60) + log2(24) + log2(31) + log2(12) = 6+6+5+5+4 = 26 bits = 4 bytes
If you want to get the fields in and out by multiplying & adding / dividing & modulo, multiply together the number of possible values for each field and take log2() of that, divide by eigth, and round up.
log2(60*60*24*31*12) = 24.9379 bits = 4 bytes
You can save a tiny additional amount of space by combining non-isoformal fields (e.g. storing day of year rather than month and day of month) but it is seldom worth it.
log2(60*60*24*366) = 24.91444 bits = 4 bytes
-- MarkusQ "teach a man to fish"
just to offer an alternative:
if you only need minute-level resolution,
and you don't cross date boundaries (month/year)
and your messages are sequential with guaranteed delivery
then you can store the timestamp as an offset from the timestamp of the last message.
In this case, you only need enough bits to hold the maximum number of minutes between messages. For example, if you emit messages at most 255 minutes apart, then one byte will suffice.
Note, however, that the very first message may need to include an absolute timestamp in its payload, for synchronization.
[i'm not saying this is a good solution - it's fairly fragile and makes a lot of assumptions - just an alternative one]
60 Minutes/Hour means you'd need at least 6 bits to store the minute (since 59th minute == 111011b), while 24 Hours/Day means another 5 bits (23rd hour == 10111b). If you want to account for any of the (possibly) 366 Days/Year, you'd need 9 more bits (366th day (365 when day 1 == 0) == 101101101b). So if you wanted to store everything in a purely accessible format, you'd need 20 bits == 3 Bytes. Alternatively, adding a Month field would make the total possible Days value go from 366 to 31 -- down to 5 bits, with 4 more bits for the month. This would also give you 20 bits, or 3 bytes with 4 bits to spare.
Conversely, if you kept track of the date just by minutes from some start date, 3 bytes would give you a resolution of 16,777,215 minutes before you rolled over to 0 again -- that's about 279,620 hours, 11,650 days, and about 388 months, and that's using all 24 bits. That's probably a better way to go, if you don't care about seconds, and if you don't mind taking a little bit of execution time to interpret the hour, day and month. And this would be much easier to increment!
5 bits for the day plus
5 bits for the hour plus
6 bits for the minute equals an unsigned short. Any further packing would not reduce the storage space required and would increase code complexity and cpu usage.
Well, disregarding the superfluous HOUR 24 and MINUTE 60, we have 31 x 24 x 60 = 44,640 possible unique time values. 2^15 = 32,768 < 44,640 < 65,536 = 2^16 so we'll need at least 16 bits (2 bytes) to represent these values.
If we don't want to be doing modulo arithmetic to access the values each time, we need to be sure to store each in its own bit field. We need 5 bits to store the DAY, 5 bits to store the HOUR, and 6 bits to store the MINUTE, which still fits in 2 bytes:
struct day_hour_minute {
unsigned char DAY:5;
unsigned char HOUR:5;
unsigned char MINUTE:6;
};
Including the MONTH would increase our unique time values by a factor of 12, giving 535,680 unique values, which would require at least 20 bits to store (2^19 = 524,288 < 535,680 < 1,048,576 = 2^20), which requires at least 3 bytes.
Again, to avoid modulo arithmetic, we need a separate bit field for MONTH, which should only require 4 bits:
struct month_day_hour_minute {
unsigned char MONTH:4;
unsigned char DAY:5;
unsigned char HOUR:5;
unsigned char MINUTE:6;
unsigned char unused: 4;
};
In both of these examples however, be aware that C prefers its data structures be on-cut - that is, that they are multiples of 4 or 8 bytes (usually), so it may pad your data structures beyond what is minimally necessary.
For example, on my machine,
#include <stdio.h>
struct day_hour_minute {
unsigned int DAY:5;
unsigned int HOUR:5;
unsigned int MINUTE:6;
};
struct month_day_hour_minute {
unsigned int MONTH:4;
unsigned int DAY:5;
unsigned int HOUR:5;
unsigned int MINUTE:6;
unsigned int unused: 4;
};
#define DI( i ) printf( #i " = %d\n", i )
int main(void) {
DI( sizeof(struct day_hour_minute) );
DI( sizeof(struct month_day_hour_minute) );
return 0;
}
prints:
sizeof(struct day_hour_minute) = 4
sizeof(struct month_day_hour_minute) = 4
To simplify this without loss of generality,
Day (0 - 30), Hour (0 - 23), Minute (0 - 59)
encoding = Day + (Hour + (Minute)*24)*31
Day = encoding %31
Hour = (encoding / 31) % 24
Minute = (encoding / 31) / 24
The maximum value of encoding is 44639 which is slightly less than 16 bits.
Edit: rampion said basically the same thing. And this gets you the minimal representation, which is less than the bitwise interleaving representation.

Resources