Is there a benefit in using unsigned long for timeval members? - c

I noticed some programmers use unsigned long for tv_sec and tv_usec [when they copy them or operate with them] of timeval while they are defined as simply long.
Though it does make me wonder why they were defined like that when time usually goes forward.

Using long int for those variables will work until year 2038, and after that the tv_sec will overflow on machines where long is 4bytes.
timeval shall be defined as:
The <sys/time.h> header shall define the timeval structure that includes at least the following members:
time_t tv_sec Seconds.
suseconds_t tv_usec Microseconds.
You should notice that time_t type is used instead of long, but which is also a 32bit representation on some systems while there are even 64bit representations on other systems. In order to avoid overflow, time_t will probably be changed to an unsigned 32bit integer or a 64bit one.
That is why some are using unsigned long, as it will stop the overflow until year 2100+. You should use the time_t type instead, and you won't need to think about how long your program is supposed to run for in the future.

When unix time was invented, negative times probably made sense. Like, AT&T needed adequate timestamps for things that happened in the 1960s.
As for microseconds, if you subtract two values you can go into negative numbers with signed value, and into 4+billions with unsigned. Comparing with 0 seems more intuitive.

tv_sec has type time_t. tv_usec has type long, and needs to be signed because you will (on average 50% of the time) get negative results in tv_usec when subtracting timeval values to compute an interval of time, and you have to detect this and convert it to a borrow from the tv_sec field. The standard (POSIX) could have instead made the type unsigned and required you to detect wrapping in advance, but it didn't, probably because that would be harder to use and contradict existing practice.
There is also no reason, range-wise, for tv_usec to be unsigned, since the maximum range it really needs to be able to represent is -999999 to 1999998 (or several times that if you want to accumulate several additions/subtractions before renormalizing).

Related

Is it possible to create a custom sized variable type in c?

Good evening, sorry in advance if I have a bad English, i'm French.
So, in C, there is different variable types, for example int, long, ... That takes a number of bytes depending of the type, and if I'm not wrong the "largest" type is long long int (or just long long) that takes 8 bytes of memory (like long which is weird so if someone could explain me that too thanks)
So my first question is: can I create my custom variable type that takes for example 16 bytes or am I forced to use strings if the number is too high for long long (or unsigned long long) ?
You can create custom types of all sorts, and if you want a "integer" type that is 16 bytes wide you could create a custom struct and pair two long longs together. But then you'd have to implement all the arithmetic on those types manually. This was quite common in the past when 16 bit (and even 32 bit) machines were most common, you'd have "bigint" libraries to do like 64-bit integer math. That's less useful now that most machines are either 64 bit or have long long support natively on 32 bit targets.
You used to see libraries with stuff like this quite often:
typedef struct _BigInt {
unsigned long long high;
unsigned long long low;
} BigInt;
// Arithmetic functions:
BigInt BigIntAdd(BigInt a, BigInt b);
// etc.
These have faded away somewhat because the current typical CPU register width is 64 bits, which allows for an enormous range of values, and unless you're working with very specialized data, it's not longer "common" in normal programming tasks to need values outside that range. As #datenwolf is explicit and correct about in the comments below, if you find the need for such functionality in production code, seek out a reliable and debugged library for it. (Writing your own could be a fun exercise, though this sort of thing is likely to be a bug farm if you try to just whip it up as a quick step along the way to other work.) As Eric P indicates in the comments above, clang offers a native way of doing this without a third party library.
(The weird ambiguities or equivalencies about the widths of long and long long are mostly historical, and if you didn't evolve with the platforms it's confusing and kind of unnecessary. See the comment on the question about this-- the C standard defines minimum sizes for the integer types but doesn't say they have to be different from each other; historically the types char, short, int, long and long long were often useful ways of distinguishing e.g. 8, 16, 32, and 64 bit sizes but it's a bit of a mess now and if you want a particular size modern platforms provide a uint32_t to guarantee size rather than using the "classic" C types.)
Obviously you can. By preference you should not use string, because computations with those will be a lot more complicated and slower.
Also, you may not want to use bytes, but the 2nd largest datatype available on your compiler, because detecting overflow can be cumbersome if you're using the largest datatype.

Determine the fastest unsigned integer type for basic arithmetic calculations

I'm writing some code for calculating with arbitrarily large unsigned integers. This is just for fun and training, otherwise I'd use libgmp. My representation uses an array of unsigned integers and for chosing the "base type", I use a typedef:
#include <limits.h>
#include <stdint.h>
typedef unsigned int hugeint_Uint;
typedef struct hugeint hugeint;
#define HUGEINT_ELEMENT_BITS (CHAR_BIT * sizeof(hugeint_Uint))
#define HUGEINT_INITIAL_ELEMENTS (256 / HUGEINT_ELEMENT_BITS)
struct hugeint
{
size_t s; // <- maximum number of elements
size_t n; // <- number of significant elements
hugeint_Uint e[]; // <- elements of the number starting with least significant
};
The code is working fine, so I only show the part relevant to my question here.
I would like to pick a better "base type" than unsigned int, so the calculations are the fastest possible on the target system (e.g. a 64bit type when targeting x86_64, a 32bit type when targeting i686, an 8bit type when targeting avr_attiny, ...)
I thought that uint_fast8_t should do what I want. But I found out it doesn't, see e.g. here the relevant part of stdint.h from MinGW:
/* 7.18.1.3 Fastest minimum-width integer types
* Not actually guaranteed to be fastest for all purposes
* Here we use the exact-width types for 8 and 16-bit ints.
*/
typedef signed char int_fast8_t;
typedef unsigned char uint_fast8_t;
The comment is interesting: for which purpose would an unsigned char be faster than an unsigned int on win32? Well, the important thing is: uint_fast8_t will not do what I want.
So is there some good and portable way to find the fastest unsigned integer type?
It's not quite that black and white; processors may have different/specialized registers for certain operations, like AVX registers on x86_64, may operate most efficiently on half-sized registers or not have registers at all. The choice of the "fastest integer type" thus depends heavily on the actual calculation you need to perform.
Having said that, C99 defines uintmax_t which is meant to represent the maximum width unsigned integer type, but beware, it could be 64 bit simply because the compiler is able to emulate 64-bit math.
If you target commodity processors, size_t usually provides a good approximation for the "bitness" of the underlying hardware because it is directly tied to the memory addressing capability of the machine, and as such is most likely to be the most optimal size for integer math.
In any case you're going to have to test your solution on all hardware that you're planning to support.
It's a good idea to start your code with the largest integer type the platform has, uintmax_t. As has already been pointed out, this is not necessarily, but rather most probably the fastest. I'd rather say there are exceptions where this is not the case, but as a default, it is probably your best bet.
Be very careful to build the size granularity into expressions that the compiler can resolve at compile type, rather than runtime for speed.
It is most probably a good idea to define the base type as something like
#define LGINT_BASETYPE uintmax_t
#define LGINT_GRANUL sizeof(LGINT_BASETYPE)
This will allow you to change the base type in one single place and adapt to different platforms quickly. That results in code that is easily moved to a new platform, but still can be adapted to the exception cases where the largest int type is not the most performant one (after you have proven that by measurement)
As always, it does not make a lot of sense to think about optimal performance when designing your code - Start with a reasonable balance of "designed for optimization" and "design for maintainability" - You might easily find out that the choice of base type is not really the most CPU-eating part of your code. In my experience, I was nearly always in for some surprises when comparing my guesses on where CPU is spent to my measurements. Don't fall into the premature optimization trap.

Why does `gmtime` take a pointer?

According to the documentation the struct tm *gmtime(const time_t *timer); is supposed to convert the time_t pointed to by timer to a broken down time.
Now is there a reason why they decided to make the function take a pointer to the time_t instead of passing the time_t directly?
As far as I can see time_t is of arithmetic type and should therefore have been possible to pass directly (also I find it reasonable that it would have fit into a long). Also there seem to be any specific handling of the NULL pointer (which could have motivated passing a pointer).
Is there something I'm missing? Something still relevant today?
From what I've seen, it's more a historical quirk. When time.h was first introduced, and with it functions like time, it used a value that could not be returned (ie: no long int etc). The standard defined an obscure time_t type that still leaves a lot of room for vendors to implement in weird ways (it has to be an arithmetic type, but no ranges or max values are defined for example) - C11 standard:
7.27.1 Components of time.
[...]
3 .The types declared are size_t (described in 7.19);
clock_t
and
time_t
which are real types capable of representing times;
4. The range and precision of times representable in clock_t and time_t are
implementation-defined.
size_t is described, in C11, as "is the unsigned integer type of the result of the sizeof operator;"
In light of this, your comment ("I find it reasonable that it would have fit into a long") is an understandable one, but it's incorrect, or at the very least inaccurate. POSIX, for example requires time_t to be an integer or real floating type. A long fits that description, but so would a long double which doesn't fit in a long. A more accurate assumption to make would be that the minimum size of time_tis a 32bit int (until 2038 at least), but that time_t preferably is a 64bit type.
Anyway, back in those days, if a value couldn't be returned, the only alternative was to pass the memory to the function (which is a sensible thing to do).
This is why we have functions like
time_t time(time_t *t);
It doesn't really make sense to set the same value twice: once by returning it, and once using indirection, but the argument is there because originally, the function was defined as
time(time_t *t)
Note the lack of a return type, if time were added today, it'd either be defined as void time( time_t * ), or if the committee hadn't been drinking, and realized the absurdity of passing a pointer here, they'd define it as time_t time ( void );
Looking at the C11 standard regarding the time function, it seems that the emphasis of the functions' behaviour is on the return value. The pointer argument is mentioned briefly but it certainly isn't given any significance:
7.27.2.4 The time function
1. Synopsis
#include <time.h>
time_t time(time_t *timer);
2. Description
The time function determines the current calendar time. The encoding of the value is
unspecified.
3. Returns
The time function returns the implementation’s best approximation to the current
calendar time. The value (time_t)(-1) is returned if the calendar time is not
available. If timer is not a null pointer, the return value is also assigned to the object it
points to.
The main thing we can take from this is that, as far as the standard goes, the pointer is just a secondary way of getting at the return value of the function. Given that the return value indicates something went wrong ((time_t)(-1)), I'd argue that we should treat this function as though it was meant to be time_t time( void ).
But because the old implementation is still kicking around, and we've all gotten used to it, it's one of those things that should've been marked for deprecation, but because it never really was, it probably will be part of the C language for ever...
The only other reason why functions use time_t like this (const) is either historical, or to maintain a consistent API across the time.h API. AFAIK that is
TL;DR
Most time.h function use pointers to the time_t type for historical, compatibility, and consistency reasons.
I knew I read that stuff about the early days of the time function before, here's a related SO answer
The reason is probably that in the old days a parameter could be no larger than an integer, so you couldn't pass a long and had to do that as a pointer. The definition of the function never changed.

Software exposed Bit width in C

I have two questions:
Is there any method to specify or limit the bit widths used for integer variables in a C program?
Is there any way to monitor the actual bit usage for a variable in a C program? What I mean by bit usage is, in some programs when a register is allocated for a variable not all the bits of that register are used for calculations. Hence when a program is executed can we monitor how many bits in a register have been actually changed through out the execution of a program?
You can use fixed width (or guaranteed-at-least-this-many-bits) types in C as of the 1999 standard, see e.g. Wikipedia or any decent C description, defined in the inttypes.h C header (called cinttypes in C++), also in stdint.h (C) or cstdint (C++).
You certainly can check for each computation what the values of the values could be, and limit the variables acordingly. But unless you are seriously strapped for space, I'd just forget about this. In many cases using "just large enough" data types wastes space (and computation time) by having to cast small values to the natural widths for computation, and then cast back. Beware of premature optimization, and even more of optimizing the wrong code (measure if the performance is enough, and if not, where modifications are worthwhile, before digging in to make code "better").
You have limited control if you use <stdint.h>.
On most systems, it will provide:
int8_t and uint8_t for 8-bit integers.
int16_t and uint16_t for 16-bit integers.
int32_t and uint32_t for 32-bit integers.
int64_t and uint64_t for 64-bit integers.
You don't usually get other choices. However, you might be able to use a bit-field to get a more arbitrary size value:
typedef struct int24_t
{
signed int b24:24;
} int24_t;
This might occupy more than 24 bits (probably 32 bits), but arithmetic will end up being 24-bit. You're not constrained to a power of 2, or even a multiple of 2:
typedef struct int13_t
{
signed int b13:13;
} int13_t;

What is the biggest useful value of time_t?

It seems remarkably difficult to find out what a valid range of values is for time_t.
It is on some platforms 32 bit, on most 64 bit, and so can easily enough be set to LONG_MAX. However trying to then use that value doesn't really work properly. For instance you can't pass it to localtime and change it into a struct tm.
A quick test program to binary search the value tells me it is 67768036191676799. That corresponds to the end of year 2147483647, so that makes sense as a value. But is it this specified anywhere, and is there any reasonable, platform independent value for a maximum usable time_t?
Indeed, the specification for time_t and clock_t are implementation defined (C99 7.23.1).
This is one of those things where I would recommend not generating these values yourself, but rely on the implementation to create them for you such as with mktime(), and use the struct tm for directly manipulating the time. -1 is the only value of time_t that is a "good" value you can use on your own.
I would specifically suggest that you not treat it as a 32 bit value of any type, as jgm suggests. You just never know if some strange embedded compiler will want to use a 16 bit time or 18 or who knows.
The safest way to use it is as a 32-bit signed, as long as you're happy with it not working 25 years from now.
Otherwise you're going to have to test the type yourself on whichever platforms you run on and act accordingly.
tm_year has type int, so if you will be converting to struct tm, the largest meaningful time_t is the value corresponding to year INT_MAX.

Resources