How should year-only dates in ISO 8601 be interpreted? - data-modeling

I'm looking at yet another date specification relying on the ISO 8601 standard. I understand that the latter accepts dates having the year only in their value eg, "2007". The question is: how are they interpreted? 1) only the year is given, nothing is said about day/month, either because they're unknown, or we want to refer to a year generically, as in: "next year for the dog in Chinese astrology", or 2) Jan/1 is implicit?
Did they define a standard interpretation for this, or is everyone free to do as they want (ie, spread incompatibility), or de-facto everyone does as they want?
As clarified in the comments, my main interest is in modelling schemas for data exchange and interoperability and hence it's important for me to know if/how these things are specified.

Related

What does it mean if there's a letter I in the middle of my datetime string?

I'm trying to understand datetime strings that look like this:
2019/04/18 0823:40I:45
2019/05/17 0024:23I:53
Most of it is clear, but I can't imagine what the I in the middle represents. Is this a standard datetime format I'm unfamiliar with?
Edit: These values came from a dataset provided by a US-based company, and some of the other data is english text.
Sometimes, there are letters like T or Z, and for some reason I that I've actually never seen before.
T is used as a literal to separate the date from the time, and Z means "zero hours offset".
It must be something similar to this, maybe "minutes" or "seconds" in another language.
If you don't want to have strings in your dataformat you can use
SimpleDateFormat format = new SimpleDateFormat(
"yyyy-MM-dd'T'HH:mm:ss.SSS'Z'", Locale.US);
format.setTimeZone(TimeZone.getTimeZone("UTC"));
(in java, as you haven't specified the language you are using it in), or you can search for something like that!
Hope this helps!
It’s no common standard. It is probably a home-grown format.
If the I is always there and always uppercase I, I suggest that you can ignore it provided that you know what the times mean.
I would guess that 0823:40I:45 likely means 08:23:40.450, but without any confirmation I wouldn’t be sure.
A far-fetched guess could be that the letter I refers to the military time zone also known as “India Time Zone” because India is the NATO phonetic I (the zone has noting to do with the country of India). It’s unlikely when your date-time strings come from the US because India Time Zone is +09:00 and therefore far away from the USA. It also would make no sense to put the time zone before the fraction of second (if that is what the last two digits are).
Link: Time Zone Abbreviations – Military Time Zone Names

snprintf: Are there any C Standard Proposals/plans to change the description of this func?

Are there any Proposals (or plans) to the C language Standard to change the (last sentence of the) description of the snprintf function such that the ambiguity described in this my answer to the question - "Is snprintf() ALWAYS null terminating?"- is resolved?
(Or how (using which links) can I determine by myself if there are any such Proposals?
Is there any search engine that can show all the currently active Proposals about the snprintf function?
The only link I currently know is this one - http://www.open-std.org/jtc1/sc22/wg14/ - and this is the first time I have particular thoughts about a Proposal to any Standard)
From the WG14 page you can find a list of all C11 defect reports (DR).
I can only find one DR about snprintf, DR 428. Though I didn't check if this is the same issue.
As for how to propose a DR, I suppose you have to go through your national standard institute and contact your national part of WG14, which in the US would be INCITS PL22.11.

How to get seconds since unix epoch in Ada?

I feel very stupid as I don't seem to get a plain Natural number representing the seconds since the unix epoch (01/01/1970 00:00:00) in Ada. I've read the Ada.Calendar and it's subpackages up and down but don't seem to find a sensible way of achieving that, even though Ada.Calendar.Clock itself is supposed to be exactly what I want...
I am at my wits end. Any nudge in the right direction?
Using Ada.Calendar.Formatting, construct a Time representing the epoch.
Epoch : constant Time := Formatting.Time_Of(1970, 1, 1, 0.0);
Examine the difference between Ada.Calendar.Clock and Epoch.
Put(Natural(Clock - Epoch)'Img);
Check the result against this epoch display or the Unix command date +%s.
See Rationale for Ada 2005: §7.3 Times and dates and Rationale for Ada 2012: §6.6 General miscellanea for additional details.
According to POSIX standard UNIX time does not account leap seconds, while Ada.Calendar."-" handles them:
For the returned values, if Days = 0, then Seconds + Duration(Leap_Seconds) = Calendar."–" (Left, Right).
One option is to split Ada.Calendar.Time into pieces using Ada.Calendar.Formatting.Split and gather back using POSIX algorithm.
The best option option seems to be to use Ada.Calendar.Arithmetic.Difference. It returns Days, Seconds and Leap_Seconds. You can then combine Days * 86_400 + Seconds to get UNIX time, and Leap_Seconds will be explicitly thrown away as required by POSIX.
I have recently been solving this problem and posted a library into public domain.
In the GNAT implementation of Ada, there is a private package Ada.Calendar.Conversions which contains Ada <-> Unix conversions used by the children of Calendar.

How to represent end-of-time in a database?

I am wondering how to represent an end-of-time (positive infinity) value in the database.
When we were using a 32-bit time value, the obvious answer was the actual 32-bit end of time - something near the year 2038.
Now that we're using a 64-bit time value, we can't represent the 64-bit end of time in a DATETIME field, since 64-bit end of time is billions of years from now.
Since SQL Server and Oracle (our two supported platforms) both allow years up to 9999, I was thinking that we could just pick some "big" future date like 1/1/3000.
However, since customers and our QA department will both be looking at the DB values, I want it to be obvious and not appear like someone messed up their date arithmetic.
Do we just pick a date and stick to it?
Use the max collating date, which, depending on your DBMS, is likely going to be 9999-12-31. You want to do this because queries based on date ranges will quickly become miserably complex if you try to take a "purist" approach like using Null, as suggested by some commenters or using a forever flag, as suggested by Marc B.
When you use max collating date to mean "forever" or "until further notice" in your date ranges, it makes for very simple, natural queries. It makes these kind of queries very clear and simple:
Find me records that are in effect as of a given point in time.
... WHERE effective_date <= #PointInTime AND expiry_date >= #PointInTime
Find me records that are in effect over the following time range.
... WHERE effective_date <= #StartOfRange AND expiry_date >= #EndOfRange
Find me records that have overlapping date ranges.
... WHERE A.effective_date <= B.expiry_date AND B.effective_date <= A.expiry_date
Find me records that have no expiry.
... WHERE expiry_date = #MaxCollatingDate
Find me time periods where no record is in effect.
OK, so this one isn't simple, but it's simpler using max collating dates for the end point. See: this question for a good approach.
Using this approach can create a bit of an issue for some users, who might find "9999-12-31" to be confusing in a report or on a screen. If this is going to be a problem for you then drdwicox's suggestion of using a translation to a user-friendly value is good. However, I would suggest that the user interface layer, not the middle tier, is the place to do this, since what may be the most sensible or palatable may differ, depending on whether you are talking about a report or a data entry form and whether the audience is internal or external. For example, some places what you might want is a simple blank. Others you might want the word "forever". Others you may want an empty text box with a check box that says "Until Further Notice".
In PostgreSQL, the end of time is 'infinity'. It also supports '-infinity'. The value 'infinity' is guaranteed to be later than all other timestamps.
create table infinite_time (
ts timestamp primary key
);
insert into infinite_time values
(current_timestamp),
('infinity');
select *
from infinite_time
order by ts;
2011-11-06 08:16:22.078
infinity
PostgreSQL has supported 'infinity' and '-infinity' since at least version 8.0.
You can mimic this behavior, in part at least, by using the maximum date your dbms supports. But the maximum date might not be the best choice. PostgreSQL's maximum timestamp is some time in the year 294,276, which is sure to surprise some people. (I don't like to surprise users.)
2011-11-06 08:16:21.734
294276-01-01 00:00:00
infinity
A value like this is probably more useful: '9999-12-31 11:59:59.999'.
2011-11-06 08:16:21.734
9999-12-31 11:59:59.999
infinity
That's not quite the maximum value in the year 9999, but the digits align nicely. You can wrap that value in an infinity() function and in a CREATE DOMAIN statement. If you build or maintain your database structure from source code, you can use macro expansion to expand INFINITY to a suitable value.
We sometimes pick a date, then establish a policy that the date must never appear unfiltered. The most common place to enforce that policy is in the middle tier. We just filter the results to change the "magic" end-of-time date to something more palatable.
Representing the notion of "until eternity" or "until further notice" is an iffy proposition.
Relational theory proper says that there is no such thing as null, so you're obliged to have whatever table it is split in two: one part with the rows for which the end date/end time is known, and another for the rows for which the end time is not yet known.
But (like having a null) splitting the tables in two will make a mess of your query writing too. Views can somewhat accommodate the read-only parts, but updates (or writing the INSTEAD OF on your view) will be tough no matter what, and likely to affect performance negatively no matter what at that).
Having the null represent "end time not yet known" will make updating a bit "easier", but the read queries get messy with all the CASE ... or COALESCE ... constructs you'll need.
Using the theoretically correct solution mentioned by dportas gets messy in all those cases where you want to "extract" a DATE from a DATETIME. If the DATETIME value at hand is "the end of (representable) time (billions of years from now as you say)", then this is not just a simple case of invoking the DATE extractor function on that DATETIME value, because you'd also want that DATE extractor to produce the "end of representable DATEs" for your case.
Plus, you probably do not want to show "absent end of time" as being a value 9999-12-31 in your user interface. So if you use the "real value" of the end of time in your database, you're facing a bit of work seeing to it that that value won't appear in your UI anywhere.
Sorry for not being able to say that there's a way to stay out of all messes. The only choice you really have is which mess to end up in.
Don't make a date be "special". While it's unlikely your code would be around in 9999 or even in 2^63-1, look at all the fun that using '12/31/1999' caused just a few years ago.
If you need to signal an "endless" or "infinite" time, then add a boolean/bit field to signal that state.

Calculate time period using C

How do I calculate the time period between 2 dates in C (any library, etc.)?
The program should take two (local) dates as input and provide the duration period between them as output.
For example,
startDate = OCT-09-1976 and endDate = OCT-09-2008
should show a duration of 32 years.
startDate = OCT-09-1976 and endDate = DEC-09-2008
should show a duration of 32 years and 2 months.
Thanks.
Convert the dates into two struct tm structures with strptime
Difftime gives you the difference between the two in seconds.
Convert that into months etc with the code here (in C++, but the only C++ is for the string formatting, easy to change)
EDIT: as a commentor observed, that avoids the month issue. There is (GPL'd) code for
isodiff_from_secs that can be converted to do what you want, if you're happy with its assumption that months have 30 days. See Google codesearch and the description of the standard here
Doing the fully-correct solution which takes acccount of the true months between the actual days would be pretty complex. Is that required for your problem?
I did something very similar recently using Boost.Date_Time, and presenting the resulting function as C, but this of course requires using the C++ linker.
Actually, the example leaves a little to be desired - will the start and end dates always be on the same day of the month? If so you can ignore the day number end up with a trivial subtraction of the month and year numbers.
However if your dates can be anywhere in the month it might be a bit more tricky. Remember that not all months have the same number of days!
C difftime doesn't help you with month calculations, which is why I used Boost, though you may not have that option.

Resources