Different timezones in a PostgreSQL application - database
We have a system that serves many distributions centers. A distribution center is a physical space anywhere in the country. One client may have more than one distribution center. Now that we are expanding to more places, we will have to face the problem of different timezones. The same client may have, also, centers in different timezones.
A lot of events can be created and saved (with date and hour) using our system in the client centers. The ideal behavior for different timezones in the same client is the following:
Considering an event that happens in timezone A, at noon. If a supervisor of another distribution center in timezone B goes check this event, he should see the date and hour of that event respecting the timezone where the event was originally created (including DST changes, if existing).
This is because what matters to know is if the event was created at noon in the event timezone. For the supervisor does not matter if when the event was created it was 2PM in his timezone.
We use PostgreSQL as our database and I see that exists two different types to save timestamps. TIMESTAMP and TIMESTAMPTZ. All of our database uses only the type TIMESTAMP.
Another scenario that might (or not) happen is the case where a distribution center changes geographically. This may impact in a change of its timezone.
I have done some research and found that the more “correct” approach (at least is what it seems) is to save the timezone of every center in the DISTRIBUTION_CENTER table. Change all types from TIMESTAMP to TIMESTAMPTZ in our database and in every insert of an event that saves the timestamp, we should use the timezone of the center where the event was created to save the offset of the timezone of the event (the TIMESTAMPTZ only saves the offset of the timezone, not the timezone itself).
I’m not fully convinced that this is the right (or best) approach to deal with different timezones. As I never implemented nothing like this, I can’t say.
If I have to follow this approach, I will have to change all column types from TIMESTAMP to TIMESTAMPTZ in our database. All views that depends on that columns in some way in also have to be recreated, because we will be changing the column type. I will also have to change all queries that deals with that columns to apply the center timezone using AT TIMEZONE.
The database has now the America/Sao_Paulo timezone setted up and my fear is ended up doing something wrong when changing the column type for TIMESTAMPTZ. Does this change can potentially break with data consistency? Should I first change the column type or change the database timezone for UTC?
The solution that I described is the best way of doing this?
Does that approach also deals correctly with the DST changes?
Extra information: our server uses java (jersey).
These issues have been covered many times on Stack Overflow & the sister site https://dba.stackexchange.com/. Please, next time, search thoroughly before posting. So, just a recap here.
First you must understand the difference between offset-from-UTC and time zone. An offset is merely a number of hours, minutes, and seconds displacement from UTC. A time zone is the history of past, present, and future changes in the offset used by the people of a particular region. So using a time zone is always preferable to a mere offset.
exists two different types to save timestamps. TIMESTAMP and TIMESTAMPTZ
Not precisely. The actual types as defined by the SQL standard are TIMESTAMP WITH TIME ZONE and TIMESTAMP WITHOUT TIME ZONE. The other names are Postgres-specific synonyms. I suggest sticking with the standard names for clarity. Date-time handling is confusing enough without the ambiguity of reading/remembering the z on the end.
The SQL spec barely touches on the topic of date-time handling. So behavior varies widely between database implementations.
The way Postgres works is really quite simple.
For a column of type TIMESTAMP WITH TIME ZONE, any input passed with an offset-from-UTC or a time zone is automatically adjusted into UTC. After adjusting, the original value’s offset/zone info is discarded. The “with time zone” really means “with respect for the incoming data’s time zone” rather than “stored with time zone”. If you must know the original offset/zone, you must store that yourself in a separate column. I suggest doing so as text using the ISO 8601 standard formats for offset, or the proper name of timezone. If an input lacks any indicator of zone/offset, the session’s current default time zone is applied and then the adjustment is made to UTC — as I vaguely recall; you should never pass an input lacking zone/offset!
For a column of TIMESTAMP WITHOUT TIME ZONE, any zone/offset info with an input (if any) is ignored. Similarly, when retrieved this value has no zone/offset. This type has no concept of zone/offset. Do not use this type when your intention is to store a moment, a point on the timeline. This type is for a vague idea about potential moments along a range of about 26-27 hours, such as “Christmas Day starts after midnight on December 25, 2018”. Such a sentence has no real meaning until you append "in Japan", "in India", or "in France" (thereby creating a value in the other type, TIMESTAMP WITH TIME ZONE). This type is also used for future appointments more than several weeks out, when politicians may potentially change their region’s offset (which they surprisingly do often and with little forewarning).
BEWARE: Confusingly, some tools or drivers may apply the session’s current default time zone to values of either type. This includes pgAdmin. A terrible anti-feature in my opinion. Well-intentioned, but such a tool/driver sitting between you and Postgres should not be injecting its “opinion” about the data in transit. Doing so creates the illusion that the data retrieved carries that zone inside the database when precisely the opposite is true (actually carries either either UTC or no zone/offset). If your tool makes such an adjustment, it is likely controlled by a zone/offset setting in your Postgres session, as discussed here.
Best practice in date-time handling is to think, work, store, log, and exchange data using UTC. Think of other zones as mere variations on that theme. Adjust into time zones only when required by business logic or for presentation to users. Forget all about your own parochial time zone. Get a second clock on your desk set to UTC – seriously.
The database has now the America/Sao_Paulo timezone setted up
The default time zone of your server OS should be irrelevant. Never depend on such a default as a programmer as it is well out of your control, and is so easily changed.
In Java, the JVM has is own current default time zone separate from the host OS. The JVM’s current default may be changed during runtime at any moment by any code in any thread of any app within that JVM. So never depend on the current default. Always specify your desired/expected time zone explicitly by passing the optional argument.
If a supervisor of another distribution center in timezone B goes check this event, he should see
As discussed above, on the database side you should be working in UTC. Adjusting into a time zone expected by the user is a user-interface task, not a database task. Just like with internationalization, where you would store some kind of key-lookup value in the database, to be localized to some human language on the user-interface side.
the TIMESTAMPTZ only saves the offset of the timezone, not the timezone itself
No, incorrect. As discussed above, the TIMESTAMP WITH TIME ZONE type discards the offset/zone info after adjusting into UTC for storage. No offset, no zone, just a UTC moment is stored in the column — basically, a count of microseconds since an epoch reference.
Change all types from TIMESTAMP to TIMESTAMPTZ in our database and in every insert of an event that saves the timestamp, we should use the timezone of the center where the event was created to save the offset of the timezone of the event
If you are saying that you already have recorded date-time values from various time zones into a column of TIMESTAMP WITHOUT TIME ZONE, then you have an awful mess. You cannot reliably clean it up, not with full certainty of accuracy, as you do not really know what zone/offset was originally intended for the inputs passed to the database. You can guess the original intent of the stored data, but you can never be sure.
Explain to your boss & stakeholders that this is not a mess of your making. Explain that whoever set up this database & app did the equivalent of storing money amounts in various currencies such as Yen, Canadian dollars, British pounds, and Euros without bothering to record which currency on each amount.
If you want to guess, you would need to know the name of time zones that were likely used.
In Java, use only the java.time classes built into Java 8 and later. The older date-time classes are a bloody awful mess, now legacy, supplanted by java.time as defined in JSR 310.
Identify your possible zones.
ZoneId zoneSaoPaulo = ZoneId.of( "America/Sao_Paulo" ) ;
ZoneId zoneLisbon = ZoneId.of( "Europe/Lisbon" ) ;
ZoneId zoneKolkata = ZoneId.of( "Asia/Kolkata" ) ;
Extract the date-time value as a LocalDateTime, the Java class for a date-time value lacking any concept of zone/offset. With JDBC 4.2 and later, you may directly exchange java.time objects with the database.
LocalDateTime ldt = myResultSet.getObject( … , LocalDateTime.class ) ;
Perhaps an enum would be appropriate way to represent your distribution centers. This assumes the list need not change during runtime.
public enum DistributionCenter {
// List the constants to be constructed automatically when this class loads.
SAOPAULO( ZoneId.of( "America/Sao_Paulo" ) ) ,
LISBON( ZoneId.of( "Europe/Lisbon" ) ) ,
KOLKATA( ZoneId.of( "Asia/Kolkata" ) )
final public ZoneId zoneId ; // Make public, or add a getter method to access private member.
// Add constructor taking the passed `ZoneId` and storing in the variable.
}
Apply the zone, to generate a ZonedDateTime object. Now we have an actual moment, a specific point on the timeline.
DistributionCenter dc = … ;
ZonedDateTime zdt = ldt.atZone( dc.zoneId ) ;
Adjust that value into a UTC value. Same moment, same point on the timeline, different wall-clock time. Do not proceed with your project until you understand that concept clearly.
The Instant class represents a moment on the timeline in UTC with a resolution of nanoseconds (up to nine (9) digits of a decimal fraction).
Instant instant = zdt.toInstant() ;
You should be able to pass your ZonedDateTime object to your JDBC driver for adjustment into UTC. I just want to drive home the point that we are ending up with a UTC value in Postgres storage. Plus, I do convert to Instant myself for easy debugging – remember: UTC is The One True Time.
Now that we have determined an actual moment, we can store it in the database.
myPreparedStatement.setObject( … , instant ) ;
Note how none of this code depends on the current default time zone of your server host OS, your Postgres cluster, or your JVM.
I will have to change all column types from TIMESTAMP to TIMESTAMPTZ in our database
Yes. Data recording an actual moment, a piece of history, should never have been stored in TIMESTAMP WITHOUT TIME ZONE. Some naïve programmers/DBAs hope that using this data type may somehow exempt them from dealing with time zone issues. But actually this is a “pay now, or pay later” situation. Unfortunately, you are the one stuck paying for their poor choice.
You likely could do this same kind of work within a Postgres procedure. Postgres does have much better support for date-time work than most databases. However, nothing beats the java.time classes for date-time handling. And, personally I would rather debug and practice this particular chore within Java.
distribution center changes geographically
That is confusing and unwise. The business really should identify the new location as a new center, not the same. If you cannot convince management to do so, I would do so within your database and apps behind-the-scenes.
About java.time
The java.time framework is built into Java 8 and later. These classes supplant the troublesome old legacy date-time classes such as java.util.Date, Calendar, & SimpleDateFormat.
To learn more, see the Oracle Tutorial. And search Stack Overflow for many examples and explanations. Specification is JSR 310.
The Joda-Time project, now in maintenance mode, advises migration to the java.time classes.
You may exchange java.time objects directly with your database. Use a JDBC driver compliant with JDBC 4.2 or later. No need for strings, no need for java.sql.* classes. Hibernate 5 & JPA 2.2 support java.time.
Where to obtain the java.time classes?
Java SE 8, Java SE 9, Java SE 10, Java SE 11, and later - Part of the standard Java API with a bundled implementation.
Java 9 brought some minor features and fixes.
Java SE 6 and Java SE 7
Most of the java.time functionality is back-ported to Java 6 & 7 in ThreeTen-Backport.
Android
Later versions of Android (26+) bundle implementations of the java.time classes.
For earlier Android (<26), the process of API desugaring brings a subset of the java.time functionality not originally built into Android.
If the desugaring does not offer what you need, the ThreeTenABP project adapts ThreeTen-Backport (mentioned above) to Android. See How to use ThreeTenABP….
Read Basil's impressive answer to understand the concepts.
You should definitely switch to timestamp with time zone, which in reality is an absolute timestamp (in slight violation of the SQL standars's intention).
One thing you want to conaider is if the time zone of an event should change if the time zone of the center that recorded it changes. If not, you will either have to keep a history of time zones for a center or (better) store the time zone with each event as it is created.
Related
How to know a Salesforce table field is auto-calculated?
Salesforce provides CaseMilestone table. Each time I call the API to get a same object, I noticed that TimeRemainingInMins field has a different value. So I guessed this field is auto-calculated each time I call the API. Is there a way to know what fields in a table are auto-calculated ? Note : I am using python simple-salesforce library.
Case milestone is special because it's used as countdown to service level agreement (SLA) violation, drives some escalation rules. Depending on how admin configured the clock you may notice it stops for weekends, bank holidays or maybe count only Mon-Fri 9-17... Out of the box other place that may have similar functionality is OpportunityHistory table. Don't remember exactly but it's used by SF for for duration reporting, how long oppty spent in each stage. That's standard. For custom fields that change every time you read them despite nothing actually changing the record (lastmodifiedate staying same) - your admin could have created formula fields based on "NOW()" or "TODAY()", these would also recalculate every time you read them. You'd need some "describe" calls to get the field types and the formula itself.
Data Vault on Snowflake - what timestamp data type should be used for load_datetime attributes
In Data Vault, all objects have a load_datetime attribute which can be used to determine the relative order of insertion into the database, regardless of where in the world this took place. Which Snowflake data type is best suited for such columns and why? My own feeling is that timestamp_ntz would not work as it just records "wallclock" time. I would think that timestamp_ltz is the best choice as it stores only UTC. Also, timestamp_tz should maintain the correct relative order, but the local time information is irrelevant in this case so timestamp_ltz seems a cleaner choice. Have I missed anything?
I agree that using _ltz (UTC) is the best choice, especially if you have sources coming from different time zones. If all your sources are local to a single time zone then _tz would be fine, but why risk it right?
Storing timestamp with timezone in Postgres DB with a C extension?
To put it short, on input to my Postgres DB I have a timestamp in format "2014-12-10T12:00:14+07:00", and I would like to use 'timestampandtz' Postgres C extension (https://github.com/mweber26/timestampandtz), but I don't know how to approach the question of determining the time zone. Since the extension compares the input with the full timezone names in zones.c, the datatype won't know what to do with "+07:00" instead of "# Continent/City", which it expects on the input. The thing is, I need to pull out the city somehow from "+07:00", cause I do need Daylight Savings Time resolution. Also, I know that these timestamps are from only one country, so maybe determining the "# Continent/City" can be thought out based on this. Any POV on how to approach this challenge would be greatly appreciated, thanks!
I'm afraid that you are out of luck. There is no way to convert a time offset like +07:00 to a time zone like US/Eastern automatically. The reason is that the same time offset can belong to different time zones. For example, -05:00 currently could be America/Lima or US/Central, but these are different time zones – the former has no daylight savings time. So you will have to come up with a translation yourself, e.g. if you know what time zone all your data with a certain time offset belong to.
Problems with incorrect timezones and locale-specific display of time
So, there is absolutely no reason why we should be having this problem in this day and age, but we do. Our database has datetime columns, and when the dates are pulled out from the database, they are retrieved as CDT (this time of year, CST in the winter). That time is then passed as CDT to the UI through JSON. This could not be more wrong. The time stored in the database is the time relative to the location specified in the data. So, if we have a trip going from Los Angeles at 6AM PDT to New York at 11PM EDT, then the start time will be retrieved from the database as 6AM CDT and the end time will be retrieved as 11PM CDT. Requirements: The UI needs to display in the time local to the data, 6AM and 11PM in the previous example. The UI needs to indicate items in the near future, such as arrivals within the next 3 hours. When the data is edited, it needs to be input in the same manner, the user enters 6AM in Los Angeles, and 11PM in New York. The user also needs to be able to enter times relative to the current time, such as "H+5" for 5 hours from now. The UI needs to sort based on time. This is more of a nice to have as the application they are used to using doesn't do this right either. Our current solution is just burying our head in the sand and displaying it (untested for browsers in other timezones, such as those in our California office), which is actually surprisingly effective, even though it is not at all semantically correct. 11PM CDT is read from the database It gets displayed as 11PM, and is understood by the user to be EDT The user edits it, puts in 10PM It is parsed back as 10PM CDT and saved that way, exactly as we want it. Where the current solution fails miserably is in times relative to the current time. 11PM CDT is read from the database It gets displayed as 11PM, and is understood by the user to be EDT The user edits it, puts in "NOW" (assume the user's local clock is 9PM CDT) It is parsed back as 9PM CDT and saved that way, but it should have been 10PM because it is 10PM in New York! I'm looking for a way to handle these five cases that isn't totally hideous. I am open to a solution in any layer (architecture detailed above), but there are constraints because we share the database with another application. If there are additional tools/frameworks that would be useful and fit with what we already have, I am open to using them. Database: SQL Server 2008 API: Rails with JSON responses Frontend: JS + Moment + other stuff unrelated to dates Any attempt to correct the data and/or schema is totally out of the question as we would break the other application. The addition of new views/table columns/tables/stored procedures is usually possible. The addition of indexes is NOT allowed. The status of any more exotic features is unknown. There are many tables/endpoints that are affected by this problem, so any brute-force solution is going to be incredibly tedious. Any solution only needs to work in the Continental US. Note that this is not a simple timezone conversion as the timezone we get back from the database is straight up wrong, so conversions of the timezone will also be wrong.
CakePHP - Created and Modified server time offset for save
My target market is based in a very different time zone compared to where the webserver is based. Therefore, my save method timestamps Created and Modified are a lot less useful than they could be. Is there anyway that I could define a global offset for my app for those two fields whenever they are saved in the app so that the time matches my target market timezone? For example, deduct 5h from every Created record?
Store your datetimes as UTC and convert them to the appropriate user timezone when you display them, with CakeTime::convert. If you have user accounts, let the users pick their own timezones. If you don't, pick whichever timezone makes sense to you.
Put this in your Config/bootstrap.php: date_default_timezone_set('UTC'); //or whatever your timezone is It's just based on the server time and really has nothing to do with CakePHP - so just change the default timezone with PHP, and you should be good to go. 'created' and 'modified' will be based on the specified timezone.