Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why does Temporal choose not to expose whether a ZonedDateTime is "in" daylight saving time? #3074

Closed
BurntSushi opened this issue Jan 15, 2025 · 22 comments
Labels

Comments

@BurntSushi
Copy link

Basically, what it says on the tin. I have my own guesses about this, but I was wondering if the Temporal folks ever deliberated on adding such an API explicitly, and if so, what the thinking was behind providing it (or not providing it, as the case may be).

I could have sworn I stumbled across discussion about this in the past, but after searching the tracker here, I couldn't find anything.

(Shame on me if DST querying is actually part of Temporal's API and I just missed it.)

Why am I asking? A [feature request was made to Jiff](https://github.com/BurntSushi/jiff/issues/205) to add the ability to inspect whether a `Zoned` (Jiff's analog to Temporal's `ZonedDateTime`) is in daylight saving time or not. This functionality is already available in Jiff in a lower level API, and making it available in the higher level APIs would make access to it more convenient and cheaper in some cases. On the other hand, I'm hesitant to make such queries more accessible because I _suspect_ they are prone to result in buggy code written by the user. And it's conspicuously absent from Temporal's API, which perhaps suggest y'all came to a similar conclusion. But I'm unsure, hence why I'm asking.
@BurntSushi BurntSushi changed the title Why does Temporal choose not to expose whether a ZonedDateTime is not "in" daylight saving time? Why does Temporal choose not to expose whether a ZonedDateTime is "in" daylight saving time? Jan 15, 2025
@ptomato
Copy link
Collaborator

ptomato commented Jan 15, 2025

It's intentionally missing. TL;DR of reasons:

  • Being "in DST" is meaningless for most of the world which doesn't use DST
  • It doesn't take other UTC offset shifts into account
  • There are edge cases such as Ireland (shifts UTC offset by 1 hour twice a year, but the summer time is the "standard" time, not the "daylight" time) and Morocco (is in permanent DST except during Ramadan)
  • A common use of this flag is to decide whether to display a time zone name as "X Standard Time" or "X Daylight Time", but Intl.DateTimeFormat can do this for you

I'm having trouble digging up any references in the notes where we discussed this, but here are some external references:

@BurntSushi
Copy link
Author

Thanks! I think for points 1, 2 and 3, tzdb takes all of those into account and can report (correctly, as far as I know) whether a particular instant is "in DST" or not.

What I was thinking here was that a routine to query the DST status of an instant was ripe for misuse. I was curious if anyone had more concrete experience there. I'll take a look at your links later.

A common use of this flag is to decide whether to display a time zone name as "X Standard Time" or "X Daylight Time", but Intl.DateTimeFormat can do this for you

The other use case I'm aware of is for integrating with other systems that might be a little more insistent on knowing the DST status of an instant. For example, for bridging between Jiff and Python.

In any case, thank you for the answer!

@ljharb
Copy link
Member

ljharb commented Jan 15, 2025

DST status depends on location, though, not just timezone - for example, Arizona moves between two timezones to avoid having DST, so in order to know DST status you may need to know both an Instant and a location.

@BurntSushi
Copy link
Author

DST status depends on location, though, not just timezone - for example, Arizona moves between two timezones to avoid having DST, so in order to know DST status you may need to know both an Instant and a location.

AFAIK, US/Arizona in the IANA Time Zone Database models this correctly.

@ljharb
Copy link
Member

ljharb commented Jan 16, 2025

Bad example then :-) I suppose if the tzdb always ensures that any location-based timezone changes have their own timezone then my specific comment doesn't apply.

@justingrant
Copy link
Collaborator

I guess I don't understand the demand for this feature. How would it be used?

@gilmoreorless
Copy link
Contributor

gilmoreorless commented Jan 16, 2025

I think for points 1, 2 and 3, tzdb takes all of those into account and can report (correctly, as far as I know) whether a particular instant is "in DST" or not.

FWIW, tzdb actively discourages using its isdst flag, which is pretty much only there for backwards compatibility concerns. From https://data.iana.org/time-zones/code/theory.html:

POSIX and ISO C define some APIs that are vestigial: they are not needed, and are relics of a too-simple model that does not suffice to handle many real-world timestamps. Although the tz code supports these vestigial APIs for backwards compatibility, they should be avoided in portable applications. The vestigial APIs are:

  • [...snip...]
  • The tm_isdst member is almost never needed and most of its uses should be discouraged in favor of the abovementioned APIs. It was intended as an index into the tzname variable, but as mentioned previously that usage is obsolete. Although it can still be used in arguments to mktime to disambiguate timestamps near a DST transition when the clock jumps back on platforms lacking tm_gmtoff, this disambiguation works only for proleptic TZ strings; it does not work in general for geographical timezones, such as when a location changes to a time zone with a lesser UT offset.

Edit: Coincidentally, this was backed up a few hours ago in a mailing list discussion:

The clocks didn't change then: only tm_isdst changed, and since tm_isdst is obsolescent that wasn't enough to prompt a new TZDB release right away.

@BurntSushi
Copy link
Author

@justingrant

I guess I don't understand the demand for this feature. How would it be used?

I think @ptomato already answered this one with at least one example. Intl.DateTimeFormat uses it for formatting. I guess Temporal's own toLocaleString uses it too:

>> zdt = Temporal.ZonedDateTime.from("2025-01-17T00:00-05[America/New_York]")
>> zdt.toLocaleString()
"1/17/2025, 12:00:00 AM EST"
>> zdt = Temporal.ZonedDateTime.from("2025-07-17T00:00-04[America/New_York]")
>> zdt.toLocaleString()
"7/17/2025, 12:00:00 AM EDT"

I don't know if there are use cases outside of this. I suppose that's partially why I'm asking the question here. In the context of Jiff, which doesn't own formatting datetimes into the user's locale, that has to be the responsibility of some other code. If that code is using Jiff, then that code either needs to query the DST status of an instant on its own or it can use Jiff to do it. This information is available from tzdb, so I think it makes sense for Jiff to provide it. And Jiff does in a lower level API.

What I'm wondering is the appropriateness of putting this kind of information in a higher level API, and I wanted to ask you folks for some background on the decision to omit this API. If it's more like, "we just didn't have the use cases," then I think it would probably be fine for Jiff to expose it since I do have definite use cases (two of them, linked above). But if it's more like, "our experience with other datetime libraries that provided this functionality resulted in users using it for the wrong thing and committing bugs," then maybe it should just stay tucked away in a lower level API. Does that make sense?

@BurntSushi
Copy link
Author

BurntSushi commented Jan 17, 2025

I think for points 1, 2 and 3, tzdb takes all of those into account and can report (correctly, as far as I know) whether a particular instant is "in DST" or not.

FWIW, tzdb actively discourages using its isdst flag, which is pretty much only there for backwards compatibility concerns. From https://data.iana.org/time-zones/code/theory.html:

Hmm, I don't think so. Your link discusses the POSIX tm_isdst member on the struct tm type. It doesn't discourage the DST status in tzdb itself. Indeed, from the part you quoted:

this disambiguation works only for proleptic TZ strings; it does not work in general for geographical timezones, such as when a location changes to a time zone with a lesser UT offset.

The tzdb does not suffer from these problems. It handles DST in time zones like Europe/Dublin just fine. And tzdb doesn't just work for proleptic TZ strings. It can define different rules for different periods of time.

I am somewhat confused here. Surely, the concept of querying DST status itself should not be in question. Temporal's own toLocaleString couldn't work without it. The question here for me is more about whether it should be a public API. In Temporal's case, if the only valid use case is locale datetime formatting and it isn't expected that anyone else should be able to implement their own locale datetime formatting in lieu of Intl.DateTimeFormat, then yeah, I don't think Temporal needs such a public API and it's probably best left out. But I'm not just asking about what Temporal ought to do, but more general guidance. :-)

@justingrant
Copy link
Collaborator

justingrant commented Jan 17, 2025

If it's more like, "we just didn't have the use cases," then I think it would probably be fine for Jiff to expose it since I do have definite use cases (two of them, linked above). But if it's more like, "our experience with other datetime libraries that provided this functionality resulted in users using it for the wrong thing and committing bugs," then maybe it should just stay tucked away in a lower level API.

I'm not aware of any other valid use cases besides formatting, and naively using a boolean flag will tend to cause problems in edge cases. For example, IIRC Ireland's "standard" time is in the summer and it changes%20in%20the%20winter%20period.) to the non-standard time during the winter.

Image

So if I were in your shoes I'd probably do what Temporal is doing: ensure that formatting libraries can localize time zone names correctly, but don't expose a high-level isDST flag API. Or at least don't expose it until you learn actual valid cases that require a high-level API.

Note that if users really want to roll their own isDST, there are a few workarounds. One way could be to check the offset of the two solstices (Dec 21 and June 21) before and two solstices after the ZDT. If the winter and summer solstices all have the same offset, then you can pretty safely assume no DST. If they're different, and the two years' summer and winter offsets are identical, then it has DST. If there's a disparity between one summer and the other summer, or one winter and the other winter, then the time zone stopped using or started using DST.

There's likely another workaround using ZDT.p.getNextTransition and making a similar set of conclusions.

@ptomato
Copy link
Collaborator

ptomato commented Jan 17, 2025

Surely, the concept of querying DST status itself should not be in question. Temporal's own toLocaleString couldn't work without it.

I don't think I'd agree with this. toLocaleString printing the time zone doesn't use any sort of "in DST" algorithm internally. Rather, the time zone name is computed from rules in the TZDB. Example: America/Vancouver - since 1945, the human-readable abbreviation for America/Vancouver at -08:00 is PST, and at -07:00 is PDT. There's nothing intrinsically "in DST" about that, for example from 1942–1945 the abbreviation for America/Vancouver at -07:00 was PWT (and from August to September 1945, PPT), not PDT.

In other words the time zone abbreviation is not a function of whether the time zone is "in DST". It's conceptually a function that takes an Instant and a time zone ID, and returns a string. That string may have a "D" for "daylight" in it, or may not, even if the time zone could be said to be "in DST".

So yes, I think the concept of querying DST status should be in question. It's meaningless for most of the time zones in the world, and for others it yields results you don't expect. Being in DST is something that you and I may talk about colloquially in the present day, but from a worldwide or historical perspective we're the weird ones. It's not a useful piece of data to encode into a data model.

At the Stack Overflow link that I gave earlier, I listed the three use cases I'm aware of:

  • Formatting the time zone name — I think we all agree that using "in DST" to do this is buggy and there is already a better way
  • Porting code from another library that has an "in DST" function to Temporal — I think the thing to do here is to write your own function that emulates the other library's "in DST" logic exactly, and then try to eliminate it later
  • Interfacing with other systems that require you to supply an "in DST" bit — I think the thing to do here is hardcode a list of time zones and time periods for which an "in DST" heuristic works, and then use whatever heuristic works for your application

@BurntSushi
Copy link
Author

In other words the time zone abbreviation is not a function of whether the time zone is "in DST". It's conceptually a function that takes an Instant and a time zone ID, and returns a string. That string may have a "D" for "daylight" in it, or may not, even if the time zone could be said to be "in DST".

Hmmm. I think that's a very fine line to walk. Does that mean every single string in a localized datetime has to come from tzdb as if it were just some opaque thing representing the current time zone transition? I ask sincerely here because I'm not an expert on locale formatting or the relevant Unicode specs. But I do note that the icu crate exposes a concept of DST in its API. And hooking this up with Jiff requires asking whether the current instant is in DST. It's not clear to me how one might achieve this without querying DST status of an instant.

I think the use cases you cite are what I'm aware of as well. But I'm somewhat skeptical of your proposed solutions given that tzdb stores DST status in a way that seems to match the colloquial way that humans talk about DST.

I do think I am at least convinced not to expose DST status in a higher level API absent a more compelling motivation. And I think the argument for Temporal to do so is even weaker given that there is already dedicated locale formatting. So I'll close this issue as answered, but I'm happy to continue discussion if y'all want to. Thank you all for your input. :-)

@ptomato
Copy link
Collaborator

ptomato commented Jan 17, 2025

Does that mean every single string in a localized datetime has to come from tzdb as if it were just some opaque thing representing the current time zone transition?

IMO, yes, those localized strings should be treated as opaque — I can see how that might be counterintuitive though! 😄

@BurntSushi
Copy link
Author

I don't think that actually works though. tzdb has very limited time zone strings. But full Unicode localization seems to indicate using different strings not found in tzdb. How does one know to use those strings without querying DST status or "interpreting" the strings in tzdb?

@justingrant
Copy link
Collaborator

Not sure what you mean. Localization comes from CLDR not TZDB. It's CLDR and ICU that are in charge of figuring out what strings to show, using TZDB as input but not the only input. Could you clarify what you mean?

@sffc may want to share more details about how exactly DST affects localization in CLDR and ICU.

@BurntSushi
Copy link
Author

Right that's what I'm saying... How does a localization algorithm choose which strings to use without querying DST status?

@ptomato
Copy link
Collaborator

ptomato commented Jan 17, 2025

CLDR does have "standard" and "daylight" names in its data model: https://github.com/unicode-org/cldr/blob/f566e7d448546d80f5b81e7eac9793eec9cf6bf2/common/main/en.xml#L3909

Then, they have a bunch of heuristics to decide whether an instant is "standard" or "daylight". For example see the function in the following link, particularly the highlighted step: https://github.com/unicode-org/cldr/blob/f566e7d448546d80f5b81e7eac9793eec9cf6bf2/tools/cldr-code/src/main/java/org/unicode/cldr/util/TimezoneFormatter.java#L385

I'd argue that this is insufficient for purpose, and they'd be better off structuring their data like TZDB. However, that probably can't be changed, or isn't a high priority because the existing heuristic gets the names right for most modern dates.

But, for example, CLDR doesn't correctly format the Second World War time zone names that I mentioned above:

> new Date(1944, 5, 1).toLocaleString('en', {timeZone: 'America/Vancouver', timeZoneName: 'long'})
'6/1/1944, 12:00:00 AM GMT-07:00'

(should be "Pacific War Time")

Or Yukon's extra daylight saving hour in 1966:

> new Date(1966, 5, 1).toLocaleString('en', {timeZone: 'America/Dawson', timeZoneName: 'long'})
'5/31/1966, 10:00:00 PM GMT-09:00'

(the TZDB abbreviation given for this is "YDDT" so I assume the long name is supposed to be "Yukon Double Daylight Time" or something similar)

@sffc
Copy link
Collaborator

sffc commented Jan 17, 2025

Localization of a time zone name involves looking up the correct name in a table keyed by what ICU4X calls the "zone variant". Currently CLDR exports only the Standard and Daylight zone variants, but as others have pointed out, this is a limitation in the CLDR model for what should generally also contain concepts such as War Time, Peace Time, and Ramadan Time. The TZDB docs themselves highlight War and Peace time zone variants:

https://web.cs.ucla.edu/~eggert/tz/tz-how-to.html

#Rule NAME FROM TO    -   IN  ON        AT   SAVE LETTER/S
Rule  US   1918 1919  -   Mar lastSun  2:00  1:00 D
Rule  US   1918 1919  -   Oct lastSun  2:00  0    S
Rule  US   1942 only  -   Feb 9        2:00  1:00 W # War
Rule  US   1945 only  -   Aug 14      23:00u 1:00 P # Peace
Rule  US   1945 only  -   Sep 30       2:00  0    S
Rule  US   1967 2006  -   Oct lastSun  2:00  0    S
Rule  US   1967 1973  -   Apr lastSun  2:00  1:00 D
Rule  US   1974 only  -   Jan 6        2:00  1:00 D
Rule  US   1975 only  -   Feb 23       2:00  1:00 D
Rule  US   1976 1986  -   Apr lastSun  2:00  1:00 D
Rule  US   1987 2006  -   Apr Sun>=1   2:00  1:00 D
Rule  US   2007 max   -   Mar Sun>=8   2:00  1:00 D
Rule  US   2007 max   -   Nov Sun>=1   2:00  0    S

So, a reasonable model would be for Temporal and other datetime APIs to return the "letter" of the current time zone. Doing so would likely require some work to ensure that TZDB applies letters consistently across time zones and, generally, increasing their data quality.


Another piece of data that ICU needs to localize the time zone is a mapping from the time zone to its metazone display name. For example, is America/Indiana/Indianapolis in Central Time or Eastern Time? TZDB also technically contains this information in the form of the FORMAT column:

# Zone	NAME		STDOFF	RULES	FORMAT	[UNTIL]
Zone America/Indiana/Indianapolis -5:44:38 - LMT 1883 Nov 18 18:00u
			-6:00	US	C%sT	1920
			-6:00 Indianapolis C%sT	1942
			-6:00	US	C%sT	1946
			-6:00 Indianapolis C%sT	1955 Apr 24  2:00
			-5:00	-	EST	1957 Sep 29  2:00
			-6:00	-	CST	1958 Apr 27  2:00
			-5:00	-	EST	1969
			-5:00	US	E%sT	1971
			-5:00	-	EST	2006
			-5:00	US	E%sT

So an effort to expose the LETTER might also want to expose the FORMAT. (Or, maybe, it should just expose the whole "formatted" string, which ICU can then localize further.)

@gilmoreorless
Copy link
Contributor

Your link discusses the POSIX tm_isdst member on the struct tm type. It doesn't discourage the DST status in tzdb itself.
...
The tzdb does not suffer from these problems. It handles DST in time zones like Europe/Dublin just fine. And tzdb doesn't just work for proleptic TZ strings. It can define different rules for different periods of time.

Yes, tzdb handles Ireland just fine, but many consumers of tzdb didn't, because of hard-coded assumptions about DST based on a boolean flag. There's a distinction between the human-readable text input files for tzdb and the compiled binary output. For the input files, DST is more of a shortcut for defining repeating rules to generate timestamps, as shown by @sffc above.

But the tzdb binary data files only contain a boolean flag for DST to service the deprecated isdst flag quoted. From the perspective of the binary data, the only thing that matters is a list of UTC timestamps and their mapping to a time offset and a text abbreviation. Whether that abbreviation contains a "D" or an "S" is a separate matter (because many zones don't contain either).

My assertions about the DST flag come from years of seeing the tzdb's maintainer stating that a boolean flag was too simplistic for real-world scenarios. That's why it's now deprecated. A perfect example is the 2025a release that just came out a few days ago:

    Paraguay will stop changing its clocks after the spring-forward
    transition on 2024-10-06, so it is now permanently at -03.
    (Thanks to Heitor David Pinto and Even Scharning.)
    This affects timestamps starting 2025-03-22, as well as the
    obsolescent tm_isdst flags starting 2024-10-15.
  1. Paraguay changed from -04/-03 with DST to a permanent -03. But the law to enact that change wasn't finalised until October 14th, when Paraguay had already entered -03 as DST on October 6th.
  2. Therefore, Paraguay spent 9 days on "-03 with a DST flag" then switched to "permanent -03, no DST". No timestamps or clocks changed, only the notion of whether or not it was DST.
  3. Since the isdst flag is "obsolescent" (in the release notes), it wasn't deemed worthy of a database release until closer to when the timestamps will actually change.

@sffc
Copy link
Collaborator

sffc commented Jan 17, 2025

Also, for context, I think one of the reasons @BurntSushi likely opened this issue was because of comments I've made elsewhere about wanting to decouple ICU4X time zone localization from the TZDB.

@ptomato and others have correctly observed that toLocaleString does this automatically, but @BurntSushi correctly pointed out that this is only because toLocaleString queries the TZDB for this information. In order to be more modular, ICU4X has taken the position that metadata required for time zone formatting should be carried with the time zone object instead of being re-computed at formatting time. This would imply that the time zone variant, and ideally the metazone, should be queryable from a time zone object.

To work around the limitation, ICU4X is currently planning to ship a slimmed-down data payload, currently around 24 kB, derived from TZDB with the information required to calculate the zone variant and metazone from the time zone ID. However, if datetime libraries would export this information from their own copy of TZDB, we would not need to ship this extra payload.

@BurntSushi
Copy link
Author

BurntSushi commented Jan 17, 2025

But the tzdb binary data files only contain a boolean flag for DST to service the deprecated isdst flag quoted.

...

My assertions about the DST flag come from years of seeing the tzdb's maintainer stating that a boolean flag was too simplistic for real-world scenarios. That's why it's now deprecated.

Okay, so I think this is where I'm getting a little stuck on your comments here. From what I can see, the tzdb maintainer is saying that the POSIX tm_isdst member is obsolete and shouldn't generally be used for reasons that don't seem to apply to the boolean in tzdb itself. But I don't see anything about tm_isdst being deprecated (by POSIX) or the DST boolean in tzdb being deprecated either.

I did spend about 15 minutes searching the tzdb mailing list earlier today to see if I could find more commentary about this, but there was a lot of noise. What commentary I could find seemed specific to tm_isdst and not to the general concept of querying DST.

With that said, I do agree that the examples @ptomato cites above seem to suggest that querying DST status is insufficient for all cases. However, it reads more like "the tzdb binary format should expose more data, including DST status and not less."

And yes, as @sffc points out, the heart of my questioning here comes from a decoupling of datetime localization and datetime handling.

@gilmoreorless
Copy link
Contributor

I did spend about 15 minutes searching the tzdb mailing list earlier today to see if I could find more commentary about this, but there was a lot of noise.

No disagreement on that! 😆

From what I've gathered, POSIX originally modelled DST based on North American conventions (like so much else in computing). Unsurprisingly, those conventions were found to not handle all cases when expanded to the rest of world, but they're still very much baked in to the spec. But I'm not a low-level C & POSIX expert, and it's entirely possible I've misinterpreted discussions in the past. Probably not worth me derailing this issue any further. 😉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants