-
Notifications
You must be signed in to change notification settings - Fork 20
Implement zoneinfo parsing/compilation and add TZif structs #257
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
f0dd735
to
fe9d51b
Compare
It's time to finally tackle the zoneinfo and tzif sourcing issue. The work here lays the baseline for to handle sourcing. There's still a lot more to add, and maybe some bugs to uncover. But this is also major progress.
fe9d51b
to
12221b7
Compare
03303bd
to
3a9ffcc
Compare
3a9ffcc
to
a12f14e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding some general comments for context.
write!(file, "//@generated\n\n{generated}") | ||
} | ||
|
||
fn write_debug(&self, debug_path: &Path) -> io::Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there may be a better way to output the debug info rather than just one single JSON that will hopefully make the debugs a little more digestible
zoneinfo/src/lib.rs
Outdated
let tzif = TransitionData { | ||
transitions, | ||
single_line_zone, | ||
// TODO: Handle POSIX tz string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is already really big, and sort of dense. Not sure that POSIX tz string serialization should be added to it.
But if it's preferred, then I can add it.
/// | ||
/// Please note: implementation is very minimal | ||
#[derive(Debug)] | ||
pub struct TzifBlockV2 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a lot of TZif structs everywhere. I like the convenience of a related struct being in zoneinfo_compiler
, but I could be convinced that it should live elsewhere.
It might be nice to use structs from tzif
, but that may require some upstream changes.
}) | ||
} | ||
|
||
pub(crate) fn calculate_transitions_for_year( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is where a large bulk of the zoneinfo compilation work lives.
I tried my best to document it as much as possible because it's basically a rough reverse engineering of outzone
. So keeping general steps in the code seemed like a good idea for whenever we have to come back to it.
Just to note: I'm working through a couple bugs that I came across when double checking some more TZifs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove entries from the transitions table of each example? It would be nice to have something smaller for tests, not just a big file of values that we don't really use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not entirely sure ... is there a different testing scheme that would be preferred.
We need to be able to test against the offset, the savings flag (is_dst
), and the time zone abbreviation for correctness, so every entry in the transitions table is used in the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not proposing not testing those things, I'm proposing removing all the entries that are double testing the same things by carefully selecting a smaller range of transitions that has all the things we need to test. This would imply modifying the test data generation code to be a bit more aware about the test cases instead of just dumping everything from /usr/share/zoneinfo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The best option to do this would be to manually generate the test data in a slim format, which would contain all the primary transitions in the data, and removes the extra transitions from a bloat format that is normally found in /usr/share/zoneinfo
.
I'm not exactly clear on how much that may save in the transitions though.
If you're referring to an even smaller subset of transitions from the slim format than it may be possible. Not exactly sure how at this point and it'd probably need to be worked through.
Just to note, while building this and testing this against these test cases, I've hit nearly every primary transition point. It's possible to correct a tricky transition, but in correcting it you've messed up the transition before and after and so on.
EDIT: Updating to slim removed ~6000 lines in 1b1ea48
pub transition_types: ZeroVec<'data, u8>, | ||
// NOTE: zoneinfo64 does a fun little bitmap str | ||
pub types: ZeroVec<'data, LocalTimeRecord>, | ||
pub posix: &'data str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would propose parsing the posix string, instead of leaving it as a raw string. IIRC, zoneinfo64 has the Rules
field, which could map nicely into a common rules format for tzif and zoneinfo64, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is changed in #265. If it isn't then, it would be easy to add on that follow up. This was primarily a filler at the time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
zoneinfo64 has the
Rules
field, which could map nicely into a common rules format for tzif and zoneinfo64, right?
It does have a rules format, but I haven't looked super deeply at it or its purpose. It seemed a little strange or edge case based. The posix struct would map to the final*
fields in the zone table entry.
EDIT: correction, they have consolidated the rules into a table. So, yes there is potential to map between the two formats if need be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very impressive work!
It's time to finally tackle the zoneinfo compilation and tzif sourcing issue. The work here lays the baseline for to handle sourcing. There's still a lot more to add, and maybe some bugs to uncover. But this is also major progress to supporting the ability to source tzifs.
Opening as a draft first for early feedback due to just how dense this actually is. So far, the work here allows us to parse zoneinfo files and compile sets of transition data from the parsed zoneinfo files. There is also related work in provider to setup the tzif structures.
The steps missing from this PR is: