Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#991 Support Relative Date Times #1006

Merged
merged 20 commits into from
Jan 15, 2025

Conversation

currantw
Copy link
Contributor

@currantw currantw commented Jan 3, 2025

Signed-off-by: currantw [email protected]

Description

Adds support for relative date times via the new user-defined method relative_datetime.

Related Issues

Resolves #991.

Check List

  • Updated documentation (docs/ppl-lang/README.md)
  • Implemented unit tests
  • Implemented tests for combination with other commands
  • New added source code should include a copyright header
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

testValid("@w6", "2000-01-01T00:00");
testValid("@w7", "2000-01-02T00:00");

testInvalid("@INVALID", "The relative date time unit 'INVALID' is not supported.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you will need to test a lot more invalid cases (such as "@W8") to satisfy implementation.

Copy link
Member

@YANG-DB YANG-DB Jan 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also like to see more complicated use cases such as
source=table | WHERE earliest=-5d@w1 AND latest=@w6 | ...
I remembered we will not be supporting this functionality in the first iteration - can you please add this test and mark it as ignore until we add this capability ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@acarbonetto I have added this test case, as well as a few more invalid ones. If you have any more in mind, please let me know.

@YANG-DB. I have added ignored tests for earliest and latest in FlintSparkPPLBuiltInDateTimeFunctionITSuite. That being said, more complex integration tests would, I think, require mocking/overriding the current time in order to be stable. Let me know if you think this is something I should pursue.

Copy link
Member

@LantaoJin LantaoJin Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can leverage the timestampdiff to test more complex cases in IT:
timestampdiff(HOUR, relative_timestamp("+1d"), relative_timestamp("+1h"))
timestampdiff(HOUR, relative_timestamp("-1h@w3"), relative_timestamp("@d"))
And something else, they should be stable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added some more complex IT tests along those lines - although I don't think timestampdiff(HOUR, relative_timestamp("-1h@w3"), relative_timestamp("@d")) would actually work, since the difference would depend on the current time and day of the week, right?

testValid("@w6", "2000-01-01T00:00");
testValid("@w7", "2000-01-02T00:00");

testInvalid("@INVALID", "The relative date time unit 'INVALID' is not supported.");
Copy link
Member

@YANG-DB YANG-DB Jan 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also like to see more complicated use cases such as
source=table | WHERE earliest=-5d@w1 AND latest=@w6 | ...
I remembered we will not be supporting this functionality in the first iteration - can you please add this test and mark it as ignore until we add this capability ...

Copy link
Member

@YANG-DB YANG-DB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please also add such time based relative dates queries in the example commands doc

Add a new relative time query filter example to the ppl where clause documentation

@currantw
Copy link
Contributor Author

currantw commented Jan 9, 2025

please also add such time based relative dates queries in the example commands doc

Add a new relative time query filter example to the ppl where clause documentation

Done! Let me know if this is what you had in mind.

Comment on lines 397 to 433
// TODO #957: Support earliest
ignore("test EARLIEST") {
var frame = sql(s"""
| source = $testTable
| | eval earliest_hour_before = earliest(now(), "-1h")
| | eval earliest_now = earliest(now(), "now")
| | eval earliest_hour_after = earliest(now(), "+1h")
| | fields earliest_hour_before, earliest_now, earliest_hour_after
| | head 1
| """.stripMargin)
assertSameRows(Seq(Row(true), Row(true), Row(false)), frame)
}

// TODO #957: Support latest
ignore("test LATEST") {
var frame = sql(s"""
| source = $testTable
| | eval latest_hour_before = latest(now(), "-1h")
| | eval latest_now = latest(now(), "now")
| | eval latest_hour_after = latest(now(), "+1h")
| | fields latest_hour_before, latest_now, latest_hour_after
| | head 1
| """.stripMargin)
assertSameRows(Seq(Row(false), Row(true), Row(true)), frame)
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@acarbonetto Any idea on better tests for earliest and latest that don't require mocking the current time for testing? Ultimately, earliest and latest are really only wrappers around relative_timestamp function (earliest(field_name, "-1h@d") is equivalent to field_name >= relative_timestamp("-1h@d"), so it seems fine to me as long as we have pretty robust unit tests for relative timestamp. Thoughts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests are cheap - so it's okay to have duplicate tests.

Since IT tests are mostly focusing on testing the API and integration, I don't thinks that's overly valuable to test the backend logic. Leave that to Unit Tests where mocking is easily done.

If you need to mock the IT test backend, you're probably not doing testing correctly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you need to mock the IT test backend, you're probably not doing testing correctly.

IT tests have been updated, and don't rely on mocking. Let me know what you think!

build.sbt Outdated Show resolved Hide resolved
docs/ppl-lang/functions/ppl-datetime.md Outdated Show resolved Hide resolved
docs/ppl-lang/functions/ppl-datetime.md Outdated Show resolved Hide resolved
docs/ppl-lang/functions/ppl-datetime.md Outdated Show resolved Hide resolved
docs/ppl-lang/functions/ppl-datetime.md Outdated Show resolved Hide resolved
docs/ppl-lang/functions/ppl-datetime.md Outdated Show resolved Hide resolved
docs/ppl-lang/functions/ppl-datetime.md Outdated Show resolved Hide resolved
testValid("@w6", "2000-01-01T00:00");
testValid("@w7", "2000-01-02T00:00");

testInvalid("@INVALID", "The relative date time unit 'INVALID' is not supported.");
Copy link
Member

@LantaoJin LantaoJin Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can leverage the timestampdiff to test more complex cases in IT:
timestampdiff(HOUR, relative_timestamp("+1d"), relative_timestamp("+1h"))
timestampdiff(HOUR, relative_timestamp("-1h@w3"), relative_timestamp("@d"))
And something else, they should be stable.


@Test
public void testRelativeOffsetValue() {
testValid("+h", "2000-01-03T02:01:01.100");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicate

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Duplicate removed

testValid("+2wk", "2000-01-17T01:01:01.100");
testValid("-1h@W3", "1999-12-29T00:00:00");
testValid("@d", "2000-01-03T00:00");
testValid("now", "2000-01-03T01:01:01.100");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but @now is invalid, or +now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Added these to testRelativeOffsetUnit and testRelativeSnap, respectively.

testValid("+qtr", "2000-04-03T01:01:01.100");
testValid("+qtrs", "2000-04-03T01:01:01.100");
testValid("+quarter", "2000-04-03T01:01:01.100");
testValid("+quarters", "2000-04-03T01:01:01.100");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are there limits to numbers of any other types? Such as: can I do +5q?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. Any integer should work. I've added another test case (-3d) to testRelativeOffsetValue to help make this more clear.

Signed-off-by: currantw <[email protected]>
Signed-off-by: currantw <[email protected]>
…sponding unit tests in `SerializableTimeUdfTest`. Add `mockito-inline` to dependencies for `ppl-spark-integration` to allow mocking of current datetime.

Signed-off-by: currantw <[email protected]>
Signed-off-by: currantw <[email protected]>
Signed-off-by: currantw <[email protected]>
Signed-off-by: currantw <[email protected]>
…eturned, but output from `$CurrentTimestamp` is an `Instant`

Signed-off-by: currantw <[email protected]>
Signed-off-by: currantw <[email protected]>
Signed-off-by: currantw <[email protected]>
Signed-off-by: currantw <[email protected]>
@currantw currantw force-pushed the #991_relative_datetime branch from cad87e3 to 4db9de7 Compare January 14, 2025 00:22
@YANG-DB YANG-DB merged commit 4a40676 into opensearch-project:main Jan 15, 2025
4 checks passed
@acarbonetto acarbonetto deleted the #991_relative_datetime branch January 15, 2025 21:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE] Support Relative Date-time Strings
4 participants