Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: support to_char(date, timstamp format) #14536

Open
Tracked by #14661
xudong963 opened this issue Feb 7, 2025 · 11 comments
Open
Tracked by #14661

Feature: support to_char(date, timstamp format) #14536

xudong963 opened this issue Feb 7, 2025 · 11 comments
Assignees
Labels
enhancement New feature or request

Comments

@xudong963
Copy link
Member

It seems that we don't support

> select to_char('2023-09-04'::date, '%Y-%m-%dT%H:%M:%S%.3f');
Execution error: Cast error: Format error

I want to ensure if this is an unsupported feature or if the SQL semantics don't support this usage.

Thanks!

@xudong963
Copy link
Member Author

FYI, Postgres supports this, so I think it's a feature.

Schema (PostgreSQL v17)


SELECT TO_CHAR('2023-09-04'::date, 'YYYY-MM-DD"T"HH24:MI:SS.MS');
to_char
2023-09-04T00:00:00.000

View on DB Fiddle

@Omega359
Copy link
Contributor

Omega359 commented Feb 7, 2025

If you cast the date to a timestamp that would work, but the syntax you provided would not match a date and thus it'll throw the format error. Any of the following would work:

select to_char('2023-09-04'::date, '%Y-%m-%d');
select to_char('2023-09-04 00:00:00'::timestamp, '%Y-%m-%dT%H:%M:%S%.3f');
select to_char(arrow_cast('2023-09-04'::date, 'Timestamp(Second, None)'), '%Y-%m-%dT%H:%M:%S%.3f');

@alamb
Copy link
Contributor

alamb commented Feb 7, 2025

@xudong963 maybe we should make the error message better?

@xudong963
Copy link
Member Author

xudong963 commented Feb 10, 2025

@xudong963 maybe we should make the error message better?

Yes, this is a good point (if we don't plan to support the syntax)

@matthewmturner
Copy link
Contributor

I had the impression (although perhaps it is dated) that datafusion sought to be compatible with postgres to the extent reasonable. Assuming thats still the case is there a reason we wouldnt want to fix this?

@Omega359
Copy link
Contributor

Omega359 commented Feb 10, 2025

I had the impression (although perhaps it is dated) that datafusion sought to be compatible with postgres to the extent reasonable. Assuming thats still the case is there a reason we wouldnt want to fix this?

I'm sorry, I didn't mean to imply in any way that this is something that should not be looked into. I was just providing alternatives to get @xudong963 past this issue in the meantime.

The solution is likely to try casting the date to a timestamp then applying the format if any of the provided formats don't succeed at first.

@alamb
Copy link
Contributor

alamb commented Feb 11, 2025

I agree -- fixing this issue seems reasonable to me. Thanks @Omega359

@matthewmturner
Copy link
Contributor

@Omega359 totally understood and appreciate the alternatives :)

@xudong963 xudong963 changed the title Question: to_char(date, timstamp format) Feature: support to_char(date, timstamp format) Feb 14, 2025
@xudong963 xudong963 added the enhancement New feature or request label Feb 14, 2025
@Omega359
Copy link
Contributor

I took a look into this. DataFusion delegates the formatting off to arrow-cast which doesn't currently have support for trying multiple times with different types.

I see a few possibilities:

  1. DataFusion casts all dates to timestamps and send them through. Not as efficient for the typical use case
  2. DataFusion on receiving a format Err from arrow potentially casts the type (date -> timestamp, etc) and retries.
  3. We submit a PR to arrow-rs to attempt the casting of dates to timestamps in the display code in arrow-cast upon receiving a format failure.
  4. DataFusion skips arrow-cast for the displaying of date/times/timestamps/duration and handles it internally. I'm not a huge fan of this option.

I'm leaning towards option 3

@friendlymatthew
Copy link
Contributor

take

@friendlymatthew
Copy link
Contributor

friendlymatthew commented Feb 19, 2025

@Omega359 wrote: We submit a PR to arrow-rs to attempt the casting of dates to timestamps in the display code in arrow-cast upon receiving a format failure.

I also think this is the most sound.

The error is occurring here. We hand off the formatting task to chrono which attempts to format a NaiveDate with the supplied format string. If the format string contains time-related specifiers, the format attempt will panic. To format both date and time, one would use chrono's NaiveDateTime instead.


@alamb wrote: maybe we should make the error message better?

Consider a query with a malformed format string:

> select to_char('2023-09-04'::date, '%Y-%m-%k');
Execution error: Cast error: Format error

I think we can benefit from a more detailed error message. But fwiw, I think this is a chrono issue, as the write!() error upon format is very vague. I've opened an issue to address this: chronotope/chrono#1660. (The error message is literally: Error)

__

TL;DR: We can't format a Date with time-related specifiers in chrono. We should perform an intermediary cast to Timestamp in arrow. Formatting error messages are vague, and we could benefit from capturing more context in the error message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants