Replies: 1 comment 5 replies
-
I agree, though my current mental model of this is that the querying interface sitting between PRQL and the DB could handle this. i.e. the PRQL lang is limited to the description of the data transformation, rather than how or when to evaluate it. That would let the interface could be really smart — it could do things like in the "Native" description here: #1672. This is also increasingly possible with DuckDB, fast CDWs with memory caching, burstable compute, etc. My mood-affiliation is to have machines handle the non-declarative parts. That said, possibly a user wants more direct control of the query than that design would offer, and the language could offer that? My vote would be to leave this open, and see how things evolve with various interfaces and tools. WDYT? |
Beta Was this translation helpful? Give feedback.
-
Currently in plyground, when we write a query, it is evaluated immediately and the result is returned from DuckDB.
However, when executing a query on huge tables, we may not want the query to be evaluated until you finish writing it.
I feel it would be useful to have the ability to tell the compiler that the query is not complete in such cases.
For example, dplyr evaluates queries for data frames immediately, but queries for different backends (dbplyr, dtplyr, arrow, etc.1) are not executed until the query is finished at
collect()
(orcompute()
)2.(Python) Polars have DataFrame and LazyFrame.
Footnotes
https://dplyr.tidyverse.org/index.html#backends ↩
https://dplyr.tidyverse.org/reference/compute.html ↩
Beta Was this translation helpful? Give feedback.
All reactions