ggsql: A Grammar of Graphics for SQL
- nicoritschel - 421 sekunder sedanThis is neat. I do wish there was a way for this to gracefully degrade in contexts without support for the grammar, though.
I devised a similar in spirit (inside SQL, very simplified vs GoG) approach that does degrade (but doesn't read as nice): https://sqlnb.com/spec
- anentropic - 7782 sekunder sedanMaybe I skim read it too fast, but I did not find any clear description in the blog post or website docs of how this relates to SQL databases
I was kind of guessing that it doesn't run in a database, that it's a SQL-like syntax for a visualisation DSL handled by front end chart library.
That appears to be what is described in https://ggsql.org/get_started/anatomy.html
But then https://ggsql.org/faq.html has a section, "Can I use SQL queries inside the VISUALISE clause," which says, "Some parts of the syntax are passed on directly to the database".
The homepage says "ggsql interfaces directly with your database"
But it's not shown how that happens AFAICT
confused
- getnormality - 6128 sekunder sedanI skimmed the article for an explanation of why this is needed, what problem it solves, and didn't find one I could follow. Is the point that we want to be able to ask for visualizations directly against tables in remote SQL databases, instead of having to first pull the data into R data frames so we can run ggplot on it? But why create a new SQL-like language? We already have a package, dbplyr, that translates between R and SQL. Wouldn't it be more direct to extend ggplot to support dbplyr tbl objects, and have ggplot generate the SQL?
Or is the idea that SQL is such a great language to write in that a lot of people will be thrilled to do their ggplots in this SQL-like language?
EDIT: OK, after looking at almost all of the documentation, I think I've finally figured it out. It's a standalone visualization app with a SQL-like API that currently has backends for DuckDB and SQLite and renders plots with Vegalite. They plan to support more backends and renderers in the future. As a commenter below said, it's supposed to help SQL specialists who don't know Python or R make visualizations.
- jorin - 2166 sekunder sedanReally cool project! Would love to see a standard established for representing visualizations in SQL! I built a whole dashboarding tool on top of the idea: https://taleshape.com/shaper/docs/getting-started/ But Shaper takes a more pragmatic approach and just uses built in functionality to describe how to visualize the results. The most value I see with viz as SQL is that it's a great format for LLMs to specify what they want while making it easy to audit and reproduce. Just built a slack bot on top of that concept last week: https://taleshape.com/blog/build-your-own-data-analytics-sla...
- kasperset - 6405 sekunder sedanWill this ever integrate rest of the ggplot2 dependent packages described here: https://exts.ggplot2.tidyverse.org/gallery/ in the near or distant future? Sorry if it already mentioned somewhere.
- jiehong - 1771 sekunder sedanThe cli only produces vega-lite[0] json graphics, right?
It would be nice if it included a rendering engine.
- efromvt - 6887 sekunder sedanLove the layering approach - that solves a problem I’ve had with other sql/visual hybrids as you move past the basics charts.
- jiehong - 2392 sekunder sedanOutstanding!
This can replace a lot of Excel in the end.
It makes so much sense now that it exists!
- thomasp85 - 10491 sekunder sedanThe new visualisation tool from Posit. Combines SQL with the grammar of graphics, known from ggplot2, D3, and plotnine
- gh5000 - 6256 sekunder sedanIt is conceivable that this could become a duckdb extension, such that it can be used from within the duckdb CLI? That would be pretty slick.
- kasperset - 8832 sekunder sedanLooks intriguing. Brings plotting to Sql instead of “transforming” sql for plotting.
- data_ders - 5086 sekunder sedanok, this is definitely up my alley. color me nerd-sniped and forgive the onslaught of questions.
my questions are less about the syntax, which i'm largely familiar with knowing both SQL and ggplot.
i'm more interested in the backend architecture. Looking at the Cargo.toml [1], I was surprised to not see a visualization dependency like D3 or Vega. Is this intentional?
I'm certainly going to take this for a spin and I think this could be incredible for agentic analytics. I'm mostly curious right now what "deployment" looks like both currently in a utopian future.
utopia is easier -- what if databases supported it directly?!? but even then I think I'd rather have databases spit out an intermediate representation (IR) that could be handed to a viz engine, similar to how vega works. or perhaps the SQL is the IR?!
another question that arises from the question of composability: how distinct would a ggplot IR be from a metrics layer spec? could i use ggsql to create an IR that I then use R's ggplot to render (or vise versa maybe?)
as for the deployment story today, I'll likely learn most by doing (with agents). My experiment will be to kick off an agent to do something like: extract this dataset to S3 using dlt [2], model it using dbt [3], then use ggsql to visualize.
p.s. @thomasp85, I was a big fan of tidygraph back in the day [4]. love how small our data world is.
[1]: https://github.com/posit-dev/ggsql/blob/main/Cargo.toml
[2]: https://github.com/dlt-hub/dlt
[3]: https://github.com/dbt-labs/dbt-fusion
[4]: https://stackoverflow.com/questions/46466351/how-to-hide-unc...
- - 7664 sekunder sedan
- radarsat1 - 8246 sekunder sedanWow, love this idea.
- breakfastduck - 1860 sekunder sedanThis is fantastic. Feels like something that should've been in there from the start!
- rvba - 2336 sekunder sedan1) does this alllw to export to Excel?
2) how to make manual adjustments?
- hei-lima - 4473 sekunder sedanReally cool!
- dartharva - 6320 sekunder sedanWould be awesome if somehow coupled into Evidence.dev
Nördnytt! 🤓