databow: a Rust CLI to query any database with an ADBC driver

115 points by hckshr 3 days ago

Reviewing the issues and PRs there provides a clue what to expect as this project matures.

pjmlp 1 day ago

Ouch, CVE party.

Sorry, just trying to understand. Why would I use this over duckdb. duckdb already has plugins for a lot of databases. Is the syntax the advantage?

bunsenhoneydew 1 day ago

It’s going to have a very big hill to climb if it’s playing in a space where duckdb already has a hold. Duck has probably been my favourite technology find in the last few years. Awesome tech.
I’ll still check this out though.
- iconicBark 1 day ago
  
  I agree
f311a 1 day ago

ClickHouse also supports a lot of data sources and has a local mode where you just use a single binary with local-only access.
Coincidentally, I wrote an article today on how I use it for similar scenarios. It can fetch from S3, multiple databases at once, and so on.
And you get all the benefits of a database when you need to join or postprocess data from multiple sources.
https://rushter.com/blog/clickhouse-data-processing/
data_ders 1 day ago

I think the advantage is simplicity. Why connect first to duckdb and attach the db when you can query it directly with ADBC which is guaranteed to be fast
- f311a 1 day ago
  
  You don’t need to connect to duckdb, it’s just a process that you spawn.
  
  co0lster 11 hours ago
  
  You spawn in memory instance of duckdb and connect to it.
gigatexal 1 day ago

This is my question too
freakynit 1 day ago

It's more of a common database cli/shell which uses a well defined, and fast, ADBC protocol. You are basically freed from DuckDB's internal handling/runtime of query for various databases. Not to mention, this has vastly more supported databases.
With databow, the query still runs on the target database (unline duckdb), but you get one consistent CLI across different databases: connection profiles, output formats, history, scripting, and import/export behavior.
This is genuinely useful for humans (For example, I regularly juggle 6-7 different database, oltp, olap, search and key-value mixed), and even more useful for AI coding agents, because they don't have to learn and juggle a different CLI and set of flags for every database.
- DangitBobby 15 hours ago
  
  FWIW duckdb's optimizer does actually push some of the query down into the target database such as selects and where clauses, which you definitely want in many cases.
co0lster 11 hours ago

duckdb relies on filter pushdowns that makes the reason harder to spot, but you don’t use sqlite to connect to database via JDBC protocol, are you?

wodenokoto 1 day ago

My biggest pain point with using different cli to connect to different databases is that they all do things like listing tables differently.

Another nice feature one would want from such a program is of course auto complete.

data_ders 1 day ago

Yeah for me standardization is the big win. But not just output formatting but cli commands and a guarantee that they’re as past as possible given that all the connectors use ADBC
adonese 1 day ago

I find this to put my psql in a better place as a daily driver. Adding this to ~/.inputrc [1] will allow you to limit search on your entered text. Eg.
select (up arrow)
will loop through your psql history for commands that started with e.g., . The challenging part is in wide tables and or table with large data. Less is awkward usually so using pspg made it less awkward.
I tried also to with help of ai, to write a plugin for sublime that fits my flow. It worked well but I think I'm more used to psql.
[1]
~/.inputrc
$if psql "\e[A": history-search-backward "\e[B": history-search-forward $endif
edit: formatting

ComputerGuru 1 day ago

ADBC: https://arrow.apache.org/docs/format/ADBC.html

Seems like a columnar version of ODBC, for OLAP instead of OLTP.

ifh-hn 1 day ago

This is excellent! I'm not a data engineer or SRE or whatever other commenters have mentioned. But part of my job is accessing data in various formats from various places, mostly offline. This in gonna be part of my toolset and I can pipe the output into other tools like nushell too.

aleda145 1 day ago

Cool! But as a data engineer I don't know when I would ever use this. Getting data into a centralized place so it can be joined and queried easily is like prio 1 for any data team.

I'm sure SREs will really love me doing expensive adhoc queries against production postgres /s

I've yet to work in enterprises big enough to have multi cloud data warehouses though, maybe it's more useful in that setting?

tonnydourado 1 day ago

As a consultant data engineer (ish), I think it has potential. You're right that any company doing data analytics is gonna be prioritizing a single source of truth and a unified platform, but each one will choose a different set of tools, which I'll have to learn, install, and even teach, for each new client. If I can use this to both explore AND implement stuff for clients regardless of their underlying database, that would be a pretty significant win.
- aleda145 1 day ago
  
  That's a great point! "Speed to insight" feels more important than ever
wodenokoto 1 day ago

Isn’t it useful for when you are getting things into a central DW?
E.g, you don’t need a million tools to connect to the million different application databases when inspecting sources as part of setting up pipelines.
- jnewton_dev 1 day ago
  
  Thanks for sharing — this clears up a misconception I've had for a while.