I've been curious about this project for a while...
If you squint a bit it's sort of like an Airflow that can run on AWS Step Functions.
Step Functions sort of gives you fully serverless orchestration, which feels like a thing that should exist. But the process for authoring them is very cumbersome - they are crying out for a nice language level library i.e. for Python something that creates steps via decorator syntax.
And it looks like Metaflow basically provides that (as well as for other backends).
The main thing holding me back is lack of ecosystem. A big chunk of what I want to run on an orchestrator are things like dbt and dlt jobs, both of which have strong integrations for both Airflow and Dagster. Whereas Metaflow feels like not really on the radar, not widely used.
Possibly I have got the wrong end of the stick a bit because Metaflow also provides an Airflow backend, which I sort of wonder in that case why bother with Metaflow?
A while ago I saw a promising Clojure project stepwise [0] which sounds pretty close to what you're describing. It not only allows you to define steps in code, but also implements cool stuff like ability to write conditions, error statuses and resources in a much-less verbose EDN instead of JSON. It also supports code reloading and offloading large payloads to S3.
Metaflow was started to address the needs of ML/AI projects whereas Airflow and Dagster started in data engineering.
Consequently, a major part of Metaflow focuses on facilitating easy and efficient access to (large scale) compute - including dependency management - and local experimentation, which is out of scope for Airflow and Dagster.
Metaflow has basic support for dbt and companies use it increasingly to power data engineering as AI is eating the world, but if you just need an orchestrator for ETL pipelines, Dagster is a great choice
If you are curious to hear how companies navigate the question of Airflow vs Metaflow, see e.g this recent talk by Flexport https://youtu.be/e92eXfvaxU0
I went to the GitHub page. The descriptions of the service seem redundant to what cloud providers offer today. I looked at the documentation and it lacks concrete examples for implementation flows.
Seems like something new to learn, an added layer on top of existing workflows, with no obvious benefit.
All the cloud providers have some hosted / custom version of an AI/ML deployment and training system. Good enough to use, janky enough to probably not meet all your needs if you're serious.
A big deal is that they get packaged automatically for remote execution. And you can attach them on the command line without touching code, which makes it easy to build pipelines with pluggable functionality - think e.g. switching an LLM provider on the fly.
Metaflow tracks all artifacts and allows you to build dashboards with them, so there’s no need to use MLFlow per se. There’s a Metaflow integration in Weights and Biases, CometML etc, if you want pretty off-the-shelf dashboards
As a fun historical sidebar and an illustration that there are no new names in tech these days, Metaflow was also the name of the company that first introduced out-of-order speculative execution of CISC architectures using micro-ops. [1]
I've been curious about this project for a while...
If you squint a bit it's sort of like an Airflow that can run on AWS Step Functions.
Step Functions sort of gives you fully serverless orchestration, which feels like a thing that should exist. But the process for authoring them is very cumbersome - they are crying out for a nice language level library i.e. for Python something that creates steps via decorator syntax.
And it looks like Metaflow basically provides that (as well as for other backends).
The main thing holding me back is lack of ecosystem. A big chunk of what I want to run on an orchestrator are things like dbt and dlt jobs, both of which have strong integrations for both Airflow and Dagster. Whereas Metaflow feels like not really on the radar, not widely used.
Possibly I have got the wrong end of the stick a bit because Metaflow also provides an Airflow backend, which I sort of wonder in that case why bother with Metaflow?
A while ago I saw a promising Clojure project stepwise [0] which sounds pretty close to what you're describing. It not only allows you to define steps in code, but also implements cool stuff like ability to write conditions, error statuses and resources in a much-less verbose EDN instead of JSON. It also supports code reloading and offloading large payloads to S3.
Here's a nice article with code examples implementing a simple pipeline: https://www.quantisan.com/orchestrating-pizza-making-a-tutor....
[0]: https://github.com/Motiva-AI/stepwise
Metaflow was started to address the needs of ML/AI projects whereas Airflow and Dagster started in data engineering.
Consequently, a major part of Metaflow focuses on facilitating easy and efficient access to (large scale) compute - including dependency management - and local experimentation, which is out of scope for Airflow and Dagster.
Metaflow has basic support for dbt and companies use it increasingly to power data engineering as AI is eating the world, but if you just need an orchestrator for ETL pipelines, Dagster is a great choice
If you are curious to hear how companies navigate the question of Airflow vs Metaflow, see e.g this recent talk by Flexport https://youtu.be/e92eXfvaxU0
I went to the GitHub page. The descriptions of the service seem redundant to what cloud providers offer today. I looked at the documentation and it lacks concrete examples for implementation flows.
Seems like something new to learn, an added layer on top of existing workflows, with no obvious benefit.
All the cloud providers have some hosted / custom version of an AI/ML deployment and training system. Good enough to use, janky enough to probably not meet all your needs if you're serious.
It's an old project from before the current AI buzz and I rejected this when I looked at it few years back as well with similar reasons.
My opinion about Netflix OSS has been pretty low as well.
I don't know if it's a coincidence but we just released a major new feature in Metaflow a few days ago - composing flows with custom decorators: https://docs.metaflow.org/metaflow/composing-flows/introduct...
A big deal is that they get packaged automatically for remote execution. And you can attach them on the command line without touching code, which makes it easy to build pipelines with pluggable functionality - think e.g. switching an LLM provider on the fly.
If you haven't looked into Metaflow recently, configuration management is another big feature that was contributed by the team at Netflix: https://netflixtechblog.com/introducing-configurable-metaflo...
Many folks love the new native support for uv too: https://docs.metaflow.org/scaling/dependencies/uv
I'm happy to answer any questions here
Is it common to see Metaflow used alongside MLflow if a team wants to track experiment data?
Metaflow tracks all artifacts and allows you to build dashboards with them, so there’s no need to use MLFlow per se. There’s a Metaflow integration in Weights and Biases, CometML etc, if you want pretty off-the-shelf dashboards
As a fun historical sidebar and an illustration that there are no new names in tech these days, Metaflow was also the name of the company that first introduced out-of-order speculative execution of CISC architectures using micro-ops. [1]
[1] https://en.wikipedia.org/wiki/Metaflow_Technologies