Autoloading existing Dagster definitions
This feature is considered in a preview stage and is under active development. It can change significantly, or be removed completely. It is not considered ready for production use.
This guide covers using existing Dagster definitions with a dg
-compatible project. To convert an existing project to use dg
, see "Converting an existing project to use dg
".
In projects that are started with dg
, all definitions are typically kept in the defs/
directory. However, if you've converted an existing project to use dg
, you may have definitions located in various other modules. This guide will show you how to move these existing definitions into the defs
directory in a way that will allow them to be automatically loaded.
Example project structure
Let's walk through an example of migrating your existing definitions, with a project that has the following structure:
tree
.
├── README.md
├── my_existing_project
│ ├── __init__.py
│ ├── analytics
│ │ ├── __init__.py
│ │ ├── assets.py
│ │ └── jobs.py
│ ├── definitions.py
│ ├── defs
│ │ └── __init__.py
│ └── elt
│ ├── __init__.py
│ ├── assets.py
│ └── jobs.py
└── pyproject.toml
5 directories, 11 files
At the top level, we load definitions from various modules:
import dagster as dg
import my_existing_project.defs
from my_existing_project.analytics import assets as analytics_assets
from my_existing_project.analytics.jobs import (
regenerate_analytics_hourly_schedule,
regenerate_analytics_job,
)
from my_existing_project.elt import assets as elt_assets
from my_existing_project.elt.jobs import sync_tables_daily_schedule, sync_tables_job
defs = dg.Definitions.merge(
dg.Definitions(
assets=dg.load_assets_from_modules([elt_assets, analytics_assets]),
jobs=[sync_tables_job, regenerate_analytics_job],
schedules=[sync_tables_daily_schedule, regenerate_analytics_hourly_schedule],
),
dg.components.load_defs(my_existing_project.defs),
)
Each of these modules contains a variety of Dagster definitions, including assets, jobs, and schedules.
Let's migrate the elt
module to a component.
Move definitions to defs
We'll start by moving the top-level elt
module into defs/elt
:
mv my_existing_project/elt/* my_existing_project/defs/elt
Now that our definitions are in the defs
directory, we can update the root definitions.py
file to no longer explicitly load the elt
module's Definitions
:
import my_existing_project.defs
from my_existing_project.analytics import assets as analytics_assets
from my_existing_project.analytics.jobs import (
regenerate_analytics_hourly_schedule,
regenerate_analytics_job,
)
import dagster as dg
import dagster.components
defs = dg.Definitions.merge(
dg.Definitions(
assets=dg.load_assets_from_modules([analytics_assets]),
jobs=[regenerate_analytics_job],
schedules=[regenerate_analytics_hourly_schedule],
),
dagster.components.load_defs(my_existing_project.defs),
)
Our project structure now looks like this:
tree
.
├── README.md
├── my_existing_project
│ ├── __init__.py
│ ├── analytics
│ │ ├── __init__.py
│ │ ├── assets.py
│ │ └── jobs.py
│ ├── definitions.py
│ └── defs
│ ├── __init__.py
│ └── elt