APIs
run_step ¶
Run a pipeline step's tasks based on the availability of task files.
Tasks are iterated through, and the relevant in/output files' existence existence is checked when the task is reached in the loop (rather than at the start). This means that intermediate files can be created by tasks, and their existence will be checked when those output files become inputs to subsequent tasks.
If any task's required input files are missing, the step bails out: no further tasks will run.
models ¶
Pydantic models to represent the tasks within a step in a data pipeline.
Executable ¶
Bases: BaseModel
All tasks must have an associated function to make them executable.
AvailableTask ¶
Bases: Executable
A task is available when its input files exist and its outputs don't.
CompletedTask ¶
Bases: Executable
A task is completed when its output files exist, whether inputs exist or not.
Task ¶
Bases: Executable
A task has zero or more input files and zero or more output files.
TaskRef ¶
Bases: Executable
A TaskRef is dereferenced to a Task by looking up src/dst fields on a config.
Step ¶
run ¶
Control flow using the Pydantic runtime file I/O checks.
run_step ¶
Run a pipeline step's tasks based on the availability of task files.
Tasks are iterated through, and the relevant in/output files' existence existence is checked when the task is reached in the loop (rather than at the start). This means that intermediate files can be created by tasks, and their existence will be checked when those output files become inputs to subsequent tasks.
If any task's required input files are missing, the step bails out: no further tasks will run.