At run-time

graph LR R[(raw data)] -- raw schema --> P[data processor] P -- clean schema --> C[(clean data)] C -- clean schema --> F[featurizer] F -- feature schema --> X[(feature data)]

At test-time

graph LR S(raw schema) --> R[(mock raw data)] R --> P[data processor] P -- clean schema --> O[output]
graph LR S(clean schema) --> R[(mock clean data)] R --> F[featurizer] F -- feature schema --> O[output]
graph LR L[load] --DF-Raw--> C[clean] C --DF-Clean--> F[featurize] F --DF-Training--> T[train_model]