Updated repo to use the SDK more consistently #67
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I've refactored the training pipeline to use the SDK in each step (train and evaluate) to track each run via the Release ID, and then dynamically look up the run metrics, rather than storing them in intermediary JSON files.
This update also tags the Azure ML pipeline with a "version" matching the Build ID, and then each child run has the property of "release-id" matching the AzDO Release ID that triggered the pipeline run. This should help improve traceability from model, to run, and back to build.
A few notes:
This update also removes the separate "register_model_step" in the pipeline and registers the model (based on performance) in the evaluate_step. I have left the code for the register step in as a legacy snippet.
PipelineData and Pipeline Inputs/outputs are no longer required (which is a shame because the JSON files were a decent example of how to use those features)
Updated .env.example to match which variables are needed for local testing.
This pipeline assumes that the model name will always match the file name of the saved model
Uses the SDK to submit pipelines, rather than REST call