Select Page

Data Engineering and MLOps specialist: Streamlining EDW & Data Pipelines for ML & AI products.

Azure machine learning pipelines are workflows of executable steps that enable users to complete Machine Learning workflows. Executable steps in azure pipelines include data import, transformation, feature engineering, model training, model optimisation, deployment etc.

Benefits of Pipeline:

  1. Multiple teams can own and iterate on individual steps which increases collaboration.
  2. By dividing execution into distinct steps, you can configure individual compute targets and thus provide parallel execution.
  3. Running in pipelines improves execution speed.
  4. Pipelines provide cost improvements.
  5. You can run and scale steps individually on different compute targets.
  6. The modularity of code allows great reusability.

Creating a pipeline in Azure

We can create a pipeline either by using Machine learning Designer or by using python programming

Creating pipeline using Python

1. Loading the workspace configuration

from azureml.core import Workspace

ws = Workspace.from_config()

2. Creating training cluster as a compute target to execute our pipeline step

from azureml.core.compute import ComputeTarget, AmlCompute

compute_config = AmlCompute.provisioning_configuration(

vm_size='STANDARD_D2_V2', max_nodes=4)

cpu_cluster = ComputeTarget.create(ws, cpu_cluster_name, compute_config)

cpu_cluster.wait_for_completion(show_output=True)

3. Defining estimator which provides required configuration for a target ML framework:

from azureml.train.estimator import Estimator

estimator = Estimator(entry_script='train.py',

compute_target=cpu_cluster, conda_packages=['tensorflow'])

4. Configuring the estimator step

from azureml.pipeline.steps import EstimatorStep

step = EstimatorStep(name="CNN_Train",

estimator=estimator, compute_target=cpu_cluster)

5. Defining and executing a pipeline :

from azureml.pipeline.core import Pipeline

pipeline = Pipeline(ws, steps=[step])

Pipeline is defined simply through a series of steps and is linked to a workspace.

6. Validating pipeline to check

pipeline.validate()

7. All steps are validated. We can now submit it as an experiment to workspace.

from azureml.core import Experiment

exp = Experiment(ws, "simple-pipeline")

run = exp.submit(pipeline)

run.wait_for_completion(show_output=True)

Creating pipeline using ML Designer

We have covered pipeline creation using Azure Machine Learning Designer in another article in detail.