Package source files into a recipe
This tutorial provides instructions on how you can package the provided Retail Sales sample source files into an archive file, which can be used to create a recipe in Adobe Experience Platform Data Science Workspace by following the recipe import workflow either in the UI or using the API.
Concepts to understand:
- Recipes: A recipe is Adobe’s term for a Model specification and is a top-level container representing a specific machine learning, artificial intelligence algorithm or ensemble of algorithms, processing logic, and configuration required to build and execute a trained model and hence help solve specific business problems.
- Source files: Individual files in your project that contain the logic for a recipe.
Prerequisites
Recipe creation
Recipe creation starts with packaging source files to build an archive file. Source files define the machine learning logic and algorithms used to solve a specific problem at hand, and are written in either Python, R, PySpark, or Scala. Built archive files take the form of a Docker image. Once built, the packaged archive file is imported into Data Science Workspace to create a recipe in the UI or using the API.
Docker based model authoring docker-based-model-authoring
A Docker image allows a developer to package up an application with all the parts it needs, such as libraries and other dependencies, and ship it out as one package.
The built Docker image is pushed to the Azure Container Registry using credentials supplied to you during the recipe creation workflow.
To obtain your Azure Container Registry credentials, log into Adobe Experience Platform. On the left navigation column, navigate to Workflows. Select Import Recipe followed by selecting Launch. See the screen shot below for reference.
The Configure page opens. Provide an appropriate Recipe Name, for example, “Retail Sales recipe”, and optionally provide a description or documentation URL. Once complete, click Next.
Select the appropriate Runtime, then choose a Classification for Type. Your Azure Container Registry credentials are generated once complete.
- For Python recipes select the Python runtime.
- For R recipes select the R runtime.
- For PySpark recipes select the PySpark runtime. An artifact type auto populates.
- For Scala recipes select the Spark runtime. An artifact type auto populates.
Note the values for Docker host, username, and password. These are used to build and push your Docker image in the workflows outlined below.
Package the source files
Start by obtaining the sample codebase found in the Experience Platform Data Science Workspace Reference repository.
Build Python Docker image python-docker
If you have not done so, clone the GitHub repository onto your local system with the following command:
git clone https://github.com/adobe/experience-platform-dsw-reference.git
Navigate to the directory experience-platform-dsw-reference/recipes/python/retail
. Here, you will find the scripts login.sh
and build.sh
used to login to Docker and to build the Python Docker image. If you have your Docker credentials ready, enter the following commands in order:
# for logging in to Docker
./login.sh
# for building Docker image
./build.sh
Note that when executing the login script, you need to provide the Docker host, username, and password. When building, you are required to provide the Docker host and a version tag for the build.
Once the build script is complete, you are given a Docker source file URL in your console output. For this specific example, it will look something like:
# URL format:
{DOCKER_HOST}/ml-retailsales-python:{VERSION_TAG}
Copy this URL and move on to the next steps.
Build R Docker image r-docker
If you have not done so, clone the GitHub repository onto your local system with the following command:
git clone https://github.com/adobe/experience-platform-dsw-reference.git
Navigate to the directory experience-platform-dsw-reference/recipes/R/Retail - GradientBoosting
inside your cloned repository. Here, you’ll find the files login.sh
and build.sh
which you will use to login to Docker and to build the R Docker image. If you have your Docker credentials ready, enter the following commands in order:
# for logging in to Docker
./login.sh
# for build Docker image
./build.sh
Note that when executing the login script, you need to provide the Docker host, username, and password. When building, you are required to provide the Docker host and a version tag for the build.
Once the build script is complete, you are given a Docker source file URL in your console output. For this specific example, it will look something like:
# URL format:
{DOCKER_HOST}/ml-retail-r:{VERSION_TAG}
Copy this URL and move on to the next steps.
Build PySpark Docker image pyspark-docker
Start by cloning the GitHub repository onto your local system with the following command:
git clone https://github.com/adobe/experience-platform-dsw-reference.git
Navigate to the directory experience-platform-dsw-reference/recipes/pyspark/retail
. The scripts login.sh
and build.sh
are located here and used to login to Docker and to build the Docker image. If you have your Docker credentials ready, enter the following commands in order:
# for logging in to Docker
./login.sh
# for building Docker image
./build.sh
Note that when executing the login script, you need to provide the Docker host, username, and password. When building, you are required to provide the Docker host and a version tag for the build.
Once the build script is complete, you are given a Docker source file URL in your console output. For this specific example, it will look something like:
# URL format:
{DOCKER_HOST}/ml-retailsales-pyspark:{VERSION_TAG}
Copy this URL and move on to the next steps.
Build Scala Docker image scala-docker
Start by cloning the GitHub repository onto your local system with the following command in terminal:
git clone https://github.com/adobe/experience-platform-dsw-reference.git
Next, navigate to the directory experience-platform-dsw-reference/recipes/scala
where you can find the scripts login.sh
and build.sh
. These scripts are used to login to Docker and build the Docker image. If you have your Docker credentials ready, enter the following commands to terminal in order:
# for logging in to Docker
./login.sh
# for building Docker image
./build.sh
login.sh
script, try using the command bash login.sh
.When executing the login script, you need to provide the Docker host, username, and password. When building, you are required to provide the Docker host and a version tag for the build.
Once the build script is complete, you are given a Docker source file URL in your console output. For this specific example, it will look something like:
# URL format:
{DOCKER_HOST}/ml-retailsales-spark:{VERSION_TAG}
Copy this URL and move on to the next steps.
Next steps next-steps
This tutorial went over packaging source files into a Recipe, the prerequisite step for importing a Recipe into Data Science Workspace. You should now have a Docker image in Azure Container Registry along with the corresponding image URL. You are now ready to begin the tutorial on importing a packaged recipe into Data Science Workspace. Select one of the tutorial links below to get started: