Encrypted data ingestion
Adobe Experience Platform allows you to ingest encrypted files through cloud storage batch sources. With encrypted data ingestion, you can leverage asymmetric encryption mechanisms to securely transfer batch data into Experience Platform. Currently, the supported asymmetric encryption mechanisms are PGP and GPG.
The encrypted data ingestion process is as follows:
- Create an encryption key pair using Experience Platform APIs. The encryption key pair consists of a private key and a public key. Once created, you can copy or download the public key, alongside its corresponding public key ID and Expiry Time. During this process, the private key will be stored by Experience Platform in a secure vault. NOTE: The public key in the response is Base64-encoded and must be decrypted prior to using.
- Use the public key to encrypt the data file that you want to ingest.
- Place your encrypted file in your cloud storage.
- Once the encrypted file is ready, create a source connection and a dataflow for your cloud storage source. During the flow creation step, you must provide an
encryption
parameter and include your public key ID. - Experience Platform retrieves the private key from the secure vault to decrypt the data at the time of ingestion.
This document provides steps on how to generate a encryption key pair to encrypt your data, and ingest that encrypted data to Experience Platform using cloud storage sources.
Getting started
This tutorial requires you to have a working understanding of the following components of Adobe Experience Platform:
- Sources: Experience Platform allows data to be ingested from various sources while providing you with the ability to structure, label, and enhance incoming data using Platform services.
- Cloud storage sources: Create a dataflow to bring batch data from your cloud storage source to Experience Platform.
- Sandboxes: Experience Platform provides virtual sandboxes which partition a single Platform instance into separate virtual environments to help develop and evolve digital experience applications.
Using Platform APIs
For information on how to successfully make calls to Platform APIs, see the guide on getting started with Platform APIs.
Supported file extensions for encrypted files
The list of supported file extensions for encrypted files are as follows:
- .csv
- .tsv
- .json
- .parquet
- .csv.gpg
- .tsv.gpg
- .json.gpg
- .parquet.gpg
- .csv.pgp
- .tsv.pgp
- .json.pgp
- .parquet.pgp
- .gpg
- .pgp
Create encryption key pair create-encryption-key-pair
The first step in ingesting encrypted data to Experience Platform is to create your encryption key pair by making a POST request to the /encryption/keys
endpoint of the Connectors API.
API format
POST /data/foundation/connectors/encryption/keys
Request
The following request generates an encryption key pair using the PGP encryption algorithm.
curl -X POST \
'https://platform.adobe.io/data/foundation/connectors/encryption/keys' \
-H 'Authorization: Bearer {{ACCESS_TOKEN}}' \
-H 'x-api-key: {{API_KEY}}' \
-H 'x-gw-ims-org-id: {{ORG_ID}}' \
-H 'x-sandbox-name: {{SANDBOX_NAME}}' \
-H 'Content-Type: application/json'
-d '{
"encryptionAlgorithm": "PGP",
"params": {
"passPhrase": "{{PASSPHRASE}}"
}
}'
encryptionAlgorithm
PGP
and GPG
.params.passPhrase
Response
A successful response returns your Base64-encoded public key, public key ID, and the expiry time of your keys. The expiry time automatically sets to 180 days after the date of key generation. Expiry time is currently not configurable.
{
"publicKey": "{PUBLIC_KEY}",
"publicKeyId": "{PUBLIC_KEY_ID}",
"expiryTime": "1684843168"
}
publicKey
publicKeyId
expiryTime
Create customer managed key pair
You can optionally create a sign verification key pair to sign and ingest your encrypted data.
During this stage, you must generate your own private key and public key combination and then use your private key to sign your encrypted data. Next, you must encode your public key in Base64 and then share it to Experience Platform in order for Platform to verify your signature.
Share your public key to Experience Platform
To share your public key, make a POST request to the /customer-keys
endpoint while providing your encryption algorithm and your Base64-encoded public key.
API format
code language-http |
---|
|
Request
code language-shell |
---|
|
table 0-row-2 1-row-2 2-row-2 | |
---|---|
Parameter | Description |
encryptionAlgorithm |
The type of encryption algorithm that you are using. The supported encryption types are PGP and GPG . |
publicKey |
The public key that corresponds to your customer managed keys used for signing your encrypted. This key must be Base64-encoded. |
Response
code language-json |
---|
|
table 0-row-2 1-row-2 | |
---|---|
Property | Description |
publicKeyId |
This public key ID is returned in response to sharing your customer managed key with Experience Platform. You can provide this public key ID as the sign verification key ID when creating a dataflow for signed and encrypted data. |
Connect your cloud storage source to Experience Platform using the Flow Service API
Once you have retrieved your encryption key pair, you can now proceed and create a source connection for your cloud storage source and bring your encrypted data to Platform.
First, you must create a base connection to authenticate your source against Platform. To create a base connection and authenticate your source, select the source you would like to use from the list below:
After creating a base connection, you must then follow the steps outlined in the tutorial for creating a source connection for a cloud storage source in order to create a source connection, a target connection, and a mapping.
Create a dataflow for encrypted data create-a-dataflow-for-encrypted-data
To create a dataflow, make a POST request to the /flows
endpoint of the Flow Service API. To ingest encrypted data, you must add an encryption
section to the transformations
property and include the publicKeyId
that was created in an earlier step.
API format
POST /flows
Request
The following request creates a dataflow to ingest encrypted data for a cloud storage source.
code language-shell |
---|
|
table 0-row-2 1-row-2 2-row-2 3-row-2 4-row-2 5-row-2 6-row-2 7-row-2 8-row-2 9-row-2 | |
---|---|
Property | Description |
flowSpec.id |
The flow spec ID that corresponds with cloud storage sources. |
sourceConnectionIds |
The source connection ID. This ID represents the transfer of data from source to Platform. |
targetConnectionIds |
The target connection ID. This ID represents where the data lands once it is brought over to Platform. |
transformations[x].params.mappingId |
The mapping ID. |
transformations.name |
When ingesting encrypted files, you must provide Encryption as an additional transformations parameter for your dataflow. |
transformations[x].params.publicKeyId |
The public key ID that you created. This ID is one half of the encryption key pair used to encrypt your cloud storage data. |
scheduleParams.startTime |
The start time for the dataflow in epoch time. |
scheduleParams.frequency |
The frequency at which the dataflow will collect data. Acceptable values include: once , minute , hour , day , or week . |
scheduleParams.interval |
The interval designates the period between two consecutive flow runs. The interval’s value should be a non-zero integer. Interval is not required when frequency is set as once and should be greater than or equal to 15 for other frequency values. |
code language-shell |
---|
|
table 0-row-2 1-row-2 | |
---|---|
Property | Description |
params.signVerificationKeyId |
The sign verification key ID is the same as the public key ID that was retrieved after sharing your Base64-encoded public key with Experience Platform. |
Response
A successful response returns the ID (id
) of the newly created dataflow for your encrypted data.
{
"id": "dbc5c132-bc2a-4625-85c1-32bc2a262558",
"etag": "\"8e000533-0000-0200-0000-5f3c40fd0000\""
}
Restrictions on recurring ingestion
Encrypted data ingestion does not support ingestion of recurring or multi-level folders in sources. All encrypted files must be contained in a single folder. Wildcards with multiple folders in a single source path are also not supported.
The following is an example of a supported folder structure, where the source path is /ACME-customers/*.csv.gpg
.
In this scenario, the files in bold are ingested into Experience Platform.
-
ACME-customers
- File1.csv.gpg
- File2.json.gpg
- File3.csv.gpg
- File4.json
- File5.csv.gpg
The following is an example of an unsupported folder structure where the source path is /ACME-customers/*
.
In this scenario, the flow run will fail and return an error message indicating that data cannot be copied from the source.
-
ACME-customers
-
File1.csv.gpg
-
File2.json.gpg
-
Subfolder1
- File3.csv.gpg
- File4.json.gpg
- File5.csv.gpg
-
-
ACME-loyalty
- File6.csv.gpg
Next steps
By following this tutorial, you have created an encryption key pair for your cloud storage data, and a dataflow to ingested your encrypted data using the Flow Service API. For status updates on your dataflow’s completeness, errors, and metrics, read the guide on monitoring your dataflow using the Flow Service API.