Day: June 8, 2023

Extracting Web Analytics Data using Python & Adobe Analytics 2.0 APIs

The Adobe Analytics 2.0 APIs provide a powerful way to directly interact with Adobe’s servers, enabling you to perform various actions programmatically that were previously only available through the user interface. In this blog post, we will explore how to leverage Python and the 2.0 API to extract web analytics data from Adobe Analytics.

Getting Started: OAuth Server-to-Server Credentials

The Service Account (JWT) credentials have been deprecated in favor of the OAuth Server-to-Server credentials. This guide will focus on using the latter. To obtain the API, Client and Secret key, you can refer to the official Adobe Developer documentation on the setup.

Authenticating and Accessing Adobe Analytics 2.0 API with Python

import requests
from authlib.integrations.requests_client import OAuth2Session

# Configure the Adobe Analytics API credentials
client_id = 'YOUR CLIENT ID'
client_secret = 'YOUR CLIENT SECRET'
token_endpoint = 'https://ims-na1.adobelogin.com/ims/token'
company_name = 'TGT_COMPANY'

# Create an OAuth2Session object with the client credentials
oauth = OAuth2Session(client_id, client_secret, scope='openid AdobeID additional_info.projectedProductContext')

# Fetch the access token from Adobe IMS
token = oauth.fetch_token(token_endpoint)

## Test a simple GET query
api_url = r'https://analytics.adobe.io/api/{}/annotations?locale=en_US&limit=10&page=0&sortProperty=id'.format(company_name)

headers = {
'Authorization': 'Bearer ' + token['access_token'],
'x-api-key': client_id
}
response = requests.get(api_url, headers=headers)

if response.status_code == 200:
    data = response.json()
print(data)
else:
print('Error:', response.status_code, response.text)

The following Python script demonstrates the authentication process and making consecutive API requests using Authlib.

Using the Reporting API

The /reports endpoint serves as the primary endpoint for reporting requests to retrieve web analytics metrics. Since the /reports endpoint utilizes the same API as the Analytics Workspace UI, it offers extensive configuration options. To initiate a report request, we must provide a Date Range, Metrics, and at least one Dimension. The /reports endpoint requires a specific JSON data structure to define the requested report. You can refer to the sample code below for the structure. For more informationon creating the JSON Structure, you can visit the Adobe Analytics Docs.

# Setting up for the JSON Structure
# Getting the Visits and Page views from Jun 1st to Jun 5th group
# by each day results.

RSID = 'Report Suite ID'
START_DATE = '2023-06-01'
END_DATE = '2023-06-05'
MIDNIGHT = 'T00:00:00.000'
DATE_RANGE = START_DATE + MIDNIGHT + '/' + END_DATE + MIDNIGHT

DIM = 'variables/daterangeday'
METS = ['metrics/visits','metrics/pageviews']
METS_OBJ = [{'id':x} for x in METS]

query_json = {
"rsid":RSID,
"globalFilters":[
                  {
"type":"dateRange",
"dateRange":DATE_RANGE
                  }
               ],
"metricContainer":{
"metrics":METS_OBJ,

              },
"dimension":DIM,
"settings":{
"dimensionSort":"asc"
               }
}

api_url = r'https://analytics.adobe.io/api/{}/reports'.format(company_name)

response =requests.post(url=api_url, headers=headers, json=query_json)

# Process the response
if response.status_code == 200:
    data = response.json()
else:
print('Error:', response.status_code, response.text)

Formatting the Response Output

Upon receiving the response from the /reports endpoint, you can format the output in a tabular structure for better readability and analysis.

df  =pd.DataFrame(response.json()['rows'])
df.columns = [DIM+'_key',DIM,'data'] #rename columns<br>
dfa = pd.DataFrame(df['data'].to_list())
dfa.columns = METS
output = pd.concat([df.iloc[:,:-1],dfa],axis='columns')
output

Conclusion

We can leverage Python and the Adobe Analytics 2.0 APIs to provides a automated solution for extracting web analytics data. By utilizing OAuth Server-to-Server credentials and making API requests, we can automate data retrieval, generate custom reports, storing to databaes and gain valuable insights.

This post has also been published on Medium.