About Us

Citrus Consulting Services is the Consulting and the Transformation Services arm of Redington Gulf.

Sunday – Thursday: 9:00AM–6:00PM (Sales), Sunday – Saturday: 24×7 / 365 (Support) E.O#3, Ground Floor, Building 01 Dubai Internet City, P.O Box 501 761 Dubai, UAE (+971) 04 516 1500
(+966) 11 462 5323
info@citrusconsulting.com
Citrus Consulting Services
amazon-deepar-demand-and-forecast-banner

Demand and Price Forecasting Using Amazon DeepAR Algorithm

In Consumer Electronics, forecasting demand and selling price is crucial in predicting the inventory of each product across each store. In this blogpost we will discuss with a specific use case how we can use DeepAR algorithm to solve the industry problems.

Use Case

Predicting Mobile phone demand, selling price is difficult for the retailers as the shelf life of a mobile handset lasts anywhere between 16 to 18 months. From the date of launch, product price follows a declining trend due to market dynamics. Towards 12th -18th months retailer companies tend to clear all the stocks from their inventories to make space for new model releases. Usually between 14th -18th months, retailers tend to give massive discounts to clear off any outdated inventory. If we can forecast specific models to be sold, with right timing and price, we can optimize purchasing, shipping, inventory and pricing costs.

From the use case we observed that mobile device prices follow a general declining trend after day of launch. During product life cycle, they may be impacted by other related time series, sales quantity for each transaction, a big purchase order leading to a better negotiated price, seasonality factors such as holiday and sales events, which may result in unprecedented price fluctuations. Deep AR is the perfect algorithm which will capture the trend, seasonality, fluctuations, cold start problem and impact of other model’s time series data.

amazon-deepar-demand-and-forecast-data

Sk_id: Unique Id for the combination of Model, storage and color. Now we will see how we use DeepAR algorithm to solve these kind of use cases. For this use case I am using simulated data of mobile phones across different retailers. Below is the sample data.

Amazon DeepAR Algorithm

DeepAR is a supervised learning algorithm for time series forecasting that uses recurrent neural networks (RNN) to produce both point and probabilistic forecasts. It’s quite complex algorithm, unlike other timeseries forecasting techniques that trains different model for each timeseries, DeepAR creates a single model for all the time series and try to identify the similarities between each time series.

Implementation

Launch SageMaker Notebook instance to load, preprocess the data and build the model.

Step 1 - Data Loading and Preprocessing

After cleaning, preprocessing and feature engineering, we need to prepare the data as per the format required by DeepAR algorithm. Import all the required libraries.

%matplotlib inline
import sys
from dateutil.parser import parse
import json
from random import shuffle
import random
import datetime
import os
import boto3
import s3fs
import sagemaker
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from Lib import utils
#Create the connection to s3 to access the training and testing data
s3 = boto3.client('s3')
#get_execution_role gives the IAM role
role = sagemaker.get_execution_role()
#Intializing bucket properties
bucket = ‘’ #provide the bucket name
prefix = ‘’ #Provide the folder name inside the bucket
input_data = "s3://{}/{}/data".format(bucket, prefix)
output_data = "s3://{}/{}/output".format(bucket, prefix)

All string features need to be encoded to integer and value should start from 0. Hence, “sku_id”, “model”, “storage_col”, “colour”, “customer” and “country” features should be transformed.

transform_json={
"model":{"model1":0,"model2":1,"model3":2,"model4":3,"model5":4,"model6":5},
"storage_vol":{"64 gb":3,"256 gb":1,"512 gb":2,"128 gb":0},
"colour":{"black":0,"blue":1,"coral":2,"gold":3,"green":4,"silver":5,"space grey":6},
"customer":{"retailers1":0,"retailers2":1,"retailers3":2,"retailers4":3},
"country":{"country1":0,"country2":1,"country3":2}
}

df_cat = df.copy() #Creating the data copy to make sure the no changes in raw data

#Convert all column string values to lower case
str_cols=['model','storage_vol','colour','customer','country']
for col in str_cols:
df_cat[col]=df_cat[col].str.lower()

#Repalcing the string with corresponding encoded values
df_cat.replace(transform_json, inplace=True)

# create mapping table between sku_id and categorical data 
dict_mapping = df_cat[['sku_id', 'model', 'storage_vol', 'colour']].drop_duplicates()
dict_mapping = dict_mapping.sort_values('sku_id')
dict_mapping.head()

Next step is to assign the features into 3 arrays, “target”, “cat” and “dynamic_feat” in JSON format. You may notice the different terminologies used in SageMaker compared to Amazon Forecast datasets, and they are equivalent to “target time series”, “metadata” and “related time series”, respectively. Below is a sample of the training data where each line represents one time series, i.e. one sku’s historical record

amazon-deepar-demand-and-forecast-preprocessing

Preparing the target time series data in above format

#Selecting the required columns
df1 = df_cat[['timestamp', 'sku_id', 'price', 'quantity','country', 'customer']]

# time series columns naming: sku_id + country + customer_group (Merging the columns as our predictions will be in this dimension)
df1['group_id'] = df1.sku_id.astype(str) + '_' + df1.country.astype(str) + '_' + df1.customer.astype(str)

#Converting all the timeseries in to respective columns with date index
columns = df1['group_id'].unique()
columns.sort()
df1['timestamp'] = pd.to_datetime(df1['timestamp'], errors='coerce',dayfirst=True)
df1 = df1.set_index('timestamp')
df_target = pd.DataFrame(columns = columns)
df_target['timestamp'] = pd.date_range(start='2018-09-20', end='2020-01-31', freq = 'D')
df_target = df_target.set_index('timestamp')
df_target = df_target.asfreq(freq='D') #in df_target each column is nothing but one combination time series

#Updating all the time series with target price
num_columns = len(columns)
for i in range(num_columns):
    columns_split = df_target.columns[i].replace('_', ' ').split(' ')
    temp = df1.loc[(df1['sku_id'] == int(columns_split[0])) 
               & (df1['country'] == int(columns_split[1])) 
               & (df1['customer'] == int(columns_split[2]))].resample('D').mean() #taking the mean price if we have multiple records for the same day
    df_target.iloc[:, i] = temp['price']

#Adding all the time series data to target_price list
target_price = []
num_ts = df_target.shape[1]
for i in range(num_ts):   
    target_price.append(np.trim_zeros(df_target.iloc[:,i], trim='f'))

Data transformation to create dynamic feature data

#Data transformation to create realted time series data
df_dynamic = pd.DataFrame(columns = columns)
df_dynamic['timestamp'] = pd.date_range(start='2018-09-20', end='2020-01-31', freq = 'D')
df_dynamic = df_dynamic.set_index('timestamp')
df_dynamic = df_dynamic.asfreq(freq='D

num_columns = len(columns)
for i in range(num_columns):
    columns_split = df_dynamic.columns[i].replace('_', ' ').split(' ')
    temp = df1.loc[(df1['sku_id'] == int(columns_split[0])) 
               & (df1['country'] == int(columns_split[1])) 
               & (df1['customer'] == int(columns_split[2]))].resample('D').sum() #taking the sum of quantity if we have multiple records for the same day
    df_dynamic.iloc[:, i] = temp['quantity']

df_dynamic = df_dynamic.fillna(0) #Filling all missing values with 0

dynamic_quantity = []
num_ts = df_dynamic.shape[1]

for i in range(num_ts):   
    dynamic_quantity.append(df_dynamic.iloc[:,i])   #appending all the related time series data to dynamic_quantity list

The missing data in “target” and “dynamic_feat” need to be manually imputed. It allows nan values in “target” but has to be coded as “NaN’ string specifically to train the model properly. “dynamic_feat” cannot have nan values, so missing data needs to be manually imputed based on business logics. By now, we have completed the data preparation which is the heavy-lifting part

Now Split the prepared data into two datasets, training and testing, and upload them into an S3 bucket.

freq = '1D'
prediction_length = 30 #forecast days
context_length = 30

start_dataset = pd.Timestamp("2018-09-20", freq=freq)
end_training = pd.Timestamp("2020-01-31", freq=freq)
 
# cat structure : [model, storage, colour, country, customer_group]
# use dict_mapping, find the cat data for specific sku_id 
target_cat = {}

for i in range(len(target_price)):
    column_name = target_price[i].name.replace('_', ' ').split(' ')
    cat_name = dict_mapping.loc[dict_mapping['sku_id'] == int(column_name[0])][['model', 'storage_vol', 'colour']].to_numpy()
    target_cat[i] = []
    target_cat[i].append(int(cat_name[0][0]))
    target_cat[i].append(int(cat_name[0][1]))
    target_cat[i].append(int(cat_name[0][2]))
    target_cat[i].append(int(column_name[1]))
    target_cat[i].append(int(column_name[2]))

FREQ = 'D'
training_data = [
    {
        "start": str(start_dataset),
        "target": [i for i in ts[start_dataset:end_training - pd.Timedelta(1, unit=FREQ)].tolist()], # We use -1, because pandas indexing includes the upper bound
        "cat": target_cat[index],
        "dynamic_feat": [[j for j in dynamic_quantity[index][start_dataset:end_training - pd.Timedelta(1, unit=FREQ)].tolist()]]
    }
    for index, ts in enumerate(target_price)
]
print(len(training_data))

for i in range(len(training_data)):
    training_data[i]['target'] = [x if np.isfinite(x) else "NaN" for x in training_data[i]['target']]
 
#Creating test dataset with back test window as 4
FREQ = 'D'
num_test_windows = 4

test_data = [
    {
        "start": str(start_dataset),
        "target": [i for i in ts[start_dataset:end_training + pd.Timedelta(k * prediction_length, unit=FREQ)].tolist()],
        "cat": target_cat[index], 
        "dynamic_feat": [[j for j in dynamic_quantity[index][start_dataset:end_training + pd.Timedelta(k * prediction_length, unit=FREQ)].tolist()]]
    }
    for k in range(1, num_test_windows +1) 
    for index, ts in enumerate(target_price)
]
print(len(test_data))
 
for i in range(len(test_data)):
    test_data[i]['target'] = [x if np.isfinite(x) else "NaN" for x in test_data[i]['target']]

#Convert training and testing data to jsonlines
%%time
utils.write_dicts_to_file("train.json", training_data)
utils.write_dicts_to_file("test.json", test_data)
 
#transfer the data to s3
utils.copy_to_s3("train.json", input_data + "/train/train.json")
utils.copy_to_s3("test.json", input_data + "/test/test.json")

Step 2 - Training the Model

Then, we use class sagemaker.estimator.Estimator to initialize an estimator instance. In the estimator configuration, user can select the build-in model image, in this case DeepAR Docker image, which sets multiple computing instances to speed up the training process. Yet, GPU instances can also be used for training a DeepAR algorithm. We can set different hyperparameters to tune the model accuracy.

image_name = sagemaker.amazon.amazon_estimator.get_image_uri(region, "forecasting-deepar", "latest")

estimator = sagemaker.estimator.Estimator(
    sagemaker_session=sagemaker_session,
    image_name=image_name,
    role=role,
    train_instance_count=1,
    train_instance_type='ml.c4.2xlarge',
    base_job_name='deep-ar-testing-price-prediction‘,
    output_path=output_data
)
 
hyperparameters = {
    "time_freq": freq,
    "epochs": "400",
    "early_stopping_patience": "40", #stop the job if loss hasn't improved in 40 epochs
    "mini_batch_size": "64",
    "learning_rate": "5E-4",
    "likelihood" : "gaussian",
    "context_length": str(context_length),
    "prediction_length": str(prediction_length)
}

estimator.set_hyperparameters(**hyperparameters)

Then call estimator.fit() method to start the training job. Selected one training instance, ml.c4.2xlarge for training. Below shows the model accuracy metrics.

data_channels = {
    "train": "{}/train/".format(input_data),
    "test": "{}/test/".format(input_data)
}

estimator.fit(inputs=data_channels, wait=True)
amazon-deepar-demand-and-forecast-training-model

Step 3 - Model Deployment

After a model is trained, its artifacts are saved in S3 buckets automatically. In order to retrieve real-time inference results, we need to set up and launch an endpoint to host the model. Below code will create the endpoint.

predictor = estimator.deploy(
    initial_instance_count=1,
    instance_type='ml.m4.xlarge')

The endpoint can be integrated with web apps to get the predictions. You can clone end to end code from the below GitHub url: https://github.com/Citrus-Consulting-Services/DNA-AWS-DeepAR-algorithm-implementation-python.git

DeepAR Advantages

  • DeepAR+ algorithm can learn the trends and seasonality of a group of similar time series, which is based on training an auto-regressive recurrent network model.
  • DeepAR+ algorithm is a probabilistic forecast instead of point forecast. It means the forecast result is in the form of probability distribution that contains a prediction interval, not just a single best realization. It enables optimal decision making under uncertainty in real-life scenarios.
  • DeepAR+ algorithm has cold start forecast capability. A cold-start scenario occurs when you want to generate a forecast for a product that just launched in the market, it has little or no existing historical data.

Ashok is a Senior Consultant at Citrus Consulting Services based in Dubai, UAE. He is responsible for consulting, implementation and delivery of AI/ML projects across Middle East and Africa. Ashok has executed multiple ML projects in the region wherein he streamlined end to end ML process starting from Data collection to Model deployment. His expertise spread across different AI/ML techniques like Time Series Forecasting, Classification, Regression, Clustering, NLP, Computer Vision, Transfer learning. He is a certified AWS Machine Learning Specialist having hands on experience in all AWS AI/ML Services and also other cloud providers.

Comments

  • Poonam

    Wonderful post ! Very well explained code using all the parameters applicable for DeepAR, especially cat and dynamic_feat. Thanks so much.

    reply to

Post a Comment

two × 3 =