Predicting Sales Forecast for FMCG Retail
The Main objective of this blogpost is to forecast the demand of all grocery items across all retail stores in UAE using Amazon Forecast.
Most of the Retail stores has more than 10000 items across each store. Creating a Model for each item across the store is compute expensive, time consuming which will delay in getting the predictions. For some items there will no historical data, so we need to solve the cold start problem also.
Dataset
We created a simulated dataset which will give sample overview of how data looks like and what all the features we used.
In the above dataset we have different product ids and store_ids. We need to predict the demand of each Product_id across each Store_id. Here our target variable is “Sold Quantity”.
Amazon Forecast will solve all these problems, it will create one single model for all items across stores and give predictions and it will take care of cold start problem also.
Amazon Forecast
Amazon Forecast is a fully managed deep learning service for Time Series Forecasting. We can provide time series data to Amazon Forecast and it will give the future predictions. It works for multiple domains like Retail, Finance, HealthCare etc. It will solve the problem of multiple time series forecasting; at a time, we can forecast for many products.
In this blogpost we will discuss how AWS Forecast helps to solve real world problems. Then create an IAM role which gives permissions to Amazon Forecast to access the data in Amazon S3 bucket. Follow the link to see how to create IAM role to access Amazon Forecast.
How to Use Amazon Forecast
We can use Amazon Forecast in three different ways:
- AWS Console
- AWS CLI
- AWS Python SDK
Regardless of whether you use the Amazon Forecast console or AWS Python SDK first you need to prepare the input data.
AWS Forecast expects datasets in three different types.
Target time series dataset (Required)
It will include the field that you want to generate the forecast for. Below is the sample Target time series dataset for our data.
Product_ID, Store_id : Two forecast dimensions
Sold Quantity: Target Variable
Item Metadata dataset (optional)
It contains the metadata of items that are present in the target time series data. Below is the sample item metadata dataset for our data.
Related Time Series dataset (Optional)
It includes the related features of our target feature. In our case, it will be price. Below is the sample Related Time Series dataset.
Once we are done with preparing the input data, upload the datasets (Target Time Series, Related Time Series and Item Metadata) to S3 bucket. Then create an IAM role which gives permissions to Amazon Forecast to access the data in Amazon S3 bucket. Follow the link to create IAM role to access Amazon Forecast.
Using Console
We have input data in the format as required by Amazon Forecast. Next steps are:
- Create a Dataset group and then for every Dataset upload respective Target, related and Metadata datasets.
- Once datasets are uploaded created the predictor by choosing the respective hyperparameters, after the predictor created, create the forecast to get the predictions.
You can follow the steps in detail by following the link.
Using AWS Python SDK
Launch SageMaker Notebook instance. Utilize SageMaker notebook to call Amazon Forecast services by using AWS Python SDK (boto3). In order to access Amazon Forecast service from SageMaker we need to Amazon Forecast access to the SageMaker role that have already created.
The overall process of using Amazon Forecast is as follows:
- Create a Dataset group, this isolates the model and data that are trained on from each other.
- Create a Dataset, in Forecast there are 3 types of dataset, Target Time Series, Related Time Series, and Item Metadata. The Target Time Series is required, the others are optional depends on the algorithm.
- Import data, this moves the information from S3 into a storage volume where the data can be used for training and validation.
- Create a Predictor using Dataset group, Here you can choose any algorithms from the available algorithms list that Amazon Forecast provides and you can either provide your own hyperparameters or you can use Hyper parameter Optimization (HPO) to choose the best performed hyperparameters.
- Once the predictor is created, you can deploy the predictor, which give the forecast
- You can query the forecast for all the incoming request and get the forecast for it.
The code is uploaded in GitHub. You can clone the code from the GitHub URL.
.
Advantages of Amazon Forecast
- Amazon Forecast can learn from your data and automatically choose the best algorithm for you.
- Various inbuilt methods for missing value imputations.
- DeepAR+ algorithm can learn the trends and seasonality of a group of similar time series, which is based on training an auto-regressive recurrent network model.
- It will give both probabilistic forecast and point forecast.
- It will solve Cold start problem
- End-to-end management, automating the entire forecasting workflow from data upload to data processing, model training, dataset updates, and forecasting.