Sales Forecast Project Using Google's Big Query and Python

This repository contains the research on sales forecast for one year using Google's Big Query and Data Studio. SQL, Python and some JS were used to complete given tasks.

Folder eu_sales contains SQL queries for dataset with some clothes sales in Europe and Asia. The task was to build a model for forecasting sales with high accuracy. ARIMA model from Big Query was used to perform it. All the queries for models creation can be found in eu_sales/models_fitting. Other queries are used to either predict sales for some period or to reearch some dependencies in dataset, which can help get better forecast score. There is code for general sales model, models for different cities sales and online sales, model for specific type of goods and model trained without outliers. Forecast is performed either daily or weekly by aggregating the predictions by weeks with different mean functions.

Folder us_sales contain SQL queries for dataset with gifts sales in USA. The task is similar to previous; build a forecast model with ARIMA. All the queries for models creation are contained in us_sales/models_fitting. Other queries are used to predict sales and to get better understanding of data used for this task. There are four models: general model trained on data from 2016 to 2018, general model trained strictly on 2018, model for most efficient associate sales forecasting and model for predicting earrings sales. Forecast can be done daily or weekly by aggregating the predictions by weeks with different mean functions.

Custom_functions folder contains UDF-functions for Big Query on JS to count harmonic and qudratic means.

Model_info provides you with functions to get ARIMA coefficients and features. shannon_entropy.sql is SQL script to count Shannon entropy of data. It was used to check data for difficulty of forecasting.

Notebooks folder contains Jupyter Notebooks with some additional research on time series stationarity. You can find code to count Sample Entropy there. Also, the ADF test was done in these notebooks, which prooved the stationarity of given data. Test for seasonality was performed there too. It showed that data is pretty similar each half a year.

Reports visualize queries from this repository and how some research performed on data. You can find models comparisons, calculations of time series statistics, MAPE scores of created models.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
us_queries		us_queries
README.md		README.md
SalesAnalysis.ipynb		SalesAnalysis.ipynb
Test+Entropy.ipynb		Test+Entropy.ipynb
shannon_entropy.sql		shannon_entropy.sql

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sales Forecast Project Using Google's Big Query and Python

About

Releases

Packages

Languages

Stass2000/sales_forecast

Folders and files

Latest commit

History

Repository files navigation

Sales Forecast Project Using Google's Big Query and Python

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages