Data_Engineering_Certificate

Project 8: ETL on Monthly Retail Trade Survey data

Abstract

This report discusses the ETL (Extract, Transform, Load) process : from extraction of data from various sources, transformation of the data to fit the target system, and loading the transformed data into the target system are explained. The benefits of using the ETL process for businesses are also discussed, such as improved data management, better decision-making processes, and effective collaboration between departments. The report then focuses on the Monthly Retail Trade Survey, a widely used dataset that collects monthly sales and inventory data from US retail businesses. The data exploration and preparation steps using Excel macros, Python libraries and mysql are detailed. Then, the report explains the installation script used to automate the process, the creation of a database for the MRTS data. Finally, the report analyzes various categories using different value representation methods such as percentage Change and rolling time windows and a graph analysis is performed.

Go to the project

Jupyter Notebook

Raw Data (Excel file)

full repository


🔙 Back to portfolio