Explanatory Data Analysis (EDA) alongside with cohort analysis using dataset from Google BigQuery.
By: Muhammad Rafie Darmawan.
The data that would be used is TheLook e-commerce clothing. It is fictitious dataset developed by the Looker Team. It contains information such as customers, product, order, logistics, and web event.
In order to boost the profit, as well as to get some insights regarding the users and their activities, we need to figure out the insights. As such, some questions occurred, which are:
- During January 2019 until August 2022, how many unique customers that did shopping activity? How many order they made? how much sales' potential we could get? Divided by shopping's status and month.
- Using the same timeframe, how many completed orders customers made? How many unique customers that made the completed order? How is the average order value? Divided by month.
- During August 2022, many customers are refunding their orders. How is the list of customers that made refund during the month, detail it by their first name, last name, email, and user_id.
- Find the top 5 profitable products as well as least 5 profitable products over all the time. Check what kind of products that generate high and least revenue over time?
- Assuming the date when the analytical request came was 15 August 2022, how is the look of the MTD (Month to Date) of total profit from each product's category? Divided by month and category.
- How is the condition of the monthly growth from each products? Divided by month and products, order it descendingly.
- How does the monthly retention customers look across 2022? We need to find out using cohort analysis to see customers' retention in the following months in 2022.

