Project Name: 

  • Superstore Analysis

Tools used:

  • Tableau and R.

Project Duration:

  • 3 Months

Table of Contents:

1) Introduction
  • Data Source
2) Exploratory Analysis
  • State Vs Profit
  • Categories and Sales
  • Sum of Sales in Different
  • Shipment Date vs Category
3) Visualization
  • Subcategory and Profit
  • Sales vs Profit
  • Sales vs Discount
  • Profit vs Discount
  • Sales vs Category
  • Profit vs Category
4) Analysis and Discussions

Project Overview

In this data visualization project, we focus on analyzing the sales data of Super Store, a renowned retail outlet in the United States. The dataset includes comprehensive information about sales, profit, and regional statistics for a diverse selection of over 1,000 products. Our objective is to uncover significant insights from the data through visualization and analysis, using the tools Tableau and RStudio for efficient data processing.

By employing various visualization techniques such as line charts, bar graphs, and interactive dashboards, we aim to reveal sales trends, product performance, and opportunities for Super Store to grow in a competitive market. Our primary goals include identifying profitable product categories, assessing the impact of discounts on profitability, and understanding sales patterns across different states and product segments. Through this data-driven approach, we intend to provide stakeholders with actionable insights that will optimize sales strategies, enhance decision-making, and maximize profits for Super Store.

Goal : To find various supermarket statistics such as

Data Source:

The research is based on Kaggle's Superstore analytical data, providing insights into various aspects of profit, such as loss, order ID, shipment mode, delivery status, online transactions, and store descriptions. The dataset consists of 9,994 rows and 21 variables, covering 24 states and cities. To visualize the data effectively, we utilized a range of visualization techniques, including maps, bar charts, line plots, scatterplots, circular plots, violin plots, and mosaic plots. After thorough cleaning and analysis, we focused on different US cities with sales, profit, segment, shipment modes, product descriptions, and more to derive valuable insights.

Exploratory Analysis

Before beginning the exploratory analysis, data cleaning was conducted, ensuring the dataset's integrity by addressing missing data and eliminating duplicate files.

The initial exploratory visualizations focused on understanding the interactions between various variables. We started by examining the relationship between states and profit. The bar graph displayed multiple states in the United States, each represented with varying profit rates, providing an overview of the state-wise profit distribution.

State Vs Profit

The exploratory bar graph reveals that Indiana has the highest profit, while Ohio has the lowest profit among the states in the United States.

Categories and Sales

The Tree Map visualization displays the relationship between sales and product categories, enabling us to assess demand and purchase frequency per product. "Phones" emerge as the top-selling category, while "Fasteners" have the lowest sales compared to other categories.

Sum of Sales in Different States

The Cartogram visualization provides an intuitive representation of aggregate sales in different states, highlighting sales disparities between regions. The geographic map groups states together, displaying the total sales in each state over time. Polarized color standards effectively portray the sales rate distribution, ranging from mild to severe, using distinct colors like peach and red. While color may not be the best attribute for direct comparison, it offers a sense of scale and aids in understanding sales patterns. Annotations are also available to assist the audience in locating specific sales data.

Shipment Date vs Category

The line chart illustrates the relationship between shipment dates and product categories, with data points connected to depict the rate of change from the previous year. Office supplies record the highest sales in December (Q4) and decline in January (Q1). Similarly, furniture and technology sales peak in December and decrease in January. This trend may be attributed to increased discounts and promotions available towards the end of the year, contributing to higher sales during that period. The line chart's connection of data points facilitates the visualization of seasonal sales patterns across different product categories.

VISUALIZATION

The data set includes numerous countries for analysis, as depicted in our previous visualizations. Our exploration involved examining various factors, including sales, profit, discount, and shipment methods. After analyzing sales by state, we expanded our investigation to explore sales across other categories. Each team member pursued different variables through exploratory analysis, providing diverse perspectives before regrouping to synthesize our findings. This approach allowed us to delve deeper into different aspects of the data, making the visualization process fascinating and comprehensive.

Subcategory and Profit

This Tableau-generated visualization enables comprehensive comparisons between sales and profit across various segment groups. The line graph showcases quarterly sales differences, while the bar chart illustrates profit trends in furniture, office supplies, and technology subcategory groups over time. Additionally, the grouped bar chart displays combined sales rates of these subcategories based on profit levels, using a color scheme to encode ordered groups and show their interconnectedness by variable. The visual allows viewers to delve into sales and profit dynamics, making insightful and in-depth analyses across the subcategory groups.

Sales vs Quantity

The visualization compares sales and profit, focusing on the impact of the shipment mode. Using RStudio's ggplot, the scatterplot depicts data points for each shipment mode, revealing the relationship between sales and profit. The standard shipment class stands out as having generated more profit or losses, though it does not exhibit significantly higher range profits. This visual provides valuable insights into the correlation between sales and profit based on different shipment modes, aiding in decision-making and strategy optimization for the Superstore industry.

ggplot(data = store_data, aes(x = Sales, y = Profit, color = Ship.Mode)) + geom_point()

Sales vs Quantity

The visualization showcases the impact of the shipment mode on sales concerning the quantity of products sold. Using RStudio's ggplot, the bar graph displays the quantity of products on the x-axis and their corresponding sales on the y-axis, with each bar color-coded based on the shipment mode. Notably, the standard class of shipment mode has driven the highest sales among all modes.

Sales vs Quantity:

The visualization showcases the impact of the shipment mode on sales concerning the quantity of products sold. Using RStudio's ggplot, the bar graph displays the quantity of products on the x-axis and their corresponding sales on the y-axis, with each bar color-coded based on the shipment mode. Notably, the standard class of shipment mode has driven the highest sales among all modes.

ggplot(data = store_data, aes(x = Quantity, y = Sales, fill = Ship.Mode)) + geom_bar(stat = "identity")

Profit vs Discount:

The visualization reveals a clear correlation between discounts and profitability for different product segments. As discounts increase, profitability decreases across the segments. While products with no discounts exhibit a wide range of profits, a higher range of discounts is associated with more losses and less profit.

ggplot() + geom_bar(data = store_data, aes(x = Discount, y = Profit, fill = Ship.Mode), stat = "identity")

Sales vs Category

The visualization highlights the sales distribution across different product categories. Technology emerges as the top-selling category, followed by Furniture and Office Supplies. Notably, the West and East regions contribute significantly to the majority of sales.

ggplot() + geom_bar(data = store_data, aes(x = Category, y = Sales, fill = Region), stat = "identity")

Profit vs Category

The visualization indicates that the Furniture category experiences more losses compared to the Technology and Office Supplies categories. Additionally, the profitability in the Furniture category varies across a wide range, mirroring the sales pattern from low to high.

ggplot() + geom_bar(data = store_data, aes(x = Category, y = Profit, fill = Region), stat = "identity")

Analysis and Discussions:

  • Sales to Profit Ratio: The graphs reveal a consistent Sales to Profit ratio across all categories, irrespective of their grouping.
  • Same-day Shipping and Discounts: Offering further discounts for same-day shipping can potentially boost sales and profits. Discounts should be based on sales volume to avoid significant losses due to needless discounts with low sales.
  • Focus on Binders and Machines: The Binders and Machines industry requires additional attention to strengthen these struggling sectors.
  • Central Region's Office Supplies and Furniture: The Office Supplies and Furniture sectors in the Central Region show room for improvement.
  • Choropleth Map - Top Sales States: The Choropleth map highlights California, New York, Texas, and Washington as states with the highest number of sales, enabling targeted advertising for improved profits.
  • Mosaic Plot - Category Sales per Region: The Mosaic plot conveys the same information as the Choropleth map but presents it differently, aiding decision-making.
  • Bar Chart - Popular Shipping Method: The bar chart indicates that customers prioritize affordability over quick delivery, with Standard Class shipping being the most commonly used method.
  • Relationship between Discounts and Sales: Higher discounts correspond to increased sales, suggesting that discounted products can lead to profitable outcomes.
  • Frequency of Quantity Bought: Understanding customer buying patterns helps optimize warehouse stock and revenue vs. cost, as customers tend to buy in quantities of three or five.
  • Tree Map - Subcategory Sales: The Tree map illustrates sales distribution across subcategories, assisting companies in preparing for future sales and product offerings.

My Team:

  • Gagan Doddanna
  • Kathie Matta
  • Megha Kaavali Mahadevappa
  • Krishna Asoka Kumar Sajitha
"Thanks for diving into this case study! I would love to hear your thoughts and feedback. Feel free to reach out using the contact form below. Let's keep the conversation flowing!"