Session 5 - Data Visualization

Introduction to Data Analytics for Beginners

Welcome to one of the most important parts of your analytics journey: Data Visualization. Think of it like this – numbers contain the story, but visuals help reveal and communicate that story. Without visualization, your insights can remain hidden in rows and columns. But with the right visuals, patterns come to life, decisions get easier, and your message becomes clear. There are two key reasons why data visualization should be part of every analyst’s toolkit:

  1. To Discover Patterns:
    Visuals help you instantly see trends, spikes, dips, and outliers that would be difficult to catch in tables.

  2. To Communicate Across Teams:
    Your stakeholders—from marketing to finance—may not be data experts. Charts let you speak a common visual language that influences and inspires action.

We will continue using the Amazon Sales Dataset as our example, which you downloaded and imported into Google Sheets in the first session. If not, follow the link above to download.

 

Data Visualization

First Things First – Choosing the Right Chart

Just because a chart looks fancy doesn’t mean it’s effective. Use visuals to clarify, not complicate.

Chart TypeBest ForAvoid When…
Line ChartTime series trendsYou’re not working with chronological data
Column/Bar ChartComparing values across categoriesToo many categories crowd the chart
HistogramViewing distribution (e.g., rating ranges)Categories aren’t numeric or continuous
Scatter PlotChecking correlation between 2 numeric fieldsData is categorical or unrelated
Pie ChartShowing part-to-whole ratios (rarely useful)Values are too close together
Geo ChartMapping values by locationNo geo data is available

 

Step-by-Step: Visualizing Product Ratings

1. Start With a Business Question

Before jumping into any chart, always ground yourself in the business context. In our case, we’re trying to answer this question:

Which product categories receive the highest and lowest customer ratings?

This isn’t just a visual exercise. You’re looking to guide decisions—maybe about inventory prioritization, quality improvement, or marketing focus.

 

2. Examine Your Raw Data

Your dataset contains a Category column, but here’s the catch:

Each value looks like this:

 
Home & Kitchen | Appliances | Electric Grinders

That’s not a clean category – it’s a hierarchy: main category → subcategory → specific item.

If you try to visualize using this raw column, you’ll end up with hundreds of ultra-specific combinations. The result?

  • Charts become overcrowded

  • Patterns become hard to read

  • Stakeholders get confused

 

3. Clean It Up: Extract the Main Category

To bring clarity, we simplify. The insight we want is at the main category level.

We’ll use the SPLIT() function in Google Sheets (or Excel) to separate the values based on the pipe symbol (|), and extract only the first part – i.e., the top-level category like Home & Kitchen, Electronics, Toys & Games.

Formula:

=SPLIT(A2, "|")

Why this matters:

  • It gives us consistent, top-level categories

  • We reduce complexity from 200+ specific groups → 8–10 meaningful chunks

  • Visualizations become cleaner, clearer, more actionable

Best Practice: When visualizing, zoom out before you zoom in. Start broad to reveal patterns. If needed, drill down later.

 

4. Create a Pivot Table to Summarize the Data

Now that we have a column of main categories, we want to answer:

What’s the average rating for each main category?

To do that, use a Pivot Table:

  • Go to InsertPivot Table

  • Use your cleaned sheet as input

  • Choose to place the pivot table in a new sheet

Then:

  • RowsMain Category

  • ValuesAverage Rating
    (Make sure it’s set to AVERAGE, not SUM)

You’ll now have a table that summarizes average customer rating for each top-level category.

 

5. Build Your First Chart

Select your pivot table results and go to:

Insert → Chart

Google Sheets will suggest a chart type. If it recommends a column chart, go ahead – but evaluate if a bar chart might offer better readability (especially if category names are long).

We’ll walk through these options in the next section.

 

Let’s Visualize

📈 Line Chart

A line chart connects individual data points using a line. It’s best used when your data has a natural order – especially when you’re tracking changes over time (days, months, years, etc.).

Example:
Tracking monthly sales or customer support tickets over a quarter.

Tip:
Avoid using line charts for categorical comparisons. If your data isn’t time-based, a column chart is likely a better choice.

📊 Column Chart vs. Bar Chart

  • Column Chart: Vertical bars. Can become cramped with long category names.

  • Bar Chart: Horizontal layout often reads better, especially when labels are lengthy.

Rule of Thumb: If you’re tilting labels or truncating text, consider switching from column to bar.

Sorting for Clarity: Sort by average rating (descending). This improves storytelling—your audience sees the “best” first.

 

◑ Pie Chart

Pie charts are often used to show how a whole is divided into parts – for example, what percentage of total sales came from each region. But they’re easily misused.

Limitations:
When there are too many slices or the differences between them are small, a pie chart becomes difficult to read. It’s also hard to compare angles accurately.

Alternatives:
Use a bar chart when you want precision and easy comparison. Use a pie chart only when you want to emphasize that the data forms parts of a whole.

📉 Histogram

  • Shows distribution of a variable (e.g., ratings).

  • Good for spotting skewness: left-skewed = many low ratings; right-skewed = mostly high.

 

🔗 Scatter Plot

  • Use to assess relationships. Try plotting Discount % vs. Rating.

  • In our case: The scatter plot reveals no clear trend—discount and rating are not correlated.

Interpretation Tip: If dots are randomly scattered, there’s no correlation. If they form a slope, the variables are related.

 

🌍 Geo Chart

  • Paste in a “Country” column to simulate geographic data.

  • Use a Geo Chart to see how average ratings vary by country.

  • Color gradient shows sentiment: red (low) → green (high).

Optional: Use Geo Markers for added context—color + size = sentiment + frequency.

 

When Not to Visualize

Avoid visuals that:

  • Repeat the same insight shown earlier

  • Make minor variations look dramatic

  • Confuse more than they clarify

Golden Rule: Every visual should answer a question, support a decision, or reveal a pattern.

 

Understanding Distribution of Ratings

A histogram helps us understand how data is distributed across different ranges or “buckets.” In this case, we’re looking at the distribution of average product ratings across our dataset.

Instead of comparing categories, we now shift our focus to the shape of the data. Are most ratings clustered around 4 and 5? Are there a lot of poor reviews? Or is the distribution balanced?

 

Why It Matters?

Understanding the shape of your data is critical for drawing reliable conclusions:

  • A normal (bell-curve) distribution suggests consistency and reliability.

  • A right-skewed distribution (long tail on right) may indicate widespread dissatisfaction.

  • A left-skewed distribution (long tail on left) often reflects generally high satisfaction with occasional poor ratings.

This is the kind of nuance that a table won’t reveal — but a histogram can, instantly.

 

Step-by-Step: Creating the Histogram of Ratings

  1. Go to your cleaned ratings data — specifically, the column with average ratings per product or per category.

  2. Select the column containing those ratings.

  3. Insert → Chart
    Google Sheets may automatically suggest a histogram. If not:

    • In the Chart Editor, switch the chart type manually to Histogram.

 

How to Read This Chart?

Once your histogram appears, take a moment to observe its shape.

Let’s say your histogram shows:

  • A peak between 4.0–4.4 average rating

  • Very few products with ratings below 3.5 or above 4.8

This is called a normal distribution — a balanced “bell curve.” Most products are rated well, with fewer being exceptional or disappointing.

 

Skewness Explained

  • Right-skewed: If the chart has a long tail on the right, with many low ratings (2s and 3s), it may signal deeper problems — poor quality, delivery issues, or unmet expectations.

  • Left-skewed: A long tail on the left with many 5-star ratings and few 2s/3s may reflect over-performance — or possibly rating inflation.

A flat histogram with no clear peak might indicate too much noise – maybe inconsistent quality or a dataset that needs more segmentation (e.g., by region or product type).

 

Business Insight

Let’s say your histogram shows a cluster around 4.2 with a few products down at 2.5. This tells you:

  • Most customers are satisfied — that’s great!

  • But a few products may be dragging down brand perception — worth investigating.

You could now decide to:

  • Deep-dive into the lowest-rated products

  • Read customer feedback to understand pain points

  • Recommend targeted fixes or strategic changes

 

Exploring the Relationship Between Discount and Ratings

Sometimes the questions that matter most in business aren’t about totals or averages — they’re about relationships. One such question is:

“Do higher discounts lead to better ratings?”

This is where a scatter plot becomes useful. A scatter plot lets you visualize how two continuous variables move in relation to each other. In our case:

  • X-axis: Discount percentage

  • Y-axis: Product rating

 

Step-by-Step: Creating the Scatter Plot

  1. Select the two columns:

    • One with discount_percent

    • One with rating
      These are the two variables whose relationship we want to explore.

  2. Insert chart → Scatter plot:
    Google Sheets may automatically suggest a scatter chart. If not, you can select it manually from the chart type list.

 

What You’ll See

The result is a cloud of data points scattered across the chart. If there’s a clear upward or downward trend (a visible slope), it may indicate correlation.

In our case:

  • The points are widely scattered, with no clear upward or downward pattern.

  • Ratings vary across both low and high discount ranges.

  • Some products with 0% discount still received high ratings.

  • Others with 70%+ discount also showed mixed reviews.

 

What This Means (Real-World Insight)

There’s no strong correlation between discount and rating. In fact, based on the scatter plot, the relationship is close to non-existent.

This suggests that:

  • Customers aren’t simply rewarding discounts with better ratings.

  • Product quality, user experience, or expectations may matter more than price cuts.

Analyst Takeaway

This kind of finding is subtle but powerful. Being able to say:

“There’s no statistical relationship between discounting and rating performance”
…helps shape more strategic business decisions.

Instead of blindly running promotions, the team might:

  • Investigate product quality issues.

  • Focus on improving descriptions or delivery experience.

  • Use discounts more selectively — not as a blanket fix.

 

Visualizing Ratings by Country Using Geo Charts

Why Use Geographic Visualizations?

Up to this point, we have been working primarily with product-related insights — looking at categories, subcategories, ratings, and discounts. However, in a real-world business context, one of the most valuable dimensions for analysis is geographic location. Understanding how customer satisfaction varies by country can help businesses adapt their strategy for specific markets.

Let’s suppose we now want to answer the following question:

“Are customers from certain countries more satisfied with our products than others?”

To explore this, we’ll use a Geo Chart — a map-based visualization that allows us to compare values (like ratings or revenue) across geographic regions.

 

Step-by-Step: Creating a Geo Chart in Google Sheets

1. Ensure you have geographic data

Before we begin, we need a column in our dataset that contains country names. In this example, we have manually added a Country column to our dataset, matching each customer rating to a country. In a real-world dataset, this may come from the shipping destination, user profile, or billing address.

Note: The data here is hypothetical, and intended for learning purposes only.

You can download this country data from Lumen github repository.

 

2. Select your data

Select the two columns:

  • Column A: Country

  • Column B: Rating

This combination tells Google Sheets that we want to map average ratings per country.

 

3. Insert a Geo Chart

With the two columns selected:

  • Click on Insert → Chart

  • In the Chart Editor, change the chart type to Geo Chart

 

If your data is valid (country names are spelled correctly and consistently), Google Sheets will automatically generate a world map highlighting each country based on its corresponding value (average rating).

 

How to Interpret the Geo Chart

The map is shaded using a color gradient:

  • Countries with higher average ratings will appear in dark green.

  • Countries with average or mid-level ratings appear in gray or light green.

  • Countries with low average ratings will appear in red or orange tones.

 

This allows us to instantly spot regional differences. In our sample chart, we observe:

  • High average ratings in countries like:

    • 🇲🇽 Mexico (4.8)

    • 🇸🇦 Saudi Arabia (5.0)

    • 🇵🇰 Pakistan (4.6)

  • Low average ratings in:

    • 🇨🇦 Canada (3.2)

    • 🇨🇴 Colombia (3.0)

    • 🇩🇪 Germany (2.0)

 

What Does This Tell Us?

This visualization is more than a visual flourish. It can guide real business decisions. Here’s how:

  • Identify problem areas
    A country with a low average rating (e.g., Germany with 2.0) might indicate issues with product quality, delivery logistics, or customer expectations in that region.

  • Tailor your marketing or customer support strategy
    If certain countries consistently give high ratings, that could indicate a good product-market fit. These could be your brand advocates. On the other hand, low-rating countries might require deeper investigation — perhaps product descriptions need localization, or packaging needs to match cultural expectations.

  • Explore customer behavior trends by geography
    Certain countries might have a tendency to give lower or higher ratings regardless of the product — this could be due to cultural differences in review habits.

 

Switching to a Marker-Based Geo Chart

Google Sheets also offers an alternate view of the same data, known as the Geo Chart with Markers.

To switch:

  • Click on the geo chart

  • Open the Chart Editor

  • Change the chart type to Geo chart with markers

 
What changes?
  • Each country is now represented by a circular marker on the map.

  • The color of the marker represents the average rating (similar to before).

  • The size of the marker represents the number of data points or observations used to compute that rating.

For example, Germany may have a small red circle, indicating few data points but a low average rating.
Mexico, on the other hand, may have a large green circle, indicating many reviews and high satisfaction.

 

Analyst Tip: Use Maps with Caution

While geo charts are visually powerful, they should always be supported by context. A country with a single review (positive or negative) can distort your perception if not interpreted carefully.

 

Best practices include:

  • Always show or mention sample sizes, especially if you’re presenting insights to decision-makers.

  • Combine geo charts with pivot tables or summary stats to check if you’re over-relying on outliers.

  • Watch out for country names that may not be recognized by Google Sheets (e.g., spelling variations or abbreviations like “USA” vs “United States”).

 

Additional Real-World Retail Scenarios

ScenarioSuggested Chart
Comparing sales across product linesColumn or Bar Chart
Evaluating customer satisfaction by countryGeo Chart
Monitoring daily traffic on a websiteLine Chart
Understanding distribution of transaction sizesHistogram
Correlating price and customer ratingScatter Plot

What’s Next?

In the next session, we’ll go one step further: How to turn these visuals into a compelling data story.

Contact

Talk to us

Have questions? We’re here to help! Whether you’re curious to learn more, want guidance on applying, or need insights to make the right decision—reach out today and take the first step toward transforming your career.