Week 1: Web & Web Analytics
Jan-2026
Throughout this course, we’ll analyse ShopSocial, a hypothetical social e-commerce platform where:
This Week’s Focus: Using web analytics to understand ShopSocial user behaviour
By the end of this week, you will be able to:
Assessment link: These concepts form the foundation for your coursework on analysing digital platform data.
Wooclap Question
Go to wooclap.com and enter code: LJHCNE
Which “web era” describes how YOU mostly use the internet today?
A. Reading content others created (news, Wikipedia)
B. Creating and sharing content (social media, reviews)
C. Using AI assistants and smart recommendations
D. Interacting with decentralised apps (crypto, NFTs)
Let’s see the results and discuss what this tells us about web evolution…


Web (2000) vs Web (2026); Generated by Nano Banana Pro






Key insight: Each web generation creates new data types and analytics opportunities.
Wooclap Question
Go to wooclap.com and enter code: LJHCNE
What percentage of e-commerce website visitors do you think make a purchase?
A. 1-3%
B. 3-10%
C. 10-20%
D. 20-35%
🎯 Prediction Reveal #1
Answer: Typically 1-3% (average is about 2.5-3%)
This is why understanding user behaviour is so critical!
Definition
Web analytics is the collection, measurement, analysis, and reporting of website data to understand and optimise web usage.
ShopSocial Question: How can we use analytics to increase purchases and engagement?
| Era | Focus | Data Sources | Key Challenges |
|---|---|---|---|
| Web 1.0 | Basic metrics (hits, visits) | Server logs | Limited interaction data |
| Web 2.0 | User engagement | Cookies, JS tags, social APIs | Data volume, privacy concerns |
| Web 3.0 | Cross-platform, decentralised | Blockchain, IoT, AI platforms | Complexity, integration |
| Web 4.0 | Intent and context | Multi-modal sensors, LLMs | Ethics, real-time processing |
ShopSocial operates in Web 2.0/3.0: We need to track user behaviour whilst respecting privacy.
Wooclap Question
Go to wooclap.com and enter code: LJHCNE
How many tracking cookies do you think are placed when you visit a typical news website?
A. 5-20 B. 20-50
C. 50-100
D. 100-200+
🎯 Prediction Reveal #2
Answer: Often 100-200+ cookies!
Let’s investigate this together…
Reflection
Think about the last time you browsed online. Did you notice ads related to your recent searches? Ever wonder how that happens?
Scenario

ShopSocial needs: Session cookies (cart), persistent cookies (login), but should avoid unnecessary third-party tracking.


Log types:


JavaScript enables rich, client-side tracking of user interactions:
Activity (3 minutes)
Even without cookies, websites can identify you through “browser fingerprinting”.
Try it yourself:
Discussion
If your fingerprint is unique among millions, does blocking cookies even matter?
Wooclap Question
Go to wooclap.com and enter code: LJHCNE
Do you use an ad blocker?
A. Yes, always
B. Yes, but I whitelist some sites
C. No
D. What’s an ad blocker?
🎯 Prediction Reveal #3
Current ad blocker usage: ~40% of internet users globally
This is why privacy-first analytics matters for businesses!
The Big Picture
To analyse the web, we need to understand how web pages are built.
Later in this course, you’ll learn to:
Think of it this way:
| Technology | Role | Analogy |
|---|---|---|
| HTML | Structure & Content | The skeleton and organs |
| CSS | Styling & Layout | The skin and clothes |
| JavaScript | Interactivity | The muscles and brain |

HTML = HyperText Markup Language
<tag> and closing </tag>Reading this: “This is an HTML document with a title ‘ShopSocial - Best Deals’, a main heading, and a paragraph.”
| Tag | Purpose | Example | What You Might Extract |
|---|---|---|---|
<h1> to <h6> |
Headings | <h1>Product Name</h1> |
Product titles |
<p> |
Paragraph | <p>Great quality!</p> |
Descriptions, reviews |
<a> |
Link | <a href="url">Click</a> |
URLs, link text |
<img> |
Image | <img src="photo.jpg"> |
Image URLs |
<div> |
Container | <div class="price">£99</div> |
Grouped content |
<span> |
Inline container | <span>4.5 stars</span> |
Small pieces of text |
<table> |
Table | <table>...</table> |
Structured data |
<ul>, <li> |
Lists | <ul><li>Item 1</li></ul> |
List items |
Key insight: Most data you want to scrape is wrapped in these tags!
Tags can have attributes that provide additional information:
| Attribute | Purpose | Why It Matters for Scraping |
|---|---|---|
href |
Link destination | Extract URLs to follow |
src |
Image/script source | Get image URLs |
class |
CSS styling group | Find elements by class name |
id |
Unique identifier | Find specific elements |
alt |
Image description | Extract image descriptions |
class and id are crucial for web scraping! They help us locate specific data.
Activity (5 minutes)
Let’s explore how a real website is structured!
Instructions:
<h1>, <span>, <div>?)class or id attribute?Share: What patterns do you notice? Are similar items in similar tags?
Imagine this is the HTML for a ShopSocial product:
<div class="product-card" id="product-12345">
<h2 class="product-title">Wireless Headphones</h2>
<img src="images/headphones.jpg" alt="Black wireless headphones">
<p class="product-description">Premium sound quality with 20hr battery</p>
<div class="price-container">
<span class="original-price">£79.99</span>
<span class="sale-price">£49.99</span>
</div>
<div class="rating">
<span class="stars">★★★★☆</span>
<span class="review-count">(127 reviews)</span>
</div>
<a href="/product/12345" class="buy-button">Add to Cart</a>
</div>Question: If you wanted to scrape all sale prices, what would you look for?
Answer: Elements with class="sale-price"
Using the ShopSocial HTML from the previous slide, how would you find:
| Data Needed | Tag to Look For | Class/ID to Use |
|---|---|---|
| Product name | ||
| Sale price | ||
| Number of reviews | ||
| Product image URL | ||
| Link to product page |
Answers:
| Data Needed | Tag to Look For | Class/ID to Use |
|---|---|---|
| Product name | <h2> |
class="product-title" |
| Sale price | <span> |
class="sale-price" |
| Number of reviews | <span> |
class="review-count" |
| Product image URL | <img> (src attribute) |
Inside class="product-card" |
| Link to product page | <a> (href attribute) |
class="buy-button" |
URL = Uniform Resource Locator (the address of a web page)
https://www.shopsocial.com/products/headphones?sort=price&page=2
└─┬──┘ └───────┬────────┘└─────────┬─────────┘└────────┬────────┘
protocol domain path parameters
| Component | Example | Purpose |
|---|---|---|
| Protocol | https:// |
How to connect (secure) |
| Domain | www.shopsocial.com |
Which server |
| Path | /products/headphones |
Which page |
| Parameters | ?sort=price&page=2 |
Extra options (filtering, pagination) |
For scraping: Understanding URL parameters helps you navigate pagination and filters automatically!
Remember web beacons and JavaScript tracking?
<!DOCTYPE html>
<html>
<head>
<title>ShopSocial</title>
<!-- Analytics JavaScript goes in the head -->
<script async src="https://www.googletagmanager.com/gtag/js?id=G-XXXXX"></script>
<script>
window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());
gtag('config', 'G-XXXXX');
</script>
</head>
<body>
<h1>Welcome to ShopSocial</h1>
<!-- Tracking pixel (beacon) often at the bottom -->
<img src="https://tracking.com/pixel.gif?page=home" width="1" height="1">
</body>
</html>Activity (3 minutes)
Let’s look at the complete HTML of a real page!
Instructions:
<script (How many JavaScript files?)pixel or beacon (Any tracking pixels?)google (Any Google services?)Note: Real websites have hundreds of lines of HTML - don’t be overwhelmed!
In this week’s lab session, you’ll learn to write Python code like this:
from bs4 import BeautifulSoup
import requests
# Get the webpage
page = requests.get('https://shopsocial.com/products')
soup = BeautifulSoup(page.content, 'html.parser')
# Find all product titles (using what we learned today!)
titles = soup.find_all('h2', class='product-title')
# Find all sale prices
prices = soup.find_all('span', class='sale-price')
# Extract the text
for title, price in zip(titles, prices):
print(f"{title.text}: {price.text}")See the connection? The class='product-title' directly uses what we learned about HTML classes!
HTML Structure:
<tag> and closing </tag>class and id attributes help identify elementsImportant Tags for Scraping:
<h1> - <h6>: Headings<p>: Paragraphs<a>: Links (check href)<img>: Images (check src)<div>, <span>: ContainersURLs:
Developer Tools:
Coming up: You’ll use this knowledge to build web scrapers in Python!
Definition
Clickstream data tracks the sequence of clicks (page views) a user makes whilst navigating through a website.
| Term | Definition |
|---|---|
| Hit | Any request for a file from the web server |
| Page view | A request to load a single page |
| Session | A group of user interactions within a time frame (typically 30 mins) |
| Unique visitor | A distinct individual visiting the site (identified by cookie/IP) |
| Bounce | A session with only a single page view |
| Conversion | Completion of a desired action (purchase, sign-up) |
Important
Bounce Rate (for a page): \[ \text{Bounce Rate} = \frac{\text{Single-page sessions starting on this page}}{\text{Total sessions starting on this page}} \]
Exit Rate (for a page): \[ \text{Exit Rate} = \frac{\text{Sessions ending on this page}}{\text{Total sessions that included this page}} \]
Conversion Rate: \[ \text{Conversion Rate} = \frac{\text{Number of conversions}}{\text{Total sessions (or visitors)}} \]
Average Page Depth: \[ \text{Average Page Depth} = \frac{\text{Total page views}}{\text{Total sessions}} \]
Bounce Rate:
When high bounce is acceptable:
Exit Rate:
Key difference:
| Visitor 1 | Visitor 2 | Visitor 3 | Visitor 4 | Visitor 5 |
|---|---|---|---|---|
| Home | Home | Home | Home | Products |
| Products | About | Products | ||
| Product 1 | Products | Basket | ||
| Basket | Home | Checkout | ||
| Checkout | ||||
| Purchase confirmed |
| Visitor 1 | Visitor 2 | Visitor 3 | Visitor 4 | Visitor 5 |
|---|---|---|---|---|
| Home | Home | Home | Home | Products |
| Products | About | Products | ||
| Product 1 | Products | Basket | ||
| Basket | Home | Checkout | ||
| Checkout | ||||
| Purchase confirmed |
Answers:
Definition: Tracking and analysing the steps users take towards a specific goal or conversion.
| Component | Description |
|---|---|
| Entry point | Where users begin (e.g., landing page, homepage) |
| Intermediate steps | Key actions towards the goal (e.g., view product, add to cart) |
| Conversion | The final goal (e.g., purchase, sign-up) |
| Drop-off points | Stages where users exit without converting |

Key insights:
The customer journey extends beyond a single session, encompassing all touchpoints with your brand:
| Stage | Description | ShopSocial Example |
|---|---|---|
| Awareness | Customer discovers your brand | Social media ad, Google search |
| Consideration | Researches and compares options | Browse products, read reviews |
| Decision | Makes a purchase | Checkout and payment |
| Retention | Post-purchase engagement | Order tracking, support |
| Advocacy | Recommends to others | Reviews, social shares, referrals |
What heatmaps show:
Insights from heatmaps:

How it works:
Provides data on:



Web Evolution:
Tracking Technologies:
Web Basics:
class and id attributes help locate dataCore Metrics:
Measurement Tools:
Next week: Search Engines and Web Graph
Course assessment reminder:
Thank you!
Dr. Zexun Chen
📧 Zexun.Chen@ed.ac.uk
