Post

Mind the Data Gap: Rush Hour Reality

Mind the Data Gap: Rush Hour Reality

Part 5: Rush Hour Reality

Over the last year, I have been designing a system for ingesting data from National Rail’s “Darwin Push Port Feed”.
This feed contains live information about train movements, scheduling, and service alterations across the UK rail network.

In previous posts, we have discussed how services can be grouped into routes, analysed stations for the most and least delayed, and developed interactive visualisations to show delays across the network.
Today we’re going to focus on peak trains, their capacity, and their average loading — a metric to quantify how full a given service is.

Loading is a relatively new metric to the Darwin system, being first introduced in 2017 ([1] yes, the railways move slowly…). Over the course of the last six months, we’ve only been able to capture reliable data for both Chiltern and Southeastern.

Given data for only these two operators, let’s dive in.


Loading Distribution

To conduct the analysis, we need to pull all services over a given period and compute the average loading for each of them.
This gives us the following loading distribution (0 is empty, 100 is full including standing):

Loading RangeRecordsPercentage
0–20%106,28556%
20–40%51,21227%
40–60%17,5889%
60–80%8,1414%
80–100%6,4853.4% ← Severe overcrowding

Perhaps unsurprisingly, trains are often very empty!
Just 7% of services are running at full seating capacity.


Average Loading by Time of Day

When we group these by time (across a sample at the start of October), the loading factor increases drastically during the early morning and evening peaks:

Time PeriodAverage LoadingNotes
4–5am10% 
5–6am18% 
6–7am24%Early commuters
7–8am35–37%Morning peak starts
8–9am34–36%Morning peak
9–10am21% 
10am–2pm18–20% 
2–3pm21% 
3–4pm31–32%School run
4–5pm33–34%Evening peak starts
5–6pm29%Evening peak
7–8pm20% 
8pm–midnight13–17% 

This pattern doesn’t continue at the weekends though, where instead we see:

  • No morning peak: 7–8am only 14% (vs 36% weekday)
  • Midday peak instead: 10am–12pm reaches 23–25%
  • Flattened evening: 5–7pm only 24% (vs 33% weekday)
  • Later activity: 8pm–midnight stays higher (16–19% vs 13–17%)

Fleet Composition

There are a few other factors outside of time of day that affect service loading — these being frequency and length of the service.
We won’t be touching on frequency today, but let’s explore service length and its effect on loading.

Let’s start with a simple analysis on the length of services across the two operators:

Southeastern Fleet

1
2
3
4
5
2–4 cars:  ██████████ 14.1%
5–7 cars:  ████████  9.1%
8 cars:    ████████████████████████████████ 48.9%  ← BACKBONE
10 cars:   █████████████ 21.0%
12 cars:   ████ 6.6%

Chiltern Fleet

1
2
3
4
2 cars:    ████████████████ 24.9%
3 cars:    ███████████████ 23.9%
4 cars:    █████████████████████████ 39.1%  ← BACKBONE
5–8 cars:  ██████ 11.2%

Despite both operators running commuter services into London, the above shows the disparity in rolling stock they have available,
with Southeastern’s trains being far more fit-for-purpose on the busy commuter routes.

But do the shorter services of the Chiltern lines affect its overcrowding at peak times?
If we group our services by length, then compute the average loading factor, we can get a feel for whether length is affecting overcrowding of peak services.


Peak-Time Overcrowding by Train Length

During peak times, we get the following table:

TOCOperatorTrain Length# ServicesAvg LoadingMin–MaxSeverity
SESoutheastern5 cars1480.9%65–98%🔥🔥🔥 CRITICAL
SESoutheastern10 cars3873.7%61–93%🔥🔥 SEVERE
SESoutheastern12 cars171.5%71–71%🔥🔥 SEVERE
SESoutheastern8 cars869.7%60–81%🔥🔥 SEVERE
SESoutheastern3 cars370.1%60–78%🔥🔥 SEVERE
SESoutheastern6 cars565.0%62–68%🔥 HIGH
CHChiltern4 cars1067.2%60–88%🔥 HIGH
CHChiltern3 cars668.8%62–76%🔥 HIGH
CHChiltern5 cars265.6%62–69%🔥 HIGH
CHChiltern2 cars161.3%61–61%🔥 HIGH

14 of the services running during peak times for Southeastern are 5-car trains — with an average loading between 65% and 98%, accounting for nearly 20% of the overcrowding at peak.
A rebalancing of trains during peak times could drastically cut this congestion.

The same could also be said for the 10-car services, which are equally stretched during peak times.

Looking closer at the data for Chiltern, we can see a different picture, with the longest trains having the largest loading factor. These services are only 4 cars long, something which if increased could have a drastic improvement to passenger comfort at peak.

You can play with this data further in the visualisations below


Interactive Visualizations

Loading Distribution

Explore the distribution of train loading across all services, showing how often trains fall into different capacity ranges:

Hourly Loading Pattern

See how train loading varies throughout the day, with clear peaks during morning and evening rush hours:

Train Length vs Loading

Explore the relationship between train length and loading levels, showing how different train configurations affect capacity during peak times:


Reference

[1] Celebrating 20 years of Darwin – The railway’s single source of truth