The problems with single game xG as a proxy for attacking quality

Cast Iron Tactics
5 min readJul 6, 2020

--

West Ham’s 2–2 draw with Newcastle United yesterday is a good example of how solely using contextless xG to draw conclusions on a team’s attacking quality can be misleading for a number of reasons:

  1. Difference in values between data providers

Understat have the xG for this game rated as Newcastle 1.49–2.03 West Ham:

whereas Infogol have it pegged as Newcastle 1.32 — West Ham 2.33.

Confusingly, Infogol’s numbers vary slightly between their app and their web browser.

Either way, you have two different data providers watching the same match, watching the same shots and ascribing different values to them. There’s a ~0.5 xG swing in values that takes the difference between the two sides from a 0.5 xG gap to a whole expected goal’s difference.

(It seems the main discrepancies are on Souček’s goal (0.14 vs 0.22), Jarrod Bowen’s toe poke to Aaron Cresswell’s cross/shot in the first half (0.51 vs 0.54), Antonio’s hooked volley on Rice’s overhit cross at the start of the game (0.08 vs 0.10, and has Shelvey’s equaliser as 0.29 vs 0.40 on Understat. Those little differences add up).

2. Multiple Shots in the Same Move

Another major issue that you may have spotted in the Infogol screenshot above is that both of these providers have included the value of Declan Rice’s back post header in the build up to Souček’s shot. Here’s the goal:

This is where it gets messy. At its core, xG is a measure of the probability of a shot being scored based on the historical likelihood of similar shots being scored. But that causes a problem here, because Souček’s goal is entirely dependent on Rice’s header not going in — if it doesn’t rebound off the bar, then the ball doesn’t drop in the 6-yard box to him. They are two interdependent events and it’s not an accurate representation of the team’s attacking prowess to include both of them in the xG total.

I’m not sure how best to account for this. I remember seeing some suggestion that in situations like this you could take the value of all of the shots in the build-up to a goal and average it out. If you did that in this scenario (using Infogol as an example), you’d end up with Rice’s header (0.32) + Souček’s shot (0.22) and divide it by two to get 0.27. If you take the difference in that value away from Infogol’s xG value, West Ham’s total drops ~10% of the previous total to 2.06 which paints a slightly different picture.

Basic xG values are generally based on shot location, distance, whether it’s a header or shot with the foot, which foot the shot is taken by, whether it’s a volley or not, but don’t factor in the position of opposition players or of the goalkeeper. That can cause some slightly curious valuations.

Take the two shots that resulted in the build-up for our second goal yesterday: I know Souček’s goal was a volley on his weaker foot (0.22), but is a player heading a pacy, inswinging corner at the far post at a fairly acute angle with a player marking him (0.32) likely to result in a goal twice as often as a relatively unmarked volley on the edge of the 6-yard box? Jonjo Shelvey’s one-on-one left-footed finish to equalise was rated 0.29 by Infogol.

Is that really an equivalent chance to Rice’s 0.32 header? (Understat has Shelvey’s finish as 0.40 and Rice’s header as 0.36, so still has them pegged as quite similar quality chances).

3. Drawing conclusions on attacking quality without context

It’s quite easy to look at the xG totals and the shot maps from this game and come away from it thinking West Ham were the better side and created the better chances. But that fails to factor in how those chances came about.

Look at the opening goal. The eventual location of Michail Antonio’s shot is a brilliant one and the build-up play in the final third was good too, but did West Ham truly create this shot?

Jarrod Bowen’s ball is in to an excellent area but the delivery itself is quite tame and, in normal circumstances, quite easy to clear and only results in a goal because Lascelles temporarily forgets how human legs work. I suppose you could make the claim that Antonio’s presence puts pressure on Lascelles and forces the mistake, so it isn’t exactly an unforced error, but I have a hard time ascribing the creation of this goal to West Ham’s attacking quality.

Compare it to the Almirón equaliser:

Sure, there are lapses in concentration from Rice and Ogbonna as Newcastle work the ball into the box, but they’re not in the same category as the Lascelles error. This goal is a more accurate reflection of Newcastle’s attacking quality in the game than the Antonio goal is of West Ham’s ability as an attacking force.

And if you take out the 0.47 from that Antonio chance that was essentially a fluke as well sort of the mess for the two shots leading to our 2nd goal, the xG totals for the game are basically the same for both teams.

Obviously the Antonio shot actually happened, so you have to count it but the circumstances leading to the shot are almost entirely out of the influence of West Ham. Drawing conclusions on the attacking quality of a team in a single game using xG can easily lead you astray.

I know that none of this is particularly new or revelatory, but this match felt like a good example of the phenomena. I also know that Infogol/Understat aren’t especially reputable sources but they’re the only ones who are free and publicly available (fbref, who use Statsbomb’s xG values, haven’t updated their data for the most recent round of fixtures yet).

--

--

Cast Iron Tactics
Cast Iron Tactics

Written by Cast Iron Tactics

I write long, boring, and increasingly deranged articles about football tactics and West Ham @CastIronTactics on Twitter

No responses yet