Expect the (Un)Expected: A Beginner’s Guide to Advanced Football Metrics
What exactly are expected goals? Why does PPDA sound like something I may have been mis-sold? Why, like any meat, should stats not be consumed raw? All these questions — as well as many more — answered in this, your introductory guide to football analytics. This piece will break down, explain and hopefully show the value of certain metrics within football analytics.
Expected Goals (xG)
Sky Sports and Match of the Day have taken kindly to the Expected Goals metric, frequently displaying it during and after matches. Fundamentally, this metric acts a comparative tool for teams and players to measure performance against. Ultimately, this seeks to quantify each shot, in order to give a probability of it being scored, an average if you like. A variety of factors are taken into the analysis:
· Distance from goal
· Angle of shot
· Shot clarity: Number of defenders between the ball and goal
· Goalkeeper’s position
· Height of ball at time of shot
· Pressure from opponents
Crucially, and I mean crucially, providers of xG vary in their measurement criteria; for instance, Opta do not consider the impact of height on goalscoring probability, whereas StatsBomb do. Thousands and thousands of football shots have been analysed through these models, which now allows for each shot to have an xG value assigned for it.
Penalties are the simplest example of xG to understand. On average, 79% of penalties are scored at elite level. Since xG translates goalscoring probabilities from percentages to decimals, this reads as 0.79. This allows for team/individual accumulation across a season. Though many misunderstand xG as a concept; teams or individuals will very rarely come close to meeting their xG values exactly, which is down to the variability and randomness of football. Ultimately this is all looked at probabilistically; elite footballers are good enough to score shots from range, outside the line of the posts and with defenders in the way. Likewise, elite footballers are also human beings, capable of missing chances, due to pressure from opponents, the intense speed of professional football and simply just lapses in decision-making.
Albion have started the season well from an xG perspective, with the underlying numbers suggesting that quality of chances they are creating to be closer to that of a top half side.
For example, Anthony Knockaert’s match-winner at Selhurst Park in the 18/19 season came in at just 0.03xG according to InfoGol. A 3% of scoring from there seems fair, given when you analyse the difficulty of the shot against distance, angle and shot clarity.
That said, xG is not necessarily a tool to assess whether a result was ‘fair’ or not; a team who creates higher xG than their opponent would be, from a probabilistic perspective, considered more likely to win the game as the quality — and likely quantity — of shooting opportunities they are generating are much better. Football is not played in a lab, nor is it on paper, and I write this as an analyst.
Take Tomer Hemed as an example — a player who any Brighton fan would consider to be a seriously elite penalty taker; according to Transfermarkt, he scored 11 of the 13 penalties he took as an Albion player, giving him a conversion rate of 85% from the spot. That sits 6% above average, certainly a fair quantification of Hemed’s penalty taking ability.
Football is one of the lowest scoring sports in the world, meaning each goal is infinitely more valuable. Referring back to the Crystal Palace versus Brighton game in March 2019, Crystal Palace generated 1.73xG to Brighton’s 0.15. Brighton won the game 2–1 — thanks to the aforementioned Knockaert worldie.
Did Brighton therefore not deserve the win? To look at it alternatively, if a team outscores their xG they are more clinical than the ‘average’ side, whilst a team who underscores are more wasteful; you make your own luck in football, so sides who are clinical deserve the results they get, as do the sides who fail to convert. All the xG shows, is that the repeatability of that sort of result is unlikely; no side scores 30-yard shots every week, likewise no team will miss all their tap-ins, this is all just retrospective analysis.
Brighton recorded 2.83xG against Liverpool at the Amex last season, which at that stage of the season was the second highest xG any side had recorded against the Champions elect. Brighton were wasteful though, only scoring once and missing four of the five big chances they created.
To appreciate xG best, take it all with a pinch of salt and try to avoid using it in isolation; it operates best as a supplementary metric.
Expected Assists (xA)
The sibling of xG, Expected Assists (xA) analyses a team or individual’s creative abilities. All this does is quantify the value of a chance created. Take Leandro Trossard’s assist for Neal Maupay at Newcastle; this cross in-behind the defensive line set the Frenchman up for a 1-touch finish, on the edge of the 6-yard box and in a very central location.
InfoGol have this at 0.51xG, meaning Maupay is probabilistically expected to score 51/100 times he takes that specific shot. Maupay gets the xG for the shot, and Trossard has the same number added to his xA, because shots can be — and are — assisted. Both expected metrics are ones which should typically reflect what you see with your eyes; most football fans have watched and played enough football to have a relatively comprehensive understand of the chances of scoring from a certain position, even if they cannot necessarily quantify that. Ultimately, most providers accept that their xG models are not perfect, but they do not need to be.
Here’s where things get icky. Fundamentally, football is an invasion sport. Each team has their own half, within which their goal lies. Their aim is to get the ball into the opposition half and ultimately the opponent’s goal, whilst preventing the opposition doing the exact same to themselves. Under Potter, Brighton have become much more expansive in-possession than what they were under Potter, which has enhanced their requirement for quality progressors.
This is what progressive metrics quantify, though again these will vary from provider to provider. For instance, StatsBomb consider a progressive pass to be one which moves the ball at least 10 yards closer to the opposition goal than any point in the last 6 passes, or a pass which is completed into the opponent’s box. They exclude passes from the defending 40% of the field.
Hopefully this explains why it is important to not become over-reliant on metrics, as often they are measuring incredibly specifically; coaches and fans akin will rightly highlight that passes within the defensive 40% of the field, as well as passes which do not progress the ball 10+ yards towards the opposition goal, are valuable. Perhaps more useful is looking at the yardage numbers for each player — very American Football of us, we know.
Progressive passing yards are simply the total number of yards a player moves the ball towards the opposition goal, each match, through passes. A pass from the 18-yard-box line to the 6-yard-box line gains a player 12 yards. Similarly, progressive carry yards measure in the exact same way, just when a player is dribbling the ball. In the same way, if a player dribbles from the 18-yard-box line to the penalty spot, they gain 6 progressive carry yards.
Solly March has been the primary progressor for Brighton so far this season:
PPDA (Passes allowed Per Defensive Action)
No, it isn’t something that you’ve been mis-sold. Passes per defensive action, abbreviated to PPDA, seeks to quantify the pressing rate of a team. For this metric, lower means a side is pressing more frequently, as there are fewer opposition passes — on average — between their defensive actions. Brighton’s PPDA over the last 3 seasons (league ranking):
· 2017/18: 15.57 (20th)
· 2018/19: 13.63 (14th)
· 2019/20: 10.36 (8th)
· 2020/21 (so far): 8.69 (4th)
Another metric which should synergise with the eye test; it is clear that Brighton pressed more under Potter than Hughton, with the trend seemingly continuing this season, though the sample size is still relatively small to draw any concrete conclusions.
This metric may be of more use than pressures alone, though these can also provide insight. PPDA is naturally comparative, because it contextualises for the amount of time a team does or does not have the ball. Pressures, however, are simply a raw number. Brighton’s high possession approach would limit the frequency of opposition possessions, which thus limits the number of opportunities to press. The solution for negating the influence of possession will be covered later in the piece.
Neal Maupay has been a kingpin in Brighton’s front-foot pressing approach.
Interestingly, in the 9 games post-lockdown last season, Brighton somewhat reverted to the type they showed under Hughton, becoming the least pressing side in the division, though they did take 12 points from their final 9 fixtures.
Brighton’s evolving pressing output will no doubt be supplemented by the addition of Adam Lallana. Despite limited minutes over the past few seasons, he still recorded a pitch-wide pressing output way above Brighton average for the 19/20 season.
Key passes, shot assists, chances created. Take your pick. All those adjectives describe a pass which directly precedes a shot. These are generally decent in ascertaining which players are good at creating chances for teammates through final passes, though this approach is porous.
Take Jose Izquierdo, in the 2017/18 season he recorded 9 dribbles which led to a shot, though under the above measurement criteria it would not be considered a chance created, simply because he has not played the ball to anyone else.
Seems unfair, right? Players can, and do create chances for themselves, as well as create chances through alternative means than final passes. StatsBomb propose a more nuanced solution of shot-creating actions, looking at the final two actions before a shot occur, considering players to be able to create chances in the following ways:
· Live passes (in open play)
· Dead passes (from set pieces; corners, free-kicks, throw-ins)
· Shots (which then lead to another shot)
· Fouls won
Winning fouls is another underrated method of chance creation; Brighton fans are particularly appreciative of Aaron Connolly and Tariq Lamptey’s ability in this department.
Additionally, shot-creating actions look at the action before the shot-creating one, which distributes assists in the same way as in hockey. On top of shot-creating actions, there’s goal-creating actions, measured in the exact same way, just exclusively for shots which are scored. Whilst chance creation was not a massively flawed metric before this, a more nuanced approach — which considers the ability to create through methods aside from passing — is certainly more useful for coaches, analysts and fans. Furthermore, the value of acknowledging the assist of the assist — the pre-assist, or second assist, if you like — means that players who operate in build-up phases can gain statistical recognition.
For instance, Ben White fed Tariq Lamptey for the dribble that led to the penalty at Newcastle, whilst finding Maupay with a vertical ball that he assisted Connolly from for the third goal on Tyneside. Under key pass terms, Ben White is not considered in the goal-involvement, but from the goal-creating actions perspective he is rightly recognised.
You’ll likely have heard the saying that “numbers don’t lie”. That’s true, they never lie, but sometimes they don’t always tell the full truth. The big reason for this is numbers are so easy to take out of context — see exhibit A, the tweet below:
Now, as exceptional a forward as Glenn Murray is, I think it is fair to say that Karim Benzema slightly edges him in terms of quality — it’s marginal, though. So many factors affect performance: team tactics, minutes played, quality of opposition. That’s just to name a few. As such, a few filters have been applied to statistics to try and even the hypothetical playing field, purely to make comparison more realistic and worthwhile:
· Per 90 minutes
· Per 100 touches
Let’s break these down then. They’re all relatively simple to understand, with per 90 minutes (often written as p90) the most common of the three. This takes the total output for a player over a given time period — likely a season — and equates it to what they average per 90 minutes, the standard length of an association football match. See this as what a player averages per game. Of course, here you have to be careful because decimals start to get involved, so be conscious of rounding or not rounding to skew numbers.
Sample sizes are important here, too. An example of this is last season, Florin Andone averaged 1.23 goals per 90 in a Brighton shirt; he played 73 minutes and scored once. There is no universally agreed threshold for a minimum size, but size matters here — the bigger the better. Anything above 500 minutes is usually considered relatively acceptable, as this equates to over 5 full matches, which on paper is a big enough scope to see the true abilities of a player.
Please, please, please never look at appearances, look at minutes played. A player coming off the bench in second-half injury time and one who plays the full match both get credited with the same number of appearances, but the reality is very different to that.
Per 90 is typically used from a recruitment standpoint, particularly beneficial for comparing players across different leagues with unidentical numbers of teams; for example, in the Championship teams play 46 games, giving players a possible 4,140 minutes of football. In the Premier League, that figure falls to 3,420 minutes. A practical example of this:
Pascal Gross recorded 116 shot-creating actions (SCA) in the 2017/18 season, which was 39 more than the 77 he managed in the 2018/19 season. Those raw numbers imply a decline in creative output, though Gross was injured for a large portion of the 18/19 season and recorded over 1,000 less minutes played. Applying the per 90 filter, Gross was actually better as a shot-creator in the 2018/19 season, recording 3.72 SCAs per 90 minutes, slightly ahead of the 3.54 he managed in the 2017/18 season.
Now for possession adjusted. Probably the most complex of the three, this seeks to analyse player performance against the amount their team does or does not have the ball. Typically abbreviated to PAdj, and often coupled with the per 90 filter, this compensates for the impact of possession on performance. Teams who have low possession — like Brighton under Hughton — give their players fewer opportunities to record in-possession actions, like dribbles, passes and shots. Teams who have high possession — like Brighton under Potter — give their players fewer opportunities to record out-of-possession actions, like pressures, tackles and interceptions.
Last season, if you take non-adjusted stats, Lewis Dunk ranked within the bottom 25% of Premier League centre-backs for tackles per 90 minutes. If you adjust this for possession — considering that Brighton averaged the 7th most in the league — he shoots up to rank above over 40% of Premier League centre-backs. In-possession, adjustment is typically based on the output of a player per their average 100 touches on the ball.
Keeping Lewis Dunk as the example, he ranked within the top 25% of Premier League centre-backs for passes into the final third and progressive yards — which hopefully you’re now an expert on — per non-possession adjusted 90 minutes. But, since Brighton had more of the ball than most teams, Dunk benefited from having more touches of the ball and thus more opportunity to pass to the final third and gain yards. When adjusting for possession, Dunk drops to the 70th and 61st percentiles for those respective metrics; this output is still very good, by the way.
Hopefully this piece helped you understand the minefield that is football analytics. If you have any questions or would like to follow the Twitter page, please feel free to shout us @AlbionAnalytics.
Patreon: Those who want to support the page and unlock exclusive content can do so, over at: https://www.patreon.com/albionanalytics