Friday 7 September 2018

If Carlsberg did FPL spreadsheets....

They would probably be like mine.

Some spreadsheets simply track each team's fixtures in gameweek order.  Some incorporate fixture difficulty ratings, with the aim of identifying favourable/unfavourable sequences to assist with managers' transfer decisions.  Although, as Richard Kenny @InfernoSix showed, FDR is a very blunt instrument when it comes to assessing fixture difficulty.  And some spreadsheets even assign predicted FPL scores to players based on a wide variety of algorithms, including RMT.  Mine do many of these things and more, but probably better.


With a week of the first international break to go, I am inviting FPL managers to sign up to receive weekly first draft versions of my FPL Cheat Sheets to test-fly during the next four gameweek period between the first and second international breaks.  The first wave of test pilots will only need access Excel, but I am planning to replicate my tables on Google Sheets soon.

Managers with active wildcards might be especially interested.  Albeit they'd need to understand that the data used by my team ratings formula is currently split fifty-fifty between results from last season and this.  That said, even whilst not yet firing on all cylinders, my spreadsheets are currently outperforming other pay-as-you-go predictive algorithms I've compared them to.

How I Got Here

I first came across the concept of season tickers during my first proper FPL preseason.  I immediately liked the concept, but as my second season progressed I grew dissatisfied  with what I perceived to be the arbitrary nature of team rankings, and distrustful of how often they'd be updated to reflect new realities.  Having finished 17,394th in my second season, I believed there was plenty of scope for improvement, and so I set out to learn what I could do with my desktop excel program, which I'd never had any use for previously.

I went into my third season determined to fly solo and rely solely on my own formulations.  With no formal background in maths whatsoever, I intuited a method of calculating the offensive and defensive strengths of each team relative to all the others.  As my Excel know-how grew I had several Eureka! moments that led to my second blog (December 2015) proclaiming my first prototype of a season ticker spreadsheet based on what I would later come to understand as a crude form of xG, though I'd never heard of that concept at that point.   My spreadsheets were constantly being refined throughout that season, but I nonetheless achieved a new personal best finish of 12,599th.

Unfortunately for my average overall rank though, I cut corners the following season by substituting my homegrown variety of xG with the much less labour intensive, and more readily available, Shots On Target data.  This was considered by many FPL managers at that time to be the most important metric of all for predicting attacking returns.  See One Stat To Rule Them All for example.  My complacency and laziness proved disastrous, and I did well to finish inside the top one hundred and fifty thousand (148,327) having been still outside the top two million going into Gameweek 12.

When I first met David Wardale @DavvaWavva during the summer of 2017 to be interviewed for his excellent book about fantasy football - Wasting Your Wildcard - I was feeling very bullish about my fifth season prospects.  This wasn't the same blind optimism most players of FPL experience before the first bunting of red flags or first quiver of red arrows.  No, my optimism was based entirely on my discovery of a reliable source of xG data.  I'd done tonnes of research into the many different models used to calculate xG, and had settled on a source I trusted to take my spreadsheets to the next level.

In the course of that research into xG models, I'd been very pleased to learn that my homegrown method was almost identical to that now used by many professional sports bettors.  This was also when I learned of a concept called poisson distribution, which came to revolutionise my understanding of most likely correct scores based on my spreadsheet calculations, but more on that fishy sounding concept later.


Where I Am


When deciding where to 'invest' their money professional sports bettors use websites like FiveThirtyEight and ClubElo.  These were reckoned to be amongst the very best last season by alex b @fussbALEXperte who, apart from writing excellent articles about football and psychology, also measures the quality of football predictions.  After taking a close look at the methods and principles used by FiveThirtyEight and ClubElo I was pleased to see that they are essentially the same as mine.  Coincidentally, the latter was also credited by the aforementioned Richard Kenny as offering a more reliable barometer of teams' attacking and defending strengths than FDR.


As far as I am aware though, these sites focus on upcoming matches only, which makes perfect sense given that most of the bettors they cater for have little interest in betting on the result of league matches several weeks away.  After all, there are a multitude of variables that can change teams' future prospects in the meantime, e.g., injuries, morale, suspensions, transfers, etc.

Catering for FPL managers is a different ball game, however, as they must plan ahead to be successful.  They only have two wildcards per season and one free transfer per week, so my spreadsheets are just as focused on long-term projections about teams' forthcoming fixture runs as they are for more immediate short-term predictions.


What I Do

After each gameweek I carefully enter the expected goals scored and conceded values reported by my preferred source, for all of the matches in the latest round of fixtures, into pre-prepared cells on my spreadsheets.




















These are then systematically adjusted and weighted to give every team individual values for strength in attack and defence, for both home and away.


The rationale for distinguishing between home and away form is simply that many teams approach away games differently to when playing at home, often changing their formations and personnel in the process.


The sophisticated part comes next, when I model what future adjusted and weighted spreadsheets will look like if, and admittedly it is a big if, my spreadsheets' current predictions correlate precisely with actual outcomes.  Clearly, this is never going to happen, but I do find these dynamic xG projections to be more reliable than static ones that assume the status quo will remain relevant.


The analogy I use is modern weather forecasts, and how they are based on computer simulations that evolve the state of the atmosphere forward in time using an understanding of physics and fluid dynamics.  They attempt to predict what the weather will be in the future, not what it is now.


Extrapolating from the values my spreadsheet assigns to every team allows me to do many things, including sorting teams by the predicted number of expected goals that will be scored and/or conceded over any number of future gameweeks for any range of gameweeks desired.  It also enables me to produce weekly correct score predictions.

An important distinction to make here is that what my spreadsheets predict is 'expected goals', not actual goals.  As can be observed in the match stats shown during Match Of The Day post-match interviews with club managers, these often don't correspond with each other.

Before coming to the vexed question of what the point of generating theoretical goals is if they can differ markedly from goals scored in reality, I should add that over the course of a season there's usually very little to separate the total number of goals scored in the Premier League from those expected by the xG model I use.  Last season, for instance, total expected goals for all teams only exceeded actual goals by 46, which works out as an average of under 0.1 of a goal per match per team.

Why xG?

Have you ever watched a game of football and seen the team with the biggest chances to score goals lose?  The answer is almost certainly yes, so there's your answer.  Before I started playing FPL, unfortunate events like calamitous errors by individual players, poor refereeing decisions, and unlucky deflections, were all just grist to the opinion mill.  It was only when I started recording match results onto my spreadsheets, and saw the distorting effect such simplistic data had on my team rating equations, that I came to realise all goals ought not to be accorded equal significance.

After 7 matches last season, Crystal Palace had yet to score any points or goals.  By the standard measure of actual goals they were destined to drop down into the Championship.  Clearly, they were a team who couldn't score goals, and on that basis, wouldn't score the goals needed to avoid relegation.  The Expected Goals model, however, told a different story.  One in which they were cast as merely the unlucky victim of variance who had experienced the unfairest results up until that point.  And the villain of the piece was our old friend 'standard deviation'.

Even before I came across xG on my fantasy football travels, my homegrown variety had me start Leicester's title-winning season with Vardy and Albrighton in my FPL squad (alas not Mahrez), at a time when they were very unfashionable picks.  My DIY xG spreadsheet had identified Leicester as a vastly underestimated team judging from performances during their 'great escape' the season before.  I remember well my increasing exasperation with radio and television pundits alike midway through that season as they all declared relegation for the Foxes a foregone conclusion.

These, and countless other examples besides, have led me to confidently conclude that xG gives a more accurate picture of the attacking and defending strengths and weaknesses of teams than do the Goals For and Goals Against columns in a league table.

Predictably enough though, there are still large swathes of the FPL community yet to embrace this revolutionary way of understanding football matches.  My twitter feed is still littered with uninformed commentary and misguided sarcasm, from some of the most followed twitter accounts, at the expense of the xG approach.

To my mind though, these snipers and swipers are akin to 'flat earthers' denying the earth is round.  I expect Tony Bloom and Matthew Benham wouldn't have any sympathy for these conspiracy theorists either, given that they were able to buy football clubs (Brighton Hove Albion and FC Midjylland & Brentford) with the proceeds from their xG-based sports betting operations.




    
What's The Plan?

I've been refining and perfecting my spreadsheets for several years now, and they've helped me to sub-twenty thousand finishes in three out of the last four years, the last of which was a decent 4,733rd last year.

What I've enjoyed most about letting my xG spreadsheets govern my decisions is that they often promote going against the grain of template teams, against the flow of groupthink, and against the tide of crowd wisdom.  And yet, the maverick moves I've made have generally kept me ahead of the curve.

The time and effort I put into my spreadsheets has sometimes been hinted at in screenshots I've shared on social media posts, but the time has now come for me to make them more readily available to other FPL managers.  There are four more gameweeks to go before my 2018-19 spreadsheets have enough data from the current season to dispense with data from the previous one.  Conveniently enough this period takes us up to the second international break.  With the help of feedback from my trialists I am sure to be kept busy in the meantime troubleshooting issues and ironing out wrinkles.


Another new development to be tested is the application of expected goals and assists values to individual players, rather than teams only.  Previously, I just looked at these for players from teams highlighted by my spreadsheets as being of most interest.  But I only ever did so on an ad hoc basis.  I believe a more systematic approach this time around, however, could prove a very powerful tool for prioritising transfer targets.


Towards that end, I have created a complex algorithm that takes expected goals and assists into account to calculate average expected FPL points for the next six gameweeks (see right).

What I will be providing volunteers with, however, is a much more user-friendly worksheet with easily sortable columns (see below).


The scores contained there will need to be refreshed each gameweek to reflect new data, as the rolling four gameweek window onto player form moves inexorably towards season's end.


Before each new gameweek, therefore, managers will be sent new worksheets that will supercede previous ones.

The elusive nature of form in football, however, and the sensitivity of my algorithms, means that fluctuations in my player and team ratings are inevitable, but, if my spreadsheets perform well, these should be gradual rather than volatile.

Ultimately though, my spreadsheets will not eradicate difficult decisions regarding captaincy and transfers.    They should be used alongside managers' judgement and knowledge, not instead of them.

Getting the most from my cheat sheets will depend greatly on synthesising them with managers' instincts. The onus will still be on managers to make adjustments and allowances for events like key injuries (and returns from injury), suspensions, transfers, etc., the significance of which cannot be immediately captured by my spreadsheets.


Reality check

The most common scoreline in the Premier League last season was one-one, which happened just under twelve percent of the time (11.84%).  The next most common score was one-nil (11.58%).  And then there were just as many nil-nil draws as there were two-one wins (8.42%), meaning that in just under a third of all matches (31.84%) neither team scored two goals or more.

Using a poisson distribution applet I was able to calculate that the most probable that a 1-1 scoreline could ever be is 13.533569%.  That's longer than 6/1 in fractions, but shorter than 13/2.  Accepting any odds of 6/1 or less (11/2, 5/1, 9/2, etc.) for a 1-1 score draw, therefore, would mark you out as a 'mug punter'.


The implications for successfully predicting scorelines are considerable.  Even if we find 10 matches on a football coupon that all had the highest probability of a 1-1 scoreline possible, the chances of getting at least 3 out of the 10 correct will never exceed 29.75%.

In other words, don't be calling my spreadsheets out if they only get one or two score forecasts correct each gameweek, because in  reality, achieving 3 out of 10 with any more regularity than once every three gameweeks is against the odds.

As for correctly predicting all the results of the SkySports Super 6, you will never have higher than a 0.00061% chance of winning the jackpot.  That's a one-in-one hundred and sixty two thousand, seven hundred and fifty two chance (162,751-1), in the best case scenario.  Little wonder then that I've never won it!

If you were in any doubt about the random nature of much of what happens on a football pitch, then these startling odds should help you better understand the enormity of the task faced by those trying to provide accurate predictions.

If you understand predicting scorelines is difficult, then you will realise that forecasting who will be doing the scoring and assisting can be an even more unpredictable business!  Like weather presenters assuming the weather tomorrow will correspond with that of yesterday, much of what passes for FPL punditry too often presents evidence of what has happened in recent gameweeks as incontrovertible proof of what will happen in future gameweeks.  In my experience though, such linear thinking in FPL rarely work out how we expect it to.  Hence the reason for the phenomenon of 'kneejerking'.

Health warning

Finally, I should warn all of my triallists about the dangers of dependency.  Use of spreadsheets can become seriously addictive.  Never binge drink algorithms as they can really go to your head.  They might give you dutch courage to make maverick moves that leave you with a really bad overall rank hangover.

Please drink responsibly.


Cheers!

Coley aka FPL Poker Player @barCOLEYna