How I accidentally became a data-scientist
Or why Ben Brereton-Diaz changed his name.
I've never really been in to data and statistics. I'm not great at maths and it gets tiring figuring things out from first principles every time I need to do some calculation. And while I'm pretty impressed with what Microsoft Excel (and other spreadsheets) can do, and how empowering they are, they've never interested me.
But that may be about to change.
If you've never played the Football Manager (formerly Championship Manager) series of games, you might not realise the depth that these games have. I like the games because you get to make tactical tweaks during the game itself. In the olden days you would watch small circles move around a pitch, notice that your left-back circle was getting consistently beaten by their right winger circle and tell your left-back circle to play deeper, or get your defensive midfielder circle to stay back.
But as time has gone by, FM has become bigger, deeper and, effectively, a football simulator, rather than just a game. The database of players has grown to cover every major (and many minor) leagues around the world. And each player has a set of attributes (pace, finishing, heading etc) describing their abilities - these are painstakingly researched by a team of data analysts. In fact, the Football Manager database is so highly regarded that the Athletic, last summer, ran a series of articles rating player transfers in the real football leagues and they included Football Manager "ability" scores as part of their rating.
Aside: the story is that the Chilean national team was hunting for a new striker, eligible to play for the country. After searching the Football Manager database, they discovered that Ben Brereton, former Nottingham Forest academy graduate and kid from Stoke, was half-Chilean. They asked him to join their squad, he set the Copa America alight and became a cult hero in his newly adopted country. As a result, he adopted the Spanish tradition and added his mother's surname to his father's - becoming Ben Brereton Diaz.
The game is absolutely enormous and, like one of those open-world games, you can choose how you want to play it.
I mean, here is the list of roles and responsibilities at your club.
If you want to act as a director of football, overseeing the business side of things and delegating the day-to-day stuff to your staff, you can do. If you're a "tracksuit manager" and want to focus on training and the development of players, you can do. For me, I'm interested in match-days, choosing my team, tweaking my tactics and responding to events during the game.
However, there's a growing trend in Football Manager for running your club along Moneyball lines. "Moneyball - the art of winning an unfair game" was a book about Billy Beane and the Oakland Athletics baseball team. The underlying premise was that the way that clubs scouted new players was outdated. Instead of travelling round the country to watch players and relying on personal accounts and judgements, they identified particular traits that they wanted to improve within their team, looked through the statistical data for players who matched those traits, then went and scouted the players no-one else was interested in.
I know nothing about baseball, so here's the football equivalent.
Let's say your side has a big hefty number nine who's great at scoring headers. To feed your striker, you need crosses. So, when you're buying a new right winger you need to look at the "number of crosses per 90 minutes" and "successful cross completion percentage". If you can pick out players who excel on those two statistics, you've found your perfect candidate. You're not looking for a winger who's good at interceptions, at pressing, who can play a killer through ball or other skills that everyone else is looking for. All you want is accurate crosses and lots of them. And because their other stats might not be that hot, it's likely that their transfer fees will be lower.
In the Premier League, Brighton and Brentford are two clubs who have over-performed using data analysis. Brighton, in particular, have a reputation for finding obscure players for very little money, who perform incredibly well and then get sold on for hundreds of times their original transfer fee.
And now people are bringing Moneyball to Football Manager.
As well as the player attributes you can see, FM has a whole host of hidden attributes that the game engine uses to add some unpredictability. So, when scouting for new players, you can see their main attributes, you can get a scout report (from your scouting team, who, depending upon your budget, may or may not be accurate) but, without those hidden attributes, you're always taking a gamble with every signing.
However, FM also keeps track of statistics, similar to the ones used by Opta or Transfermarkt. Things like "key passes per 90 minutes", "headers won", "expected goals", "expected assists". These are compiled as the season progresses and the engine simulates games in all the other leagues across your game world. The problem is it doesn't surface this information directly and it's definitely not searchable.
This is where the FM obsessives come in.
Football Manager allows for "skins" - alternative UIs - to be added. So FM boffins have built skins that show these stats on the player profile, search and scouting pages. You can ask your scout to find "right wingers in the top European leagues", then view their stats (for this season) on the recruitment page.
Then, the data people took it even further.
You can export your scouting results list to an HTML table. Copy and paste this table into a specially designed spreadsheet, and you can filter the results to find the players who meet your criteria.
But the really clever bit is this. The spreadsheet doesn't only let you filter your own results. These crazy people have run ten year simulations of Football Manager, exported the collective statistics of all players over those ten years, and added them in to the spreadsheet. When you paste in your search results, it colour codes each statistic and can tell you how these players compare to the baselines from their massive database. You want a winger who is in the top 10% in the world for cross completion? The spreadsheet will show you which percentile they are in, compared to everyone else in the Football Manager world.
Now you really can Moneyball your season.
Choose your formation and how you want the team to play. Pick out the key statistics required for each role. Run a search for players and then choose the ones who outperform the rest in those stats. Because your search is so focussed, you will find players that everyone else overlooks, meaning you can get them on the cheap. And, once they're a success, sell them on for a huge fee and use the money to grow your club.
It's enough to make me want a copy of Excel.