On Dec 1st, 2005, the Boston Bruins made a catastrophically piss poor decision to trade Joe Thornton to the San Jose Sharks. As a result of that trade, San Jose became a perennial contender. Boston did eventually become a great great team, but the trade of Thornton could only be described as the agonizingly slow route.
The rationale for the trade was the belief in the gutless Thornton mystique, a story created from the nonsense of his first foray into the playoffs when with a broken rib he was only able to get one measly assist in a 7 game series loss to the Montreal Canadiens.
Over the years I have seen Joe Thornton carry mediocre San Jose teams to the playoffs only to be continuously thwarted by Doug Wilson’s inability to get a decent third line…
Or so I think …
My point in all of this, is that Joe Thornton, to anyone who knows hockey, is most definitely not a Bruins player…
Unfortunately Facebook’s algorithms are a little bit less aware of the NHL and hockey… and produced this trending result:
Joe Thornton: Bruins’ Thornton fined $2,800 for squirting water
And herein lies the danger of computers and their lack of semantic awareness. They create such delightful moments of absurdity. The algorithms confused Joe Thornton with Scott Thornton.
Without any knowledge of the internals of the implementation, I could suppose that Facebook used the fact that I post stuff about Joe Thornton, including posts I have made about how he once played for the Bruins to surface a note about Scott.
This is the inherent danger in an era of big data. We have all of this data that we can then use clever algorithms to create interesting and useful results that we can immediately share with the world. And most of the time the results are good.
The problem arises when the result is not good
And it’s not that computers are less reliable than human beings, humans are just as prone to absurd errors, the problem with computers is how we humans are interpreting the results. Or choosing to share the results without interpreting them because they are correct most of the time.
The key difference with the past is that the computers had access to less data, so our ability to find relationships where none existed was limited. We knew that the data was too small to trust any result that did not fit our intuition.
But now, with so much data, who is to say that the relationship doesn’t exist?
Perhaps our clever algorithms did find relationships that were only possible because of the amount of data we had collected.
And so we need to be careful and humble as we look at the results. Sometimes we will find things in the data that we didn’t know existed, and sometimes we might be looking at the night sky and seeing a horse in the stars.
Because, I can assure you that Joe Thornton is both not a Bruins player and not playing hockey right now.
And as I review my post, it’s tempting to ask myself – maybe the data is telling me a truth that Joe Thornton is not as great in the playoffs as I think he is. Maybe I need to listen the fact that he was -6 and scored only 3 points this year and that that had something to do with the performance of the Sharks. Maybe. Or not.