The Scrabble Leaderboard
For some time now we have used the “Scrabble Club” as our base line technical test for developers, and we have seen other organisations use the same test for their technical entrance. It has served us all well….
There are no secrets or tricks to delivering a good solution to this exercise, but there can be a sting in the tail if you don’t heed the advice of considering the most important part of the final system to inform your initial designs. For more, read on.
There is a natural tendency in software to start with the easiest part of a development, get it working, then move on to enhancing the solution to meet the rest of the requirements. Sometimes this works just fine and is exactly what “emergent design” advocates. As a developer it is great, as being able to see working code early gives us all that feel-good progress factor.
With the Scrabble club this “emergent” approach generally means writing some basic CRUD to get entities like players and matches working. The problem with this is that the simplest CRUD schema can make it very difficult to generate the leaderboard using a single efficient database query. Once they hit this barrier, the developer usually thinks that it is not the end of the world as they can use the programming language to pull all the necessary data from the database and then loop to sort and select the answer.
The issue with this (and it is a big issue, a real show-stopper) is that so long as the data set is small everything works fine. Once the database starts to grow the impact on the application is catastrophic. In short, this “solution” does not scale.
We have seen a classic example of this recently in one of our clients. Their system contained a function that queried the database then looped the results hunting for the existence of a value. This started to cause problems two years after it had been deployed; a subsequent development increased the amount of underlying data eligible to be returned by the query to levels where the function grew from taking less than one second to almost two minutes. The culprit was “query then loop”, where a more refined query would have returned the answer. Better, a well written query should have continued to return the required results without any degradation in performance. Changing that code over to a query immediately solved the problem and removed the stress from the client as the applications performance was having a direct and major impact on their multi-million pound business.
The lesson to learn….
Use the database to do the hard work on the data; that is what databases are designed to do. If the database can’t produce the final result you are looking for, start to question your schema design.
So back to the Scrabble Club. The questions you should ask yourself when looking at your solution are:
1. Is the leaderboard generated by a single query?
2. If the company were to ask you “Could you use the system to produce the leaderboard for this time last year so that we can compare it?”, could you do it without a major re-write? This is outside of the Scrabble Club requirements, but it does help with a second problem that we often see, which is throwing away base data and storing the figure that the users are currently looking for instead. It gives us a second lesson as a really good hint:
Store the base data into a schema that allows a query to extract the answer you are currently looking for. If you store the answer that the users are looking for today, you won’t have the data to answer the question they want to ask tomorrow. If data is thrown away it is often difficult if not impossible to recreate.
and yes we have seen that problem in client’s systems too….
Good luck with the Scrabble Club.