Friday, 23 November 2012

Ruby, facemash and the Elo rating system via BDD

Reproducing the Elo rating algorithm in Ruby is a little challenge that I took upon myself recently for a small hack. The same Elo rating system that was scrawled upon the glass in "The Social Network" as the algorithm that Mark Zuckerberg used on rank people on FaceMash. As an exercise, here's a pass at it with a little BDD thrown in for good measure.

A new file - elo.rb:

require 'minitest/spec'
require 'minitest/autorun'


describe "result" do
  describe "when both players start with rating of 100 and k factor of 30"  do
    it "must return 115 for the winner" do
      new_rating.must_equal 115
    end
  end
end

Starting a new algorithm is always tricky. We know that the Elo rating system is essentially concerned with assigning ratings to players based on the outcome of their games or matches. In fact, it is widely used as a chess ranking algorithm. At a first glance then, I thought I might want to jump in and start modelling some objects - players, games, ratings, player ranking. But that sounds like a big bite. A lot can go wrong with hunches.....bunches of hunches and hunches in bunches.

Small steps then - what are we doing? The big picture. The major output we want is a new rating! Our first test tries to check that the new rating is equal to 115. Why 115? Because the excellent online elo calculator told me so when I typed a bunch of numbers in there.

The numbers

To get some numbers, I first read wikipedia's entry on the Elo rating system. The algorithm relies on each player having a rating, each game delivering a score to each player and the mysterious "k" value.

I decided, rather arbitrarily, to assume that in my two player match, each player starts with a rating of 100. We could, equally have started with 0, but 100 felt right. Similarly, each player starts with a k value of 30. Using any method you fancy (the aforementioned online elo calculator, or even, working it out by hand), a winner in the above scenario, would have a new rating of 115.

Of course, running our code fails miserably:

Loaded suite elo
Started
E
Finished in 0.001000 seconds.

  1) Error:
test_0001_must_return_115_for_the_winner(ResultSpec::WhenBothPlayersStartWithRatingOf100AndKFactorOf30Spec):
NameError: undefined local variable or method `new_rating' for #<ResultSpec::WhenBothPlayersStartWithRatingOf100AndKFactorOf30Spec:0x28a8830>
    elo.rb:8:in `block (3 levels) in <class:TestElo>'

1 tests, 0 assertions, 0 failures, 1 errors, 0 skips

Test run options: --seed 62642

Player

We can get this to pass by declaring new_rating = 115. That would be very baby steps. I'm going to go a little faster. We are dealing with a "new rating". What has a rating? A player has a rating. Let's go with that for a moment and create a Player class.

class Player
  attr_accessor :rating
end

describe "result" do
  describe "when both players start with rating of 100 and k factor of 30"  do
    it "must return 115 for the winner" do
      winner = Player.new
      new_rating = winner.rating
      new_rating.must_equal 115
    end
  end
end

Run this and we still fail. However, we fail because our new rating is not 115 as we expect in our test.

  1) Failure:
test_0001_must_return_115_for_the_winner(result::when both players start with rating of 100 and k factor of 30) [elo.rb:19]:
Expected: 115
  Actual: nil

Getting rid of our nil is easy - we can define a new Player to default their rating to 100 if it isn't specified.

class Player
  attr_accessor :rating

  def initialize(rating=100)
    @rating = rating
  end
end

And our test starts to produce some more meaningful failure

  1) Failure:
test_0001_must_return_115_for_the_winner(result::when both players start with rating of 100 and k factor of 30) [elo.rb:19]:
Expected: 115
  Actual: 100

Current rating, New rating

Implied in all this Elo business is the idea that at some point, we will calculate the new rating based on a game between our players. To that end, we want the ability to update a player's rating based on a game outcome. That sounds like a pretty tall order. It starts me thinking towards game objects and....tooooooo complicated. Let's rather put some functionality into Player in the meantime - just to get it working.

class Player
  attr_accessor :rating

  def initialize(rating=100)
    @rating = rating
  end

  def calculate_new_rating
    @rating = 115
  end
end

describe "result" do
  describe "when both players start with rating of 100 and k factor of 30" do
    it "must return 115 for the winner" do
      winner = Player.new
      winner.calculate_new_rating
      new_rating = winner.rating
      new_rating.must_equal 115
    end
  end
end

Putting the above code together, our tests pass! Great - now all we have to do is write the code to actually do the calculation.

Calculating Elo

Looking back through Wikipedia and translating the algorithm into Ruby, we might end up with something like the following:

  def calculate_new_rating
    new_rating = rating + k_value * (score - expected_outcome)
  end

The k value, as we mentioned earlier, is going to be 30. Score is either 1 or 0 (win or lose - we ignore draws for now). That just leaves us with expected outcome.

Expected rating

We need to have expected outcome in place before we can calculate our actual new rating based on the outcome of our game. Time to code up the equation for expected outcome. I'm going to put it into Player for the moment. It might not actually belong there in the end, because an expected outcome sounds like something that should belong to a Game. But it is a start. Simply copying the formula from Wikipedia, we get the following piece of code:

  def expected_outcome
    1.0/(1 + 10**((opponent_rating - @rating)/400))
  end

Things to note

  • We are raising to a power - in Ruby we use ** to achieve this.
  • We need the opponent's rating
  • We should probably think about floating point arithmetic

We've meandered away from our tests at this point. Let's get back to them and start testing our expected outcome, we should have written this first really:

  describe "expected outcome" do
    it "must return 0.5 when ratings are equal"
      winner = Player.new
      winner.expected_outcome.must_equal 0.5
    end
  end

Even if you don't know the Elo algorithm inside and out - this makes sense - given two opponents who are the same level, the chances that one will win should be 0.5. The algorithm is fairly well documented though, and if you are inclined, you can hunt around and find tables of expected outcomes based on differences between opponent ratings. Running the above crashes - an undefined variable 'opponent_rating'. I'll make it a parameter

  def expected_outcome(opponent_rating)
  ....

Which gives us a wrong number of arguments error. We need to update our test and pass in an opponent rating

  describe "expected outcome" do
    it "must return 0.5 when ratings are equal"
      winner = Player.new
      winner.expected_outcome(100).must_equal 0.5
    end
  end

Passing tests. Good. Now we can flesh out that new rating algorithm. Feeling confident, I'm changing a fair amount in one go here, let's hope it works:

  def calculate_new_rating(score, opponent_rating)
    @rating = rating + 30* (score - expected_outcome(opponent_rating))
  end

and fixing our test

it "must return 115 for the winner" do
  winner = Player.new
  winner.calculate_new_rating(1, 100)
  new_rating = winner.rating
  new_rating.must_equal 115
end

Wow - it's good! Let's discuss the code a bit - I changed the calculate_new_rating method to take two parameters - a score and the opponent rating. Then I substituted 30 in for the k_value - at some point this should become a constant perhaps, or a parameter. I also set our rating instance variable upon calculating the new rating. This means that any time I call calculate_new_rating on a Player, it will update that player's rating to be their new rating. I guess I could start running around and adding in variables for current rating, old rating and such, but this feels a bit cleaner to me - you have a rating - as soon as a game is played, your rating should change to reflect that. I am passing in score - currently 1. In our implementation, it will only ever be 1 or 0 - games can be won or lost, and never tied (we are, after all, implementing a facemash like algorithm - there are no draws).

Improvements

This code is a first draft - a good working example - it has its drawbacks though, which become apparent when you start adding in more tests, and especially when you test the very real scenario of calculating the ratings of two players, one after the other. I'll leave you to play with it a bit and see if you can spot the gotchas. I'll pick it up again next time.

Monday, 12 November 2012

Stop asking for a ninja when you actually want a samurai

From the various appropriate entries on Wikipedia about ninja and samurai (emphasis mine)

A ninja or shinobi was a covert agent or mercenary. The functions of the ninja included espionage, sabotage, infiltration, and assassination
The shinobi proper, a specially trained group of spies and mercenaries
Samurai were the military nobility
Samurai felt that the path of the warrior was one of honor, emphasizing duty to one's master, and loyalty unto death

Now which one do you really want working for you?

Thursday, 8 November 2012

Overriding equality and Test Driven Development

Ruby has, at its root, an Object. Methods available in Object are available to every class because every class in Ruby inherits from Object somewhere in its own class hierarchy. Of course, you can override methods in subclasses, changing the functionality of a root method.

You might stumble on to this idea if you work through Test Driven Development By Example by Kent Beck, translating the Java code into Ruby as you go. At some point pretty early on, he overrides the equality method on the Currency class to better test if two instances are equal. I'm going to do the same here, working with Instruments instead of Currency.

Equality

Equality in Ruby can be expressed using any of the following three methods

object == other
equal?(other)
eql?(other)

These methods are defined on the base Object. The default implementation of equality will only return true if both objects are exactly the same. The interesting thing is that although these three methods start out functioning the same, the documentation does note the difference between them:

Equality—At the Object level, == returns true only if obj and other are the same object. Typically, this method is overridden in descendant classes to provide class-specific meaning. Unlike ==, the equal? method should never be overridden by subclasses: it is used to determine object identity (that is, a.equal?(b) iff a is the same object as b). The eql? method returns true if obj and anObject have the same value. Used by Hash to test members for equality.

So we are encouraged to override == to test equality of our own classes and objects.

Instruments in irb

Let's put together a simple implementation to see how this works. Fire this up in irb:

class Instrument

  def play
    puts 'lovely sound'
  end

end

guitar = Instrument.new
=> #<Instrument:0x2a1b9d8>
another_guitar = Instrument.new
=> #<Instrument:0x2940c70>
guitar == another_guitar
=> false
guitar.equal?(another_guitar)
=> false
guitar.eql?(another_guitar)
=> false

You get the idea. The two guitars are not equal. Everything functions as per the documentation. Just for completeness, Ruby considers an object to be the same when it is *exactly* the same - keep typing:

third_guitar = guitar
=> #<Instrument:0x2a1b9d8>
third_guitar == guitar
=> true

That makes sense and behaves as per the documentation. Equality tests that the "actual" objects are equal. This is the equivalent of saying, in the real world, no two guitars are the same - even if they are both, for example, Ibanez Jems (see what I did there? ruby gems, guitar jems) with the Floral pattern that came off he production line one after the other.

Okay, the pedantic among you will argue that they won't be the same, the floral pattern will probably be a bit different - that was the point of Jems right? The tone might be a little different, the feel will be different - minutely different, but different none the less. Buuuuuut, I would argue, to all intents and purposes, these two fictional guitars are the same. They will have the same price tag, they will play pretty much identically (in double blind playing trials), they have the same number of pickups, same number of frets, same volume and tone potentiometer, same pickup-selector switch. They both have the same monkey handle, they will both produce the same note when tuned to A440.

The same

Given the above, for our example, we are going to say they *are* the same. In fact, to make it even simpler to show the code, I'm going specify that any 'guitar' is the same as any other 'guitar' (like Bruce Lee said "A kick is a kick, a punch is a punch").

To represent this in Ruby, we have to override object equality and put in its place, our own code to compare two instruments. That sounds a little scary to just bash out and I heard the word "specify" a few sentences ago, so I'm going to wire up some tests using the awesome minitest (last seen integrating into Rails) to help us code this and see where we get to.

Using various asserts included in our testing framework, we can test equality of objects because things like assert_equal actually just take the two objects and call the "==" method on those objects. Climb out of irb and create yourself a new file - I'm calling mine minstrel.rb:

require 'minitest/autorun'

class TestInstruments < MiniTest::Unit::TestCase

  def test_guitars_are_guitars
    guitar = Instrument.new
    another_guitar = Instrument.new
    assert_equal guitar, another_guitar, "guitars should be guitars"
  end

end

Pretty easy right? We require our minitest library and create a new class that inherits from MiniTest::Unit::TestCase - because that's how you use minitest :). Then we set up our first little test, create two guitars and try to test the assertion that they are equal. Running this, of course, fails miserably:

Started
E
Finished in 0.000000 seconds.

  1) Error:
test_guitars_are_guitars(TestInstruments):
NameError: uninitialized constant TestInstruments::Instrument
    minstrel.rb:10:in `test_guitars_are_guitars'

1 tests, 0 assertions, 0 failures, 1 errors, 0 skips

And we can meander down a small intro to Test Driven Development if we fancy, but that's not really what I'm trying to show here. Instead, I'll just add our Instrument class that we played with in irb up above. A bit messy because I'm putting all my code in minstrel.rb....

require 'minitest/autorun'

class Instrument

  def play
    puts 'lovely note'
  end

end

class TestInstruments < MiniTest::Unit::TestCase

  def test_guitars_are_guitars
    guitar = Instrument.new
    another_guitar = Instrument.new
    assert_equal guitar, another_guitar, "guitars should be guitars"
  end

end

Now I can run the whole shebang again:

  1) Failure:
test_guitars_are_guitars(TestInstruments) [minstrel.rb:12]:
No visible difference.
You should look at your implementation of Instrument#==.
#

1 tests, 1 assertions, 1 failures, 0 errors, 0 skips

That' very interesting - look - it fails the assertion, but it actually tells us that there is no visible difference. Further, it points us nicely towards the method we should be examining which is, coincidentally, our equality on Instrument. Of course it is the equality method that is inherited from our base Object class, because as yet, we haven't defined an equality method explicitly.

Defining equality

At last we come to our point - how to override equality on our objects. Baby steps - I'm going to make my assertion pass by overriding equality in my Instrument class and always return true - not our final implementation, but, it is a start:

class Instrument
  
  def ==(other)
    true
  end

  def play
    puts 'lovely note'
  end

end

Running my tests now gives this output

# Running tests:

.

Finished tests in 0.000542s, 1845.7745 tests/s, 1845.7745 assertions/s.

1 tests, 1 assertions, 0 failures, 0 errors, 0 skips

Great success

We have successfully overridden our equality method. Of course our implementation is not complete - because any instrument will be considered equal to any other instrument at this point. One implementation is to expose an Instrument Type property on Instrument. Initialize any instance with a type and then test the types are the same for our equality. You might end up with something like this:

minstrel.rb:

require 'minitest/autorun'

class Instrument
  attr_reader :type

  def initialize(type)
    @type = type
  end

  def ==(other)
    @type == other.type
  end

  def play
    puts 'lovely note'
  end
end

class TestInstruments < MiniTest::Unit::TestCase

  def test_guitars_are_guitars
    guitar = Instrument.new('guitar')
    another_guitar = Instrument.new('guitar')
    assert_equal guitar, another_guitar, "guitars should be equal"
  end

  def test_guitars_are_not_violings
    guitar = Instrument.new('guitar')
    violin = Instrument.new('violin')
    refute_equal guitar, violin, "guitars should not be violins"
  end

end

All over overriding

That's it - we started out on a simple mission to override our method to test equality between classes, came up with a contrived example that differs from the standard Currency example in Kent Beck's book, and ended up with a little test suite toboot. The take home message - override equality by defining your own method called "==" with one param of (other).