May 13, 2017 by Daniel P. Clark

Don’t Use Objects as Hash Keys in Ruby*

Hashes have been optimized for symbols and strings in Ruby which technically are objects but this article is for revealing how much of a difference this makes when using other objects as hash keys.  There are some cases where this makes a big difference but many times you won’t notice much of a difference.

I wrote a little experiment to see how different kinds of keys would perform for a hash.  It’s been very common for me to use the self reference as a key in a hash and now I know that’s not good for performance.  Here’s the code.

The first part of the code is setting up objects and methods to act as keys for the hash, then there’s the hash, and finally the benchmark.  Here are the results.

So any time you index a hash with an object as a key you are having your code look up the result 313% slower than it would with a Symbol type object.

Summary

I had always heard that symbols were that faster than strings in hash lookups, but I wasn’t aware that hashes were faster than method calls (see comment section below) or how slow objects were for keys.  It’s okay to use objects as hash keys if you really want to.  Just know that you pay a small price for doing so.

Where you really need to be more concerned with this is when you implement some code that will be used a lot in your code base.  So when you implement something like a raw type which may be called thousands of times in one run this is where that difference really matters.  Generally you don’t have to worry about this performance loss as Ruby itself is fast and in most cases the code written doesn’t get called that much.

One example of where it would make a big difference is the Pathname class in Ruby’s standard library.  In older versions of Rails this was called many thousands of times per request because of the asset pipeline.  I wrote the gem FasterPath to implement this heavily used code in Rust just to improve performance.  The more times the code is called in a small time frame, the more you should keep performance in mind.

Hopefully you found this information useful!  I know this is a short post.  Let me know if you like posts like this and I’ll write more.  Please feel free to comment, share, subscribe to my RSS Feed, and follow me on twitter @6ftdan

God Bless!
-Daniel P. Clark

Image by Levan Gokadze via the Creative Commons Attribution-ShareAlike 2.0 Generic License

#hash#method#object#performance#ruby#string#symbol
  • Thomas

    When testing the method call, you need to avoid unnecessary object allocations. So better change the method to:

    ~~~
    X = [1]
    def value_of
    X
    end
    ~~~

    Then you will find that the plain method call is fastest.

    Also note that calling `#value_of` and `Hash#[]` performs one method call each, so the only difference is in the implementation of the method. Although `#value_of` has a very simple implementation, Ruby has to still call Ruby code to perform this action. When invoking `Hash#[]`, a C method is invoked and although it does more, it’s still very fast.

    • Good catch! That is an important note to remember as it’s not an uncommon practice to write code like:

      [/crayon]

      • Thomas

        Yes, that’s sadly true. In most cases it won’t make a difference but using such a method in a critical execution path can lead to much work for the GC.

  • Ego

    This is somewhat ‘anal’ of me, but you might want to change the title of this article. Everything in Ruby is an object. You cannot create a non-object key for a hash in Ruby.

    • True, everything is basically an object in Ruby and hashes are optimized more so for specific objects such as symbols and strings.

      I believe many people may be aware that it’s recommended to look up hash values with either a symbol or a string so I’m generally inclined to believe that the title will lead them to think of other objects in general. I will try to clear it up some.

  • Itay Ben Ari

    Thanks for the post, if you’ll freeze the string, the performance will be just like the symbol performance.

  • soulcutter

    When benchmarking it’s really handy to give Ruby version numbers since performance may vary (sometimes significantly) between versions.

    I ran the benchmark against ruby 2.3.4p301 (2017-03-30 revision 58214) [x86_64-darwin16] and the Method key was approximately the same as a String key.

    I suppose your title is clickbait, but I think the best practice is to avoid premature optimization and write good, clear code – towards that end I would expect most people to not change how they use hash keys based on this microbenchmark. It’s a nifty piece of trivia, though, and could certainly come in handy when optimizing hot code paths.

    • “I think the best practice is to avoid premature optimization and write good, clear code – towards that end I would expect most people to not change how they use hash keys based on this microbenchmark. It’s a nifty piece of trivia, though, and could certainly come in handy when optimizing hot code paths.”

      I agree with you on this.

      “I suppose your title is clickbait”

      It wasn’t intended to be. One of my popular posts was “Rails: Don’t “pluck” Unnecessarily” which was also a post on performance. I was merely going for a similar name for the title.