February 4, 2015 by Daniel P. Clark

Using Ruby Object Type Classes to Safely Build Data

When building collections of data you will find situations where the types aren’t what you planned to work with.  And when I say types I’m speaking generically of arrays, hashes, strings, integers, nil, etc.  Everything’s cosy when you know what your getting.  For example putting 10 integers into an Array:

But if you want to work something that is not yet defined, or is nil, then you’ll get an error:

You will find yourself more likely to run into this kind of situation when working with Hashes:

In each of these examples I’ve been using << to put something into what we would like to be an Array.  But the Array Object must first be instantiated before we can insert items into it.  We could do it like so:

But now we’ve taken two lines to accomplish this.  We may end up writing LOTS of code with similar behavior so we don’t really want to have to write more than we need to.  Well there’s good news.  Ruby has classes that allow us to create the right Objects for just this kind of situation.

As you can see we handed these classes a nil Object and it returned an empty collection of the type class we used.  So with this we can take our example and one-line the nil Object assignment and insertion.

And it worked!  The Array(example[:fiz]) created an empty Array since example[:fiz] was nil which then allowed us to insert 5 in to the Array and finally save it on the left side of the equals into example[:fiz].  It looks a lot like the way += works except that += will not work on nil.  Now lets try it with the loop.

In this situation the Array() class method ensured we had an Array the very first time.  It took nil and made an empty Array.  From there it used the same Array for each cycle of the loop and appended the new number to the end.  With this we don’t have to care about nil being an Issue.  That issue’s been dealt with.

A Complex Example

When organizing data you will get into far more complex situations where you will need this.  One great way is to handle files and directories and organize collections with them.  Here’s the start of an MP3 play-list creator I’m working on.  The way it works is it finds all mp3s in subdirectories and then uses the folder as the play-list group that they will belong to.

There’s a lot going on here.  You can see that I’m using both Array() and Hash().  In each case where these class methods are first reached the Object within them will evaluate as nil.  So they will create a new empty instance of either the Array or Hash instance that the collection gets built from.

Lets break it down.  Dir.glob(“#{dir}/**/*.mp3”) takes the path we hand in to the parse_dir method and goes through all subdirectories no matter how deep.  The double stars ** are what has the glob method traversing all the directories.  The end of the string *.mp3 selects any file that ends with .mp3 .  The result of this will be an enumerable Object which we can iterate over (a list of results).

Now that we have the list of results we want to take each one and place them in a “play-list” group in our Hash called result.  So with each we hand each item in the list to the variable path and start our process.

First we want to create the play-list name.  So we take the files complete path and split it by directory seperators “/”, from there we take the second from last [-2] which is the directory the file is in.  ([-1] is the file name itself) We’ll want uniform file names so we replace the spaces with underscores and lower-case the whole thing.  From there we have our string “my_directory_name.m3u” stored in the variable m3u.

Now we apply the technique I’ve shown here with result[m3u] = Hash(result[m3u]).  The first time this code is cycled through result[m3u] is nil, so Hash() turns it into an empty Hash {} and it gets assigned to result[m3u] = {}.

Next we update the Hash with new values.  Now this may take a little bit to wrap your head around so I’ll see if I can simplify it.  The first time this block is run it looks like this (with nils):

During the first time the Hash(nil)[:files] is attempting to access the symbol :files on an empty Hash {}[:files] #=> nil .  So that turns it into files: Array(nil).<< which further turns into files: [] << .  So the first time the :files key inside the result[m3u] Hash is created with an empty Array.  Then continues to insert the first item into that Array via the << method.  The Object getting inserted to that Array is the hand written Hash you see above which will look something like:

Each time the loop goes through it will now update the Array within the Hash within the Hash with the individual file details as hashes of their own.  Let me simplify it.  So result has multiple keys labelled as m3u lists (“my_directory_name.m3u”) based on the directory names.  Each of those keys will access the Hash that has the key :files which returns an Array (list) of files with their own hashes of path/filename.

And now we have successfully grouped all the file details by directory name.  If you’re wondering why I used a :files Hash instead of just an Array here it is because I have more items in the Hash in my production code.  My next step in the project is to dynamically render a m3u play-list as a view and use it to stream audio over my local area network.  It’s fun and I’m looking forward to using it.

Cautionary Note:

These methods are good for working with collections.  But not all Ruby Type Classes are equal.  If you use the Integer() method on nil it doesn’t produce zero.

So you’ll need to use .to_i for integers.  That being the case if you want to use any other methods like this then test them first to ensure their behavior.

Summary

Using Type Classes are a good practice.  They will save time and effort by incorporating them on the right side of the assignment methods.  It’s also nice that Array([]) won’t result in doubling the depth of the Array like [[]] but still returns [].  So these methods are a safety net against nil.  These methods are like having your cake and eating it too ^_^.

I hope you enjoyed this and that it was insightful. Please comment, share, subscribe to my RSS Feed, and follow me on twitter @6ftdan!

God Bless!
-Daniel P. Clark

Image by Mark Thurman via the Creative Commons Attribution-NonCommercial-ShareAlike 2.0 Generic License.

#array#article#blog#collection#collections#complex#data#dictionary#example#hash#index#key#list#lists#object#post#ruby#type#types#value

8
Leave a Reply

avatar
2 Comment threads
6 Thread replies
0 Followers
 
Most reacted comment
Hottest comment thread
4 Comment authors
Aaron SchrabStephanDaniel P. ClarkAndrew Berls Recent comment authors
  Subscribe  
newest oldest most voted
Notify of
Andrew Berls
Guest

So these are not really “type classes” in the traditional sense – type classes are a way to define functions that have different implementations depending on the type of data they’re given (“ad-hoc polymorphism”). In Haskell, you might define a typeclass and an associated “instance” Ruby [crayon-5cbb330a1375d824521402 ] class Eq a where (==) :: a -> a -> Bool instance Eq Integer where x == y = x `integerEq` y instance Eq Bool where -- ... and so on 12345678910  [crayon-5cbb330a1375d824521402  ]class Eq a where  (==) :: a -> a -> Bool instance Eq Integer where  x == y = x `integerEq` y instance Eq… Read more »

Daniel P. Clark
Guest

Thank you for informing me about the traditional sense of “type classes”. I was using it in the more literal sense of a method of the class for the type. I had originally learned about the coercive methods like Array from Avdi Grimm’s blogs and “Confident Ruby” talk. There are many who seem to share your view of what is “simpler”. But when it comes to that I believe that may be a matter of preference/perception. If I were new to the language then all of this “{ |h,k| h[k] = [] }” would having me scratching my head in… Read more »

Stephan
Guest
Stephan

For the hash example, you can use:

[/crayon]

Daniel P. Clark
Guest

Thanks for the input! That works nicely.

Aaron Schrab
Guest
Aaron Schrab

That will use the same Array object for every key that isn’t previously set. If after the above somebody did:

hash[:another] << 7

Then `hash[:test]` and `hash[:another]` would both return `[5,7]`.

Daniel P. Clark
Guest

Good catch. Wouldn’t want that unexpected surprise.

Stephan
Guest
Stephan

Yeah. I just realised. Strange that it keeps re-using the same array for every non-initialized key-value pair. I do get why though, as it would require a deep clone to fix the issue, which would signifiantly reduce performance. Guess this approach isn’t any good for arrays and hashes (which has the same issue). I have tried to come up with a clever way to fix it using lambdas etc. but I can’t seem to make this work. You could of course monkey-patch the hash-class itself and make it return an empty array in the case that the element does not… Read more »

Aaron Schrab
Guest
Aaron Schrab

There’s no need for monkey patching or doing anything overly fancy. You can instead supply a block to Hash.new:

Hash.new{ |hash,key| hash[key] = [] }

The block is responsible for putting the new item into the hash, and returning the value for the initial access.

This method had already been mentioned in a previous comment, so I didn’t address it in my earlier reply.