tap vs. each_with_object: tap is faster and less typing.

posted 2012-Feb-17
— updated 2012-Feb-23

Ruby 1.9 introduced Enumerable#each_with_object, a crazy-specific method that takes an object, invokes each while also yielding that object, and then returns the object. For example:

by_id = items.each_with_object({}){ |item,h| h[item.id] = item }

It’s almost exactly the same as good old Enumerable#inject, except that you don’t have to ensure that the memo object is the last expression in the block:

by_id = items.inject({}){ |h,item| h[item.id] = item; h }

Ruby 1.9 also introduced Object#tap, a general-purpose method that yields the receiver to the block and returns it when done:

by_id = {}.tap{ |h| items.each{ |item| h[item.id] = item } }

I don’t really understand people who use each_with_object. Using tap/each is always fewer characters to type. It uses general-purpose methods instead of a special-case method whose yielded-parameter order you have to remember. (It’s the opposite of the order for inject.) And as an added bonus, it’s also always slightly faster:

N = 1_000_000
nums = N.times.map{ rand(N) } # Lots of random numbers

require 'benchmark'
Benchmark.bmbm do |x|
  x.report('inject'){     nums.inject({}){ |h,n| h[n]=n; h }         }
  x.report('tap/each'){   {}.tap{ |h| nums.each{ |n| h[n]=n } }      }
  x.report('ea_wi_obj'){  nums.each_with_object({}){ |n,h| h[n]=n }  }
#=>                 user     system      total        real
#=> inject      0.660000   0.020000   0.680000 (  0.682896)
#=> tap/each    0.630000   0.010000   0.640000 (  0.636919)
#=> ea_wi_obj   0.950000   0.030000   0.980000 (  0.971507)
Michael Kohl
05:18PM ET

I think the order of block arguments make sense, it’s consistent with each_with_index (first the yielded object, then the other thing).

My problem with tap is that it’s called on what’s about to be the result, instead of the data that’s gonna be transformerd. However, a small alias makes this more convincing:

class Object
  alias :filled_with :tap

{}.filled_with { |h| items.each{ |item| h[item.id] = item } }
08:13PM ET

There’s no difference between using tap and pretending you don’t know about the existence of either inject or each_with_object.

That is, if you test the following, it will be slightly faster than the usage of tap in the example above yet be exactly the same functionally as using tap as it is used above:

h = {}
nums.each { |n| h[n]=n }

Also, it’s clearer what the developer’s intention is with this, whereas when I see tap I expect a developer to be tapping into a method chain to perform operations on intermediate results within the chain.

12:37PM ET

Nice observation. I know you are comparing each_with_object vs tap/each here, but I’m compelled to point out that inject is slightly different here if you are using immutable types.

[1, 2, 3].inject(0) { |sum, i| sum + i }

In Ruby code I’ve seen, most people use inject to populate a Hash as you have done here. I see inject as more of an inline recursion pattern.

net.mind details contact résumé other