The Fastest Way to Concatenate Strings and Arrays in Ruby

by Mike Zazaian at 2009-08-15 06:06:35 UTC in syntax

remove hidden bottlenecks by assimilating these tips into your best practices toolbox

no comments no links

convention: the silent killer

There are a couple of conventions in programming that you never really think to question because they're just so damn commonplace. Method arguments, for the most part, act in the same way across different languages, as do equals signs, and concatenators, and all of those little bits that every language HAS to implement to be useful and high-level.

And while we except these conventions, and you never really think about this in any capacity, the fact that something is commonplace and conventional does not necessarily infer that it is the best available solution. And, as I'm coming to realize, it usually isn't. I know this is kind of a big fuck you to Occam and his Razor, but hear me out.

here be dragons

There comes a time in every programmer's life when she or he will want to concatenate one string to an existing one.

For PHP programmers, it looks something like:

$var . " add to var" . "some other stuff to add"

And here in Ruby, as in Javascript and many other languages that use the plus symbol as their concatenator, something like this:

var + "a seemingly more expressive syntax" + "i agree entirely"

That seems to make a bit more sense, no? Looking back on my PHP existence, especially now being so firmly entrenched in Ruby, it seems that the language tended to invent its own conventions for seemingly arbitrary reasons or, at the very least, not for the sake of expressibility. Oh, well. We can't all be Ruby.

Anyway, concatenators. We use them constantly for Strings and Arrays (but not Hashes, sadly). I bet you'll find a + sign in ninety percent of your methods, more maybe. Ninety five. Whatever the case, I've just always assumed that it's the superior way to concatenate and never thought twice about it.

Until today.

enter "string".concat()

The concat() method in the String class, or its more prevalent alias, "<<", is really the de facto way to append Strings into Arrays. You see it everywhere, it's VERY commonly used, and nobody ever really questions it.

But it's available to String objects, too. And it's one of those things -- sometimes you'll use it because you're feeling whimsical, and just want to prove to yourself that you know how do something simple like that in eight different ways. So you do, and you move on, and you smirk with a sense of self-importance, and you don't think about it again until the whim returns.

And it was amidst one such instance that it occurred to me, "one of these has to be faster".

testing string theory

So there are approximately three sane ways in which to concatenate strings. You can use the plus(+) or plus equals(+=) concatenators, you can use <<, or you can sort of group everything together inside of a string and then supplant your variables inside of it using pound/brackets(#{}). We're going to test all of those right now. Let's write a benchmark:

require 'benchmark'

Benchmark.bm do |b| 
  b.report('+') do
    hyper = "hyper"
    crypto = "crypto"
    monkey = "monkey"

    10_000_000.times { hyper + crypto + monkey }
  end

  b.report('<<') do
    hyper = "hyper"
    crypto = "crypto"
    monkey = "monkey"

    10_000_000.times { hyper << crypto << monkey }
  end

  b.report('#{}') do
    hyper = "hyper"
    crypto = "crypto"
    monkey = "monkey"

    10_000_000.times { "#{hyper}#{crypto}#{monkey}" }
  end
end

That's right, we're going produce thirty million hypercryptomonkey objects as quickly as we can just for the sake of doing so. And while no, we haven't yet distilled an actual application for the resulting hypercryptomonkeys, we here at do{block} labs (a contract employee of Insani-T Chemical Enterprises) are confident that we can establish a market for one with an aggressive and whimsical advertising campaign. How very American of us.

the string results

But that's another article. For right now all we care about is how quickly we can crank out said hypercryptomonkeys, and by which method. And here are the results:

=      user       system     total       real
  +  11.550000   1.430000  12.980000 ( 13.130550)
 <<  8.760000    1.430000  10.190000 ( 10.299680)
#{}  13.770000   1.380000  15.150000 ( 15.316712)

Yeah, that's right. The + sign is about 27.5% slower than the << or concat() methods. And the #{} method, which never really seemed to be enormously efficient in the first place, still clocks in at about 48.7% slower than << and concat.

And, luckily for Insani-T Chemical Enterprises, Ruby is open source, and implementing the superior << method costs exactly zero percent more than its portlier relatives.

but, why?

Fair question. The plus symbol, it seems, creates an intermediary copy of the variables before combining them, whereas << and concat directly concatenate the variables to each other without first producing an intermediary copy. #{} isn't even in the same league because it's not even concatenating so much as just creating a new string object altogether with the properties of the other strings.

what about Arrays?

I'm glad you asked.

In their unrelenting quest to mass market marginally useful and unquestionably hazardous products, Insani-T Chemical Enterprises, a subsidiary of Mom's Old Fashioned Bavarian Pretzel's, Inc. (did I forget to mention that?), is aggressively developing a new product called "animal basket, which, exactly as it sounds, is simply a basket full of animals.

After their success with the enormously popular hypercryptomonkey line (especially in the Czech Republic), and with limited overhead thanks to <<, they're once again researching the fastest way to assemble the baskets. So as any studious contractor, I got wind of the project, put in a bid, and was hired to write a benchmark. Here it is:

require 'benchmark'
Benchmark.bm do |b|
  
  b.report('+') do
    animals = ["insane walrus", "maniacal otter", "eloquent raccoon"]   
    1_000_000.times do
      basket = []
      animals.each do |a|
        basket += [a]
      end
    end
  end

  b.report('<<') do
    animals = ["insane walrus", "maniacal otter", "eloquent raccoon"]    
    1_000_000.times do      
      basket = []
      animals.each do |a|
        basket << a
      end
    end
  end
  
  b.report('push') do
    animals = ["insane walrus", "maniacal otter", "eloquent raccoon"]    
    1_000_000.times do      
      basket = []
      animals.each do |a|
        basket.push a
      end
    end
  end

  b.report('pipe') do
    animals = ["insane walrus", "maniacal otter", "eloquent raccoon"]    
    1_000_000.times do      
      basket = []
      animals.each do |a|
        basket | [a]
      end
    end
  end
end

As you'll see we're testing a couple of new methods, push, which adds the string in question as the last item of the Array and | or the pipe method, which combines two Arrays and flattens their values.

I tried to convince them to spend the extra money to insert breaks in the test in case the insane walrus bit somebody, or if the eloquent raccoon began to recite Kantian philosophies, but they just ignored me and said "that's the cost of doing business". Fucking Mom's Old Fashioned Bavarian Pretzels. This is why I don't take corporate contracts.

Sorry, I get bitter sometimes.

Where were we? Oh yeah, the results:

=       user      system      total       real
   +  5.160000   1.020000   6.180000 (  6.234889)
  <<  3.660000   1.020000   4.680000 (  4.727642)
push  3.720000   0.990000   4.710000 (  4.739347)
pipe  7.140000   1.060000   8.200000 (  8.274396)

So really no surprises here. The only way to concatenate a String to an Array is to first create a whole new array object in which to encapsulate the String. Unsurprisingly, that eats up some time. The same goes for the pipe method. So + and << are about 32% and 75% slower respectively.

Push, though, as you can see from the results, was about eighty-three thousandths of a second slower than <<. I'm thinking that this is because push is actually an alias for << (or Append as it's called) in the Array method, but I'm not sure about this. Either way, the difference was so marginal that I tested just those two at ten million iterations rather than one million to see if anything changed:

=       user      system      total       real
  << 36.010000  10.880000  46.890000 ( 47.008566)
push 36.370000  10.920000  47.290000 ( 47.706194)

And nothing did. If anything push proved to be a tad slower than it initially seemed.

bon voyage

Twenty-four million animal basket prototypes later, I parted ways with Insani-T Chemical with some cash in pocket, and several %15 off coupons for the entire line of Animal Basket products. I was a bit worldlier, knowing now that unless I had any special reason to do otherwise, << would always prove the quickest method by which to concatenate Strings or to append strings to Arrays.

Farewell, Insani-T Chemical. Until we meet again.

no comments

post comment
Comments are marked down using Ryan Tomayko's excellent rdiscount gem, which follows standard markdown conventions. If you don't know markdown, you can learn it using the Daring Fireball markdown syntax guide.
required
required
login to post comments without entering your name, email address and recaptcha code each time, or register if you haven't already done so

markdown basics

**bold** __bold__ [link](http://link.com "link") * unordered list item
*italic* _italic_ ##h2 heading 1. ordered list item
> blockquote ####h4 heading <code>@ruby</code>

latest links

ActiveScaffold A Ruby on Rails plugin for dynamic, AJAX CRUD interfaces
rsl's stringex at master - GitHub Some [hopefully] useful extensions to Ruby’s String class. It is made up of three libraries: ActsAsUrl [permalink solution with better character translation], Unidecoder [Unicode to Ascii transliteration], and StringExtensions [miscellaneous helper methods for the String class].
friendly-id's friendly_id-2.1.4 Documentation

login

register activate reset

feeds

articles/rss

topics

staff

editor

about

doblock focuses on ruby, rails, and all things that can help ruby and/or rails programmers hone their skills.

Techniques, tutorials, news, and even free open-source applications, doblock seeks to fill in the cracks of the ruby/rails blogosphere.

doblock v. 0.8.22 powered by Rails