The Cost of Using Methods as Variables in Ruby

This post came from my thinking about how Haskell doesn’t have any difference between variables and functions like most languages do and what would happen if this concept was tried in Ruby.

In haskell there aren’t any mutable variables. The code foo = 4 is like making foo an alias of 4. Everywhere in scope where you see foo you can replace it with 4 and the code will work the same. Functions are defined in the same way for example bar = 4 + 4 and the same idea of copying 4 + 4 everywhere bar is used applies.

In ruby we have a separate syntax for defining variables and methods.

foo = 4

def bar
    some code here
end

But we can also use methods like they were variables.

def foo
    4
end

And in your code, getting the result from foo defined as either a variable or a method gives you the same result. Using a method to return a single variable is fairly common especially in classes where you might want to override the function to return a different value or change it to include some logic for returning a value. I wanted to see what the effects of replacing variables that don’t change and converting them to methods, specifically the performance impact.

To do this I wrote a fairly simple test program.

def bar
	foo = 4
	total = 0
	100000000.times do
		total += foo
	end
	puts total
end

bar

And

def foo
	4
end

def bar
	total = 0
	100000000.times do
		total += foo
	end
	puts total
end

bar

Both of these tests are identical in end result, they both access the data from foo 100,000,000 times and add it to total. One of these has foo as a variable and the other a method.

And here are the results after running both:

Ruby variable test
400000000

real	0m3.257s
user	0m3.249s
sys	0m0.003s
Ruby function test
400000000

real	0m4.786s
user	0m4.777s
sys	0m0.003s

This likely comes as no surprise to most developers, function calls have an overhead higher than accessing variables. But why do we even have to make a method call to get the number 4 from foo? Why can’t we be more like Haskell and have the ability to simply copy the 4 in to all the places foo is called? The answer to this I believe comes down to the fact that ruby is a very dynamic language, you can change most things at runtime so even though we can see that foo simply returns 4 we can’t know that at some point this method will be redefined later to be something else.

But what if we try this same experiment in a more static language and see how far optimisation can get us when we can guarantee certain thins about the code will never change at runtime.

I rewrote the tests in C++

#include <iostream>

int main() {
	int foo = 4;

	int64_t total = 0;

	for (int i = 0; i < 1000000000; i++) {
		total += foo;
	}

	std::cout << total;
}

and

#include <iostream>

int64_t foo() {
	return 4;
}

int main() {
	int64_t total = 0;

	for (int i = 0; i < 1000000000; i++) {
		total += foo();
	}

	std::cout << total;
}

These tests closely replicates the functionality of the ruby versions. Lets see how they run.

C++ variable test
4000000000
real	0m1.859s
user	0m1.857s
sys	0m0.000s

C++ function test
4000000000
real	0m2.045s
user	0m2.042s
sys	0m0.000s

Again nothing shocking but lets see how they change when optimized with g++!

C++ variable test optimised (O3)
4000000000
real	0m0.001s
user	0m0.001s
sys	0m0.000s

C++ function test optimised (O3)
4000000000
real	0m0.001s
user	0m0.001s
sys	0m0.000s

Wait, what? They now take virtually no time to run. To find out why I had to check the compiled output and see what kind of trickery g++ is using to get these speeds.

compiled output

Hmm mov esi, 4000000000, that’s our final output. It seems g++ knows that the output can never change and it can simply precalculate it and hardcode it in to the binary. Smart but it ruins what I wanted to show which was that in C++ you could simply inline the function foo and lose the overhead of calling a function.

In conclusion you probably can use methods to define variables that shouldn’t change often especially if you think they might be extended with more complex logic later and you won’t face any performance issues unless you are running these functions hundreds of times per second.