Meta-Programming with Ruby

Meta-programming is the writing of computer programs that write or manipulate other programs (or themselves) as their data, or that do part of the work at compile time that is otherwise done at run time. In many cases, this allows programmers to get more done in the same amount of time as they would take to write all the code manually.

The language in which the metaprogram is written is called the metalanguage. The language of the programs that are manipulated is called the object language. The ability of a programming language to be its own metalanguage is called reflection or reflexivity.

Metaprogramming usually works through one of two ways. The first way is to expose the internals of the run-time engine to the programming code through application programming interfaces (APIs). The second approach is dynamic execution of string expressions that contain programming commands. Thus, “programs can write programs”. Although both approaches can be used, most languages tend to lean toward one or the other.

We create objects in Ruby by sending #new to a class:

When a class receives #new it allocates space for a new object and then, if the method exists, invokes the #initialize instance method to put the new object in a valid state. Any arguments passed to #new are automatically passed to initialize, like this:

This implementation forces us to supply the title, the author and the ISBN every time we create a book — if we don’t know the details, we still need to provide some values. For example, if all I know is a title, I’m forced to do something like this:

This is a bit cumbersome, and it also leaves the encoding of unknown values to the whim of each developer. It would be nice if there was a way to provide default values for the parameters I don’t know, and indeed, in Ruby there is. Here’s a new class definition that forces us to supply a title, but provides default values for author and ISBN:

When a method has default values for parameters, the corresponding parameters can be omitted; since parameters are mapped by position, this means that if you want to let one parameter use its default, you have to accept the defaults for all the following parameters as well:

The last example will compile and execute, but it will create a book with an author of “0375760644” and an ISBN of nil, which probably isn’t what was intended.

Now let’s imagine that we also want to list the people who own a book when we create the book, and that this list is arbitrarily long. The obvious solution is to pass the authors as an array, like this:

This certainly works, but Ruby gives us an easier way. Here’s something that’s semantically equivalent:

When we invoke #new in this case, we don’t need to make the beginnings of the owners array explicit — specifying “*owners” in the method definition tells Ruby to pick up any remaining arguments and put them in an array called owners. The optional array argument has an implicit default value of an empty Array ([]); it’s illegal to try to set an explicit default value for an optional array argument.

Any method definition may also include one other optional argument — a block of code that can be invoked within the method. We’ve already seen this when we listed examples of enumeration methods in an earlier column:

(1..5).each { |n| puts n*n }

In this code the {} denote the (nominally) optional block. Let’s show how you would define a method like this by adding a method to Book that returns a boolean indicating if the book matches some conditions contained in a block. Here’s how we might invoke the method:

and here’s the implementation:

The “&” in &block indicates that this is an optional block argument, and the #yield tells Ruby to yield control to any optional block parameter, passing self as a parameter. The following code is logically equivalent, but demonstrates two other ways of accomplishing the same things:

Ruby also has two more tricks up its sleeve that can improve the expressiveness of our code. First, parentheses are usually optional in method invocations. While programmers have come to accept the proliferation of parentheses, square brackets and braces in their programming languages, these characters aren’t a common part of natural language, and removing them makes code more readable. Finally, Ruby is very flexible in the way it treats hashes as method arguments.

A hash, as you may recall, is a set of key value pairs, and we create literal hashes like this:

Using hashes gives us a way to provide named parameters. Let’s change our Book class to use a hash instead of individual parameters for author and ISBN when we create a new instance.

This is the literal use of a hash — we still don’t need to specify arguments that we don’t need, we can specify the arguments inside the hash in any order we like, and we can use the hash keys to make the method invocation more expressive. If we don’t have any optional arguments that need to be collected into an array, Ruby doesn’t even need us to put in the {} delimiters; it will collect up any key value pairs at the end of the method invocation and put them into a hash. So this is a completely legal constructor:

There are some constraints on trying to use all these features together: an optional block must be the last parameter; an implicit array must follow any parameters other than an optional block; and key/value pairs will only be collected into a hash if they are the last parameters in a method invocation. The first invocation below is legal because the key/value pairs are the last parameters, but the second invocation is illegal because they are followed by other parameters, and Ruby can’t determine which ones should be gathered into the array and which ones should be gathered into the hash:

Using these features we can create flexible, expressive method definitions and invocations, but Ruby still has a few more features that are useful when we’re constructing DSLs — eval, class_eval and instance_eval.

Fundamentally, eval lets us take a string and evaluate it as Ruby statements. (Eval also supports varying the context in which this is done, but we won’t cover that in this column.) #eval is a method on Kernel, so we can use it inside objects or in simple scripts.

Here’s an example:

This can be very useful if we want to execute arbitrary strings that are being constructed dynamically, or are being read from some external location, such as a file.

#instance_eval is a public method on Object that takes either a string or a block as a parameter and executes the parameter in the context of the receiver. When the string or the block is evaluated, self is set to the receiver of instance_eval. This means that instance variables and private methods can be accessed within the block — though this can be both succinct and dangerous, since it breaks the encapsulation of the receiver! Let’s have a look at an alternative implementation of our book matching code that uses instance_eval:

Another common use for instance_eval is to define new methods at run time. In Ruby we can define methods that are unique for an instance, and this is what happens when we define a method using #instance_eval and the receiver is an instance:

When we do the same thing to a class, the same rules apply — in this case, the receiver is a named instance of Class, and the newly defined methods are Class methods, but available only on that specific instance, which is equivalent to creating a new (named) class method. This sounds confusing because the word “class” is overloaded, so let’s look at an example:

Here we’re invoking #instance_eval on the class Book, and the result is a class method on Book, which cannot be invoked on the instance book.

The idea behind #class_eval is similar to the idea behind #instance_eval, but #class_eval is only implemented on Class, not on Object. When we invoke a method in the context of a #class_eval, the corresponding class method is executed. When we define a method in the context of a #class_eval, a new instance method is created. Here are some examples:

The first invocation of #class_eval creates a new instance method on the class Book, so we can ask both book and book2 for their locations (though the implementation and the answer are a bit simplistic). We can also invoke the class method #content_description, defined earlier using #instance_eval on a class, as shown below:

Finally, let’s put all this together to create a DSL for creating a library and the books within that library. Here’s an example of using the DSL:

and here’s the content of the library instance after we execute this code:

So what does the implementation of the DSL look like? Clearly we need a way to create a book, so let’s use one of our earlier implementations (which one isn’t terribly important):

Now we need to write our library class. We want to execute a block of code when we create the library, so we need to pass an optional block to #initialize. We want to be able to access a method called #new_book in that block; to keep the DSL clean, we don’t want to need to specify the receiver of the block, so we want the receiver to implicitly be the library. This points us to using #instance_eval to evaluate the block, which leads to something like this:

The important thing here isn’t really the details of the implementation, it’s that Ruby gives us the features and flexibility that we need to create a language that clearly expresses the details of a specific domain, without a lot of the clutter and baggage that accompanies many programming languages. I hope that these examples encourage you to experiment with creating your own DSLs, and to learn more about features of Ruby that go beyond what you might have encountered in your current programming language.

Please post your valuable comments if you like to share your experiences with the Ruby Meta-Programming. If you like this post kindly subscribe to our RSS for free updates and articles delivered to you.

0 I like it
0 I don't like it