Meta-Programming with Ruby

Meta-programming is the writing of computer programs that write or manipulate other programs (or themselves) as their data, or that do part of the work at compile time that is otherwise done at run time. In many cases, this allows programmers to get more done in the same amount of time as they would take to write all the code manually.

The language in which the metaprogram is written is called the metalanguage. The language of the programs that are manipulated is called the object language. The ability of a programming language to be its own metalanguage is called reflection or reflexivity.

Metaprogramming usually works through one of two ways. The first way is to expose the internals of the run-time engine to the programming code through application programming interfaces (APIs). The second approach is dynamic execution of string expressions that contain programming commands. Thus, “programs can write programs”. Although both approaches can be used, most languages tend to lean toward one or the other.

We create objects in Ruby by sending #new to a class:

When a class receives #new it allocates space for a new object and then, if the method exists, invokes the #initialize instance method to put the new object in a valid state. Any arguments passed to #new are automatically passed to initialize, like this:

This implementation forces us to supply the title, the author and the ISBN every time we create a book — if we don’t know the details, we still need to provide some values. For example, if all I know is a title, I’m forced to do something like this:

This is a bit cumbersome, and it also leaves the encoding of unknown values to the whim of each developer. It would be nice if there was a way to provide default values for the parameters I don’t know, and indeed, in Ruby there is. Here’s a new class definition that forces us to supply a title, but provides default values for author and ISBN:

When a method has default values for parameters, the corresponding parameters can be omitted; since parameters are mapped by position, this means that if you want to let one parameter use its default, you have to accept the defaults for all the following parameters as well:

The last example will compile and execute, but it will create a book with an author of “0375760644” and an ISBN of nil, which probably isn’t what was intended.

Now let’s imagine that we also want to list the people who own a book when we create the book, and that this list is arbitrarily long. The obvious solution is to pass the authors as an array, like this:

This certainly works, but Ruby gives us an easier way. Here’s something that’s semantically equivalent:

When we invoke #new in this case, we don’t need to make the beginnings of the owners array explicit — specifying “*owners” in the method definition tells Ruby to pick up any remaining arguments and put them in an array called owners. The optional array argument has an implicit default value of an empty Array ([]); it’s illegal to try to set an explicit default value for an optional array argument.

Any method definition may also include one other optional argument — a block of code that can be invoked within the method. We’ve already seen this when we listed examples of enumeration methods in an earlier column:

(1..5).each { |n| puts n*n }

In this code the {} denote the (nominally) optional block. Let’s show how you would define a method like this by adding a method to Book that returns a boolean indicating if the book matches some conditions contained in a block. Here’s how we might invoke the method:

and here’s the implementation:

The “&” in &block indicates that this is an optional block argument, and the #yield tells Ruby to yield control to any optional block parameter, passing self as a parameter. The following code is logically equivalent, but demonstrates two other ways of accomplishing the same things:

Ruby also has two more tricks up its sleeve that can improve the expressiveness of our code. First, parentheses are usually optional in method invocations. While programmers have come to accept the proliferation of parentheses, square brackets and braces in their programming languages, these characters aren’t a common part of natural language, and removing them makes code more readable. Finally, Ruby is very flexible in the way it treats hashes as method arguments.

A hash, as you may recall, is a set of key value pairs, and we create literal hashes like this:

Using hashes gives us a way to provide named parameters. Let’s change our Book class to use a hash instead of individual parameters for author and ISBN when we create a new instance.

This is the literal use of a hash — we still don’t need to specify arguments that we don’t need, we can specify the arguments inside the hash in any order we like, and we can use the hash keys to make the method invocation more expressive. If we don’t have any optional arguments that need to be collected into an array, Ruby doesn’t even need us to put in the {} delimiters; it will collect up any key value pairs at the end of the method invocation and put them into a hash. So this is a completely legal constructor:

There are some constraints on trying to use all these features together: an optional block must be the last parameter; an implicit array must follow any parameters other than an optional block; and key/value pairs will only be collected into a hash if they are the last parameters in a method invocation. The first invocation below is legal because the key/value pairs are the last parameters, but the second invocation is illegal because they are followed by other parameters, and Ruby can’t determine which ones should be gathered into the array and which ones should be gathered into the hash:

Using these features we can create flexible, expressive method definitions and invocations, but Ruby still has a few more features that are useful when we’re constructing DSLs — eval, class_eval and instance_eval.

Fundamentally, eval lets us take a string and evaluate it as Ruby statements. (Eval also supports varying the context in which this is done, but we won’t cover that in this column.) #eval is a method on Kernel, so we can use it inside objects or in simple scripts.

Here’s an example:

This can be very useful if we want to execute arbitrary strings that are being constructed dynamically, or are being read from some external location, such as a file.

#instance_eval is a public method on Object that takes either a string or a block as a parameter and executes the parameter in the context of the receiver. When the string or the block is evaluated, self is set to the receiver of instance_eval. This means that instance variables and private methods can be accessed within the block — though this can be both succinct and dangerous, since it breaks the encapsulation of the receiver! Let’s have a look at an alternative implementation of our book matching code that uses instance_eval:

Another common use for instance_eval is to define new methods at run time. In Ruby we can define methods that are unique for an instance, and this is what happens when we define a method using #instance_eval and the receiver is an instance:

When we do the same thing to a class, the same rules apply — in this case, the receiver is a named instance of Class, and the newly defined methods are Class methods, but available only on that specific instance, which is equivalent to creating a new (named) class method. This sounds confusing because the word “class” is overloaded, so let’s look at an example:

Here we’re invoking #instance_eval on the class Book, and the result is a class method on Book, which cannot be invoked on the instance book.

The idea behind #class_eval is similar to the idea behind #instance_eval, but #class_eval is only implemented on Class, not on Object. When we invoke a method in the context of a #class_eval, the corresponding class method is executed. When we define a method in the context of a #class_eval, a new instance method is created. Here are some examples:

The first invocation of #class_eval creates a new instance method on the class Book, so we can ask both book and book2 for their locations (though the implementation and the answer are a bit simplistic). We can also invoke the class method #content_description, defined earlier using #instance_eval on a class, as shown below:

Finally, let’s put all this together to create a DSL for creating a library and the books within that library. Here’s an example of using the DSL:

and here’s the content of the library instance after we execute this code:

So what does the implementation of the DSL look like? Clearly we need a way to create a book, so let’s use one of our earlier implementations (which one isn’t terribly important):

Now we need to write our library class. We want to execute a block of code when we create the library, so we need to pass an optional block to #initialize. We want to be able to access a method called #new_book in that block; to keep the DSL clean, we don’t want to need to specify the receiver of the block, so we want the receiver to implicitly be the library. This points us to using #instance_eval to evaluate the block, which leads to something like this:

The important thing here isn’t really the details of the implementation, it’s that Ruby gives us the features and flexibility that we need to create a language that clearly expresses the details of a specific domain, without a lot of the clutter and baggage that accompanies many programming languages. I hope that these examples encourage you to experiment with creating your own DSLs, and to learn more about features of Ruby that go beyond what you might have encountered in your current programming language.

Please post your valuable comments if you like to share your experiences with the Ruby Meta-Programming. If you like this post kindly subscribe to our RSS for free updates and articles delivered to you.

Twitter At Scale: Will It Work?

Only two days ago the contact messaging application Twitter suffered another bout of downtime, leaving some users frustrated and others asking why the platform continues to suffer problems.

I have recently spoken to an individual who is familiar with the technical problems at Twitter as well as the challenges that lay ahead for the startup. He re-iterated his belief that the problems lay not with Blaine Cook (the former head of engineering who was shown the door), nor with Joyent NTT (their host) but with the early lack of understanding of how complex their problems would be.

The issue is that group messaging is very difficult to achieve at a grand scale. Other large sites such as WordPress and Digg are mostly dealing with known problems, such as how to serve a large number of pages or a large number of images. Twitter is unique in that it needs to parse a large number of messages and deliver them to multiple recipients, with each user having unique connections to other users.

Social networks have similar complexity issues, but they only usually need to route a message to a single user (or at the most to a defined group). Even so, social networks like Friendster struggled for years with technical and scaling issues. Twitter is specifically dealing with text messages, and in most cases with active users those messages are very frequent and go out to hundreds of contacts (or followers, as they are referred to in Twitter). Every new Twitter user and every new connection results in an exponentially greater computational requirement.

Some of the best web applications are able to efficiently solve very complex problems to produce simple results for users (Eg. Google). The success of these applications is due to the innovative efforts by developers to solve large technical challenges, where they have often had to break new ground for solutions. For Twitter to reach a similar point of reliability they too will need a very comprehensive, ground-breaking solution.

The source that I spoke to also commented on how ill-prepared the Twitter team were and are for their current and future challenges. The small team contains a handful of engineers, with only a person or two committed to infrastructure and architecture. He goes on to point out that at Digg the team for network and systems alone is bigger than the total engineering team at Twitter, and that at Digg they are lead by well-known “A-list rockstars”.

The problems at Twitter are often attributed to their use of RubyOnRails, a web development framework. Twitter is almost certainly the largest site running on Rails, so fans of the framework and its developers have been quick to deflect the criticism and point it back at the engineers at Twitter. Utilizing a framework that has never conquered large-scale territory must certainly add to the risk and work required to find a solution. As an out-of-the box framework, Rails certainly doesn’t lend itself to large-scale application development, but was a big part of the reason why Twitter could experiment and release early.

Rails has enabled Twitter to prototype quickly, to quickly launch and then to easily iterate with new features. But the old adage of “Good, Fast, Cheap – pick two” certainly applies; and Rails would do itself no harm by conceding that it isn’t a platform that can compete with Java or C when it comes to intensive tasks. Twitter is at a cross-roads as an application and Rails has served its purpose very well to date, but you are unlikely to see a computational cluster built with Ruby at Apache any time soon.

What we see at Twitter today is a very useful and popular service, but one with very complex underlying technical challenges to overcome. Twitter will require not only a new architecture approach and a big injection of the best minds they can find ($15 million can help), but will also need a little patience from users and those of us observing.

PHP vs Ruby

PHP vs Ruby – Practical Language Differences

There are rather significant syntactical differences between PHP and Ruby. For example PHP requires semicolons at the end of lines and generally requires curly brackets to enclose blocks of code. Ruby, on the other hand, uses newline characters to denote the end of a line of code and many code constructs such as function definitions, and various loops are ended with the word “end” rather than being surrounded by curly braces. Below is an example of PHP vs Ruby syntax.

PHP Function

function say_hello($name) {
$out = "Hello $name";
return $out;
}

Ruby Function

def say_hello(name)
out = "Hello ${name}"
return out
end

In Ruby, you don’t have to specify the return line as the return value of the last evaluated line is returned automatically. There are other differences in syntax between PHP and Ruby and, in general, Ruby is more concise yet it is not cryptic. Another general note about PHP is that PHP has a very large number of “built-in” functions – many more than Ruby. The Rails framework adds some nice helper methods for formatting dates, numbers, currency, and the like. PHP, however, has a vast library of functions that can do almost anything you will need. The naming of the PHP functions and the order of the parameters they take are somewhat inconsistent, but the functions are there making PHP an amazingly complete tool set for web application development. An example of the naming inconsistency is evident in function names such as strpos() vs str_replace(). Sometimes an underscore is used to separate words and other times it’s not. While that can be annoying, it’s something you can memorize in time and it’s not a huge deal. What really matters to me is what can PHP do that Ruby can’t and vice versa.

PHP5 was first released over three years ago, July 13, 2007. Frustratingly, and for various compatibility reasons, PHP5 is just now beginning to be an available choice on shared hosting accounts. Nevertheless, my comparison is between PHP5 and Ruby. Both PHP5 and Ruby have object oriented features albeit, Ruby is more sophisticated in terms of OO constructs. But I wanted to know what the real difference is between the languages. When I begin development of my next web application, what will I notice in terms of the differences between the languages.

The Similarities Between PHP and Ruby

Before getting started, I know that PHP and Ruby have very significant syntactic differences and different people may prefer one over the other. So it may be harder to implement certain features in one language over the other. Nevertheless both languages can do almost the same things. Both languages have try/catch/throw style exception handling. Exceptions are new to PHP5 as PHP4 does not have them. Both languages can be used in an object oriented way. Ruby has more powerful object oriented features but most developers probably won’t notice a difference in a normal web application. Both languages have additional functionality that can be added through libraries. In general, when developing web applications, I have not yet run into a time when I was working with one language and hit a road block where the language I was using was not capable of expressing the functionality I needed. Sometimes things are easier in one language versus the other but both PHP and Ruby have been able to “do” the same things.

About Frameworks

You can’t talk about Ruby for web development without mentioning Rails. Rails is fantastic and makes developing web applications much easier due to all of the built-in functionality. My favorite aspect of Rails is the ActiveRecord functionality. Often times, many of the objects in my applications are virtually empty except for inheriting from ActiveRecord. Validation is a breeze with Rails and is usually one line such as:

validates_presence_of :attribte_name

The built-in error handling with error_messages_for is very helpful. If you need more flexibility with the display of error messages, there are plugins to available for that. The ability to turn email sending on and off for testing is extremely nice. Having different configuration files for live, testing, and development environments is great.

PHP has over 40 different frameworks available. I’ve spent a significant amount of time studying PHP frameworks. The most popular ones appear to by CakePHP and Symfony. Both of these frameworks are essentially Rails clones. I think Symfony is the largest, and most comprehensive of the two. My favorite PHP Framework, howevever, is CodeIgniter. It is a Model-View-Controller style framework and has a strong Rails feel to it, but it is much lighter weight. CodeIgniter has no code generators like Rails has. It does have an ActiveRecord style Database class but it is not as powerful as the ActiveRecord in Rails. It is, however, quite nice and very helpful – much better than nothing. CodeIgniter, as far as I know, does not have anything comparable to Migrations in Rails.

PHP Does Not Have Formal Namespaces

PHP does not officially support namespaces and Ruby does. So organizing your code in a PHP application is something that’s left to the developer to figure out. PHP5 has the ability to autoload classess that are undefined. It is a very useful function and one that I have used to organize my code into name spaces. I essentially name classes with underscores separating directory names. Here is my autoload function that replaces underscores with slashes to create a path to the class files that I am using.

define("PROJECT", "/path/to/project/classes/");
define("LIBRARY", "/path/to/my/main/php5/library/classes/");

function __autoload($className) {
$fileName = str_replace("_", DIRECTORY_SEPARATOR, $className) . ".php";
if(preg_match("/^ProjectName_/", $className)) {
include_once(PROJECT . $fileName);
}
else {
include_once(LIBRARY . $fileName);
}
}

I have a bunch of classes that I continually reuse in my PHP applications. The path to the root directory of my library is stored in the LIBRARY constant. Any classes I write that either extend my core library or are specific to the project I am working on get stored in the location specified by the PROJECT constant. This has been a system that works really well for me and is a good workaround for PHP not having namespaces. So PHP not having formal namespaces has not really hindered my development efforts.

Documentation

PHP kills Ruby and Rails when it comes to ease of finding, reading, and even generating documentation. The PHP website has wonderful, helpful, searchable documentation. The Rails Documentation is comprehensive but much harder to navigate and has far fewer code examples. PHPDocumentor also produces much better looking documentation that is easier to navigate than RDoc does.

Hosting Ruby (on Rails) Is Painful And Expensive

All the Ruby applications I have developed to date have used the Rails framework. So this is a comparison between hosting a PHP Application and a Ruby on Rails application. PHP has several good MVC style frameworks. My favorite is CodeIgniter and I’ll be posting an article about it soon. Using a framework or not, PHP is extremely easy to host. Locate virtually any hosting company, sign up for an account and start dropping your files on the server. I can’t say the same for Ruby on Rails.

I started off with a DreamHost account but performance and reliability were miserable. I ended up switching to a virtual dedicated server at Rimu Hosting and have been extremely please with their service. I’m running Apache 2.0 + Mongrel + ISPConfig and it’s working out nicely. But every time I want to deploy a new application I have to…

* Set up a mongrel cluster
* Configure 2 or three ports for the mongrel cluster
* Create a map file for my random load balancer
* Set up a way to restart the cluster if the server reboots
* Configure Apache to forward requests to the mongrel cluster
* Configure capistrano to deploy the application.

Fortunately capistrano handles all of the annoying tedium associated with deploying the application and restarting the server. So once capistrano is setup and working, future deployments are a breeze. Also worth noting, this assumes you have your source code in Subversion. So you have to make sure that Subversion is set up and accessible over the internet to the deployment server as well as your development machine. It is good practice to have you source code in version control and you should be doing this whether the project is PHP or Ruby. If you are not using subversion and, therefore, not using capistrano, deploy a rails application is very time consuming and frustrating. Lastly, you need ssh access to your deployment server if you are going to be using Ruby on Rails whereas PHP can be deployed entirely over FTP.

To have a reasonable and reliable hosting environment for Ruby on Rails applications, you should have a virtual private server (VPS) or a dedicated server. So that will cost you at least $50 per month. You can host multiple Rails applications on the same VPS but the RAM requirements for running the mongrel clusters will quickly catch up to you. While you could host well over 50 PHP sites running on your VPS you will only be able to have a handful of Rails applications.

Conclusion

I suspect very few people will argue that PHP is a more elegant language or is more powerful than Ruby. Frankly, Ruby is probably may favorite language that I have ever worked with and I have worked with Classic ASP, ASP.NET, VB.NET, C#, Java, and Perl all rather extensively over the years. Ruby is both highly expressive and concise which is rare and refreshing.

Rails is a very comprehensive and effective web development Framework and there’s nothing exactly like it in PHP. You get a huge amount of functionality for free. Developing in Ruby on Rails is also a very fast process because Ruby is a very concise language requiring much less typing than any other language I’ve worked with. CodeIgniter is a really nice PHP framework. It will give you a great boost when developing your next PHP application.

The hosting and deployment struggles with Ruby on Rails is a major sticking point for me though. As the owner of a web development company many of our smaller clients do not have the budget for their own VPS account and even if they did, we don’t have the staff to manage a large number of VPS accounts or dedicated servers. Keeping the security updates current, managing any issues that may occur with email, and all the other headaches that go along with managing your own VPS or Dedicated server is more than we care to take on for the relatively small, practical dfference between PHP and Ruby. For large projects, it may be worth the trouble, but for small to medium sized projects, PHP is much easier to deploy, less expensive to host, and the language is capable of taking on everything those types of sites require. For our projects, development time with PHP is not noticeably longer than with Ruby on Rails. Ruby on Rails integrates a lot of things for the developer.

There is ActiveRecord for managing the link between models and the database, migrations for keeping development and live databases in sync, built in testing, the ajax – prototype javascript library is included, and you get a well defined file system structure. While it may not all be packaged together as well, PHP can do all of the above.

Ruby or PHP which one is for you

Will Ruby kill PHP?

With the recent rise in popularity of the Ruby programming language (largely driven by the excellent but not perfect web framework called Rails), I’ve noticed a little fear in the air … fear on the part of some people in the PHP community.

Will Ruby kill PHP?

The short answer is: NO.

MY REASONING

Though Ruby and PHP are both scripting languages that make developing web applications much easier than it is, say in the Java world, they are very different beasts … each appeals to a different audience.

RUBY IS ELEGENT BUT COMPLEX

Before I go on, I want to point out that Ruby is a great language and I think it makes perfect sense for PHP developers to learn a little Ruby: it is always a good idea to learn other languages because it will make you a better programmer.

That said, I believe Ruby will not appeal to, or fill the need of most PHP’ers – Ruby can be a little too abstract.

JAVA NERDS LOVE RUBY

Ruby is attracting many from the Java world because it expresses very advanced concepts in a simple syntax – contrast this to Java’s (often times) kludgy and verbose representation.

Ruby appeals to the Java crowd because Java people have been trained to think in terms of large scale enterprise applications – regardless of the size of the project.

… These ‘abstractions’ (generally speaking) lend themselves well to larger projects.

WHY PHP WORKS

PHP is often criticized because it has both a procedural and an object oriented way of doing things. Some people think that this divergence (within the language), takes away from it … I think this is part of its strength!

Objected oriented constructs are great for creating cleaner designs that are easier to maintain and promote the possibility of code reusability. Code reuseability is an often touted advantage of OOP, but from what I’ve seen in the Java world, it is not achieved so often.

With OOP, there is a cost of added complexity and overhead – you simply have to write more code to do things when you do it via OOP.

PHP PROVES THAT NON OO LANGUAGES STILL HAVE THEIR PLACE

I would suggest that the vast majority of PHP work is found in simple projects:

* Email from a web page.
* Process a simple form and save to a database.
* Create a simple store with 10 items.

My point is, that for many PHP projects, OOP may be a little overkill.

WHY RUBY WILL NOT KILL PHP

In Ruby everything is an object (even numbers!) and the core language has very sophisticated constructs that need to be understood to use Ruby effectively – Ruby strength is also its’ weakness.

… I don’t see the majority of PHP users wanting to jump that deep into the world of programmatic abstraction – for most, there is simply no point.

Ruby is a very clean language and it has lots of things going for it. But PHP has lots of things going for it too. You can point out spots where Ruby code is cleaner than PHP, but you can point out spots where (I think) PHP code is cleaner than Ruby …

Today I would consider PHP the better choice because it is well established (lots of IDEs, open source projects, easy hosting etc ..) and proven.

Ruby is just starting to get into the mainstream … and there are still some fundamental issues with Ruby and web development.

For example: Ruby integration with APACHE is still not stable. It works … but there are known problems and can be a hassle to set up.

Ruby has a long way to go, I can’t argue for the programming details as I still like my Perl but from the SysAdmin side Ruby and Rails for the matter are an absolute nightmare.

Ruby in itself is slow, doesn’t understand threads like every other language out there so it doesn’t integrate into apache like PHP or Perl or Python or the other hundreds of web scripting languages.

Until Mongrel came out you were forced to use fast_cgi, which is a dead method of rendering pages, apache’s mod_fcgi is crap and lighttpd’s fcgi is better but the server itself is not stable enough for production.

Ruby has to start by modernizing it’s preprocessor before it even has a small chance in hell of overtaking PHP and even then Rails has a long way to go before it’s ready for the primetime, the ease at which a programmer can cause Rails to leak memory is astounding.

By: Stefan Mischook in Killersite.com