Share your knowledge and create a knowledgebase.

Archive for May, 2008


Twitter At Scale: Will It Work?

May 23, 2008 Author: Ashish | Filed under: Industry News, Ruby On Rails

Only two days ago the contact messaging application Twitter suffered another bout of downtime, leaving some users frustrated and others asking why the platform continues to suffer problems.

I have recently spoken to an individual who is familiar with the technical problems at Twitter as well as the challenges that lay ahead for the startup. He re-iterated his belief that the problems lay not with Blaine Cook (the former head of engineering who was shown the door), nor with Joyent NTT (their host) but with the early lack of understanding of how complex their problems would be.

The issue is that group messaging is very difficult to achieve at a grand scale. Other large sites such as Wordpress and Digg are mostly dealing with known problems, such as how to serve a large number of pages or a large number of images. Twitter is unique in that it needs to parse a large number of messages and deliver them to multiple recipients, with each user having unique connections to other users.

Social networks have similar complexity issues, but they only usually need to route a message to a single user (or at the most to a defined group). Even so, social networks like Friendster struggled for years with technical and scaling issues. Twitter is specifically dealing with text messages, and in most cases with active users those messages are very frequent and go out to hundreds of contacts (or followers, as they are referred to in Twitter). Every new Twitter user and every new connection results in an exponentially greater computational requirement.

Some of the best web applications are able to efficiently solve very complex problems to produce simple results for users (Eg. Google). The success of these applications is due to the innovative efforts by developers to solve large technical challenges, where they have often had to break new ground for solutions. For Twitter to reach a similar point of reliability they too will need a very comprehensive, ground-breaking solution.

The source that I spoke to also commented on how ill-prepared the Twitter team were and are for their current and future challenges. The small team contains a handful of engineers, with only a person or two committed to infrastructure and architecture. He goes on to point out that at Digg the team for network and systems alone is bigger than the total engineering team at Twitter, and that at Digg they are lead by well-known “A-list rockstars”.

The problems at Twitter are often attributed to their use of RubyOnRails, a web development framework. Twitter is almost certainly the largest site running on Rails, so fans of the framework and its developers have been quick to deflect the criticism and point it back at the engineers at Twitter. Utilizing a framework that has never conquered large-scale territory must certainly add to the risk and work required to find a solution. As an out-of-the box framework, Rails certainly doesn’t lend itself to large-scale application development, but was a big part of the reason why Twitter could experiment and release early.

Rails has enabled Twitter to prototype quickly, to quickly launch and then to easily iterate with new features. But the old adage of “Good, Fast, Cheap - pick two” certainly applies; and Rails would do itself no harm by conceding that it isn’t a platform that can compete with Java or C when it comes to intensive tasks. Twitter is at a cross-roads as an application and Rails has served its purpose very well to date, but you are unlikely to see a computational cluster built with Ruby at Apache any time soon.

What we see at Twitter today is a very useful and popular service, but one with very complex underlying technical challenges to overcome. Twitter will require not only a new architecture approach and a big injection of the best minds they can find ($15 million can help), but will also need a little patience from users and those of us observing.

FriendFeed Launches Rooms

May 23, 2008 Author: Ashish | Filed under: Industry News

Activity stream aggregator FriendFeed launched a new feature called FriendFeed Rooms this afternoon, which are effectively topic-based accounts that anyone can create or join (depending on privacy settings). Users can then add links and messages to relevant content.

The main difference between Rooms and a normal FriendFeed account is the fact that multiple users can author it, and that you can’t pull third party feeds into the service.

FriendFeed usage continues to grow steadily, and has clearly gained from Twitter’s (a competitor of sorts) constant downtime. I still haven’t gone religious on it, though, as some have. That’s mostly because i don’t like having a third party service centralize all this data about me and then not let that data back out again. See my rant on the Centralized Me for more on that.

According to FriendFeed a room is like a mini FriendFeed for a particular subject or group of people. You can make a room for your family, your work team, or your knitting club. If you’d like to see what a room looks like, check out the FriendFeed News room, a public room where people share and discuss FriendFeed in the press. Everyone in your room can share stuff with each other and leave comments that only other people in your room can see. You decide whether to make your room public, where anyone can join, or private, where you have to invite or approve each member. You can even choose to view everything from your rooms in your feed, instead of just in the rooms themselves.

PHP comes with an extensive catalog of date and time functions, all designed to let you easily retrieve temporal information, massage it into a format you require, and either use it in a calculation or display it to the user. However, if you’d like to do something more complicated, things get much, much hairier.

A simple example of this involves displaying the time on a Web page. With PHP, you can easily use the date() function to read the server’s clock and display the required information in a specific format. But what if you’d like to display the time in a different location - for example, if your company is located in a different country from your server and you want to see “home” time instead of local time? Well, then you have to figure out the difference between the two places and perform some date arithmetic to adjust for the different time zones. If the time difference is significant, you need to take account of whether the new time is on the day before or after, worry about daylight savings time, and keep track of end-of-the-month and leap year constraints.

As you can imagine, the math to perform such time zone conversions can quickly get very complicated if you do it manually. To be fair, PHP has built-in time zone functions to help with this, but these aren’t particularly intuitive and require a fair amount of time to get used to. A quicker alternative is to use the PEAR Date class, which comes with built-in support for time zones and is, by far, the simplest way to perform these conversions.

This tutorial will teach you how to convert temporal values between time zones with the PEAR Date class. It assumes that you have a working Apache and PHP installation and that the PEAR Date class has been correctly installed.

Note: You can install the PEAR Date package directly from the Web, either by downloading it or by using the instructions provided.

Getting started

Let’s begin with the basics - initialising and using a Date object. Create a PHP script with the following lines of code:

<?php
// include class
include ("Date.php");
 
// initialize object
$d = new Date("2006-06-21 15:45:27");
 
// retrieve date
echo $d->getDate();
?>

This is fairly simple - include the class code, initialise a Date() object with a date/time string, and then use the getDate() method to display the value you just inserted. Here’s the output:

2006-06-21 15:45:27

What if you want the date in a different format? If the format is a standard one, such as the ISO format, simply pass getDate() a modifier indicating this:

<?php
// include class
include ("Date.php");
 
// initialize object
$d = new Date("2006-06-21 15:45:27");
 
// retrieve date as timestamp
echo $d->getDate(DATE_FORMAT_ISO_BASIC);
?>

The output in this case conforms to the standard ISO format.

20060621T154527Z

If you’d like a custom format, you can do that too, with the format() method. Like PHP’s native date() function, this method accepts a series of format specifiers that indicate how each component of the date is to be formatted. Below is an example (look in the class documentation for a complete list of modifiers):

<?php
// include class
include ("Date.php");
 
// initialize object
$d = new Date("2006-06-21 15:45:27");
 
// retrieve date as formatted string
echo $d->format("%A, %d %B %Y %T");
?>

And here’s the output:

Wednesday, 21 June 2006 15:45:27

Converting between time zones

Now that you’ve got the basics, let’s talk about time zones. Once you have a Date() object initialised, converting from one time zone to another is a simple two-step process:

1. Tell the Date class which time zone you’re converting from, with the setTZByID() method.
2. Then, tell the Date class which time zone you wish to convert to, with the convertTZByID() method.

<?php
// include class
include ("Date.php");
 
// initialize object
$d = new Date("2006-06-21 10:36:27");
 
// set local time zone
$d->setTZByID("GMT");
 
// convert to foreign time zone
$d->convertTZByID("IST");
 
// retrieve converted date/time
echo $d->format("%A, %d %B %Y %T");
?>

In this case, I’m attempting to convert from Greenwich Mean Time (GMT) to Indian Standard Time (IST). India is about 5.5 hours ahead of Greenwich, which is why the output of the script is:

Wednesday, 21 June 2006 16:06:27

Simple, isn’t it? Here’s another example, this one demonstrating how the class handles leap years and month end values.

<?php>
// include class
include ("Date.php");
 
// initialize object
$d = new Date("2008-03-01 06:36:27");
 
// set local time zone
$d->setTZByID("GMT");
 
// print local time
echo "Local time is " . $d->format("%A, %d %B %Y %T") . "\n";
 
// convert to foreign time zone
$d->convertTZByID("PST");
 
// retrieve converted date/time
echo "Destination time is " . $d->format("%A, %d %B %Y %T");
?>

And the output is:

Local time is Saturday, 01 March 2008 06:36:27
Destination time is Friday, 29 February 2008 22:36:27

Note: In case you’re wondering where the time zone IDs come from, you can find a complete list within the class documentation.

Calculating GMT offsets

Another piece of information that’s sometimes useful when working with time zones is the GMT offset — that is, the difference between the specified time zone and standard GMT. The PEAR Date class lets you get this information easily, via its getRawOffset() method. Here’s an example:

<?php
// include class
include ("Date.php");
 
// initialize object
$d = new Date("2006-06-21 10:36:27");
 
// set local time zone
$d->setTZByID("PST");
 
// get raw offset from GMT, in msec
echo $d->tz->getRawOffset();
?>

Here, the getRawOffset() method calculates the time difference between the local time and GMT. Here’s the output:

-28800000

Note that this offset value is expressed in milliseconds, so you will need to divide it by 3600000 (the number of milliseconds in one hour) to calculate the time zone difference in hours.

Tip: You can use the inDaylightTime() method to see if the destination is currently observing daylight savings time. Look in the class documentation for details on this method.
Adding and subtracting timespans

The Date class also lets you perform sophisticated date arithmetic on temporal values, adding or subtracting durations to a date/time value. These durations (or timespans) are expressed as a string containing day, hour, minute and/or second components.

<?php
// include class
include ("Date.php");
 
// initialize object
$d = new Date("2006-06-21 15:45:27");
 
// add 01:20 to it
$d->addSpan(new Date_Span("0,1,20,0"));
 
// retrieve date as formatted string
echo $d->format("%A, %d %B %Y %T");
?>

In this case, I’ve added an hour and twenty minutes to the initial timestamp, by calling the Date class’ addSpan() method and supplying it with a Date_Span() object initialised to that duration. The output is fairly easy to guess:

Wednesday, 21 June 2006 17:05:27

Just as you can add timespans, so too can you subtract them. That, in fact, is the purpose of the subtractSpan() method, which is illustrated below.

<?php
// include class
include ("Date.php");
 
// initialize object
$d = new Date("2006-06-21 15:45:27");
 
// add 01:20 to it
$d->addSpan(new Date_Span("0,1,20,0"));
 
// subtract 00:05 from it
$d->subtractSpan(new Date_Span("0,0,5,0"));
 
// retrieve date as formatted string
echo $d->format("%A, %d %B %Y %T");
?>

Here, I’ve first added an hour and twenty minutes, and then subtracted a further five minutes. The net effect is an addition of an hour and fifteen minutes, and the output reflects this:

Wednesday, 21 June 2006 17:00:27

As the examples above illustrate, the PEAR Date class provides methods to intuitively and efficiently perform fairly complex date math. If you’re looking for a stress-free way to convert timestamps between different locations, I’d heartily recommend it to you.

Dynamic XML document construction with the PHP DOM

May 15, 2008 Author: Ashish | Filed under: PHP, XML

When working with XML-based applications, developers often find themselves facing the requirement to generate XML-encoded data structures on the fly. Examples of this include an XML order template based on user input in a Web form, or an XML representation of a server request or client response based on run-time parameters.

Although this task might seem intimidating, it’s actually quite simple when one takes into account PHP’s sophisticated DOM API for dynamic node construction and manipulation. Over the course of this article, I’ll be introducing you to the main functions in this API, showing you how to programmatically generate a complete well-formed XML document from scratch and save it to disk.

Note: This article assumes a working Apache/PHP5 installation with the DOM functions enabled, and a working knowledge of basic XML constructs such as elements, attributes and CDATA blocks. You can obtain an introduction to these topics from the introductory material at Melonfire (http://melonfire.com/community/columns/trog/article.php?id=78 and http://melonfire.com/community/columns/trog/article.php?id=79)

Creating the Doctype declaration

Let’s start right at the top, with the XML declaration. In PHP, this is fairly simple; it only requires you to instantiate an object of the DOMDocument class and supply it with a version number. To see it in action type out the example script in Listing A.

Listing A

<?php
// create doctype
$dom = new DOMDocument("1.0");
 
// display document in browser as plain text
// for readability purposes
header("Content-Type: text/plain");
 
// save and display tree
echo $dom->saveXML();
?>

Notice the saveXML() method of the DOMDocument object — I’ll come back to this later, but for the moment simply realise that this is the method used to output a current snapshot of the XML tree, either to a file or to the browser. In this case, I’ve sent the output directly to the browser as ASCII text for readability purposes; in real-world applications, you would probably send this with a Content-Type: text/xml header.

When you view the output in your browser, you should see something like this:

<?xml version="1.0"?>

Adding elements and text nodes

Now that’s all very pretty and fine, but the real power of XML comes from its elements and the content they enclose. Fortunately, once you’ve got the basic DOMDocument initialised, this becomes extremely simple. There are two steps to the process:

1. For each element or text node you wish to add, call the DOMDocument object’s createElement() or createTextNode() method with the element name or text content. This will result in the creation of a new object corresponding to the element or text node.
2. Append the element or text node to a parent node in the XML tree by calling that node’s appendChild() method and passing it the object produced in the previous step.

An example will make this clearer. Consider the script in Listing B.

Listing B

<?php
// create doctype
$dom = new DOMDocument("1.0");
 
// display document in browser as plain text
// for readability purposes
header("Content-Type: text/plain");
 
// create root element
$root = $dom->createElement("toppings");
$dom->appendChild($root);
 
// create child element
$item = $dom->createElement("item");
$root->appendChild($item);
 
// create text node
$text = $dom->createTextNode("pepperoni");
$item->appendChild($text);
 
// save and display tree
echo $dom->saveXML();
?>

Here, I’ve first created a root element named and attached it to the XML header. Next, I’ve created an element named and attached it to the root element. And finally, I’ve created a text node with the value “pepperoni” and attached it to the element. The result should look like this:

<?xml version="1.0"?>
<toppings>
<item>pepperoni</item>
</toppings>

If you’d like to add another topping, simply create another and populate it with different content (Listing C).

Listing C

<?php
// create doctype
$dom = new DOMDocument("1.0");
 
// display document in browser as plain text
// for readability purposes
header("Content-Type: text/plain");
 
// create root element
$root = $dom->createElement("toppings");
$dom->appendChild($root);
 
// create child element
$item = $dom->createElement("item");
$root->appendChild($item);
 
// create text node
$text = $dom->createTextNode("pepperoni");
$item->appendChild($text);
 
// create child element
$item = $dom->createElement("item");
$root->appendChild($item);
 
// create another text node
$text = $dom->createTextNode("tomato");
$item->appendChild($text);
 
// save and display tree
echo $dom->saveXML();
?>

And here’s the revised output:

<?xml version="1.0"?>
<toppings>
<item>pepperoni</item>
<item>tomato</item>
</toppings>

Adding attributes

You can also add qualifying information to your elements, through the thoughtful use of attributes. With the PHP DOM API, attributes are added in a two-step process: first create an attribute node holding the name of the attribute with the DOMDocument object’s createAttribute() method, and then append a text node to it holding the attribute value. Listing D is an example.

Listing D

<?php
// create doctype
$dom = new DOMDocument("1.0");
 
// display document in browser as plain text
// for readability purposes
header("Content-Type: text/plain");
 
// create root element
$root = $dom->createElement("toppings");
$dom->appendChild($root);
 
// create child element
$item = $dom->createElement("item");
$root->appendChild($item);
 
// create text node
$text = $dom->createTextNode("pepperoni");
$item->appendChild($text);
 
// create attribute node
$price = $dom->createAttribute("price");
$item->appendChild($price);
 
// create attribute value node
$priceValue = $dom->createTextNode("4");
$price->appendChild($priceValue);
 
// save and display tree
echo $dom->saveXML();
?>

And here’s what the output will look like:

<?xml version="1.0"?>
<toppings>
<item price="4">pepperoni</item>
</toppings>

Adding CDATA blocks and processing instructions

While not used quite as often, CDATA blocks and processing instructions (PI) are also well-supported by the PHP API, through the DOMDocument object’s createCDATASection() and createProcessingInstruction() methods. Listing E shows you an example.

Listing E

<?php
// create doctype
$dom = new DOMDocument("1.0");
 
// display document in browser as plain text
// for readability purposes
header("Content-Type: text/plain");
 
// create root element
$root = $dom->createElement("toppings");
$dom->appendChild($root);
 
// create child element
$item = $dom->createElement("item");
$root->appendChild($item);
 
// create text node
$text = $dom->createTextNode("pepperoni");
$item->appendChild($text);
 
// create attribute node
$price = $dom->createAttribute("price");
$item->appendChild($price);
 
// create attribute value node
$priceValue = $dom->createTextNode("4");
$price->appendChild($priceValue);
 
// create CDATA section
$cdata = $dom->createCDATASection("\nCustomer requests that pizza be sliced into 16 square pieces\n");
$root->appendChild($cdata);
 
// create PI
$pi = $dom->createProcessingInstruction("pizza", "bake()");
$root->appendChild($pi);
 
// save and display tree
echo $dom->saveXML();
?>

And here’s the output:

<?xml version="1.0"?>
<toppings>
<item price="4">pepperoni</item>
<![CDATA[
Customer requests that pizza be sliced into 16 square pieces
]]>
<?pizza bake()?>
</toppings>

Saving the results

Once you’ve got the tree the way you want it, you can either save it to a file or store it in a PHP variable. The former function is performed by calling the save() method with a file name, while the latter is performed by calling the saveXML() method and assigning the result to a string. Here’s an example (Listing F).

Listing F

<?php
// create doctype
$dom = new DOMDocument("1.0");
 
// create root element
$root = $dom->createElement("toppings");
$dom->appendChild($root);
$dom->formatOutput=true;
 
// create child element
$item = $dom->createElement("item");
$root->appendChild($item);
 
// create text node
$text = $dom->createTextNode("pepperoni");
$item->appendChild($text);
 
// create attribute node
$price = $dom->createAttribute("price");
$item->appendChild($price);
 
// create attribute value node
$priceValue = $dom->createTextNode("4");
$price->appendChild($priceValue);
 
// create CDATA section
$cdata = $dom->createCDATASection("\nCustomer requests that pizza be sliced into 16 square pieces\n");
$root->appendChild($cdata);
 
// create PI
$pi = $dom->createProcessingInstruction("pizza", "bake()");
$root->appendChild($pi);
 
// save tree to file
$dom->save("order.xml");
 
// save tree to string
$order = $dom->save("order.xml");
?>

And that’s about it. Hopefully you found this article interesting, and will be able to use these techniques in your daily work with XML. Happy coding!

10 reasons to use PEAR classes

May 15, 2008 Author: Ashish | Filed under: PHP

Most PHP Web developers have heard of PEAR, the PHP Extension and Application Repository, but very few of them actually use it on a regular basis. Here are 10 reasons to get started today.

Most PHP Web developers have heard of PEAR, the PHP Extension and Application Repository, but very few of them actually use it on a regular basis. This is an oversight that should be corrected, because PEAR is actually a rich treasure trove of PHP widgets that can significantly simplify the average Web developer’s workday.

If you think I’m overstating the benefits, ask yourself if you’ve ever written custom code to (a) create HTML e-mail, (b) generate Web forms on the fly, or (c) validate email addresses. PEAR has pre-built PHP packages for all these tasks, and a few hundred more besides. These packages provide a robust, well-tested code base that can save you the time and effort you would otherwise spend on “rolling your own” code. You can’t beat the price either…they’re free!

PEAR classes cover a wide range of tasks, and so this document will focus specifically on classes of interest to developers working with Web pages and form input. If there are other categories you’d like to see addressed, send in your suggestions and we will examine those areas too…and until then, here’s to easier coding!

Note: You can install PEAR packages directly from the Web, by following the provided instructions.

Package Name: Validate
Description: This package provides validation routines for common input types: email addresses, credit card numbers, URLs, dates and times, string and number classes, and more.

Use this package to test user input entered in Web forms and ensure it is valid before using it in a calculation/saving it to a file or database.

URL: http://pear.php.net/package/Validate

Calendar
This package creates calendar data structures for one or more months. These data structures can then be combined with HTML formatting or a template to create a calendar display, complete with forward/backward navigation links.

Use this package to quickly integrate a pop-up Web calendar into a Web site.

http://pear.php.net/package/Calendar

Mail_Mime
This package provides routines to create a MIME-compliant multi-part message. Such a message can contain embedded HTML, images, file attachments, or other parts. The package also provides functions to decode received multi-part messages into their constituent parts.

Use this package to create HTML email with embedded images, or messages with one or more attachments. You can also use this class to decode multi-part messages - for example, as an attachment browser in a Web mail client.

http://pear.php.net/package/Mail_Mime

Cache
This package provides a simple caching framework for a Web site. It allows you to cache the output of PHP scripts as well as function calls, and reduce response times by rendering the cached pages to clients. Cached pages may be stored as files on disk, in a database, or using a custom storage engine.

If your site receives a lot of traffic, use this package to reduce server load and page processing time by occasionally providing clients with snapshots from the page cache instead of the “live” page. You can also reduce the load on your database server by caching the output of frequently-used SQL queries.

http://pear.php.net/package/Cache

Image_Graph
This package makes it possible to automatically convert numerical data into a graph suitable for display on a Web page. Bar, graph, pie, radar and scatter graphs are just some of the supported graph types. X and Y axis customisations are supported, as are many different output formats.

Use this package to display numerical data in a visual manner for easier comprehension - for example, when calculating Web site traffic or ad clicks.

http://pear.php.net/package/Image_Graph

HTML_QuickForm
This package provides routines to generate, validate, and process Web forms programmatically. It supports all the HTML form input types, and comes with built-in validation routines for most common input types. It also provides built-in functionality for multi-page forms and file upload forms.

Use this package to significantly simplify the task of generating Web forms at run-time, or to efficiently validate and process form input.

http://pear.php.net/package/HTML_QuickForm

Auth
This package provides a framework for a basic login/authentication system using PHP. It can verify user credentials against a variety of data sources, including MySQL databases, ASCII files, LDAP servers and POP3 servers.

Use this package to quickly create a login system for a Web application. Because it has “out of the box” support for so many authentication sources, it can also be used to implement Web-based “single login” infrastructure.

http://pear.php.net/package/Auth

XML_RSS
This package is designed to parse RSS documents. It extracts information from an RSS feed as PHP data structures, which can be processed and formatted for display.

Use this package to integrate RSS feeds from other sites into your Web pages.

http://pear.php.net/package/XML_RSS

HTML_Progress
This package provides a framework for a progress bar on a Web site. It uses PHP, JavaScript and CSS to display and dynamically update a progress bar with visual notification of a task’s progress.

Use this package to display progress bars for time-consuming Web tasks — for example a file upload or a long-running loop.

http://pear.php.net/package/HTML_Progress

Translation
This package provides a framework for multi-lingual Web sites. It contains routines to retrieve a translation for each string value from a database and insert it into the appropriate location on each translation-enabled page.

Use this package to efficiently handle multi-language versions of a Web site.

http://pear.php.net/package/Translation/

Recent Comments