Ten PHP Design Patterns

Design patterns can speed up the development process by providing tested, proven development paradigms. Effective software design requires considering issues that may not become visible until later in the implementation. Reusing design patterns helps to prevent subtle issues that can cause major problems, and it also improves code readability for coders and architects who are familiar with the patterns.

Design patterns were introduced to the software community in Design Patterns, by Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides (colloquially known as the “gang of four”). The core concept behind design patterns, presented in the introduction, was simple. Over their years of developing software, Gamma et al found certain patterns of solid design emerging, just as architects designing houses and buildings can develop templates for where a bathroom should be located or how a kitchen should be configured. Having those templates, or design patterns, means they can design better buildings more quickly. The same applies to software.

Design patterns not only present useful ways for developing robust software faster but also provide a way of encapsulating large ideas in friendly terms. For example, you can say you’re writing a messaging system to provide for loose coupling, or you can say you’re writing an observer, which is the name of that pattern.

It’s difficult to demonstrate the value of patterns using small examples. They often look like overkill because they really come into play in large code bases. This article can’t show huge applications, so you need to think about ways to apply the principles of the example — and not necessarily this exact code — in your larger applications. That’s not to say that you shouldn’t use patterns in small applications. Most good applications start small and become big, so there is no reason not to start with solid coding practices like these.

Now that you have a sense of what design patterns are and why they’re useful, it’s time to jump into ten design patterns for PHP V5.

The factory pattern

Many of the design patterns in the original Design Patterns book encourage loose coupling. To understand this concept, it’s easiest to talk about a struggle that many developers go through in large systems. The problem occurs when you change one piece of code and watch as a cascade of breakage happens in other parts of the system — parts you thought were completely unrelated.

The problem is tight coupling. Functions and classes in one part of the system rely too heavily on behaviors and structures in other functions and classes in other parts of the system. You need a set of patterns that lets these classes talk with each other, but you don’t want to tie them together so heavily that they become interlocked.

In large systems, lots of code relies on a few key classes. Difficulties can arise when you need to change those classes. For example, suppose you have a User class that reads from a file. You want to change it to a different class that reads from the database, but all the code references the original class that reads from a file. This is where the factory pattern comes in handy.

The factory pattern is a class that has some methods that create objects for you. Instead of using new directly, you use the factory class to create objects. That way, if you want to change the types of objects created, you can change just the factory. All the code that uses the factory changes automatically.

Listing 1 shows an example of a factory class. The server side of the equation comes in two pieces: the database, and a set of PHP pages that let you add feeds, request the list of feeds, and get the article associated with a particular feed.

Listing 1. Factory1.php

An interface called IUser defines what a user object should do. The implementation of IUser is called User, and a factory class called UserFactory creates IUser objects. This relationship is shown as UML in Figure 1.

Figure 1. The factory class and its related IUser interface and user class
The factory class and its related IUser interface and user class

If you run this code on the command line using the php interpreter, you get this result:

The test code asks the factory for a User object and prints the result of the getName method.

A variation of the factory pattern uses factory methods. These public static methods in the class construct objects of that type. This approach is useful when creating an object of this type is nontrivial. For example, suppose you need to first create the object and then set many attributes. This version of the factory pattern encapsulates that process in a single location so that the complex initialization code isn’t copied and pasted all over the code base.

Listing 2 shows an example of using factory methods.

Listing 2. Factory2.php

This code is much simpler. It has only one interface, IUser, and one class called User that implements the interface. The User class has two static methods that create the object. This relationship is shown in UML in Figure 2.

Figure 2. The IUser interface and the user class with factory methods
The IUser interface and the user class with factory methods

Running the script on the command line yields the same result as the code in Listing 1, as shown here:

As stated, sometimes such patterns can seem like overkill in small situations. Nevertheless, it’s still good to learn solid coding forms like these for use in any size of project.

The singleton pattern

Some application resources are exclusive in that there is one and only one of this type of resource. For example, the connection to a database through the database handle is exclusive. You want to share the database handle in an application because it’s an overhead to keep opening and closing connections, particularly during a single page fetch.

The singleton pattern covers this need. An object is a singleton if the application can include one and only one of that object at a time. The code in Listing 3 shows a database connection singleton in PHP V5.

Listing 3. Singleton.php

This code shows a single class called DatabaseConnection. You can’t create your own DatabaseConnection because the constructor is private. But you can get the one and only one DatabaseConnection object using the static get method. The UML for this code is shown in Figure 3.

Figure 3. The database connection singleton
The database connection singleton

The proof in the pudding is that the database handle returned by the handle method is the same between two calls. You can see this by running the code on the command line.

The two handles returned are the same object. If you use the database connection singleton across the application, you reuse the same handle everywhere.

You could use a global variable to store the database handle, but that approach only works for small applications. In larger applications, avoid globals, and go with objects and methods to get access to resources.

The observer pattern

The observer pattern gives you another way to avoid tight coupling between components. This pattern is simple: One object makes itself observable by adding a method that allows another object, the observer, to register itself. When the observable object changes, it sends a message to the registered observers. What those observers do with that information isn’t relevant or important to the observable object. The result is a way for objects to talk with each other without necessarily understanding why.

A simple example is a list of users in a system. The code in Listing 4 shows a user list that sends out a message when users are added. This list is watched by a logging observer that puts out a message when a user is added.

Listing 4. Observer.php

This code defines four elements: two interfaces and two classes. The IObservable interface defines an object that can be observed, and the UserList implements that interface to register itself as observable. The IObserver list defines what it takes to be an observer, and the UserListLogger implements that IObserver interface. This is shown in the UML in Figure 4.

Figure 4. The observable user list and the user list event logger
The observable user list and the user list event logger

If you run this on the command line, you see this output:

The test code creates a UserList and adds the UserListLogger observer to it. Then the code adds a customer, and the UserListLogger is notified of that change.

It’s critical to realize that the UserList doesn’t know what the logger is going to do. There could be one or more listeners that do other things. For example, you may have an observer that sends a message to the new user, welcoming him to the system. The value of this approach is that the UserList is ignorant of all the objects depending on it; it focuses on its job of maintaining the user list and sending out messages when the list changes.

This pattern isn’t limited to objects in memory. It’s the underpinning of the database-driven message queuing systems used in larger applications.

The chain-of-command pattern

Building on the loose-coupling theme, the chain-of-command pattern routes a message, command, request, or whatever you like through a set of handlers. Each handler decides for itself whether it can handle the request. If it can, the request is handled, and the process stops. You can add or remove handlers from the system without influencing other handlers. Listing 5 shows an example of this pattern.

Listing 5. Chain.php

This code defines a CommandChain class that maintains a list of ICommand objects. Two classes implement the ICommand interface — one that responds to requests for mail and another that responds to adding users. The UML is shows in Figure 5.

Figure 5. The command chain and its related commands
The command chain and its related commands

If you run the script, which contains some test code, you see the following output:

The code first creates a CommandChain object and adds instances of the two command objects to it. It then runs two commands to see who responds to those commands. If the name of the command matches either UserCommand or MailCommand, the code falls through and nothing happens.

The chain-of-command pattern can be valuable in creating an extensible architecture for processing requests, which can be applied to many problems.

The strategy pattern

The last design pattern we will cover is the strategy pattern. In this pattern, algorithms are extracted from complex classes so they can be replaced easily. For example, the strategy pattern is an option if you want to change the way pages are ranked in a search engine. Think about a search engine in several parts — one that iterates through the pages, one that ranks each page, and another that orders the results based on the rank. In a complex example, all those parts would be in the same class. Using the strategy pattern, you take the ranking portion and put it into another class so you can change how pages are ranked without interfering with the rest of the search engine code.

As a simpler example, Listing 6 shows a user list class that provides a method for finding a set of users based on a plug-and-play set of strategies.

Listing 6. Strategy.php

The UML for this code is shown in Figure 6.

Figure 6. The user list and the strategies for selecting users
The user list and the strategies for selecting users

The UserList class is a wrapper around an array of names. It implements a find method that takes one of several strategies for selecting a subset of those names. Those strategies are defined by the IStrategy interface, which has two implementations: One chooses users randomly and the other chooses all the names after a specified name. When you run the test code, you get the following output:

The test code runs the same user lists against two strategies and shows the results. In the first case, the strategy looks for any name that sorts after J, so you get Jack, Lori, and Megan. The second strategy picks names randomly and yields different results every time. In this case, the results are Andy and Megan.

The strategy pattern is great for complex data-management systems or data-processing systems that need a lot of flexibility in how data is filtered, searched, or processed.

The adapter pattern

Use the adapter pattern when you need to convert an object of one type to an object of another type. Typically, developers handle this process through a bunch of assignment code, as shown in Listing 1. The adapter pattern is a nice way to clean this type of code up and reuse all your assignment code in other places. Also, it hides the assignment code, which can simplify things quite a bit if you’re also doing some formatting along the way.
Listing 6. Using code to assign values between objects

This example uses an AddressDisplay object to display an address to a user. The AddressDisplay object has two parts: the type of address and a formatted address string.

After implementing the pattern (see Listing 6.1), the PHP script no longer needs to worry about exactly how the EmailAddress object is turned into the AddressDisplay object. That’s a good thing, especially if the AddressDisplay object changes or the rules that govern how an EmailAddress object is turned into an AddressDisplay object change. Remember, one of the main benefits of designing your code in a modular fashion is to take advantage of having to change as little code as possible if something in the business domain changes or you need to add a new feature to the software. Think about this even when you’re doing mundane tasks, such as assigning values from properties of one object to another.
Listing 6.1. Using the adapter pattern

Figure 6.2 shows a class diagram of the adapter pattern.

Figure 6.2. Class diagram of the adapter pattern
Class diagram of the adapter pattern

Alternate method

An alternate method of writing an adapter — and one that some prefer — is to implement an interface to adapt behavior, rather than extending an object. This is a very clean way of creating an adapter and doesn’t have the drawbacks of extending the object. One of the disadvantages of using the interface is that you need to add the implementation into the adapter class, as shown in Figure 6.2.
Figure 6.3. The adapter pattern (using an interface)
The adapter pattern (using an interface)

The iterator pattern

The iterator pattern provides a way to encapsulate looping through a collection or array of objects. It is particularly handy if you want to loop through different types of objects in the collection.

Look back at the e-mail and physical address example in Listing 6. Before adding an iterator pattern, if you’re looping through the person’s addresses, you might loop through the physical addresses and display them, then loop through the person’s e-mail addresses and display them, then loop through the person’s IM addresses and display those. That’s some messy looping!

Instead, by implementing an iterator, all you have to do is call while($itr->hasNext()) and deal with the next item $itr->next() returns. An example of one of the iterators is shown in Listing 7. An iterator is powerful because you can add new types of items through which to iterate, and you don’t have to change the code that loops through the items. In the Person example, for instance, you could add an array of IM addresses; simply by updating the iterator, you don’t have to change any code that loops through the addresses for display.
Listing 7. Using the iterator pattern to loop through objects

If the Person object is modified to return an implementation of the AddressIterator interface, the application code that uses the iterator doesn’t need to be modified if the implementation is extended to loop through additional objects. You can use a compound iterator that wraps the iterators that loop through each type of address like the one listed in Listing 7. An example of this is available (see Download).

Figure 7 shows a class diagram of the iterator pattern.
Figure 7. Class diagram of the iterator pattern
Class diagram of the iterator pattern

The decorator pattern

Consider the code sample in Listing 8. The purpose of this code is to add a bunch of features onto a car for a Build Your Own Car site. Each car model has more features and an associated cost. With only two models, it would be fairly trivial to add these features with if then statements. However, if a new model came along, you’d have to go back through the code and make sure the statements worked for the new model.
Listing 8. Using the decorator pattern to add features

Enter the decorator pattern, which allows you to add this functionality onto the AutomobileModel in a nice, clean class. Each class remains concerned only about its price and options and how they’re added to the base model.

Figure 8 shows a class diagram for the decorator pattern.
Figure 8. Class diagram of the decorator pattern
Class diagram of the decorator pattern
An advantage of the decorator pattern is that you can easily tack on more than one decorator to the base at a time.

If you’ve done much work with stream objects, you have used a decorator. Most stream constructs, such as an output stream, are decorators that take a base input stream, then decorate it by adding additional functionality — like one that inputs streams from files, one that inputs streams from buffers, etc.

The delegate pattern

The delegate pattern provides a way of delegating behavior based on different criteria. Consider the code in Listing 9. This code contains several conditions. Based on the condition, the code selects the appropriate type of object to handle the request.
Listing 9. Using conditional statements to route shipping requests

With a delegate pattern, an object internalizes this routing process by setting an internal reference to the appropriate object when a method is called, like useRail() in Listing 10. This is especially handy if the criteria change for handling various packages or if a new type of shipping becomes available.
Listing 10. Using the delegate pattern to route shipping requests

The delegate provides the advantage that behavior can change dynamically by calling the useRail() or useTruck() method to switch which class handles the work.

Figure 10 shows a class diagram of the delegate pattern.
Figure 10. Class diagram of the delegate pattern
Class diagram of the delegate pattern

The state pattern

The state pattern is a similar to the command pattern, but the intent is quite different. Consider the code below.
Listing 11. Using code to build a robot

In this listing, the PHP code represents the operating system for a powerful robot that turns into a car. The robot can power up, power down, turn into a robot when it’s a vehicle, and turn into a vehicle when it’s a robot. The code is OK now, but you see that it can become complex if any of the rules change or if another state comes into the picture.

Now look at Listing 12, which has the same logic for handling the robot’s states, but this time puts the logic into the state pattern. The code in Listing 12 does the same thing as the original code, but the logic for handling states has been put into one object for each state. To illustrate the advantages of using the design pattern, imagine that after a while, these robots have discovered that they shouldn’t power down while being in robot mode. In fact, if they power down, they must change to vehicle mode first. If they’re already in vehicle mode, the robot just powers down. With the state pattern, the changes are pretty trivial.
Listing 12. Using the state pattern to handle the robot’s state

Listing 13. Small changes to one of the state objects

Something that doesn’t appear obvious when looking at Figure 9 is that each object in the state pattern has a reference to the context object (the robot), so each object can advance the state onto the appropriate one.
Figure 14. Class diagram of the state pattern
Class diagram of the state pattern

Summary

Using design patterns in your PHP code is one way to make your code more readable and maintainable. By using established patterns, you benefit from common design constructs that allow other developers on a team to understand your code’s purpose. It also allows you to benefit from the work done by other designers, so you don’t have to learn the hard lessons of design ideas that don’t work out. Taken from IBM developerworks

Please post your experiences with these design patterns. If you like this post kindly subscribe to our RSS for free updates and articles delivered to you.

Prevent identity theft by avoiding these seven common mistakes

Identity theft may be on the rise, but you don’t have to make it easy for thieves — take steps to protect the personally identifiable information (PII) of your employees and clients.

Is your organization part of the solution or part of the problem? PII is pouring through the security floodgates and ending up in the wrong hands at an alarming rate.

To protect your organization’s employees and clients, you need to evaluate how well your company protects its PII. Here are seven common mistakes to avoid.

Keep users in the dark

Users will always be the weakest link in any enterprise network — and all of the gadgets and controls in the world won’t change that. If your users don’t know how to identify and handle PII, it’s only a matter of time before one of them discloses this data to the wrong source.

The solution is simple: Educate your users on your company’s policies and mechanisms to process PII. And don’t forget to include regularly scheduled refresher courses.

Partner with the wrong businesses

You’ve made sure your security is rock solid, and you’ve trained your users. But can your business partners say the same? Do you collect or share information with businesses that have little or no security?

If your company collects and shares PII with insecure partners, who do you think will end up in the paper and explaining to law enforcement about how a breach occurred? Your company will.

The solution is just as simple as the last dilemma: Educate and train your business partners on how to protect this sensitive information.Charge them for your expertise if you want, but get the job done.

Keep data around past its prime

What do you do with data once it’s served its purpose? If you aren’t destroying PII when it’s no longer required, then you’re not doing your job. That doesn’t mean throwing it away either — that means destroying it.

Dumpster divers make a living off of old bank statements and credit card receipts. That’s why you need to wipe out PII when it’s no longer necessary.If your organization doesn’t have a shredder, you need to get one today.

Don’t worry about physical security

It’s imperative that you implement physical access controls to prevent unauthorized people — including employees — from gaining access to PII. Get a door lock and a badge reader, and start controlling access.

Don’t lock up your records

If you don’t have specific storage areas on your network (as well as file cabinets) for PII, then how can your properly protect it? Take inventory of your network — and your paper copies — and develop a plan to protect that data. This would be a good time to research encrypting data-at-rest and locking some file cabinets.

Ignore activity on your network

I’ve said this before in columns, but it’s worth repeating: If you’re not going to actively monitor your network for suspicious activity or incidents, then stop collecting the data. Develop a method that’s within your capabilities and budget to monitor your network for suspicious activity or incidents. And while you’re at it, develop a response and mitigation strategy for security incidents.

Audits? Who needs audits?

A lot of businesses either don’t know what security events to audit or don’t read their security logs — or both. If you’re not sure which events to audit, find out. Set up security auditing, and start reviewing your logs today.

Final thoughts

Identity theft may be on the rise, but you don’t have to make it easy for thieves. You can help prevent identity theft both at home and at the office — you just need to take a few extra steps.

Pop-up windows: Know the difference

There’s been a lot of publicity about pop-up windows, and most of it hasn’t exactly been rave reviews. But it hasn’t always been this way.

In fact, pop-up windows were a positive component in the beginning. Created long before tabbed browsers, their purpose was to present information without interfering with the current browser window.

These days, due to security risks as well as the annoyance factor, a standard feature among browsers is to block or control pop-up behavior. But before you start telling your browser or other privacy programs to block all those pop-ups, you need to understand why they happen and what you should really be doing about them.

Most pop-ups are part of the content from the Web site the user is visiting, containing either requested information or info the site thinks one might like. But other pop-ups are just spam that’s both invasive and malicious in nature.

These types of pop-ups are actually an alarm telling you that something’s wrong with your computer and you need to fix it. Let’s divide pop-ups into two general categories — normal and alarms.

Normal pop-ups

Some pop-ups are information you’ve requested — music or video content from a link you just clicked or a download you requested (hopefully from a trusted site). Web-access e-mail programs use pop-ups to create or reply to e-mail, which mimics a traditional e-mail client.

In addition, some pop-ups are targeted advertising marketed specifically to consumers visiting a Web site. If you find yourself getting too many of these advertisements, it’s probably due to the sites you’re visiting.

In general, all of these types of pop-ups are the kind you want. And if not, you can easily dismiss them with a click on the X. These are the pop-ups you should be controlling with your browser or privacy program. But the other types of pop-ups are the ones you want to see — because they’re alerting you that something’s wrong with your system.

Alarm pop-ups

You don’t want to block the pop-ups that indicate a problem with your system — these are the ones you want to see and take action on to resolve. For example, if pop-ups are launching through the Windows Messenger Service, you’ve got a potentially serious problem.

To get rid of these pop-ups, you need to turn off the Messenger Service. Follow these steps:

1. Go to Start | Run, type services.msc, and click OK to launch the Services applet.
2. Scroll down to find Messenger.
3. Right-click Messenger, and select Properties.
4. On the General tab, select Disabled from the Startup Type drop-down list, and click OK.

This is a serious security issue. While the Messenger Service pop-up starts with data on UDP 135, this pop-up indicates that the Windows networking ports (i.e., TCP/UDP 135, 137 through 139, and 445) are open to the public. This pop-up is an alarm that you need to block these ports with your firewall.

Another type of alarm pop-up is the browser flood. As soon as your browser opens, you start receiving a swarm of pop-ups. This browser “spam” is telling you that spyware/adware is running on your system. While this is usually why people enable pop-up blockers, that’s comparable to rolling down your window and sticking your head outside so you can see to drive.

What’s the real solution? Clean your Windows! Blocking the alarm doesn’t solve the problem. If your system has experienced this type of behavior, start shopping for a spyware/adware removal tool (maybe several), and clean your system.

Final thoughts

While pop-ups can be a pain, they sometimes indicate a more serious problem. Don’t ignore all pop-ups — investigate the problem and make your system safer.

The Complexity Complex

When you’re designing or writing software, one issue that can often be glossed over is the matter of efficiency. It’s so easy at the beginning of a project to just concentrate on getting something working, so you can demonstrate progress, and then worry about making it fast later on. The unfortunate fact is though optimisation can only take you so far, the true efficiency issues are going to lie in your algorithm design. Most IT professionals have learned the basics at some point in their career, but in case you’re a little rusty read on and we’ll refresh your memory.

The first thing to consider is what kind of complexity you’re looking to reduce. The two major complexity areas are time — that is, how long an operation will take to complete — and space, or how much memory is needed. When talking complexity, we tend to rate speed in terms of how many steps (or blocks of memory for space complexity) are taken per input variable, rather than in absolutes, since they are so dependent on the specifics of the hardware. Likewise, the length of time an individual step will take is largely disregarded since for large inputs this time will be dominated by the complexity class.

To make comparing two algorithms easier we group them into classes by using a special kind of notation. There are a number of different ways to do this, based upon the best case, average case and worst case input scenario. I like to use the worst case most of the time, since that’s the time it’s going to make the most difference to how you perceive performance. To express this we use what’s called big O notation, which expresses the number of steps an algorithm will take for an input of size “n” in the worst case. So, take the following example, which simply sums the numbers in a list.

Treating each line as a single step, we can see that calling sum on a list of size n will take n+4 steps to complete, two for the initialisation of final_sum and n, one to set up the for loop, one for the return statement and then n times one for the loop body.

The problem has changed, and now you need to multiply each number by how many times it occurs in the list before adding it to the running total. Take the following implementation:

This does similarly to the last function, with the exception that before adding the current value to the running total, it goes through the list and counts the number of occurrences of each value. Calling this function of a list of size n means that 4 + n * (1 + n * 2) steps are carried out since the outer loop now contains 2n + 1 steps. In total this means that calling this function “costs” 2n2 + n + 4 steps. For a list of 10 numbers it takes 214 steps, but for a list of 100 numbers it will need more than 20,000 steps to complete. That’s quite an increase. When we rewrite it in another way, however, this changes:

In this example we precompute the number of times each value occurs in the list. To do this we use a new data type which can store these values. It’s not particularly important how this is implemented so long as we can be sure that we can insert and retrieve values in constant time. In languages that support them as standard this could be a hash or a dictionary, or if you’re not that lucky (say you’re using C) then you can think of it as an integer array of size max(a). The method simply returns true if this type contains a the given value.

Anyhow, you can see how rather than work out how many times each number occurs as we reach it we can do it all at the beginning and store it. Let’s look at how this helps — sum_multiple2 takes 3n + 6 steps: the usual initialisation steps, plus two for each input to build the dictionary of number occurrences, and then one for each input to sum them. For 10 inputs this will take 36 steps, for one hundred: 306. That’s more than 65 times faster for the second version when dealing with 100 inputs. If say, we had one million inputs it becomes two trillion vs three million and the second version is more than 650,000 times faster.

Now we’ve been taking a fairly casual view of the number of steps in each algorithm, treating each line as one step, when a statement like “sum += a[j] * numbers[a[j]]” contains multiple lookups and could be compiled into as many as 10 individual instructions on a hardware level. This is not really that important though, when you think about it, even if we assume that every step we’ve counted in the second example really takes 10, and the first program is unchanged then it still represents more than a 60,000 times improvement.

Really what we’re interested in is the order of the algorithm, for convenience, we reduce it to the size of the largest part. For example, sum_multiples we say is O(n2) whereas sum_multiples2 is O(n). This is often all you really need to know, for large enough values of n, O(n) algorithms will always beat O(n2) algorithms, regardless of the details.

Understanding the pros and cons of the Waterfall Model of software development

Waterfall development is a software development model involving a phased progression of activities, marked by feedback loops, leading to the release of a software product. This article provides a quick and dirty introduction to the model, explaining what it is, how it’s supposed to work, describing the six phases, and why the model can fail.

Say the words “waterfall development” to most people and chances are they’re going to be thinking of a bunch of condos under Niagara Falls. Imagine their surprise, then, when you tell them that waterfall development is actually a software development model which involves a phased progression of activities leading to the release of a software product. This article provides a quick and dirty introduction to the model, explaining what it is, how it’s supposed to work, and why it can fail.

Overview

Waterfall development isn’t new — it’s been around since 1970 — but most developers still only have a vague idea of what it means. Essentially, it’s a framework for software development in which development proceeds sequentially through a series of phases, starting with system requirements analysis and leading up to product release and maintenance. Feedback loops exist between each phase, so that as new information is uncovered or problems are discovered, it is possible to “go back” a phase and make appropriate modification. Progress “flows” from one stage to the next, much like the waterfall that gives the model its name.

A number of variants of this model exist, with each one quoting slightly different labels for the various stages. In general, however, the model may be considered as having six distinct phases, described below:

1. Requirements analysis: This first step is also the most important, because it involves gathering information about what the customer needs and defining, in the clearest possible terms, the problem that the product is expected to solve. Analysis includes understanding the customer’s business context and constraints, the functions the product must perform, the performance levels it must adhere to, and the external systems it must be compatible with. Techniques used to obtain this understanding include customer interviews, use cases, and “shopping lists” of software features. The results of the analysis are typically captured in a formal requirements specification, which serves as input to the next step.
2. Design: This step consists of “defining the hardware and software architecture, components, modules, interfaces, and data…to satisfy specified requirements” (Wikipedia). It involves defining the hardware and software architecture, specifying performance and security parameters, designing data storage containers and constraints, choosing the IDE and programming language, and indicating strategies to deal with issues such as exception handling, resource management and interface connectivity. This is also the stage at which user interface design is addressed, including issues relating to navigation and accessibility. The output of this stage is one or more design specifications, which are used in the next stage of implementation.
3. Implementation: This step consists of actually constructing the product as per the design specification(s) developed in the previous step. Typically, this step is performed by a development team consisting of programmers, interface designers and other specialists, using tools such as compilers, debuggers, interpreters and media editors. The output of this step is one or more product components, built according to a pre-defined coding standard and debugged, tested and integrated to satisfy the system architecture requirements. For projects involving a large team, version control is recommended to track changes to the code tree and revert to previous snapshots in case of problems.
4. Testing: In this stage, both individual components and the integrated whole are methodically verified to ensure that they are error-free and fully meet the requirements outlined in the first step. An independent quality assurance team defines “test cases” to evaluate whether the product fully or partially satisfies the requirements outlined in the first step. Three types of testing typically take place: unit testing of individual code modules; system testing of the integrated product; and acceptance testing, formally conducted by or on behalf of the customer. Defects, if found, are logged and feedback provided to the implementation team to enable correction. This is also the stage at which product documentation, such as a user manual, is prepared, reviewed and published.
5. Installation: This step occurs once the product has been tested and certified as fit for use, and involves preparing the system or product for installation and use at the customer site. Delivery may take place via the Internet or physical media, and the deliverable is typically tagged with a formal revision number to facilitate updates at a later date.
6. Maintenance: This step occurs after installation, and involves making modifications to the system or an individual component to alter attributes or improve performance. These modifications arise either due to change requests initiated by the customer, or defects uncovered during live use of the system. Typically, every change made to the product during the maintenance cycle is recorded and a new product release (called a “maintenance release” and exhibiting an updated revision number) is performed to enable the customer to gain the benefit of the update.

Advantages

The waterfall model, as described above, offers numerous advantages for software developers. First, the staged development cycle enforces discipline: every phase has a defined start and end point, and progress can be conclusively identified (through the use of milestones) by both vendor and client. The emphasis on requirements and design before writing a single line of code ensures minimal wastage of time and effort and reduces the risk of schedule slippage, or of customer expectations not being met.

Getting the requirements and design out of the way first also improves quality; it’s much easier to catch and correct possible flaws at the design stage than at the testing stage, after all the components have been integrated and tracking down specific errors is more complex. Finally, because the first two phases end in the production of a formal specification, the waterfall model can aid efficient knowledge transfer when team members are dispersed in different locations.

Criticisms

Despite the seemingly obvious advantages, the waterfall model has come in for a fair share of criticism in recent times. The most prominent criticism revolves around the fact that very often, customers don’t really know what they want up-front; rather, what they want emerges out of repeated two-way interactions over the course of the project. In this situation, the waterfall model, with its emphasis on up-front requirements capture and design, is seen as somewhat unrealistic and unsuitable for the vagaries of the real world. Further, given the uncertain nature of customer needs, estimating time and costs with any degree of accuracy (as the model suggests) is often extremely difficult. In general, therefore, the model is recommended for use only in projects which are relatively stable and where customer needs can be clearly identified at an early stage.

Another criticism revolves around the model’s implicit assumption that designs can be feasibly translated into real products; this sometimes runs into roadblocks when developers actually begin implementation. Often, designs that look feasible on paper turn out to be expensive or difficult in practice, requiring a re-design and hence destroying the clear distinctions between phases of the traditional waterfall model. Some criticisms also center on the fact that the waterfall model implies a clear division of labor between, say, “designers”, “programmers” and “testers”; in reality, such a division of labor in most software firms is neither realistic nor efficient.

Customer needs

While the model does have critics, it still remains useful for certain types of projects and can, when properly implemented, produce significant cost and time savings. Whether you should use it or not depends largely on how well you believe you understand your customer’s needs, and how much volatility you expect in those needs as the project progresses. It’s worth noting that for more volatile projects, other frameworks exists for thinking about project management, notably the so-called spiral model…but that’s a story for another day!