How Your HTML Becomes Garbage HTML?


Garbage HTML: Impact and Causes

Developers around the world are writing HTML pages and they try to be as specific as they can in terms of cross browser issues but when it comes to standards they usually ignore them. We can term such HTML pages created by them as wasteful HTML. These pages cost you in terms of bandwidth, load times and sometimes HTML rendering time. If you look at the source code you will notice that there are white-space which is almost 3 times of the actual HTML code which is useless and measured in KB. You may think that the white-space is left for indentation purposes but its not the case, these white-space is left due to ignorance. Removing those extra white-space can sometimes reduce the page weight up-to 50%.

Moreover, Instead of defining two cell classes in an external style sheet which gets downloaded once as cached, and using the right class in the tag, the HTML writers sometimes use an inline style attribute on every row tag. Same is with JavaScript which could and should be stored in an external file.

These ignorance or mistakes whatever you call it not only make your HTML or templates more difficult to maintain, but all of the inline JavaScript and CSS styling discourages reuse, standardization, and other good habits. It also makes a site significantly harder to maintain.

Characteristics of Garbage HTML

Garbage HTML has a special something to it, a unique blend of being not just invalid, but disgustingly so by going beyond minor misunderstandings or typos and far into the realm of negligence — improperly nested tags, tags that are never properly ended, incorrect attribute usage, and so on.

Why Developers Can’t figure out Garbage HTML themselves

  • Ignorance: A large number of developers don’t know HTML properly, sometimes they aren’t keen but sometimes they aren’t taught well. Many developers starts learning HTML by copying and pasting code from some other source which may or may not be perfect. Usually if the page is looking properly they don’t even care what is written in the background (HTML code)
  • Poor tools: Tools used by the developers to produce HTML, whether its some sort of HTML WYSIWYG editor or maybe a framework/library that is generating bad code. There are reasons why some of these tools can’t generate code. For example, until recently, those on-screen, web-based HTML editors had a much easier time rendering the font tag than a CSS class or even an inline style. The end result? HTML widgets that were generating the font tag for HTML nearly 10 years after font had become deprecated. Another problem is that there is not a 1:1 mapping of stylistic effects and the way to make the effects happen. Without editors that are strictly contextual driven, there is no good way for the tool to know that [Ctrl]I should result in the em tag as opposed to a CSS style that makes something italic.
  • HTML is too semantic: HTML is transitioned to being as purely semantic as possible without completely breaking backwards compatibility. In general, this is a good thing. But for a developer who is trying to finish their work and go home, it is a nightmare. Which tag does the developer use? What if the developer wants the effect that a tag produces, without the meaning of the tag? It’s enough to drive anyone insane. So what happens? Developers stop caring about what the HTML “means” and just do enough to make the page render the way the client wants it to look. In the shuffle, the HTML ends up being a mess.
  • Developers who never updated their HTML skill sets: HTML is evolving unlike any other languages but developers haven’t updated their skill or learned recent tags, tricks or syntaxes. Maybe they learned HTML in 1997, or the book they bought on the bargain bin in 2002 was from 1997. Who knows? But their HTML is stuck in the HTML 3 years, and they haven’t updated it since.
  • Server-side technologies: Many developers are concatenating strings which is HTML and they print as per logic. This is where the problem lies, as there is no way to visualize the HTML output beforehand. That is a sure way to not be able to get the code right. e.g.

  • Work & Deadline Pressure: Generally the deadline involved with the HTML projects doesn’t give enough time to validate the HTML properly. Client generally thinks that HTML projects should not take much time and they don’t care about proper HTML. They are happy as long as they see their desired output. Situation becomes worse when they are notified about HTML issues by their expert friend. They come back yelling, without considering that they haven’t given enough time to validate HTML properly.

Any Solution to Garbage HTML Problem

I think some parts of this problem are not likely to be corrected. Most developers who care about doing things right are using tools that combine just enough visualization to make life a bit easier, without completely taking over the process. Some developers may be running their code through validators or keeping current on what is correct and what isn’t. But at the end of the day, there is no way to force a developer to start caring about these things unless you are their boss. After all, browsers will display even the worst HTML code, as long as it somewhat makes sense.

In my experience, only about 20% of developers really understand how to accomplish something as well as why they should do it that way and what the consequences are. That’s the same 20% or so who are probably writing decent HTML.

If you’ve got any thoughts, comments or suggestions for things we could add, leave a comment! Also please Subscribe to our RSS for latest tips, tricks and examples on cutting edge stuff.

0 I like it
0 I don't like it