I recently ran across an unbelievable example of this type of bloat while I was preparing for a presentation on SEO. Here’s a link to the page, a calendar of events for a local library. The file has been altered only to hide the owner of the page.
With the page open, go to View Source.
<body> tag doesn’t appear until line 974; the main content doesn’t appear until line 1945 of a total of 3253 lines of code. The total page size is 208kb; of which, only 20% of that is true content.
Search engines are designed to handle approximately 100kb per page; the remainder is lucky to be crawled and indexed.
If all unnecessary code were removed from this page, the resulting page would only be around 62kb, loading much quicker for site visitors and a breeze for the search engines to handle.
Examination of the code
In reviewing the code from the example page above, you’ll notice several elements of unnecessary code:
- Cascading Style Sheets: The styles which control the formatting on the page are embedded in the head section as well as in-line, as attributes of various HTML elements. Each time the page loads, these styles must be loaded into the web browser.
- Tables for layout: I understand that tables continue to make sense for some layout situations. A calendar would appear to make a perfect situation. However, I’m calling out this practice for two reasons:
- It is still overused. Each table element (
<td>, etc.) carries with it many opportunities for bloat due to the related attributes.
- In this example, the calendar contains nested tables to represent each event.
- Outdated Tags: Outdated, depreciated HTML tags, such as the infamous
<font>tag provide a plethora of attribute opportunities.
Now that we know where the trouble areas are, we can now begin to fix the problem.
<head> section of the page:
Cascading Style Sheets (CSS)
By leveraging an external CSS file, all styles for the site are placed in the external file. Each page that refers to the external styles file can access all the styles in the file. The browser requests the CSS file once from the server; each subsequent page loads the CSS from the browser cache, decreasing the overall load time. It can also make site wide changes to your look and feel easy & efficient. Lastly, it can greatly reduce the size and load time of your pages.
Tables have been used for a decade to layout pages, providing designers a grid system to base their designs. However, with these tables have come thousands of bytes of code, usability issues, and trouble for search engine spiders.
Use should only use tables when presenting tabular data. There are several ways to leverage XHTML tags and CSS to provide a similar grid-like structure. These methods should be leveraged whenever possible. Doing so will, like the methods addressed above, reduce load time, improve usability, and make it easier for the search engines to gobble up your content.
Opinion: There may be some need to leverage tables for layout. As long as the “mother” table is limited to the most basic grid structure (example: 2-3 columns, 1 row), this is currently acceptable as a means to move towards a more Web Standards / semantic approach.
Gone are the days when the following was acceptable coding:
Content should now be coded in the following manner:
Where’s the formatting, you ask? Why, it’s in the external CSS file. Note the decrease in code use to represent the same content. That is the magic of CSS, and that’s why it should be leveraged to provide the presentation of the content.
Placing your pages on a diet using these best practices can only help in decreasing load time (make site visitors happy), increasing accessibility (make individuals with handicaps happy and keep site owners away from possible future lawsuits) and increasing the ratio of content-to-code to make search engine spiders happy and improve the relevancy of your pages.