Thursday, May 29, 2008, 2:41 AM Website Design by John
One of the lowlights of using a dynamically generated URL system on your website with htaccess is that your site can produce URLs that look OK, but yield pages with little or no content. These sorta orphaned URLs also dilute some of your search engine mojo, by causing funny little extra pages to show up in search.
In order to play nice with the search engine bots, it is wise to scan the output of of your URL and determine whether it is delivering real content or if it is just answering one more query. Advertisements
In my system, for example, most of the pages I want to index well in search have URLs that combine the ID number of the database entry for that page, along with a constructed, scrubbed text version of the title.
Now, here's why this matters: the search bots, especially GoogleBot, pick and stick.
For example, in the classifieds section of my one website, there are currently four pages of classified ads. But, in the server logs, sure enough Google picks and sticks a 5.html request in after making a request for 4.html.
The thing is, this works against you if 5.html returns anything.
So, if we detect that 5.html doesn't have any real content, it is wise to tell the bots as much.
We do so like so:
header("HTTP/1.0 404 Not Found");
Make sure you send the header before echoing or printing any output. Now, Google notes that there isn't a page there and will not request in the future. More importantly, it will not include whatever output your CMS was belching out.
Handy, huh? Especially since a good website should return the best content possible to the search engine bots, right?
|
© 2008 Pro Content and Design. All rights reserved.
|
Tools
Check Google PageRank
Recent articles- Friggin objects nested in PHP arrays
- PageRank from a single link over many high PR sites
- PlayStation 3 browser pops up in logs
- eBay will be the first big, successful Web 1.0 company to die
- It's awesome when your joke takes off
- Domain name generator, plus WHOIS and PageRank features
- What I've been working on lately
- Even my spam tells Soviet Russia jokes
- Did a similar text function bite Yahoo in the ass?
- Copyright bullies
Welcome!
Wonder where to start with your web design business?
This blog follows along with my efforts to build and grow a website design business, Pro Content and Design.
The goal of this blog is to fill in blanks that may be empty as you get your business rolling.
This blog, particularly the source code section, is not intended for beginners. If you are not comfortable with databases, Ajax, DOM objects and other advanced methods, I strongly suggest you go take a look over at W3 Schools before even reading -- let alone tinkering with -- any of the code here.
I hope this blog has some value to web designers as they attempt to get their businesses going.
Good luck, and happy reading.
Thank you,
John Crawford
Pro Content and Design

Books
I highly recommend Art of the Start if you have no idea where to start with marketing.
Links
Coding
W3 Schools
IBM's Mastering Ajax Series
Graphic Design
Worth 1000
Stock.XCHNG
Urban Fonts
Website Software
Apache Web Server
SquirrelMail
PHP/Zend
Website Design Issues
Non-Standard Character Guide
Google Trends
Search Engine Optimization Analyzer
Business
Guy Kawasaki's Blog
Seth Godin's Blog
Freakonomics
Computers
NewEgg
My Main Website
Pro Content and Design
Websites I have built
PunxsyPage: local free classifieds website
Farm N Land: low-cost real estate listing website
InvestYoung: semi-defunct finance blog
Groundhog Festival: for the local summer festival
Weather Discovery Center
My Webapps
TV Stations Transmitter Database
Google PageRank Checker
Website where I did the code, database and admin
Tour de Toona: annual bicycle race in Altoona, PA
|