Okay, that headline was just to get your attention. I'm sure this is old news
to many but I recently decided to learn a little about XSLT. The XSLT language is a
standard for converting XML (a standard format for storing data) into pretty much any
other format.
An XML file is a file meant for a computer program to read to extract data. They
are semi human readable. Here's some sample XML from a my website RSS feed
<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
xmlns:admin="http://webns.net/mvcb/"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<channel>
<title>Greggman.com</title>
<link>http://greggman.com/</link>
<description>Games, Gadgets, Gregg and stuff about Japan</description>
<dc:language>en-us</dc:language>
<dc:date>2005-00-15T17:00:03+09:00</dc:date>
<sy:updatePeriod>hourly</sy:updatePeriod>
<sy:updateFrequency>1</sy:updateFrequency>
<sy:updateBase>2000-01-01T12:00+00:00</sy:updateBase>
<item>
<title>Indie Games have Arrived</title>
<link>http://greggman.com/edit/editheadlines/2005-01-04.htm</link>
<description>Game Tunnel picked their indie games of the year and
I gotta say, I'm pretty impressed.</description>
<guid isPermaLink="false">
http://greggman.com/edit/editheadlines/2005-01-04.htm
</guid>
<dc:subject>games</dc:subject>
<dc:date>2005-01-04T19:00:00+09:00</dc:date>
</item>
</channel>
</rss>
If you try to view that in your browser
you'll see pretty much the same thing.
You can see things like title, date, subject, etc... Because of those
keywords separating out the data other programs, like RSS readers, can parse and
extract that data. Without specific tags like that as far as another
program is concerned it would all be gibberish.
Unfortunately to a human it practically is gibberish but, add just one line like this
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="example.xsl"?>
<rss version="2.0"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
xmlns:admin="http://webns.net/mvcb/"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<channel>
<title>Greggman.com</title>
<link>http://greggman.com/</link>
<description>Games, Gadgets, Gregg and stuff about Japan</description>
<dc:language>en-us</dc:language>
<dc:date>2005-00-15T17:00:03+09:00</dc:date>
<sy:updatePeriod>hourly</sy:updatePeriod>
<sy:updateFrequency>1</sy:updateFrequency>
<sy:updateBase>2000-01-01T12:00+00:00</sy:updateBase>
<item>
<title>Indie Games have Arrived</title>
<link>http://greggman.com/edit/editheadlines/2005-01-04.htm</link>
<description>Game Tunnel picked their indie games of the year and
I gotta say, I'm pretty impressed.</description>
<guid isPermaLink="false">
http://greggman.com/edit/editheadlines/2005-01-04.htm
</guid>
<dc:subject>games</dc:subject>
<dc:date>2005-01-04T19:00:00+09:00</dc:date>
</item>
</channel>
</rss>
And now look at. You can follow
the link but it should look
something like this.
You can try it with your own RSS feed. Copy these 2 files to your computer (example.xsl)
and (example.css) (right
click and pick "save as"). Copy any RSS 2.0 feed to the same
folder. For example one of these feeds (arstechnica,
joelonsoftware,
wired,
cnn).
Edit the feed and add this line just below the first <?xml> line
<?xml-stylesheet type="text/xsl" href="example.xsl"?>
If there is no <?xml> line at the top then put that line first. Now open the
feed file in your browser. You can probably just double click it.
You don't need a separate RSS feed. Your
front page would BE your RSS feed.
Note that the XSLT has to be written for a specific version of RSS so you'd
need a different XSLT file for RSS 0.91 or RSS 1.0 (RDF) or Atom etc.
There's all this talk of the
semantic web coming sometime in the future but basically we can do it today!
Anybody running a standard blog it would take very little work to change your
pages to spit out XML (for example RDF) and add the line at the top that makes
your page formatted exactly like you already have it. The advantage is
that now other programs could read your page since the page would actually be XML
and all separated out telling the other programs what part is the content, what
the subject and title is, who wrote it etc.
The only legit problem is that some older browsers don't handle this but I
think for most blogs that's not an issue. Most people are running browsers
that handle this.
There are a couple more issues that I'm sure are just a lack of knowledge on
my part. One is that if you look at that page in Firefox it won't have a green
border. As far as I can tell that's a bug in Firefox. The other is
that I couldn't get XSLT to work with my
RSS 1.0 feed. I know that's
possible, I just need to dig a little harder and I got tired trying to figure it
out. Maybe tomorrow 
Another problem that may arise from this though is that search engines aren't yet ready to handle this, so you may get less hits from search engines. A good idea to counter this (and the browser limitations) though may be of doing all the XSLT processing server-side and outputting the XHTML that results from it. This also helps counter another current XSLT limitation: having more than one data source.
I'm sure you can figure it out easily about RSS 1.0+XSLT, I use XSLT on RSS 1.0, 2.0, Atom and other formats on my "portal". As long as the source is XML, it should work. : )