Yahoo! doesn't! know! what! tags! are!
Published 8 May 2007
Hi! You've stumbled upon a blog post by a guy named Ryan. I'm not that guy anymore, but I've left his posts around because cool URIs don't change and to remind me how much I've learned and grown over time.
Ryan was a well-meaning but naïve and priviledged person. His views don't necessarily represent the views of anyone.
I work with people learning HTML and other Web languages a lot, and one of my biggest frustrations is when the teachers confuse their students with sloppy language. Yahoo! today unveiled a new scheme to exclude parts of pages from search engines, and while I have a lot of respect for the work Yahoo! is doing with their developer network, the poorly written feature description is amateurish and confusing. If Yahoo! wants to remain a technology leader, they must take more pride in their work.
A little history
Tags are building blocks of SGML, the meta-language on which HTML was based. Tagging allows authors to define semantic information about what certain parts of a document are (“this is a link”), and can add extra information about the tagged text (the page to which the link points) which may then change presentation or behavior (e.g. making said text blue and underlined, causing a new page to load when clicked). This extra information is stored in attributes, which each have a name and a value. Some attributes can contain lists of values, or tokens that are separated by spaces. Together the text, tags, attributes and values form elements, for example:
This was all fine, until accessibility became a hot topic in Web design. Suddenly people who didn't know very much about writing HTML but knew quite a bit about how the visually impaired used computers. Their suggestion: Always add ALT tags to your images
. Many people still say this--clearly it's a good practice, but how?
<alt><img src="image.jpg"> Text</alt>
<img src="image.jpg"><alt>Text</alt>
<img src="image.jpg"><alt>Text</alt></img>
Problem is, these people meant setting a value for the alt
attribute, but instead confused a lot of aspiring Web developers. But then it got worse.
Sites like Delicious and Flickr came along and decided that "tag" was an exceptionally sexy way to say, "label." Technorati employees (and some friends) decided a link could be a label and the last part of some URLs could be used for categories, but they called it a tag--but only if it had a value of "tag" for its rel
attribute.
Confused? Me too. Today we have several working definitions of "tag":
- The beginning and ending delimeters for HTML elements
- The value of an attribute within an HTML element
- A label or category for a link, photo, blog post, person in an HTML document
- The last part of an URL
- A value for the
rel
attribute of a link that labels something else
When teaching HTML, the biggest challenge becomes herding students away from all of the “bad knowledge”--out of date practices, sloppy and inaccurate language and confusing terms coming from people who should know better.
Yahoo!: tagging tags with tags
The Yahoo! search blog (linked above) and the robots-nocontent explanation describes the process as a "robots-nocontent" tag
. So I mark my pages like this?
<robots-nocontent>Text not to index</robots-nocontent>
Of course not. That would violate the HTML spec. Oh, so its an attribute with no value? Perhaps, in the XHTML world:
<div robots-nocontent="robots-nocontent">Text not to index</div>
That's just as bad! But Yahoo later explains that its a class=robots-nocontent attribute
. Ohhhhhhhh. That means it should be:
<div class=robots-nocontent="What?">Text not to index</div>
Now that's just plain ridiculous.
What they're really saying
Yahoo defined a value for the class
attribute. This is, semantically a Good Thing™--class
exists to add information about what a block of text is. Technically, however, class
contains a space-separated lists of values. Luckily, they're not saying if their engine supports that.
Seriously, Yahoo! You should know better. Their language is sloppy. It's confusing. Use your words Yahoo! The Internet will be better for it.