Best-practice-semantically-correct-xhtml
From RDFaWiki
Contents |
Generating Semantically Correct XHTML
It is important that the XHTML that you author, is not only syntactically correct, but semantically correct as well. Placing semantic data into your XHTML page that has nothing to do with your site, but is instead intended to boost it's search engine rank and or web site placement is strongly frowned upon.
To a lesser degree, it is possible to mistakenly generate semantically incorrect XHTML+RDFa. For example, the XHTML vocabulary document defines a number of vocabulary terms that may be accidentally mis-used and abused.
Example
In the following example, the author intended to keep the copyright on their paper, but release an audio recording of the paper as Creative Commons Attribution 3.0. They, however, mistakenly use the following markup:
<body>
<p id="#mypaper">
An audio recording of this paper is released under a
<a rel="license" href="http://creativecommons.org/licenses/by/3.0/">
Creative Commons Attribution 3.0 License
</a>
</p>
</body>
The markup above is valid, but semantically makes the wrong statement. Since the website author used "id" instead of "about" to specify the subject, RDFa would generate a triple stating that the page which contains all of the text of the paper is licensed under a CC Attribution 3.0 license instead of the audio recording. While this is clearly a mistake, a search engine indexing Creative Commons works on the Internet would not be able to detect the semantic error and might archive the page as a CC Attribution 3.0 work. The ramifications are worse if the author intended to keep their copyright on the text, but release an audio recording of the work under the public domain.
Reasoning
Computing Agents
In the example given above, if you state the wrong thing on your site semantically, computing agents may do the wrong thing with your data. Your copyright may be violated inadvertently, or it could result in a very bad experience for visitors to your website who are using web browsers that are aware of semantic data.
Search Engines
Early in the Web Search Engine days you could tag your page with meta elements that would associate keywords with your web page. It was not long before unscrupulous web site authors placed popular keywords on their web pages instead of keywords that reflected their site's content. To fight the meta/link spam, search engines stopped using the meta/link elements to affect their search ranking and eventually blocked the worst abusers of the meta/link elements.
The same issues are going to crop up again when Semantic Data spam is discovered by the less scrupulous Search Engine Optimization companies. The search engine companies will once again adjust their methods to ignore, block or ban such abuses.
Care must be taken when authoring semantic data. Semantic data that is correct will draw more visitors to your site than semantic data that is incorrect.