Html5-rdfa-wd-issues

From RDFaWiki

Jump to: navigation, search

Contents

Introduction

The HTML5+RDFa Working Draft was introduced on September 1st, 2009.

Issues

General comments

  • I found it very hard to follow this document, since it seems to assume full knowledge of RDFa in XHTML and only defines a delta. As a result:
    • It was hard for me to understand the actual processing model, so that I'd understand what I had to do as an implementor.
    • I had no notion of the syntax, so I wouldn't know what to do as an author.
    • As a reviewer, it was impossible for me to determine if the processing requirements were precisely specified, free of contradictions and sane.
  • To me, the document implies that the changes here apply only to HTML (as in the text/html serialization), but XHTML (even XHTML5) should be processed strictly according to XHTML+RDFa and nothing else. The changes actually unify RDFa Processing rules for both XHTML and HTML markup - that needs to be clarified in the document.
  • There have been some discussions of some modest extensions to RDFa to allow cleaner use in text/html in a way that allows DOM consistency between text/html and XML. For example, there was the idea to use a "prefix" attribute instead of xmlns: declarations to define CURIE prefixes, and also the idea to allow full URIs as an alternative to CURIEs. Have these ideas been rejected?
    • DEFERRED to the next version of RDFa. The prefix attribute and URIs in CURIEs will be covered in a more HTML/XHTML-unified RDFa specification - RDFa 1.1.
  • At the very least, references to the appropriate sections of XHTML+RDFa should be made explicit. Right now it seems there is a lot of implicit linkage.
    • ADDRESSED by linking each section in the spec to the related XHTML+RDFa section.
  • ISSUE I disagree with the characterization of the deferred issues as "implementation details".
  • ISSUE Integrate all of Philip Taylors feedback into the issues list: http://lists.w3.org/Archives/Public/public-html/2009Sep/0705.html
  • ISSUE Integrate addressable issues in Jonas Sicking's feedback into the issues list: http://lists.w3.org/Archives/Public/public-html/2009Sep/0745.html
  • ISSUE The browser-hosted dueling xmlns: attributes in the DOM issue is something that we should clarify in the spec. What happens if there is an attribute named xmlns:foo and a 'foo' item in the namespaces? The namespaces 'foo' value should override the xmlns:foo attribute.
  • ISSUE Cover what happens when the DOM is modified in the spec.

Status of this Document

2 Parsing model

3.1 Document Conformance

  • The Document Conformance requirements strike me as odd. They do not incorporate the full HTML5 conformance requirements (with appropriate extensions) or any of the RDFa conformance requirements, and the motivation for the specific requirements is unclear. The only MUST requirement is having a conforming HTML5 root element. It seems odd to call a document that only meets this one condition a "conforming HTML+RDFa document".
  • ISSUE It would be more consistent to use something like 'HTML5+RDFa1.0' as a value for the HTML5 superset to avoid confusion, because the semantics, content models and element and attribute collection of HTML5 are different (and partly incompatible) from XHTML1.1 and none of them is a superset of the other, therefore another version indication for the HTML5 variant seems to be essential to be able to distinguish.

3.2 User Agent Conformance

  • "excluding those features which are specifically overridden by this specification" it might be useful to list these.
    • ADDRESSED by linking to the "Modifications to XHTML+RDFa" section, which lists all features in XHTML+RDFa that are overridden by the HTML+RDFa specification.

4 Modifications to XHTML+RDFa

  • Are these modifications intended to apply only to the HTML serialization of HTML5? That seems like the intent but it's not totally clear from me.
  • One concern I have with only applying the changes to HTML: what if an RDFa processor has a parsed DOM, but does not know if the DOM was originally created from parsing HTML or XML? It would be better if a single set of rules could be used once you have a DOM, without having to know what kind it is, since the DOM itself does not directly expose that information.

4.1 Specifying the language for a literal

  • I suggest referring to the HTML5 rules for determining the language of a node.
  • The last paragraph of 4.1 confuses DOCTYPE and MIME type.
  • ISSUE The first and second sentence seem potentially contradictory. It should instead say that the node of the language is determined according to the HTML5 rules and the result is then used in the processing model in the place where RDFa in XHTML says xml:lang is used.

4.2 Invalid XMLLiteral values

  • Do XMLiteral values only need to be well-formed, or do they need to be namespace well-formed? I think the definition here should point to a definition of well-formed XML fragment.
  • Section 4.2 talks about well-formed XML, but XMLLiterals are generally not well-formed (i.e. they aren't guaranteed to have a single root)
  • Section 4.2 should probably require the Coercion to Infoset rules be applied prior to applying the serialization algorithm.
  • It seems like the serialization as XHTML5 per the HTML5 spec rules should always be done. A DOM fragment doesn't really have a notion of being well-formed XML or not - it needs to be serialized somehow. And it probably makes sense to use the HTML5 algorithm regardless of whether the source DOM tree was HTML or XML. This might avoid the need to link to any well-formedness definitions (not sure though).
  • ISSUE The XMLLiteral example is wrong because it includes namespaces that are not used in the literal.

4.3 The xmlns: attribute

  • This section seems to lack technical precision. Some specific points:
    • There's no such thing as "the xmlns: attribute"; xmlns is a predefined namespace prefix, not an attribute at all
      • ADDRESSED by removing all references to "the xmlns: attribute" and using the phrase "xmlns:-prefixed attributes" or "attributes that start with xmlns:"
    • "CURIE prefix mappings specified using xmlns:" does not clearly specify how attributes starting with xmlns: turn into prefix mappings. The processing model for this should be defined precisely.
      • ADDRESSED by referencing Section 5.4 of the XHTML+RDFa specification that specifies how attributes starting with xmlns: are transformed into prefix mappings. The processing model is defined precisely in the XHTML+RDFa spec and has more than 8 conformant implementations demonstrating the understandability of the XHTML+RDFa text.
    • The reference to [Namespaces in XML] doesn't really help, because it defines how namespace declarations work in XML only, and does not have anything to say about HTML or CURIEs for that matter.
      • ADDRESSED by removing the reference to Namespaces in XML, as the XHTML+RDFa spec specifies the exact behavior of attributes starting with xmlns:, generating prefix mappings and using CURIEs.
    • Attributes starting with xmlns: will look different in HTML and XML DOMs, and perhaps different still when created with DOM API calls, the text should take account of this.
      • DEFERRED because this is an implementation detail that can be easily addressed by providing implementation details for various languages in a separate implementers note (on the rdfa.info/wiki or in a separate document).
    • ISSUE Please clarify what 4.3 intends to say in terms of DOM Level 2 or the Infoset. (need more feedback from Henri regarding this change, specifically, why he thinks the section needs to discuss attributes starting with xmlns: in terms of DOM Level 2 or the Infoset)
  • The draft references Namespaces in XML, not Section 5.5 of RDFa in XHTML.
    • ADDRESSED by referencing Section 5.4 of the XHTML+RDFa specification.
  • "Next the [current element] is parsed for [URI mapping]s... Mappings are provided by @xmlns. The value to be mapped is set by the XML namespace prefix, and the value to map is the value of the attribute—a URI."
    • DEFERRED because the language should be corrected in the XHTML+RDFa specification Errata document. An e-mail has been sent to one of the editors of XHTML+RDFa.
  • Since HTML doesn't really have a notion of XML namespace prefix, the processing rules need to be defined in terms of the textual name of the attribute for HTML DOMs; you can't soundly reference XML-only concepts to define things for HTML.
      • ADDRESSED by referencing Section 5.4 of the XHTML+RDFa spec, which doesn't reference any XML-only concepts when specifying the algorithm that should be used to generate the mappings.

5 Modifications to HTML5

5.1 Preservation of the profile keyword

  • I'd suggest renaming this "The profile link type"; unknown rel values are not dropped from the DOM, so there's nothing to preserve; the change is adding an additional conformance link type.
  • I'd suggest restating the sentence to say something like "for content conforming to this specification, profile is a conforming link type.
  • I'd suggest adding a table row that matches the table in HTML5 Section 6.12.3 Link types to define all the needed info.
  • HTML+RDFa extends HTML5 by making <link rel="profile"> and xmlns:* attributes conforming (that seems to be the intent at least). But it does not seem to make @rev, @content, @about, @property, @resource, @datatype or @object conforming or define their conforming values in HTML5. It also does not make CURIEs allowed rel values in HTML5. Thus, it does not seem to actually make RDFa syntax legal for text/html documents.
    • ADDRESSED by adding a section called "The RDFa Attributes and Valid Values" that normatively references Section 2.1 of the XHTML+RDFa specification.

5.2 Preservation and Validation of xmlns:

  • I suggest changing the section header to "Attributes that start with xmlns:"
    • ADDRESSED by renaming the section header to "Attributes that Start with xmlns:".
  • The requirement to preserve attributes starting with "xmlns:" is redundant with what HTML5 already requires. I suggest changing it to an informative note that HTML5 parsing will have this result, or striking that requirement entirely.
    • ADDRESSED by noting that HTML5 preserves attributes starting with "xmlns:", but also specifying that those attributes should be considered conformant.
  • The change should be stated as a conformance change instead of a requirement for validators not to generate warnings (they will in fact generate errors, not warnings in text/html). Something like: "For documents conforming to this specification, attributes with names that have the case insensitive prefix 'xmlns:' are conforming."
    • ADDRESSED by adding the test specified above.
  • Which attribute names starting with 'xmlns:' are conforming - all of them? Even ones in uppercase or mixed case? Even ones that are not namespace-well-formed XML namespace declarations? Even ones that would trigger strange error behavior in text/html parsing? I think there actually needs to be a specific syntax for the additional attributes that are allowed, and it should be defined here. Then the spec can say that attributes with names matching that syntax are conforming.
    • DEFERRED because the definition of which xmlns: values are conforming should be specified in the main HTML5 specification because the XHTML5 serialization needs to have this defined as well. The conforming values for xmlns: should be the same for both HTML5 and XHTML5.
  • There should probably also be a requirement that the attribute value is whatever Namespaces in XML allows as values for namespace declarations
    • DEFERRED because there was a previous issue raised that non-XML mode HTML+RDFa should not reference the Namespaces in XML document for any normative language. The spec now also references XHTML+RDFa Section 5.4 on how to process attributes starting with xmlns: without referring to Namespaces in XML.
  • ISSUE Please clarify what 5.2 means for processors that expose the content of a text/html document using a non-DOM API view to the Infoset (e.g. XOM). (this probably belongs in an implementation guide)
  • So is it the idea to leave the issue of xmlns and xmlns:<prefix> being different in the DOM generated from a text/html byte stream and the DOM generated from a text/xml byte stream unaddressed? I don't see how that can work without more text because xmlns and xmlns:<prefix> as generated from a text/html byte stream do not carry the namespace semantics xmlns and xmlns:<prefix> from a text/xml bytes stream do. (They are not in a namespace.)
    • DEFERRED because this is an implementation detail that will be outlined in rdfa.info/wiki or a separate implementors guide document.

Review E-mails

Command to generate the above list:

grep "HTML+RDFa" rdfatfsept.html | sed -re 's/.*href="(.*.html)".*<em>(.*)<\/em>.*/\* http:\/\/lists.w3.org\/Archives\/Public\/public-rdf-in-xhtml-tf\/2009Sep\/\1 - \2/' | sort | cut -f 1 -d '"'