TOC 
DraftM. Wahl
 Informed Control Inc.
 September 6, 2007


Identity Schema Element Metadata: Basic Retrieval

Abstract

This document defines a procedure for retrieving metadata, expressed in RDF as RDF/XML or XHTML with RDFa, about an identity schema or schema element, by contacting the site named in the schema or schema element URI using HTTP or HTTPS.



Table of Contents

1.  Introduction
2.  Metadata Retrieval
    2.1.  Building the HTTP Request
    2.2.  Receiving the HTTP Response
    2.3.  Parsing the Returned Document to RDF Triples
        2.3.1.  Parsing application/rdf+xml
        2.3.2.  Parsing application/xhtml+xml
    2.4.  Predicates
3.  Example
4.  Security Considerations
5.  References
    5.1.  Normative References
    5.2.  Informative References
Appendix A.  Copyright
§  Author's Address




 TOC 

1.  Introduction

This document defines a procedure by which a retriever can obtain a description of an identity attribute type, an identity claim type, or an identity schema, from a web site.

The procedure defined in this document is applicable for some InfoCard (Microsoft, “A Technical Reference for InfoCard v1.0 on Windows,” August 2005.) [InfoCard.interop] claim types, OpenID AX (Hardt, D., Bufu, J., and J. Hoyt, “OpenID Attribute Exchange 1.0 - Draft 07,” August 2007.) [OpenID.attribute‑1.0] attribute types, and SAML 2.0 (, “Assertions and Protocols for the OASIS Security Assertion Markup Language (SAML) V2.0,” March 2005.) [SAML] attribute types.

The returned metadata of the identity schema or schema element is described using RDF (Resource Description Framework (Klyne, G. and J. Carroll, “Resource Description Framework (RDF): Concepts and Abstract Syntax,” February 2004.) [RDF.Concepts]).

This document does not specify procedures for any of the scenarios in which:

The following namespace prefixes are used in this document:

xml: for the XML xml:base and xml:lang attributes

rdf: as defined in the RDF syntax specification (Beckett, D., “RDF/XML Syntax Specification (Revised),” February 2004.) [RDF.SyntaxGrammar]: http://www.w3.org/1999/02/22-rdf-syntax-ns#

rdfs: as defined in the RDF schema (Brickley, D. and R. Guha, “RDF Vocabulary Description Language 1.0: RDF Schema,” February 2004.) [RDF.Schema] specification: http://www.w3.org/2000/01/rdf-schema#

owl: as defined in the OWL (, “OWL Web Ontology Language Reference,” February 2004.) [OWL.reference] specification: http://www.w3.org/2002/07/owl#

dc: as defined in the Dublin Core (, “Dublin Core Metadata Element Set, Version 1.1,” December 2006.) [DC.es] specification: http://purl.org/dc/elements/1.1/

higgins: as defined in the Higgins ontology (, “Higgins Ontology,” .) [Higgins.Ontology] specification: http://www.eclipse.org/higgins/ontologies/2006/higgins.owl#

ex: an example schema ontology, http://www.example.om/schema.rdf#

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 (Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” March 1997.) [RFC2119].

Please send comments to the identity schemas WG mailing list at idschemas@idcommons.net.



 TOC 

2.  Metadata Retrieval

The input to the metadata retrieval procedure is a single Uniform Resource Identifier (the input URI), as defined in RFC 3986 (Berners-Lee, T., Fielding, R., and L. Masinter, “Uniform Resource Identifier (URI): Generic Syntax,” January 2005.) [RFC3986], which is typically used by an application as an identifier for an attribute type or claim type.

This document assumes the input URI has one of the two URI schemes http or https. Other schemes are not currently supported by this mechanism, but might be addressed in future revisions of this document, or in companion documents.



 TOC 

2.1.  Building the HTTP Request

When the input URI is of the http scheme, a retriever SHOULD send a GET request using the HTTP/1.1 protocol, as described in the HTTP RFC 2616 (, “Hypertext Transfer Protocol -- HTTP/1.1,” June 1999.) [RFC2616].

If the URI is of the https scheme, then a retriever SHOULD send a GET request using the HTTP/1.1 protocol layered atop SSL or TLS, as described in the HTTPS specification RFC 2818 (Rescorla, “HTTP Over TLS,” May 2000.) [RFC2818]. A retriever MUST implement the requirements on choice of transport layer security mechanisms specified in section 3 of the WS-I Basic Security Profile (, “Basic Security Profile Version 1.0,” March 2007.) [WSI.BasicSecurityProfile].

Alternatively, in place of sending a a retriever that has cached a copy of a document retrieved from that base URI (the URI without a fragment) MAY first send a HEAD request, to determine if the document has changed.

In the header of a GET or HEAD request, a retriever MUST include an "Accept" request header with one or both of the media types "application/rdf+xml" as defined in RFC 3870 (Swartz, “application/rdf+xml Media Type Registration,” September 2004.) [RFC3870] or the media type "application/xhtml+xml" as defined in XHTML Media Types (W3C, “XHTML Media Types,” August 2002.) [W3C.XhtmlMediaTypes]. A retriever SHOULD support both application/rdf+xml (RDF/XML) and application/xhtml+xml (XHTML with RDFa). The retriever MAY specify other media types besides those.

Note that URI fragments are not sent to the web server.



 TOC 

2.2.  Receiving the HTTP Response

If the HTTP response status code is in the redirection (3XX) range, then a retriever SHOULD follow the redirection to locate the document. A retriever SHOULD NOT rewrite the input URI based on this redirection.

If the HTTP response status code is in the error (4XX or 5XX) ranges, then a retriever SHOULD abort the procedure: no metadata is available from the web site in RDF.

Otherwise, a retriever MUST check that the media type returned in the Content-Type header of the response has a suffix "+xml". If the media type does not have the suffix "+xml", then the retriever SHOULD abort the procedure, as no metadata is available from the web site in RDF/XML or RDFa. (This is to prevent an RDF parser from attempting to parse a text/html document, which would typically result in error messages.)

A retriever MUST permit the returned XML content to be encoded in either the UTF-8 RFC 3629 (Yergeau, F., “UTF-8, a transformation format of ISO 10646,” November 2003.) [RFC3629] or UTF-16 RFC 2781 (Hoffman, P. and F. Yergeau, “UTF-16, an encoding of ISO 10646,” February 2000.) [RFC2781] charset encodings, and MUST allow the returned content to have an ISO 10646 Byte Order Mark at the beginning of the content. (This is encouraged by the WS-I Basic Profile (, “Basic Profile Version 1.2,” March 2007.) [WSI.BasicProfile] requirements R4001 and R1010).

A retriever MUST allow an XML signature to be present. It is a local policy choice whether the retriever requires a signature to be present, or how the retriever obtains the certificate path necessary to validate the signature.



 TOC 

2.3.  Parsing the Returned Document to RDF Triples

An RDF-based XML document can be converted into a collection of RDF triples. Each RDF triple has a subject, a predicate URI and an object.

The metadata of the schema or schema element (e.g., the attribute or claim type) of interest is provided by the set of RDF triples from the document in which the subject of the RDF triple is the input URI of the schema or schema element. There might be RDF triples for other subjects present in the returned document.

If a fragment was part of the input URI, then the RDF triples that provide the metadata for the schema element are those in which the subject of the RDF triples match the input URI.

The following two sections cover how a retriever parses a document returned in the Content-Type application/rdf+xml (RDF/XML) and a document returned in the Content-Type application/xhtml+xml (XHTML with RDFa).



 TOC 

2.3.1.  Parsing application/rdf+xml

A document with a Content-Type of application/rdf+xml is to be parsed as RDF/XML. A retriever MUST start parsing the section of the XML document which is contained by an element RDF in the XML namespace "http://www.w3.org/1999/02/22-rdf-syntax-ns#". A retriever MUST allow that other namespaces be present. If there is no RDF element present, then the retriever SHOULD abort the procedure.

The subject of RDF triples is specified using the rdf:about and rdf:ID XML attributes of the elements of the returned document. A retriever MUST parse any xml:base, rdf:ID and rdf:about XML attributes of the elements in the returned XML document, as described in sections 2.14 and 2.17 of the RDF/XML Syntax Specification (Beckett, D., “RDF/XML Syntax Specification (Revised),” February 2004.) [RDF.SyntaxGrammar]. This is typically handled by a RDF parser software component of the retriever.

The retriever MUST allow the subject to be expressed using any of the rdf:about or rdf:ID conventions. For example, if the input URI is http://www.example.com/schema.rdf#first, and the returned document has a base URI of http://www.example.com/schema.rdf, then all four of the following elements in that document describe the same schema element:

<rdf:Description rdf:about="http://www.example.com/schema.rdf#first"> ... </rdf:Description>
<rdf:Description rdf:about="#first"> ... </rdf:Description>
<rdf:Description rdf:ID="first"> ... </rdf:Description>
<owl:ObjectProperty rdf:ID="first"> ... </owl:ObjectProperty>

A fragment is case sensitive. The following rdf:Description does NOT describe that same schema element:

<rdf:Description rdf:ID="First"> ... </rdf:Description>


 TOC 

2.3.2.  Parsing application/xhtml+xml

A document with a Content-Type of application/xhtml+xml is to be parsed as XHTML containing RDFa (Adida, B. and M. Birbeck, “RDFa Primer 1.0: Embedding RDF in XHTML; W3C Editors' Draft,” September 2007.) [RDFa.Primer] markup.

A retriever MUST parse any xml:base, xml:lang, lang, xmlns, about, rel, rev, property, href, resource, src, datatype, content and instanceof XML attributes of the elements in the returned XML document, as described in sections 2.3 and 3.1 of the RDFa Syntax (Birbeck, M., Pemberton, S., Adida, B., and S. McCarron, “RDFa Syntax: A collection of attributes for layering RDF on XML languages; W3C Editors' Draft,” September 2007.) [RDFa.Syntax].

The subject of RDF triples in RDFa is typically specified using the about attributes of the elements of the returned document. A retriever MUST also recognize the xml:base and rev XML attributes, as these can affect how the subject is specified.



 TOC 

2.4.  Predicates

A retriever SHOULD recognize the predicates listed in the documents Identity Schema Element Metadata: Using Existing Ontologies (Wahl, M., “Identity Schema Element Metadata: Using Existing Specifications,” September 2007.) [Schema.Existing] and Identity Schema Element Metadata: New Ontology (Wahl, M., “Identity Schema Element Metadata: New Specification,” September 2007.) [Schema.New].

For example, a retriever SHOULD use the value or values of the RDF triples with predicate rdfs:label as a short plain text description of the schema or schema element, and the value or values of the RDF triples with predicate rdfs:comment as a long plain text description of the schema or schema element. (There may be multiple RDF triples for a given predicate URI, with different xml:lang or lang values).



 TOC 

3.  Example

For example, if the input URI for a schema element is http://www.example.com/schema.rdf#age, then the retriever would send a GET request to the HTTP port of www.example.com:

GET /schema.rdf HTTP/1.1
Host: www.example.com
Accept: application/rdf+xml, application/xhtml+xml

If the file is stored in RDF/XML, the web server returns (where nnn in the second line of the header is the length of the XML file in bytes),

HTTP/1.1 200 OK
Content-Length: nnn
Content-Type: application/rdf+xml

<?xml version="1.0"?>
<rdf:RDF xml:base="http://www.example/com/schema.rdf"
         xmlns:ex="http://www.example.com/schema.rdf#"
         xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
         xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
         xmlns:owl="http://www.w3.org/2002/07/owl#"
         xmlns:higgins="http://www.eclipse.org/higgins/ontologies/2006/higgins.owl#">
<owl:Ontology rdf:about="">
 <rdfs:label>Example schema containing one attribute type.</rdfs:label>
</owl:Ontology>

<rdf:Description rdf:ID="age">
 <rdfs:label>Age</rdfs:label>
 <rdfs:label xml:lang="de">Alter</rdfs:label>
 <rdfs:label xml:lang="fr">&#xC2;ge</rdfs:label>
 <rdfs:comment>How old a person is (in years)</rdfs:comment>
 <rdf:type>
  <rdf:Description rdf:about="http://www.w3.org/2002/07/owl#ObjectProperty"/>
 </rdf:type>
 <rdfs:subPropertyOf>
  <rdf:Description rdf:about="http://www.eclipse.org/higgins/ontologies/2006/higgins.owl#attribute"/>
 </rdfs:subPropertyOf>
</rdf:Description>

</rdf:RDF>

If the file is stored in RDFa, the web server returns (where nnn in the second line of the header is the length of the XML file in bytes),

HTTP/1.1 200 OK
Content-Length: nnn
Content-Type: application/xhtml+xml

<?xml version="1.0"?>
<html xmlns="http://www.w3.org/1999/xhtml"
      xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
      xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#">
<head about="">
 <title property="rdfs:label">Example schema containing one attribute type.</title>
 <link rel="rdf:type" href="http://www.w3.org/2002/07/owl#Ontology" />
</head>
<body>

 <ul about="#age">
  <li><span property="rdfs:label">Age</span></li>
  <li><span property="rdfs:label" lang="de">Alter</span> (German)</li>
  <li><span property="rdfs:label" lang="fr">&#xC2;ge</span> (French)</li>
  <li>Comment: <span property="rdfs:comment">How old a person is (in years)</span></li>
  <li>This is an <a rel="rdf:type" href="http://www.w3.org/2002/07/owl#ObjectProperty">OWL ObjectProperty</a>.</li>
  <li>This is a sub-property of a
   <a rel="rdfs:subPropertyOf" href="http://www.eclipse.org/higgins/ontologies/2006/higgins.owl#attribute">Higgins attribute</a>.</li>
  </ul>
</body>
</html>

For both the RDF/XML and RDFa document formats, the document base URI is http://www.example.com/schema.rdf, and the document contains the following RDF triples for the requested http://www.example.com/schema.rdf#age attribute type:

ex:age   rdfs:label          "Age"
ex:age   rdfs:label          "Alter"  (in locale for language "de")
ex:age   rdfs:label          "Âge"    (in locale for language "fr")
ex:age   rdfs:comment        "How old a person is (in years)"
ex:age   rdf:type            owl:ObjectProperty
ex:age   rdfs:subPropertyOf  higgins:attribute


 TOC 

4.  Security Considerations

As with other scenarios for HTTP-based clients, retrievers SHOULD implement a local policy on contacting URIs received from unfamiliar sources. For example, an attacker might use claim types of URIs that have side effects.

It is anticipated that the majority of Internet-facing services which provide a schema retrieval service will be providing publically-vislble schema. The protection against disclosure of private schema definitions, through authentication and access control checks, is outside of the scope of this document.

When not using the HTTPS transport protocol, there is a possibility for the XML documents to be modified while in transit. There is also a possibility for an alternative XML document to be provided to the retriever by an attacker in place of the intended XML document, if the attacker can spoof the identity of the contacted web site.

Security considerations for RDF in XML are included in section 6 of RFC 3870 (Swartz, “application/rdf+xml Media Type Registration,” September 2004.) [RFC3870], for XML media types in general are included in section 10 of RFC 3023 (Murata, M., St.Laurent, S., and D. Kohn, “XML Media Types,” January 2001.) [RFC3023], and for media types which trigger directives on the receiver are included in section 2.2.6 of RFC 2048 (Freed, N., Klensin, J., and J. Postel, “Multipurpose Internet Mail Exceptions (MIME) Part Four: Registration Procedures,” November 1996.) [RFC2048]. .

Security considerations for directory schema are also included in section 5 of LDAPv3 Schema for User Applications (Sciberras, A., “Lightweight Directory Access Protocol (LDAP): Schema for User Applications,” June 2006.) [RFC4519].



 TOC 

5.  References



 TOC 

5.1. Normative References

[DC.es] Dublin Core Metadata Element Set, Version 1.1,” December 2006.
[Higgins.Ontology] “Higgins Ontology” (HTML, OWL).
[OWL.reference] OWL Web Ontology Language Reference,” February 2004.
[RDF.Concepts] Klyne, G. and J. Carroll, “Resource Description Framework (RDF): Concepts and Abstract Syntax,” February 2004.
[RDF.Schema] Brickley, D. and R. Guha, “RDF Vocabulary Description Language 1.0: RDF Schema,” February 2004.
[RDF.SyntaxGrammar] Beckett, D., “RDF/XML Syntax Specification (Revised),” February 2004.
[RDFa.Primer] Adida, B. and M. Birbeck, “RDFa Primer 1.0: Embedding RDF in XHTML; W3C Editors' Draft,” September 2007.
[RDFa.Syntax] Birbeck, M., Pemberton, S., Adida, B., and S. McCarron, “RDFa Syntax: A collection of attributes for layering RDF on XML languages; W3C Editors' Draft,” September 2007.
[RFC2119] Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels,” BCP 14, RFC 2119, March 1997 (TXT, HTML, XML).
[RFC2616] Hypertext Transfer Protocol -- HTTP/1.1,” RFC 2616, June 1999.
[RFC2781] Hoffman, P. and F. Yergeau, “UTF-16, an encoding of ISO 10646,” RFC 2781, February 2000.
[RFC2818] Rescorla, “HTTP Over TLS,” RFC 2818, May 2000.
[RFC3023] Murata, M., St.Laurent, S., and D. Kohn, “XML Media Types,” January 2001.
[RFC3629] Yergeau, F., “UTF-8, a transformation format of ISO 10646,” STD 63, RFC 3629, November 2003.
[RFC3870] Swartz, “application/rdf+xml Media Type Registration,” RFC 3870, September 2004.
[RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, “Uniform Resource Identifier (URI): Generic Syntax,” RFC 3986, January 2005.
[Schema.Existing] Wahl, M., “Identity Schema Element Metadata: Using Existing Specifications,” September 2007.
[Schema.New] Wahl, M., “Identity Schema Element Metadata: New Specification,” September 2007.
[W3C.XhtmlMediaTypes] W3C, “XHTML Media Types,” August 2002.
[WSI.BasicSecurityProfile] Basic Security Profile Version 1.0,” March 2007.


 TOC 

5.2. Informative References

[InfoCard.interop] Microsoft, “A Technical Reference for InfoCard v1.0 on Windows,” August 2005.
[OpenID.attribute-1.0] Hardt, D., Bufu, J., and J. Hoyt, “OpenID Attribute Exchange 1.0 - Draft 07,” August 2007.
[RFC2048] Freed, N., Klensin, J., and J. Postel, “Multipurpose Internet Mail Exceptions (MIME) Part Four: Registration Procedures,” RFC 2048, November 1996.
[RFC4519] Sciberras, A., “Lightweight Directory Access Protocol (LDAP): Schema for User Applications,” RFC 4519, June 2006.
[SAML] Assertions and Protocols for the OASIS Security Assertion Markup Language (SAML) V2.0,” March 2005.
[WSI.BasicProfile] Basic Profile Version 1.2,” March 2007.


 TOC 

Appendix A.  Copyright

Copyright (C) Informed Control Inc. (2007).

This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, AND THE ORGANIZATION HE/SHE REPRESENTS DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.



 TOC 

Author's Address

  Mark Wahl
  Informed Control Inc.
  PO Box 90626
  Austin, TX 78709
  US
Email:  mark.wahl@informed-control.com