Copyright ©2000 W3C® (MIT, INRIA, Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.
This is the specification of the Platform for Privacy Preferences (P3P). This document, along with its normative references, includes all the specification necessary for the implementation of interoperable P3P applications.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. The latest status of this document series is maintained at the W3C.
This is a W3C Working Draft for review by W3C members and other interested parties. This document has been produced by the P3P Specification Working Group as part of the P3P Activity, and it is the second revision of the last call draft issued the 2nd of November 1999 (http://www.w3.org/TR/1999/WD-P3P-19991102). A change log is included at the end of this document for convenience. The last call period is expected to end on April 30, 2000. A revised version of this specification is expected to advance toward W3C Recommendation status after two interoperable implementations have been demonstrated.
This Working Draft includes an extension mechanism that can be used to extend the P3P vocabulary. The Working Group is particularly interested in feedback on how to improve this mechanism as well as examples of extensions that people would like to propose. These examples may be useful for improving the design of the extension mechanism. In addition, the Working Group may consider incorporating some of these ideas into the P3P vocabulary so that they need not be introduced later as extensions. The introduction to this document (Section 1) provides additional information about the status of this working draft and future versions of P3P.
While this document is in last call, it is still a draft document that may be updated, replaced, or obsoleted by other documents at any time. It is therefore inappropriate to use W3C Working Drafts as reference material or to cite them as other than "work in progress." A list of current W3C working drafts can be found at http://www.w3.org/TR/.
Please send comments to www-p3p-public-comments@w3.org (archived at http://lists.w3.org/Archives/Public/www-p3p-public-comments/).
The Platform for Privacy Preferences Project (P3P) enables Web sites to express their privacy practices in a standard format that can be retrieved automatically and interpreted easily by user agents. P3P user agents will allow users to be informed of site practices (in both machine- and human-readable formats) and to automate decision-making based on these practices when appropriate. Thus users need not read the privacy policies at every site they visit.
Although P3P provides a technical mechanism for ensuring that users can be informed about privacy policies before they release personal information, it does not provide a technical mechanism for making sure sites act according to their policies. Products implementing this specification MAY provide some assistance in that regard, but that is up to specific implementations and outside the scope of this specification. However, P3P is complementary to laws and self-regulatory programs that can provide enforcement mechanisms. In addition, P3P does not include mechanisms for transferring data or for securing personal data in transit or storage. P3P may be built into tools designed to facilitate data transfer. These tools should include appropriate security safeguards.
The P3P1.0 specification defines the syntax and semantics of P3P privacy policies, and the mechanisms for associating policies with Web resources. P3P policies consist of statements made using the P3P vocabulary for expressing privacy practices. P3P policies also reference elements of the P3P base data schema -- a standard set of data elements that all P3P user agents should be aware of. The P3P specification includes a mechanism for defining new data elements and data sets, and a simple mechanism that allows for extensions to the P3P vocabulary.
P3P version 1.0 is a protocol designed to inform Web users of the data-collection practices of Web sites. It provides a way for a Web site to encode its data-collection and data-use practices in a machine-readable XML format known as a P3P policy. The P3P specification defines:
The goal of P3P version 1.0 is twofold. First, it allows Web sites to present their data-collection practices in a standardized, machine-readable, easy-to-locate manner. Second, it enables Web users to understand what data will be collected by sites they visit, how that data will be used, and what data/uses they may "opt-out" of or "opt-in" to.
As an introduction to P3P, let us consider one common scenario which makes use of P3P. Sheila has decided to check out a store called TheCoolCatalog, located at http://www.thecoolcatalog.com/. Let us assume that TheCoolCatalog has placed P3P policies on all their pages, and that Sheila is using a Web browser with P3P built in.
Sheila types the address for TheCoolCatalog into her Web browser. When TheCoolCatalog's server returns their homepage, it also returns the P3P privacy policy which applies to that page. The policy states that the only data the site collects on its home page is the data found in standard HTTP access logs. Now Sheila's Web browser checks this policy against the preferences Sheila has given it. Is this policy acceptable to her, or should she be notified? Let's assume that Sheila has told her browser that this is acceptable. In this case, the homepage is displayed normally, with no pop-up messages appearing. Perhaps her browser displays a small icon somewhere along the edge of its window to tell her that a privacy policy was given by the site, and that it matched her preferences.
Next, Sheila clicks on a link to the site's online catalog. The catalog section of the site has some more complex software behind it. This software uses cookies to implement a "shopping cart" feature. Since more information is being gathered in this section of the web site, the Web server sends a new P3P policy to Sheila's browser. Again, let's assume that this policy matches Sheila's preferences, so she gets no pop-up messages. Sheila continues and selects a few items she wishes to purchase. Then she proceeds to the checkout page.
The checkout page of TheCoolCatalog requires some additional information: Sheila's name, address, credit card number, and telephone number. The web site sends a new P3P policy that describes the data that is collected here and states that her data will be used only for completing the current transaction, her order.
Sheila's browser examines this P3P policy. Imagine that Sheila has told her browser that she wants to be warned whenever a site asks for her phone number. In this case, the browser will pop up a message saying that this Web site is asking for her phone number, and then explains the contents of the P3P statement. Sheila can then decide if this is acceptable to her. If it is acceptable, she can continue with her order; otherwise she can cancel the transaction.
Alternatively, Sheila could have told her browser that she wanted to be warned only if a site is asking for her telephone number and was going to give it to third parties and/or use it for uses other than completing the current transaction. In that case, she would have received no prompts from her browser at all, and she could proceed with completing her order.
Note that this scenario describes one hypothetical implementation of P3P. Other types of user interfaces are also possible.
P3P policies use an XML encoding of the P3P vocabulary to identify the legal entity making the representation of privacy practices in a policy, enumerate the types of data or data elements collected, and explain how the data will be used. In addition, policies identify the data recipients, and make a variety of other disclosures including information about dispute resolution, and the address of a site's human-readable privacy policy. P3P policies must cover all relevant data elements and practices (but note that legal issues regarding law enforcement demands for information are not addressed by this specification; it is possible that a site that otherwise abides by its policy of not redistributing data to others may be required to do so by force of law). P3P declarations are positive, meaning that sites state what they do, rather than what they do not do. The P3P vocabulary is designed to be descriptive of a site's practices rather than simply an indicator of compliance with a particular law or code of conduct. However, user agents may be developed that can test whether a site's practices are compliant with a law or code.
P3P policies represent the practices of the site. Intermediaries such as telecommunication providers, Internet service providers, proxies and others may be privy to the exchange of data between a site and a user, but their practices may not be governed by the site's policies.
P3P1.0 user agents can be built into web browsers, browser plug-ins, or proxy servers. They can also be implemented as Java applets or JavaScript; or built into electronic wallets, automatic form-fillers, or other user data management tools. P3P user agents look for P3P headers in HTTP responses and in P3P META tags embedded in HTML content. These special headers and tags indicate the location of a relevant P3P policy. User agents can fetch the policy from the indicated location, parse it, and display symbols, play sounds, or generate user prompts that reflect a site's P3P privacy practices. They can also compare P3P policies with privacy preferences set by the user and take appropriate actions. P3P can perform a sort of "gate keeper" function for data transfer mechanisms such as electronic wallets and automatic form fillers. A P3P user agent integrated into one of these mechanisms would retrieve P3P policies, compare them with user's preferences, and authorize the release of data only if a) the policy is consistent with the user's preferences and b) the requested data transfer is consistent with the policy. If one of these conditions is not met, the user might be informed of the discrepancy and given an opportunity to authorize the data release themselves.
Web sites can implement P3P1.0 on their servers by translating their human-readable privacy policies into P3P syntax and configuring their servers to advertise the location of the P3P policy. Automated tools can assist sites in performing this translation. Many HTTP1.1 servers can be configured to support P3P1.0 without requiring the installation of additional software. Servers may be configured to insert a P3P extension header into all HTTP responses that indicates the location of a site's P3P policy, using the HTTP Extension Framework. Alternatively, they can be configured to insert this information into HTML content as a META tag. Web sites have some flexibility in how they use P3P: they can opt for one P3P policy for their entire site or they can designate different policies for different parts of their sites. A P3P policy MUST cover all data generated or exchanged as part of a site's HTTP interactions with visitors. In addition, some sites may wish to write policies that cover all data an entity collects, regardless of how the data is collected.
The P3P Specification Working Group removed significant sections from earlier drafts of the P3P1.0 specification in order to facilitate rapid implementation and deployment of a P3P first step. The group envisions the release of future versions of the P3P specification after P3P1.0 is deployed. This specification would likely include improvements based on feedback from implementation and deployment experience as well as four major components that were part of the original P3P vision but not included in P3P1.0:
This document, along with its normative references, includes all the specification necessary for the implementation of interoperable P3P applications.
The [ABNF] notation used in this specification is specified in RFC2234 and summarized in an Appendix. However, note that such syntax is only a grammar representative of the XML syntax: all the syntactic flexibilities of XML are also implicitly included; e.g. whitespace rules, quoting using either single quote (') or double quote ("), character escaping, comments, and case sensitivity. In addition, note that attributes and elements may appear in any order.
The following key words are used throughout the document and should be read as interoperability requirements. This specification uses words as defined in RFC2119 [KEY] for defining the significance of each particular requirement. These words are:
Note: concerns have been raised about the semantic meaning of the Policy-CC header defined in this section. The working group is currently investigating and there may be some major changes to this section in the next few weeks.
Referencing a privacy policy is one of the first steps in the operation of the P3P protocol. Services use policy references to state what policy applies to a specific URI or set of URIs. User agents will use policy references to locate the privacy policy which applies to a page, so that they can process that policy for the benefit of their user.
Policy references are extensively as a performance optimization. Privacy policies are typically several kilobytes of data, while a URI which references a privacy policy is typically less than 100 bytes. In addition to the bandwidth savings, policy references also reduce the need for computation: policies can be uniquely associated with URIs, so that a user agent need only parse and process a policy once rather than process it with every document to which the policy applies.
P3P policies may be associated with documents retrieved by HTTP in one of two ways. First, a P3P policy may be associated with any type of document through the addition of a Policy response-header. Secondly, P3P policies may be associated with HTML or XML documents through meta tags.
This document does not specify how P3P policies may be associated with documents retrieved by other means.
P3P makes use of the HTTP Extension Framework [HTTP-EXT]. The HTTP Extension Framework allows new HTTP headers to be defined and used.
All HTTP headers associated with a given extension in a request or response are to be prefixed by an arbitrary two-digit qualifier. The qualifier may be chosen by implementations on a per-message basis. This guarantees a unique namespace for the extension's headers. In addition, the extension must identify itself (with a URI) when it declares the namespace.
The HTTP Extension Framework requires a globally unique URI identifying the extension (the extension declaration). The P3P extension declaration is the following URI:
http://www.w3.org/2000/P3Pv1
P3P policies may be associated with any document retrieved by HTTP through the use of a new response header, the Policy header. The policy header contains the URI where the P3P policy can be fetched. This URI MUST NOT be used for any other purpose beyond identifying and referencing P3P policies (so, it MUST not be customized for users or sessions and used to maintain user browsing state).
The P3P extension declaration and policy header SHOULD be inserted whenever a P3P-enabled server responds to a relevant request, including when it responds to HEAD and OPTIONS requests.
The header syntax is:
[1] | policy-header |
= |
prefix `-Policy: ` URI |
Here, URI is defined as per RFC 2396 [URI]. prefix is the two-digit namespace declaration selected for the P3P headers in this message, according to [HTTP-EXT]. It may be any two-digit number that does not conflict with other namespace declarations in the response. |
In keeping with the rules for other HTTP headers, the Policy portion of this header may be written in any case.
1. Client makes a GET request.
GET /index.html HTTP/1.1 Host: thecoolcatalog.com Accept: */* Accept-Language: de, en User-Agent: WonderBrowser/5.2 (RT-11)
2. Server returns content and the Policy header pointing to the policy of the page.
HTTP/1.1 200 OK Opt: "http://www.w3.org/2000/P3Pv1"; ns=11 11-Policy: http://thecoolcatalog.com/P3PPolicy1.xml Content-Type: text/html Content-Length: 7413 Server: CC-Galaxy/1.3.18
There are also other two P3P headers, Prefix and Exclude, that allow a document to specify the policies corresponding to other documents. This is useful to allow the user agent to know in advance the policy pertaining to other documents, without having to issue separate requests for each of them in order to know their individual policies.
[2] | prefix-header |
= |
prefix `-Prefix: ` local-URI *(` ` local-URI) |
[3] | prefix-exclude |
= |
prefix `-Exclude: ` local-URI *(` ` local-URI) |
Here, local-URI is a local URI as defined per RFC 2396 [URI]. |
When a Prefix (and optionally, an Exclude) header is present in conjunction with a Policy header, it means that the policy specified in the Policy URI applies to all the URIs at the requested host corresponding to the local-URI(s) specified by the Prefix, but not specified by an Exclude header.
If no Prefix header is declared, it MUST be implicitly assumed that the policy applies to the current resource (as in Example 2.1).
Example 2.2
Client request:
GET /index.html HTTP/1.1 Host: thecoolcatalog.com Accept: */* Accept-Language: de, en User-Agent: WonderBrowser/5.2 (RT-11)
Server response:
HTTP/1.1 200 OK Opt: "http://www.w3.org/2000/P3Pv1"; ns=11 Opt: "http://www.w3.org/2000/P3Pv1"; ns=12 11-Policy: http://thecoolcatalog.com/P3PPolicy1.xml 12-Policy: http://thecoolcatalog.com/P3PPolicy2.xml 12-Prefix: /images/ /styles/ 12-Exclude: /images/banners/ Content-Type: text/html Content-Length: 7413 Server: CC-Galaxy/1.3.18
The server response in Example 2.2 indicates (as in the previous example) that http://www.thecoolcatalog.com/index.html is under policy http://thecoolcatalog.com/P3PPolicy1.xml. But now there is the extra information that all the URIs of the form http://thecoolcatalog/images/* (with the exception of those in http://thecoolcatalog/images/banners/*) and http://thecoolcatalog/styles/* are also under this policy.
Notice that Prefix and Exclude matching is done as a simple string prefix matching. As a result, a missing "/" at the end of a directory prefix might lead to unexpectant results. For example, the header "12-Exclude: /images/logos" (notice the missing "/" at the end) will not only exclude all resources in the "/images/logos/" subdirectory but also, for example, a file with the relative URI "/images/logoschool.jpg"!
Servers may serve HTML or XML content with embedded meta tags that indicate the location of the relevant P3P policy. This use of P3P does not require a P3P-aware server (content may be modified to include the embedded link tags without requiring any changes to the way the server operates). Note that the working group is currently investigating better syntaxes rather then the one presented in the current version of this specification.
The meta tags directly encode the information that could be expressed using the P3P Policy, Prefix and Exclude headers. Each sequence of extension headers identified by the extension declaration: "http://www.w3.org/2000/P3Pv1" can be directly expressed in HTML with meta tags using a block of the form:
[4] | p3p-html-block |
= |
`<meta name="P3Pv1-Policy" content="begin">` `<meta name="P3Pv1-Policy" content="` URI `">` [`<meta name="P3Pv1-Prefix" content="` local-URI *(` ` local-URI) `">`] [`<meta name="P3Pv1-Exclude" content="` local-URI *(` ` local-URI) `">`] `<meta name="P3Pv1-Policy" content="end">` |
For example, the policies expressed in Example 2.2 using HTTP headers could be expressed equally well by including in the web page http://thecoolcatalog.com/index.html the following piece of HTML:
<meta name="P3Pv1-Policy" content="begin"> <meta name="P3Pv1-Policy" content="http://thecoolcatalog.com/P3PPolicy1.xml"> <meta name="P3Pv1-Policy" content="end"> <meta name="P3Pv1-Policy" content="begin"> <meta name="P3Pv1-Policy" content="http://thecoolcatalog.com/P3PPolicy1.xml"> <meta name="P3Pv1-Prefix" content="/images/ /styles/"> <meta name="P3Pv1-Exclude" content="/images/banners"> <meta name="P3Pv1-Policy" content="end">
Note that a policy reference expressed by meta tags is fully equivalent to a policy reference expressed using HTTP headers. If user agents handle HTML, they MUST handle both formats (policy references in HTTP headers or in meta tags) interchangeably; none of the two methods overrides declarations made in the other format. See also the requirements for non-ambiguity.
The very first rule of policy references is that of non-ambiguity: For each resource at a Website there MUST be at most one policy active at any given time. This applies especially to forward declarations described in section 2.2.1 above, where multiple policy declarations MUST NOT reference two or more different policy URI's for the same resource. Also note that this applies to the combination of both header declarations and META tag declarations in an HTML document, since both formats MUST be handled interchangably by user agents (however conflicts between header and META tag declarations might be detectable only after the HTML/XML content has already been downloaded and any embedded META tag declarations have been found).
While the need to check for such ambiguities within the policy declarations for a single resource (both headers and, if applicable, META tags) is obvious, user agents SHOULD also track policy declarations across an entire website, in order to detect ambiguities in forward declarations. The actual policy declaration when requesting the resource directly MUST NOT differ from any previously given forward declaration for its corresponding URI prefix.
See also the sections on Immutability of Policies and Policy and Reference Cacheability for a discussion of non-ambiguity over time (immutability).
Multiple language versions (translations) of the same policy can be offered by the server using the HTTP "Content-Language" tag to properly indicate that a particular language encoding has been used for the policy. This is useful so that human-readable fields such as entity and consequence can be presented in multiple languages. Whenever Content-Language is used to distinguish policies at the same URI that are offered in multiple languages, the policies MUST have the same meaning in each language.
Services may directly refer to policies, or may make indirect references to policies. A direct reference to a policy is a policy URI which, when fetched, returns the XML document which makes up that policy. An indirect reference to a policy is a policy URI which, when fetched, returns a new policy URI. The new policy URI returned by an indirect reference MAY, itself, be an indirect reference, though this is discouraged for performance reasons.
Direct and indirect references are recognized by the HTTP return code given by the server when fetched. When the URI of a direct policy reference is fetched, the server SHOULD return a 200-class HTTP return code or a 301 (Moved Permanently) HTTP return code (or an error code, if appropriate). It MUST NOT give a 302 (Found), 303 (See Other) or 307 (Temporary Redirect) return code as a response. When the URI of an indirect policy reference is fetched, a 302 , 303 or 307 return code MUST be given, unless an error (400- or 500-class) return code is appropriate. When a 302, 303 or 307 return code is returned, it MUST include a Location response header giving the actual policy URI.
Services MAY choose to use direct or indirect policy references as appropriate (so long as the requirements under Immutability of Policies are respected). A direct policy reference will result in the best performance for user agents that are processing those policies. Due to the immutability rule, if a user agent receives a direct policy reference to a URI that it has already fetched, then no additional network activity is required in order to process that policy. This results in quicker response time for the user agent.
Indirect policy references require at least one additional network round trip to locate the actual policy. This results in reduced performance for the user agent. However, it allows for more flexible policy deployment for certain organizations. An example will assist in illustrating:
Imagine that an imaginary company, TheCoolCatalog, is establishing a worldwide Web presence. Its default Web site, www.thecoolcatalog.com, provides links to a number of country-specific sites. For purposes of this example, assume that TheCoolCatalog starts by deploying four localized sites: usa.coolcatalog.com (USA), www.thecoolcatalog.co.uk (United Kingdom), www.thecoolcatalog.com.ru (Russia), and www.thecoolcatalog.com.jp (Japan). Let us assume that each of these sites has their content developed locally. This allows the sites to be better tailored to their local audiences.
However, the TheCoolCatalog company has decided that it will have a single privacy policy which will apply to all of their sites around the world. They could do this by deploying that privacy policy on their master Web site (www.thecoolcatalog.com), and having pages on their localized servers reference that policy. When TheCoolCatalog company wishes to update their privacy policy, then by the Immutability of Policies rule, they must place that policy at a new URI. Then the policy references on all of their sites must be changed. This will probably involve work by several Webmasters and Webmistresses in various parts of the globe. The problem becomes far worse when TheCoolCatalog expands to more of the world, and has perhaps 20 or 50 localized Web sites.
Indirect policy references are intended as a solution to this management problem. Each of the local TheCoolCatalog sites can contain a policy URI pointing to the main TheCoolCatalog server. Fetching this URI returns a reference to the currently-applicable privacy policy. For example, imagine that the TheCoolCatalog company wants to gather customers' e-mail addresses to send them a note listing weekly specials. Each of the local servers could use an indirect policy URI of http://www.thecoolcatalog.com/privacy/P3P/policy-weeklyspecial. Resolving this URI would then return a link to the actual privacy policy; perhaps this might be http://www.thecoolcatalog.com/privacy/P3P/policy-weeklyspecial-3.xml. Now, when the corporation wishes to update the privacy policy which applies to the weekly special registration form, they need only update in a single location, regardless of how many servers reference that policy.
In general, services should use direct policy references whenever it is feasible. Indirect policy references are expected to be used only by organizations with large and diverse Web presences.
Note that services SHOULD make indirect policy references only across URIs which are under the same organizational control, to help insure the accuracy of the policy statement. However, there is no technical means to enforce this requirement. Indirect policy references MAY be to URIs on other hosts or even in other domains, depending on the structure of an organization's Web presence.
When a user agent receives a policy reference, there is no way for it to tell if it is a direct or indirect policy reference. To process the policy properly, the user agent MUST fetch the URI specified in the policy reference. If that reference returns a 302 (Found), 303 (See Other), or 307 (Temporary Redirect) return code, then the user agent MUST fetch the URI given in the Location header to locate the actual policy in order to process the policy. Note that once a policy is fetched by direct reference, it need not be fetchd again (as long as the user agent records the relevant information). However an indirect reference requires rechecking to make sure it has not changed (unless an Expires or Cache-Control HTTP header indicates that it has not changed).
An essential requirement on policies is the so-called immutability of policies: with one exception, policies that are directly referenced at a certain URI cannot be changed. This way, the URI of a policy acts like a unique identifier for the policy, and any new policy must therefore use a new different URI. The only exception to this general principle is when multiple language versions (translations) of the same policy are offered by the server using the HTTP "Content-Language" tag.
P3P clients MAY check for immutability of policies, by comparing a cached version of a policy (and its Content-Language if present) with the corresponding freshly retrieved policy (and Content-Language if present). If a user agent discovers that the two policies are different but retain the same URI, then it MUST treat the resource covered by the changed policy as if it has no P3P policy, UNLESS they have two different values of Content-Language.
Note that immutability of policies only holds to policies that are directly referenced: as far as indirect references are concerned, the URI returned when an indirect policy reference is fetched MAY change over time (this is the purpose behind indirect policy references). Indirect policy references MUST NOT be changed into direct policy references; if this is desired, a new policy-URI MUST be used.
In a distributed system like the World Wide Web, with high network latencies and never enough bandwidth, caching is very important to give acceptable performance to users. In this light, it is important to consider the interaction of P3P with caches in the network.
This section discusses the P3P objects which can be cached, and how to determine lifetimes for those objects. Shared and private caches are not discriminated in this discussion; that is covered by the Cache-Control directives of HTTP 1.1. For a general discussion of caching in HTTP, see [HTTP1.1].
HTTP concerns itself with the caching of objects -- documents, graphic or audio objects, and so on. P3P needs to concern itself with the caching of policy documents, which are simply objects in the HTTP context. However, P3P also needs to explicitly discuss the caching of something else: policy-to-object references.
Policy documents are XML documents, and are retrieved over HTTP in the same manner as other objects. As a result, the metadata provided by HTTP is adequate for describing the cache lifetime of policy documents. Also, observe that due to the rule on immutability of policies, policy documents are eminently cacheable: they can never change. Thus servers MAY wish to use Cache-Control: max-age HTTP header to specify a long lifetime (30 days, for example) for policy documents.
P3P user agents can productively make use of a second type of lifetime information: the lifetime of a policy reference. This lifetime, specified by a new response header, indicates to the user agent that a given policy applies to a document for a specified period of time. Having this information can simplify the processing at the user agent, since it can cache the results of processing that policy for the specified period of time.
For an example of how this benefits user agents, imagine a third-party trust service that serves multiple users. Assume that the trust service fetches P3P policies on behalf of the users, compares them against the user's preferences, and then provides the user with some information. If policy references do not have any lifetime information associated with them, the trust service must fetch every document the user requests to check for Policy headers and META tags. In the worst case, this can double the load on the server and significantly slow down the user's browsing experience. However, if a policy reference includes a cache lifetime, then the trust service can cache those references, much like proxy caches cache HTTP documents. This sort of caching is expected to provide dramatic savings in latency and bandwidth consumption.
Having established the desirability of caching both policy documents and policy references, it is now time to provide the means for services to indicate their lifetimes.
As mentioned above, HTTP provides several completely adequate mechanisms for specifying the lifetime of policy documents. Specifically, the Cache-Control and Expires HTTP headers MAY be used for this task.
A new response header, Policy-CC, is introduced to specify the cache lifetime of direct policy references. The name of this header is a short form of "Policy Cache Control", and borrows concepts from the Cache-Control and Exipres HTTP headers. This is a response header which MAY be included with a Policy header, and it should be prefixed by the same namespace designator as the Policy header.
The header syntax is:
[5] | policy-cc-header |
= |
prefix `-Policy-CC: ` directive |
[6] | directive |
= |
expires-directive | max-age-directive | no-cache-directive | s-max-age-directive |
[7] | expires-directive |
= |
`expires="` HTTP-date `"` |
[8] | max-age-directive |
= |
`max-age=` delta-seconds |
[9] | no-cache-directive |
= |
`no-cache` |
[10] | s-max-age-directive |
= |
`s-max-age=` delta-seconds |
Here, prefix is the two-digit namespace declaration selected for the P3P headers in this message, according to [HTTP-EXT]. HTTP-date and delta-seconds are defined in [HTTP1.1]. |
In keeping with the rules for other HTTP headers, the Policy-CC portion of this header may be written in any case.
The meanings of the directives are given as follows:
If both a max-age directive and a s-max-age directive are specified, then a single-user cache SHOULD use the lifetime given by the max-age directive, and ignore the s-max-age directive. A shared cache MUST ignore the max-age directive and SHOULD use the value given in the s-max-age directive.
If only a max-age directive is specified, then shared or single-user caches SHOULD use the lifetime it specifies.
If only a s-max-age directive is specified, then shared caches SHOULD use the specified lifetime. Single-user caches MUST treat this as if a no-cache directive was specified.
Specifying a max-age or s-max-age directive in addition to an expires or no-cache directive is an error, and user agents MUST treat this combination as if only a no-cache directive were specified.
Finally, note that successfully revalidating an object according to the HTTP rules (cf. [HTTP1.1]) will reset the lifetime counters for the max-age and s-max-age fields of the Policy-CC header.
Services MAY specify how long an indirect reference refers to a specific policy, giving a cache lifetime for an indirect policy reference. With this information, the user agent need not resolve the indirect policy reference again for the period of the cache lifetime.
Services MAY specify the lifetime for the indirect policy reference by adding Expires or Cache-Control HTTP headers on the 302 response, as specified in [HTTP1.1].
Note that services MUST NOT use the Policy-CC header with indirect references (this header is meant to use with direct references only).
The default cache lifetime for policy documents is determined by the HTTP specification.
Cache lifetimes of policy references SHOULD NOT be lower than the cache lifetime of the document that contains the reference.
The default lifetime for direct policy references is 24 hours. In other words, if no Policy-CC header is specified with a Policy header, then user agents MUST act as if a xx-Policy-CC: max-age=86400 header were included in the response (where xx is the namespace prefix).
The default lifetime for indirect policy references is 0 seconds. While this is not ideal from the perspective of P3P, it is required due to the definition of the 302 response in HTTP 1.1. Servers MUST specify Expires or Cache-Control headers on indirect policy reference responses to allow the indirect reference to be cached. User agents MUST NOT cache an indirect policy reference unless allowed to by the appropriate HTTP headers.
The time of expiration of a policy is calculated with respect to the P3P server clock. Due to the asynchronous nature of HTTP messages, in some specific cases there might be significant discrepancy between client and server clocks. For example, suppose that a certain page has a policy that expires at 3pm; a request for that page sent by a client at 2.59pm could arrive to the server at 3.01pm, therefore now acting under a possibly different policy. So, P3P user agents should be careful about relying on cached policies when the expiration time of the policy is approaching. Possible approaches in such cases are either discarding the cache information, or inform the user that the cached policy is going to expire very soon, or (for example for HTTP GET requests) inform the user about the expiration time in some unintrusive manner, and then make sure that any change in policy is visualized once a different P3P-policy header comes back from the server.
Every P3P-enabled user agent and service SHOULD ensure that all the relevant communications that take place as part of fetching a P3P policy are part of a special "safe zone" in which minimal data collection takes place and any data that is collected is used only in non-identifiable ways.
To support this safe zone, P3P user agents SHOULD suppress the transmission of data unnecessary for the purpose of finding a site's policy until the policy has been fetched. Thus user agents SHOULD NOT send the HTTP Referer header, cookies, or user agent information while requesting a P3P policy.
In addition, P3P user agents MAY issue a HEAD request to a site in order to learn the location of the relevant policy before making other requests. This is a useful way to obtain a site's policy without making a request that could result in the transmission of data.
Servers SHOULD NOT require the receipt of an HTTP Referer header, cookies, user agent information, or other information unnecessary for responding to the request in order to serve a policy file. In addition, servers SHOULD NOT use in an identifiable way any information collected while serving a policy file or responding to a HEAD request.
Servers MAY return a Policy header in the response headers when a P3P policy is requested. However, it is important to note that the Policy header MUST be ignored, and that the "safe zone" requirements described in this section apply instead. Returning a Policy header in such cases is permitted in consideration of the fact that administrators may find it easier to apply a P3P policy to all documents on a server, and that requiring policies to be served without a Policy header may result in extra work for site administrators.
Note that the safezone requirements do not say that sites cannot keep identifiable information -- only that they SHOULD NOT use in an identifiable way any information collected while serving a policy file. Tracking down denial of service attacks would be a legitimate reason to use this information and ignore the SHOULD.
There are three important further requirements on the server side:
P3P policies and references to P3P policies SHOULD NOT, in themselves, contain any sensitive information. This means that there are no additional security requirements for transporting a reference to a P3P policy beyond the requirements of the document it is associated with; so, if an HTML document would normally be served over a non-encrypted session, then the P3P protocol would not require that the document be served over an encrypted session when a reference to a P3P policy is included with that document.
Section 3.1 begins with an example of an English language privacy policy and a corresponding P3P policy in Section 3.1. P3P policies include general assertions that apply to the entire policy as well as specific assertions -- called statements -- that apply only to the handling of particular types of data referred to by data references. Section 3.2 describes the policy element and policy-level assertions. Section 3.3 describes statements and data references.
In the sections that follow a number of XML elements are introduced. Each element is given in <> brackets, followed by a list of valid attributes. All listed attributes are optional, except when tagged as mandatory. Note that many XML elements are shown in the BNF with seperate beginning and ending tags, to allow optional elements inside them. If no elements are included, then, following standard XML rules, a self-closing element may be used instead. Thus both of the following are legal, and equivalent, syntax for the DISCLOSURE element:
Syntax example 1:
<DISCLOSURE discuri="http://www.TheCoolCatalog.com/privacy.html" access="none"> </DISCLOSURE>
Syntax example 2:
<DISCLOSURE discuri="http://www.TheCoolCatalog.com/privacy.html" access="none"/>
The following is an example of an English-language privacy policy to be encoded as a P3P policy.
TheCoolCatalog, of 123 Main Street, Bethesda, MD 20814, USA, makes the following statement for the Web page at http://www.TheCoolCatalog.com/catalog/. We have a privacy seal from PrivacySeal.org. Our privacy policy is posted at http://www.TheCoolCatalog.com/PrivacyPractice.html. We do not provide access capabilities to information we may have about you.We use cookies and collect your gender, information about your clothing preferences, and (optionally) your home address to customize our entry catalog pages and for our own research and product development. We retain this information indefinitely.
We also maintain server logs that include information about visits to the http://www.TheCoolCatalog.com/catalog/ page, and the types of browsers our visitors use. We use this information in order to maintain and improve our web site. We retain this information indefinitely.
The following is a more formal description, using the P3P element and attribute names:
Entity: TheCoolCatalog, 123 Main Street, Bethesda, MD 20814, USADisputes:
resolution type: independent
service: http://www.privacyseal.org
description:PrivacySeal.orgDisclosure:
Disclosure URI: http://www.TheCoolCatalog.com/PrivacyPractice.html
Access to Identifiable Information: none
We may collect:
dynamic.cookies (category = state)
user.gender
dynamic.miscdata (category = pref)
user.home. (optional)
For purpose: Customization of the site to individuals, research and development
Retention: Indefinitely
Recipients: Only ourselves and our agents
Consequence: A site with clothes you would appreciateWe collect:
dynamic.clickstream.server
dynamic.http.useragent
For purpose: Web site and system administration, research and development
Retention: Indefinitely
Recipients: Only ourselves and our agents
The following piece of [XML] captures the information as expressed above. P3P policies are statements that are properly expressed as well-formed XML. The policy syntax will be explained in more detail in the sections that follow.
Example 3.1
<POLICY xmlns="http://www.w3.org/2000/P3Pv1"> <ENTITY> <DATA name="business.name">TheCoolCatalog</DATA> <DATA name="business.contact-info.postal.street.line1">123 Main Street</DATA> <DATA name="business.contact-info.postal.city">Bethesda</DATA> <DATA name="business.contact-info.postal.stateprov">MD</DATA> <DATA name="business.contact-info.postal.postalcode">20814</DATA> <DATA name="business.contact-info.postal.countrycode">US</DATA> </ENTITY> <DISPUTES-GROUP> <DISPUTES resolution-type="independent" service="http://www.PrivacySeal.org" description="PrivacySeal.org" image="http://www.PrivacySeal.org/Logo.gif"/> </DISPUTES-GROUP> <DISCLOSURE discuri="http://www.TheCoolCatalog.com/PrivacyPractice.html" access="none"/> <STATEMENT> <CONSEQUENCE>A site with clothes you would appreciate</CONSEQUENCE> <RECIPIENT><ours/></RECIPIENT> <PURPOSE><custom/><develop/></PURPOSE> <RETENTION><indefinitely/></RETENTION> <DATA-GROUP> <DATA name="dynamic.cookies" category="state"/> <DATA name="dynamic.miscdata" category="preference"/> <DATA name="user.gender"/> <DATA name="user.home." optional="yes"/> </DATA-GROUP> </STATEMENT> <STATEMENT> <RECIPIENT><ours/></RECIPIENT> <PURPOSE><admin/><develop/></PURPOSE> <RETENTION><indefinitely/></RETENTION> <DATA-GROUP> <DATA name="dynamic.clickstream.server"/> <DATA name="dynamic.http.useragent"/> </DATA-GROUP> </STATEMENT> </POLICY>
This section defines the syntax and semantics of P3P policies. All policies are encoded using [UTF-8]. P3P servers MUST encode their policies using this syntax. P3P user agents MUST be able to parse this syntax.
The POLICY element contains a complete P3P policy. Each P3P policy MUST contain exactly one POLICY element. The policy element MUST contain an ENTITY element that identifies the legal entity making the representation of the privacy practices contained in the policy. In addition, the policy element MUST contain a DISCLOSURE element, at least one STATEMENT element, and optionally a DISPUTES-GROUP element and one or more extensions.
[11] | policy |
= |
`<POLICY xmlns="http://www.w3.org/2000/P3Pv1">` entity disclosure [remedies] [disputes-group] 1*statement-block *extension `</POLICY>` |
[12] | quoted-URI |
= |
`"` URI `"` |
Here, URI is defined as per RFC 2396 [URI]. |
The ENTITY element gives a precise description of the legal entity making the representation of the privacy practices.
The ENTITY element contains a description of the legal entity consisting of DATA elements referencing (all or part of) the fields of the business dataset.
[13] | entity |
= |
"<ENTITY>" entitydescription *extension "</ENTITY>" |
[14] | entitydescription |
= |
`<DATA name="business.name"/>` PCDATA </DATA> *(`<DATA name="business." string `/>` PCDATA </DATA>) |
Here, string is defined as a [UTF-8] string (with " and & escaped) among the values that are allowed by the business dataset. PCDATA is defined as in [XML]. |
The DISCLOSURE element contains a number of general privacy disclosures. A disclosure element MUST include a discuri attribute, which indicates the URI of a site's human-readable privacy policy, and an access attribute, which indicates whether the site provides access to various kinds of information.
Note that service providers may also wish to provide capabilities to access information collected through means other than the Web at the discuri. However, the scope of P3P statements are limited to data collected through HTTP or other Web transport protocols. Also, if access is provided through the Web, use of strong authentication and security mechanisms for such access is recommended; however, security issues are outside the scope of this document.
There are six valid values for the access attribute:
[15] | disclosure |
= |
"<DISCLOSURE" " discuri=" quoted-URI " access=" `"` access-disclosure `">` *extension </DISCLOSURE> |
[16] | access-disclosure |
= |
"nonident" | ; Identifiable Data is Not Used "contact" | ; Identifiable Contact Information "other_ident" | ; Other Identifiable Information |
A policy SHOULD contain a DISPUTES-GROUP element, which contains one or more DISPUTES elements. These elements describe dispute resolution procedures that may be followed for disputes about a services' privacy practices.
[19] | disputes-group |
= |
"<DISPUTES-GROUP>" 1*dispute *extension "</DISPUTES-GROUP>" |
[20] | dispute |
= |
"<DISPUTES" " resolution-type=" '"'("service"|"independent"|"court"|"law")'"' " service=" quoted-URI [" description=" quoted-string] [" verification=" quoted-string] [" image=" quoted-URI [" width=" `"` number `"`] [" height=" `"` number `"`] [" alt=" quoted-string] "/>" *remedies *extension "</DISPUTES>" |
[21] | quoted-string |
= |
`"` string `"` |
Here, string is defined as a [UTF-8] string (with " and & escaped) |
Note that there can be multiple assurance services, specified via multiple occurrences of DISPUTES within the DISPUTES-GROUP element. These fields are expected to be used in a number of ways, from representing that one's privacy practices are self assured, audited by a third party, or under the jurisdiction of a regulatory authority.
Each DISPUTES element SHOULD contain a REMEDIES element
that specifies the possible remedies in case a policy breach occurs.
The REMEDIES element must contain one or more among the following:
[17] | remedies |
= |
"<REMEDIES>" 1*remedy *extension "</REMEDIES>" |
[18] | remedy |
= |
"<correct/>" | "<money/>" | "<law/>" |
Statements describe data practices that are applied to particular types of data.
The STATEMENT element is a container that groups together a PURPOSE element, a RECIPIENT element, a DATA-GROUP element, and optionally a CONSEQUENCE-GROUP element and one or more extensions. All of the data referenced by the DATA-GROUP is handled according to the disclosures made in the other elements contained by the statement. Thus, sites may group elements that are handled the same way and create a statement for each group. Sites that would prefer to disclose separate purposes and other information for each kind of data they collect can do so by creating a separate statement for each data element.
[22] | statement-block |
= |
"<STATEMENT>" [consequence] purpose recipient retention 1*data-group *extension "</STATEMENT>" |
To simplify practice declaration, service providers may aggregate any of the disclosures (purposes, recipients, and identifiable use) within a statement over data elements. Service providers MUST make such aggregations as an additive operation. For instance, a site that distributes your age to ours (ourselves and our agents), but distributes your zip code to published (unrelated third parties), MAY say they distribute your name and zip code to ours and unrelated. Such a statement appears to distribute more data than actually happens. It is up to the service provider to determine if their disclosure deserves specificity or brevity.
Also, one must always disclose all options that apply. Consider a site with the sole purpose of collecting information for the purposes of contact (Contacting Visitors for Marketing of Services or Products). Even though this is considered to be for the current (Completion and Support of Current Activity) purpose, the site must state both contact and current purposes. Consider a site which distributes information to ours in order to redistribute it to public: the site must state both ours and public recipients.
STATEMENT elements may optionally contain a CONSEQUENCE element that can be shown to a human user to provide further explanation about a site's practices.
[23] | consequence |
= |
"<CONSEQUENCE>" PCDATA "</CONSEQUENCE>" |
Each STATEMENT element MUST contain a PURPOSE element that contains one or more purposes of data collection or uses of data. Sites MUST classify their data practices into one or more of the six specified purposes.
The PURPOSE element must contain one or more among the following:
Each type of purpose can have the following optional attribute:
[24] | yesno |
= |
"yes" | "no" |
[25] | purpose |
= |
"<PURPOSE>" 1*purposevalue *extension "</PURPOSE>" |
[26] | purposevalue |
= |
"<current" [change] "/>" | ; Completion and Support of Current Activity "<admin" [change] "/>" | ; Web Site and System Administration "<develop" [change] "/>" | ; Research and Development "<contact" [change] "/>" | ; Contacting Visitors for Marketing of Services or Products "<customization" [change] "/>" | ; Affirmative Customization "<targeting" [change] "/>" | ; One-time Targeting "<profiling" [change] "/>" | ; Individual Profiling "<other-purpose" [change] " >" PCDATA "</other-purpose>"; Other Uses |
[27] | change |
= |
" change_preferences=" `"` yesno `"` |
Service providers MUST use the above elements to explain the purpose of data collection. Service providers MUST disclose all that apply. If a service provider does not disclose that a data element will be used for a given purpose, that is a representation that data will not be used for that purpose. Service providers that disclose that they use data for "other" purposes MUST provide human readable explanations of those purposes.
Note, that the working group discussed at legnth the possibility of allowing sites to distinguish between purposes they may engage in and purposes they will engage in. The consensus of the working group was that such a distinction is not necessary. However, some members disagreed with this conclusion stating:
Yes, no and may all need to be response options in the vocabulary. If no and may are the only options, then the meaning of may is corrupted to equal yes. May should be an option that reflects its true meaning -- yes or no. If may by default means yes, because yes is not provided as a response option, the consumer will be misled. May should be used to imply that there are a set of rules underlying the term that consumers can refer to understand a privacy policy. If may means yes, the consumer is less likely to investigate via a click-through to the Web site's privacy policy. Potentially, this seemingly simple solution -- no and may -- will be a significant barrier to commerce as consumers are confused by the meaning of the truncated choices of only no and may. Those who argue that providing all three choices -- yes, may, no -- is an attempt by Web sites to mislead consumers are missing the point. In the arena of privacy protection, accuracy in stating a privacy policy is critical to building trust and confidence in the consumer about how information is used. In the interest of software simplicity, limiting consumer preference choices to no and may will do a disservice to the consumer -- and to the Web sites that are trying to communicate accurately with consumers about their policies.
Each STATEMENT element MUST contain a RECIPIENT element that contains one or more recipients of the collected data. Sites MUST classify their recipients into one or more of the six recipients specified.
The RECIPIENT element must contain one or more among the following:
[28] | recipient |
= |
"<RECIPIENT>" 1*recipientvalue *extension "</RECIPIENT>" |
[29] | recipientvalue |
= |
"<ours/>" | ; only ourselves and our agents "<same/>" | ; legal entities following our practices "<other-recipient/>" | ; legal entities following different practices "<delivery/>" | ; delivery services following different practices "<public/>" | ; public fora "<unrelated/>" ; unrelated third parties |
Service providers MUST disclose all the recipients that apply. Note that in some cases the above set of recipients may not completely describe all the recipients of data. For example, the issue of transaction facilitators, such as shipping or payment processors, who are necessary for the completion and support of the activity but may follow different practices was problematic. Currently, only delivery services can be explicitly represented in a policy. Other such transaction facilitators should be represented in whichever category most accurately reflects their practices with respect to the original service provider. The working group decided to include a special element for delivery services, but not for payment processors (such as banks or credit card companies) for the following reasons: Financial institutions will typically have separate agreements with their customers regarding the use of their financial data, while delivery recipients typically do not have an opportunity to review a delivery service's privacy policy.
Note that the <delivery/>
element SHOULD NOT be used for
delivery services that agree to use data only on behalf of the service provider
for completion of the delivery.
Each STATEMENT element MUST contain a RETENTION element that indicates the kind of retention policy that applies to the data referenced in that statement.
The RETENTION element must contain one of the following:
[30] | retention |
= |
"<RETENTION>" retentionvalue *extension "</RETENTION>" |
[31] | retentionvalue |
= |
"<no-retention/>" | ; not retained "<stated-purpose/>" | ; for the stated purpose "<legal-requirement/>" | ; stated purpose by law "<indefinitely/>" | ; indeterminated period of time "<business-practices/>" ; by business practices |
Each STATEMENT element MUST contain at least one DATA-GROUP element that contains one or more DATA elements. DATA elements are used to describe the type of data that a site collects.
The following six attributes are only used when a new (not defined in the P3P [Base Data Schema]) data element or set is referenced.
[32] | data-group |
= |
"<DATA-GROUP>" 1*data-reference *extension "</DATA-GROUP>" |
[33] | data-reference |
= |
"<DATA name=" quoted-string [" dataschema=" quoted-string] [" optional=" yesno] [" type=" quoted-string] [" typeschema=" quoted-string] [" template=" yesno] [" category=" category] [" short=" quoted-string] [" long=" quoted-string] [" size=" `"` number `"`] ; default is 0 (unlimited size) ">" [PCDATA] ; the eventual value of the data "</DATA>" |
For example, to reference the user's home address city, all the elements of the data set user.business. and (optionally) all the elements of the data set user.home.phone., the service would send the following references inside a P3P policy:
<DATA-GROUP> <DATA name="user.home.city"/> <DATA name="user.home.phone." optional="yes"/> <DATA name="user.business."/> </DATA-GROUP>
When the actual value of the data is known, it can be expressed inside the DATA element, like eventual extensions. For example, as seen in the example policy:
<ENTITY> <DATA name="business.name">TheCoolCatalog</DATA> <DATA name="business.contact-info.postal.street.line1">123 Main Street</DATA> <DATA name="business.contact-info.postal.city">Bethesda</DATA> <DATA name="business.contact-info.postal.stateprov">MD</DATA> <DATA name="business.contact-info.postal.postalcode">20814</DATA> <DATA name="business.contact-info.postal.countrycode">US</DATA> <DATA name="business.name"/>The CoolCatalog Inc.</DATA> </ENTITY>
Categories are attributes of data elements that provide hints to users and user agents as to the intended uses of the data. Categories are vital to making P3P user agents easier to implement and use; they allow users to express more generalized preferences and rules over the exchange of their data. Categories are often included when defining a new element or when referring to data that the user is prompted to type in (as opposed to data stored in the user data repository).
In the current version of P3P, the following tokens are used to denote data categories:
[34] | category |
= |
"physical" | ; Physical Contact Information "online" | ; Online Contact Information "uniqueid" | ; Unique Identifiers "purchase" | ; Purchase Information "financial" | ; Financial Information "computer" | ; Computer Information "navigation" | ; Navigation and Click-stream Data "interactive" | ; Interactive Data "demographic" | ; Demographic and Socioeconomic Data "content" | ; Content "state" | ; State Management Mechanisms "political" | ; Political Information "health" | ; Health Information "interactive" | ; Interactive Data "preference" | ; Preference Data "other" ; Other |
The Computer, Navigation, Interactive and Content categories can be distinguished as follows. The Computer category includes information about the user's computer including IP address and software configuration. Navigation data describes actual user behavior related to browsing. When an IP address is stored in a log file with information related to browsing activity, both the Computer category and the Navigation category should be used. Interactive Data is data actively solicited to provide some useful service at a site beyond browsing. Content is information exchanged on a site for the purposes of communication.
The Other category should be used only when data is requested that does not fit into any other category.
P3P uses categories to give users and user agents additional hints as to what type of information is requested from a service. While most data in the Base Data Schema is in a known category (or a set of known categories), some data elements can be in a number of different categories, depending on the situation. The former are called fixed-category data elements (or "fixed data elements" for short), the latter variable-category data elements ("variable data elements"). Both types of elements are briefly described in the two sections below.
Most of the elements in the base data schema are so called "fixed" data elements: they belong to one or at most two category classes. By assigning a category invariably to elements in the base data schema, services and users are able to refer to entire groups of elements simply by referencing the corresponding category. For example, using APPEL, the privacy preferences exchange language, users can write rules that prevent their user agent from giving out any data element in a certain category.
When creating data schemas for fixed data elements, schema creators have to explicitly enumerate the categories that these element belong to. For example:
<DATA name="postal.street.line1" type="text"
short="Street Address, Line 1" category="physical" template="yes"/>
If an element belongs to multiple categories, multiple elements referencing the same data can be used, each with a different category). For example, the following piece of XML can be used to declare that the data elements in user.name. have both category "physical" and "demographic":
<DATA name="user.name." type="personname."
short="User's Name" category="physical" template="yes"/><DATA name="user.name." type="personname."
short="User's Name" category="demographic" template="yes"/>
Please note that the category classes of fixed data elements can not be overridden, for example by writing rules or policies that assign a different category to a known fixed base data element. User Agents MUST ignore such categories and instead use the original category (or set of categories) listed in the schema definition. User Agents MAY preferably alert the user that a fixed data element is used together with a non-standard category class.
Not all data elements in the base data schema belong to a pre-determined category class. Some elements can contain information from a range of categories, depending on a particular situation. Such elements are called variable-category data elements (or "variable data element" for short). Although most variable data elements in the P3P Base Data Schema are combined in the dynamic. element set, they can appear in any data set, even mixed with fixed-category data elements.
When creating a schema definition for such elements, schema authors MUST NOT list an explicit category attribute, otherwise the element becomes a fixed data element. For example when specifying the "Year" data type, which can take various categories depending on the situation (e.g. when used for a credit card expiration date vs. for a birth date), the following schema definition can be used:
<DATA name="date.ymd.year" type="number" size="6"
short="Year" template="yes"/> <!-- Variable Data Element -->
This allows new schema extensions that reference such variable-category data types to assign a specific category to derived elements, depending on their usage in that extension. For example, an E-commerce schema extension could thus define a credit card expiration date as follows:
<DATA name="Card.ExpDate." type="date.ymd."
short="Card Expiration Date" category="financial" template="yes"/>
Under these conditions, the variable data type date. is assigned a fixed category Financial Account Identifiers when being used for specifying a credit card expiration date.
Note that while user preferences can list such variable data elements without any additional category information (effectively expressing preferences over any usage of this element), services MUST always explicitly specify the categories that apply to the usage of a variable data element in their particular policy. This information has to appear as an attribute to the corresponding DATA element listed in the policy, for example as in:
<POLICY ... >
...
<DATA name="dynamic.cookies" category="uniqueid">
...
</POLICY>
where a service declares that cookies are used for identifying the user at this site (i.e. category Unique Identifiers).
If a service wants to declare a data element that is in multiple categories, it simply declares the same element multiple times (as shown in section 3.4.1 above):
<POLICY ... >
...
<DATA name="dynamic.cookies" category="uniqueid">
<DATA name="dynamic.cookies" category="preference">
...
</POLICY>
With the above declaration a service announces that it uses cookies both for identifying the user at this site and for storing user preference data. Note that for the purpose of P3P there is no difference whether this information is stored in two separate cookies or in a single one.
P3P provides a flexible and powerful mechanism to extend its syntax and semantics using one element: EXTENSION. This element is used to indicate portions of the policy which belong to an extension. The meaning of the data within the EXTENSION element is defined by the extension itself.
[35] | extension |
= |
"<EXTENSION" [" optional=" '"' yesno '"'] ">" PCDATA "</EXTENSION>" |
For example, if www.TheCoolCatalog.com would like to add to P3P a feature to indicate that a certain set of data elements were only to be collected from users living in the United States, Canada, or Mexico, it could add a mandatory extension like this:
<DATA-GROUP> ... <EXTENSION> <COLLECTION-GEOGRAPHY type="include" xmlns="http://www.TheCoolCatalog.com/P3P/region"> <USA/><Canada/><Mexico/> </COLLECTION-GEOGRAPHY> </EXTENSION> </DATA-GROUP>
On the other hand, if www.TheCoolCatalog.com would like to add an extension stating what country the server is in, an optional extension might be more appropriate, such as the following:
<POLICY> <EXTENSION optional="yes"> <ORIGIN xmlns="http://www.TheCoolCatalog.com/P3P/origin" country="USA"/> </EXTENSION> ... </POLICY>
The xmlns attribute is significant since it specifies the namespace for interpreting the names of elements and attributes used in the extension. Note that, as specified in [XML-Name], the namespace URI is just intended to be a unique identifier for the XML entities used by the extension. Nevertheless, service providers MAY provide a page with a description of the extension at the corresponding URI.
P3P has the ability to define data schemas to provide a common way for services and user agents to refer to data elements. A data schema describes specific data elements, which may be grouped into hierarchical data sets.
Services may declare and use data elements by creating a data schema and referencing it in a policy using the dataschema attribute. P3P comes with a standard data schema, the P3P Base Data Schema, that besides defining a wide variety of commonly used data elements, also provides basic data types, which can be conveniently reused by other new schemas.
The format of a data schema is:
<DATASCHEMA xmlns="http://www.w3.org/2000/P3Pv1"> <DATA ... /> ... <DATA ... /> </DATASCHEMA>
[36] | dataschema |
= |
`<DATASCHEMA xmlns="http://www.w3.org/2000/P3Pv1">` *(data-reference|extension) "</DATASCHEMA>" |
The <DATASCHEMA> element contains references to the new data elements. Such references can be made using the <DATA> tag and the following attributes: name, type, typeschema (which may be omitted if the typeschema is the Base Data Schema), template, category, short, long, size.
For every data element, every missing attribute is presumed to be present with an empty string as defaut value. In the case of the typeschema, the empty string value has the special meaning that the type schema coincides with the namespace of the corresponding DATA element.
For example, suppose the company HyperSpeed wants to build the following data schema:
vehicle.model (of primitive type text)
vehicle.color (of primitive type text)
vehicle.built.year (of primitive type number)
vehicle.built.where. (of basic type postal.)
vehicle.price (of primitive type number)
car.model (of primitive type text)
car.color (of primitive type text)
car.built.year (of primitive type number)
car.built.where. (of basic type postal.)
car.price (of primitive type number)
Then, it could place the following code at http://www.HyperSpeed.com/models-schema
<DATASCHEMA xmlns="http://www.w3.org/2000/P3Pv1"> <DATA name="vehicle.model" type="text" short="Model" category="preference" size="63"/> <DATA name="vehicle.color" type="text" short="Color" category="preference" size="63"/> <DATA name="vehicle.built.year" type="number" short="Construction Year" category="preference" size="63"/> <DATA name="vehicle.built.where." type="postal." short="Construction Place" category="preference" size="63"/> <DATA name="car." type="vehicle." typeschema="http://www.HyperSpeed.com/models-schema"/> </DATASCHEMA>
Note that every time a data schema is created, it can be implicitly used as a type, just like the vehicle. case above.
Continuing with the example, in order to reference a car model and construction year the service could send the following references inside a P3P policy:
<DATA-GROUP> <DATA name="car.model" dataschema="http://www.HyperSpeed.com/models-schema"/> <DATA name="car.built.year" dataschema="http://www.HyperSpeed.com/models-schema"/> </DATA-GROUP>
In order to provide multilingual support for data schema files, a server can supply the right alternative based on the HTTP Accept-Language header.
Data elements can be classified according to whether or not they are in some fixed category (using the category attribute). Schema designers can use this attribute within their schema definitions to define an invariable category for each element. Once defined, this value cannot be changed when referencing such elements from within user preferences, P3P policies, or other schema definitions. However, if left undefined, this attribute MUST be explicitly listed in each P3P policy referencing such elements. Users can have different preferences depending on different attribute-values for the same element. And in the case of undefined attributes within data types, other schema definitions can explicitly set categories in derived elements (otherwise the original definition overrides any value in the derived schema).
Note that the data element names specified in the base data schema or in extension data schemas may be used for purposes other than P3P policies. For example, web sites may use these names to label HTML form fields. By referring to data the same way in P3P policies and forms, automated form-filling tools can be better integrated with P3P user agents. When P3P data element names are used as HTML form field names, underscores ("_") MUST be used in place of dot notation (e.g. user.name.first must be referenced as user_name_first). This allows interoperability with client-side javascript which also uses the dot notation to access form field names and values.
Analogously to P3P policies, an essential requirement on dataschemas is the immutability of dataschemas: with one exception, dataschemas that can be fetched at a certain URI cannot be changed. This way, the URI of a policy acts like a unique identifier for the dataschema, and usually any new dataschema must therefore use a new different URI. The only exception to this general principle is when multiple language versions (translations) of the same dataschema are offered by the server using the HTTP "Content-Language" tag to properly indicate that a particular language encoding has been used for the dataschema. P3P clients MAY check for immutability of dataschemas, by comparing a cached version of a dataschema (and its Content-Language if present) with the corresponding freshly retrieved dataschema (and Content-Language if present). If a user agent discovers that the two dataschemas are different but retain the same URI, then it MUST treat the resource referencing the changed dataschema as if it has no P3P policy, UNLESS the dataschemas have two different values of their Content-Language.
P3P schemas may refer to the following primitive data element types:
Primitive DataType | Definition |
text | [UTF-8] |
gender | "M" or "F". |
boolean | "false" or "true". |
binary | Base64 per RFC-2045. [MIME] |
number | text composed with the digits "0", "1", "2", "3", "4", "5", "6", "7", "8", "9". |
Country | two letters country code as per [ISO3166] |
uri | [URI] |
The Basic Data Types are structured types used by the P3P Base Data Schema (and, possibly, reused by other different data schemas). All P3P-compliant user agent implementations MUST be aware of the Basic Data Types. Each table below specifies the elements of a basic data type, the categories associated, their types, and the display names shown to users. More than one category may be associated with a fixed data element. However, each base data element is assigned to only one category whenever possible. Data schema designers are recommended to do the same.
The date. type is a structured type that specifies a date. Since date information can be used in different ways, depending on the context, all date. information is tagged as being of "variable" category. Schema definitions have to explicitly set the corresponding category in the element referencing this data type. For example, soliciting the birthday of a user might be "Demographic and Socioeconomic Data", while the expiration date of a credit card belongs to the "Financial Account Identifiers" category.
date. | Category | Type | Short display name |
ymd.year | (variable-category) | number | Year |
ymd.month | (variable-category) | number | Month |
ymd.day | (variable-category) | number | Day |
hms.hour | (variable-category) | number | Hour |
hms.minute | (variable-category) | number | Minute |
hms.second | (variable-category) | number | Second |
fractionsecond | (variable-category) | number | Fraction of Second |
timezone | (variable-category) | text | Time Zone |
All the fields in the date. type must be in the same format as those in the most informative profile of the time standard ISO8601. Note that "date.ymd." and "date.hms." can be used to fast reference the year/month/day and hour/minutes/seconds blocks respectively.
The personname. type is a structured type that specifies information about the naming of a person.
personname. | Category | Type | Short display name |
prefix | Demographic and Socioeconomic Data | text | Name Prefix |
first | Physical Contact Information | text | First Name |
last | Physical Contact Information | text | Last Name |
middle | Physical Contact Information | text | Middle Name |
suffix | Demographic and Socioeconomic Data | text | Name Suffix |
formatted | Physical Contact Information, Demographic and Socioeconomic Data | text | formatted Name |
nickname | Demographic and Socioeconomic Data | text | Nickname |
The certificate. type is a structured type to specify identity certificates (like, for example, X.509).
certificate. | Category | Type | Short display name |
key | Unique Identifiers | binary | Certificate Key |
format | Unique Identifiers | text | Certificate Format |
The "format" field is an IANA registered public key or authentication certificate format, while the "key" field contains the corresponding certificate key.
The phonenum. type is a structured type that specifies the characteristics of a phone number.
phonenum. | Category | Type | Short display name |
intcode | Physical Contact Information | number | International Phone code |
loccode | Physical Contact Information | number | Local Phone Area code |
number | Physical Contact Information | number | Phone Number |
ext | Physical Contact Information | number | Phone Extension |
comment | Physical Contact Information | text | Phone Optional Comments |
The contact. type is a structured type used to specify contact information. Services can specify precisely which set of data they need, postal, telecommunication, or online address information.
contact. | Category | Type | Short display name |
postal. | Physical Contact Information, Demographic and Socioeconomic Data | postal. | Postal Address Information |
telecom. | Physical Contact Information | telecom. | Telecommunications Information |
online. | Online Contact Information | online. | Online Address Information |
The postal. type is a structured type that specifies a postal mailing address.
postal. | Category | Type | Short display name |
name. | Physical Contact Information, Demographic and Socioeconomic Data | personname. | Name |
street.line1 | Physical Contact Information | text | Street Address 1 |
street.line2 | Physical Contact Information | text | Street Address 2 |
street.line3 | Physical Contact Information | text | Street Address 3 |
city | Physical Contact Information | text | City |
stateprov | Physical Contact Information | text | State or Province |
postalcode | Demographic and Socioeconomic Data | text | Postal code |
countrycode | Demographic and Socioeconomic Data | Country | Country code |
country | Demographic and Socioeconomic Data | text | Country Name |
organization | Physical Contact Information, Demographic and Socioeconomic Data | text | Organization Name |
formatted | Demographic and Socioeconomic Data | text | formatted Postal Address |
Using three distinct fields for the street information allows service providers and user agents to split long addresses into multiple lines during solicitation. However, since all fields share the common street. prefix, this shorthand form can be used to reference all three fields at once.
The "formatted" field is used to specify the formatted text corresponding to the delivery address, as it could for example be printed on a label.
The telecom. type is a structured type that specifies telecommunication information about a person.
telecom. | Category | Type | Short display name |
phone. | Physical Contact Information | phonenum. | Phone number |
fax. | Physical Contact Information | phonenum. | Fax number |
mobile. | Physical Contact Information | phonenum. | Mobile Phone number |
pager. | Physical Contact Information | phonenum. | Pager number |
The online. type is a structured type that specifies online information about a person.
online. | Category | Type | Short display name |
Online Contact Information | text | Email Address | |
uri | Online Contact Information | uri | Home Page Address |
All P3P-compliant user agent implementations MUST be aware of the data elements
in the P3P Base Data Schema. The P3P Base Data Schema includes four element
sets, user.
, thirdparty.
,
business.
and dynamic.
. The
user.
, thirdparty.
and business.
sets include elements that users and/or businesses might provide values for,
while the dynamic.
set includes elements that are dynamically
generated in the course of a user's browsing session. User agents may support
a variety of mechanisms that allow users to provide values for the elements
in the user.
set and store them in a data repository, including
mechanisms that support multiple personae. Users may choose not to provide
values for these data elements.
The formal XML definition of the P3P Base Data Schema is given in Appendix 2. In the following sections, we explain one by one the base data elements and sets. The members of this Working Group expect that in the future, there will be demand for the creation of other data sets and elements. Obvious applications include catalogue, payment, and agent/system attribute schemas. (An extensive set of system elements is provided for example in http://www.w3.org/TR/NOTE-agent-attributes.)
Each table below specifies a set, the elements within the set, the category associated with the element, its type, and the display name shown to users. More than one category may be associated with a fixed data element. However, we have tried to assign each base data element to only one category whenever possible. We recommend that data schema designers do the same.
The user.
data set includes general information about
the user.
user. | Category | Type | Short display name |
name. | Physical Contact Information, Demographic and Socioeconomic Data | personname. | User's Name |
bdate. | Demographic and Socioeconomic Data | date. | User's Birth Date |
cert. | Unique Identifiers | certificate | User's Identity Certificate |
gender | Demographic and Socioeconomic Data | gender | User's Gender |
employer | Demographic and Socioeconomic Data | text | User's Employer |
department | Demographic and Socioeconomic Data | text | Department or division of organization where user is employed |
jobtitle | Demographic and Socioeconomic Data | text | User's Job Title |
home. | Physical Contact Information, Online Contact Information, Demographic and Socioeconomic Data |
contact. | User's Home Contact Information |
business. | Physical Contact Information, Online Contact Information, Demographic and Socioeconomic Data |
contact. | User's Business Contact Information |
Note, that this data set includes elements that are actually sets of data themselves. These sets are defined in the data types subsection of this document. The short display name for an individual element contained within a data set is defined as the concatenation of the short display names that have been defined for the set and the element, separated by commas. For example, the short display name for user.home.postal.postalcode would be "User's Home Contact Information, Postal Address Information, Postal code". User agent implementations may prefer to develop their own short display names rather than using the concatenated names when displaying information for the user.
The thirdparty.
data set allows users and businesses
to provide values for a related third party. This can be useful whenever
third party information needs to be exchanged, for example when ordering
a present online that should be sent to another person, or when providing
information about one's spouse or business partner. Such information could
be stored in the user repository alongside with the user.
data
set. User agents may offer to store multiple such thirdparty.
data sets and allow users to select the appropriate values from a list when
necessary.
The thirdparty.
data set is identical with the
user.
data set. See section 4.4.1 User
Data for details.
The business.
data set features a subset of
user.
data relevant for organizations. In P3P 1.0, this data
set is primarily used for declaring the policy entity, though it should also
be applicable to business-to-business interactions.
business. | Category | Type | Short display name |
name | Demographic and Socioeconomic Data | text | Organization Name |
department | Demographic and Socioeconomic Data | text | Department or division of organization |
cert. | Unique Identifiers | certificate | Organizantion Identity Certificate |
contact-info. | Physical Contact Information, Online Contact Information, Demographic and Socioeconomic Data |
contact. | Contact Information for the Organization |
In some cases, there is a need to specify data elements that do not have fixed values that a user might type in or store in a repository. In the P3P Base Data Schema, all such elements are grouped under the dynamic. data set. Sites may refer to the types of data they collect using the dynamic data set only, rather than enumerating all of the specific data elements.
dynamic. | Category | Type | Short display name |
clickstream.client | Navigation and Click-stream Data | text | Click-stream collected on the client |
clickstream.server | Navigation and Click-stream Data | text | Click-stream collected on the server |
cookies | (variable-category) | text | Cookies are processed (read/write) |
http.useragent | Computer Information | text | User Agent information |
http.referrer | Navigation and Click-stream Data | uri | Last URI requested by the user |
miscdata | (variable-category) | text | Miscellaneous non-base data schema information |
searchtext | Interactive Data | text | Search terms |
interactionrecord | Interactive Data | text | Server stores the transaction history |
These elements are often implicit in navigation or Web interactions. They should be used with categories to describe the type of information collected through these methods. A brief description of each element follows.
"clickstream.client" should be used when the server accesses off-line browsing information that has been collected by the user's client. Some versions (e.g. 5.0) of Microsoft's Internet Explorer are known to support such behavior.
"clickstream.server" will probably apply to almost all sites on the Web today. It must be used whenever page access data is kept on the server side. Almost all known Web server implementations today will by default create such an access log, often including origin of the request (IP address or DNS name), time, requested resource, HTTP return code and transferred bytes. Any combination of resource name and originating address should be considered clickstream data (i.e. it allows the reconstruction of a visitors movements through the site) and should be declared.
The logging of referer or user agent information (included in the headers of the HTTP request by many browsers) should explicitly by declared using the http.useragent and http.referrer data elements.
"cookies" should be used whenever information is placed on a user's machine using the HTTP cookie mechanism in order to be "solicited" (i.e. automatically sent) later. Please note that "cookies" is a variable data element and requires the explicit declaration of usage categories in a policy.
"http.useragent" indicates that the server stores additional information about the user agent in its logs, such as operating system, browser software and version.
"http.referrer" indicates that the server stores additional information about the page the user viewed previously, as indicated by the HTTP_REFERER header.
The "miscdata" element references information collected by the service that the service does not reference using a specific data element. Sites MUST reference a separate miscdata element in their policies for each category of miscdata they collect.
"searchtext" is a specific type of solicitation used for searching and indexing sites. For example, if the only fields on a search engine page are search fields, the site only needs to disclose that data element.
The "interactionrecord" element should be used if the server is keeping track of the interaction it has with the user (i.e. information other than clickstream data, for example account transactions, etc). This element is only meant to inform the user that such information will be retained, but does not indicate how long such data will be kept.
Policies that contain one or more of the Variable Data Elements above explicitly declare the category of the information they solicit, for example:
<POLICY ... >
...
<DATA name="dynamic.miscdata" category="online">
...
</POLICY>
when asking for a user's IRC name (which would be in category Online Contact Information).
P3P offers Web sites a great deal of flexibility in how they describe the types of data they collect.
And these three methods may be combined within a single policy.
By using the dynamic.miscdata element, sites can specify the types of data they collect without having to enumerate every individual data element. This may be convenient for sites that collect a lot of data or sites belonging to large organizations that want to offer a single P3P policy covering the entire organization. However, the disadvantage of this approach is that user agents will have to assume that the site might collect any data element belonging to the categories referenced by the site. So, for example, if a site's policy states that it collects dynamic.miscdata of the physical contact information category, but the only physical contact information it collects is business address, user agents will none-the-less assume that the site might also collect phone numbers. If the site wishes to be clear that it does not collect phone numbers or any other physical contact information other than business address, than it should disclose that it collects user.business.postal.. Furthermore, as user agents are developed with automatic form-filling capabilities, it is likely that sites that enumerate the data they collect will be able to better integrate with these tools.
By defining new data schemas, sites can precisely specify the data they collect beyond the base data set. However, if user agents are unfamiliar with the elements defined in these schemas, they will be able to provide only minimal information to the user about these new elements. The information they provide will be based on the category and display names specified for each element.
Regardless of whether a site wishes to make general or specific data disclosures, there are additional advantages to disclosing specific elements from the dynamic. data set. For example, by disclosing dynamic.cookies a site can indicate that it uses cookies and explain the purpose of this use. The working group encourages user agent implementations that offer users cookie control interfaces based on this information. Likewise, user agents that by default do not send the HTTP_REFERER header, might look for the http.referrer element in P3P policies and send the header if it will be used for a purpose the user finds acceptable.
The data schema corresponding to the P3P base data schema follows. In order to improve legibility, the code is indented and aligned along various attribute names. However, note that the whitespace in the actual schema is significant because the content of the document must remain unchanged (immutability of dataschemas).
<DATASCHEMA xmlns="http://www.w3.org/2000/P3Pv1">
<DATA-GROUP>
<!-- ********** Base Data Types **********
-->
<!-- "date." Data Type -->
<DATA name="date.ymd.year"
short="Year"
type="number" size="6"
template="yes"/> <!-- Variable Data Element
-->
<DATA name="date.ymd.month"
short="Month"
type="number" size="2"
template="yes"/> <!-- Variable Data Element
-->
<DATA name="date.ymd.day"
short="Day"
type="number" size="2"
template="yes"/> <!-- Variable Data Element
-->
<DATA name="date.hms.hour"
short="Hour"
type="number" size="2"
template="yes"/> <!-- Variable Data Element
-->
<DATA name="date.hms.minute"
short="Minutes"
type="number" size="2"
template="yes"/> <!-- Variable Data Element
-->
<DATA name="date.hms.second"
short="Second"
type="number" size="2"
template="yes"/> <!-- Variable Data Element
-->
<DATA name="date.fractionsecond"
short="Fraction of Second"
type="number" size="6"
template="yes"/> <!-- Variable Data Element
-->
<DATA name="date.timezone"
short="Time Zone"
type="text" size="10"
template="yes"/> <!-- Variable Data Element
-->
<!-- "personname." Data Type
-->
<DATA name="personname.Prefix"
short="Name Prefix"
type="text"
category="demographic"
template="yes"/>
<DATA name="personname.first"
short="First Name"
type="text"
category="physical" template="yes"/>
<DATA name="personname.last"
short="Last Name"
type="text"
category="physical" template="yes"/>
<DATA name="personname.middle"
short="Middle Name"
type="text"
category="physical" template="yes"/>
<DATA name="personname.suffix"
short="Name Suffix"
type="text"
category="demographic"
template="yes"/>
<DATA name="personname.formatted"
short="formatted Name"
type="text"
category="physical" template="yes"/>
<DATA name="personname.formatted"
short="formatted Name"
type="text"
category="demographic" template="yes"/>
<DATA name="personname.nickname"
short="Nickname"
type="text"
category="demographic"
template="yes"/>
<!-- "certificate." Data Type
-->
<DATA name="certificate.key"
short="Certificate Key"
type="binary" size="0"
category="uniqueid" template="yes"/>
<DATA name="certificate.format"
short="Certificate format"
type="number" size="128"
category="uniqueid" template="yes"/>
<!-- "phonenum." Data Type -->
<DATA name="phonenum.intcode"
short="International Phone Code"
type="number" size="11"
category="physical" template="yes"/>
<DATA name="phonenum.loccode"
short="Local Phone Area Code"
type="number" size="11"
category="physical" template="yes"/>
<DATA name="phonenum.number"
short="Phone Number"
type="number" size="30"
category="physical" template="yes"/>
<DATA name="phonenum.ext"
short="Phone Extension"
type="number" size="11"
category="physical" template="yes"/>
<DATA name="phonenum.comment"
short="Phone Optional Comments"
type="text"
category="physical" template="yes"/>
<!-- "contact." Data Type" -->
<DATA name="contact.postal."
short="Postal Address Information"
type="postal."
category="physical" template="yes"/>
<DATA name="contact.postal."
short="Postal Address Information"
type="postal."
category="demographic"
template="yes"/>
<DATA name="contact.telecom."
short="Telecommunications Information"
type="telecom."
category="physical" template="yes"/>
<DATA name="contact.online."
short="Online Address Information"
type="online."
category="online" template="yes"/>
<!-- "postal." Data Type -->
<DATA name="postal.name."
short="Name"
type="personname."
category="physical" template="yes"/>
<DATA name="postal.name."
short="Name"
type="personname."
category="demographic"
template="yes"/>
<DATA name="postal.street.line1"
short="Street Address, Line 1"
type="text"
category="physical" template="yes"/>
<DATA name="postal.street.line2"
short="Street Address, Line 2"
type="text"
category="physical" template="yes"/>
<DATA name="postal.street.line3"
short="Street Address, Line 3"
type="text"
category="physical" template="yes"/>
<DATA name="postal.city"
short="City"
type="text"
category="physical" template="yes"/>
<DATA name="postal.stateprov"
short="State or Province"
type="text"
category="physical" template="yes"/>
<DATA name="postal.postalcode"
short="Postal Code"
type="text"
category="demographic"
template="yes"/>
<DATA name="postal.organization"
short="Organization Name"
type="text"
category="physical" template="yes"/>
<DATA name="postal.organization"
short="Organization Name"
type="text"
category="demographic"
template="yes"/>
<DATA name="postal.formatted"
short="Formatted Postal Address"
type="text"
category="physical" template="yes"/>
<DATA name="postal.formatted"
short="Formatted Postal Address"
type="text"
category="demographic"
template="yes"/>
<DATA name="postal.country"
short="Country Name"
type="text"
category="demographic"
template="yes"/>
<DATA name="postal.countrycode"
short="Country Code"
type="Country" size="2"
category="demographic"
template="yes"/>
<!-- "telecom." Data Type -->
<DATA name="telecom.phone."
short="Phone Number"
type="phonenum."
category="physical" template="yes"/>
<DATA name="telecom.fax."
short="Fax Number"
type="phonenum."
category="physical" template="yes"/>
<DATA name="telecom.mobile."
short="Mobile Phone Number"
type="phonenum."
category="physical" template="yes"/>
<DATA name="telecom.pager."
short="Pager Number"
type="phonenum."
category="physical" template="yes"/>
<!-- "online." Data Type -->
<DATA name="online.email"
short="Email Address"
type="text"
category="online" template="yes"/>
<DATA name="online.uri"
short="Home Page Address"
type="uri"
category="online" template="yes"/>
<!-- ********** Base Data Schemas ********** -->
<!-- "dynamic." Data Schema
-->
<DATA
name="dynamic.clickstream.client"
short="Click-stream collected on the client"
type="text" source="service"
category="navigation"/>
<DATA
name="dynamic.clickstream.server"
short="Click-stream collected on the server"
type="text" source="service"
category="navigation"/>
<DATA name="dynamic.cookies"
short="cookies are processed (read/write)"
type="text" source="service"/>
<!-- Variable Data Element
-->
<DATA name="dynamic.http.useragent"
short="User Agent information"
type="text" source="service"
category="navigation"/>
<DATA name="dynamic.http.referrer"
short="Last URI requested by the user"
type="uri" source="service"
category="navigation"/>
<DATA name="dynamic.miscdata"
short="Miscellaneous non base data schema
information"
type="text" source="service"/>
<!-- Variable Data Element
-->
<DATA name="dynamic.searchtext"
short="Search terms"
type="text" source="service"
category="interactive"/>
<DATA
name="dynamic.interactionrecord"
short="server stores the transaction history"
type="text" source="service"
category="interactive"/>
<!-- "user." Data Schema -->
<DATA name="user.name."
short="User's Name"
type="personname."
category="physical"/>
<DATA name="user.name."
short="User's Name"
type="personname."
category="demographic"/>
<DATA name="user.bdate."
short="User's Birth Date"
type="date."
category="demographic"/>
<DATA name="user.cert."
short="User's Identity certificate"
type="certificate."
category="uniqueid"/>
<DATA name="user.gender"
short="User's gender"
type="gender"
category="demographic"/>
<DATA name="user.jobtitle"
short="User's Job Title"
type="text"
category="demographic"/>
<DATA name="user.home."
short="User's Home Contact Information"
type="contact."
category="physical"/>
<DATA name="user.business."
short="User's Business Contact Information"
type="contact."
category="physical"/>
<DATA name="user.employer"
short="Name of User's Employer"
type="text"
category="demographic"/>
<DATA name="user.department"
short="Department or division of organization where user
is employed"
type="text"
category="demographic"/>
<!-- "thirdparty." Data Schema
-->
<DATA name="thirdparty.name."
short="Third Party's Name"
type="personname."
category="physical"/>
<DATA name="thirdparty.name."
short="Third Party's Name"
type="personname."
category="demographic"/>
<DATA name="thirdparty.bdate."
short="Third Party's Birth Date"
type="date."
category="demographic"/>
<DATA name="thirdparty.cert."
short="Third Party's Identity certificate"
type="certificate."
category="uniqueid"/>
<DATA name="thirdparty.gender"
short="Third Party's gender"
type="gender"
category="demographic"/>
<DATA name="thirdparty.jobtitle"
short="Third Party's Job Title"
type="text"
category="demographic"/>
<DATA name="thirdparty.home."
short="Third Party's Home Contact
Information"
type="contact."
category="physical"/>
<DATA name="thirdparty.business."
short="Third Party's Business Contact
Information"
type="contact."
category="physical"/>
<DATA name="thirdparty.employer"
short="Name of Third Party's Employer"
type="text"
category="demographic"/>
<DATA name="thirdparty.department"
short="Department or division of organization where third
party is employed"
type="text"
category="demographic"/>
<!-- "business." Data Schema
-->
<DATA name="business.name"
short="Organization Name"
type="text"
category="demographic"/>
<DATA name="business.department"
short="Department or division of
organization"
type="text"
category="demographic"/>
<DATA name="business.cert."
short="Organization Identity certificate"
type="certificate."
category="uniqueid"/>
<DATA name="business.contact-info."
short="Contact Information for the
Organization"
type="contact."
category="physical"/>
</DATA-GROUP>
</DATASCHEMA>
This appendix contains two XML Schemas, one for P3P policy documents, and one for P3P dataschema documents. An XML Schema may be used to validate the structure and datatype values used in an instance of the schema given as an XML document. P3P policy and dataschema documents are XML documents that MUST conform to these schemas. Note that these schemas are based on the 17 December XML Schema working drafts [XML-Schema1][XML-Schema2], which are subject to change.
<?xml version='1.0'?> <!-- XML Schema for P3P policies --> <!DOCTYPE schema PUBLIC "-//W3C//DTD XMLSCHEMA 19991105//EN" "structures.dtd"> <schema xmlns='http://www.w3.org/1999/XMLSchema/' targetNamespace='http://www.w3.org/2000/P3Pv1' version='0.9'> <!-- policy element -> <element name='policy'> <type> <element ref='entity' minOccurs='1'/> <element ref='disclosure' minOccurs='1'/> <element ref='disputes-group' minOccurs='0'/> <element ref='statement' minOccurs='1' maxOccurs='*'/> <element ref='extension' minOccurs='0' maxOccurs='*'/> </type> </element> <!-- entity element --> <element name='entity'> <type> <element ref='data' minOccurs='1' maxOccurs='*'/> </type> </element> <!-- disclosure element --> <element name='disclosure'> <type> <attribute name='discuri'> <datatype basetype='URI'/> </attribute> <attribute name='access-disclosure'> <datatype basetype='STRING'> <enumeration> <literal>nonident</literal> <literal>contact</literal> <literal>other</literal> <literal>contact_and_other</literal> <literal>all</literal> <literal>none</literal> </enumeration> </datatype> </attribute> </type> </element> <!-- disputes-group element --> <element name='disputes-group'> <type> <element name='disputes' maxOccurs='*'> <type> <attribute name='resolution-type'> <datatype basetype='STRING'> <enumeration> <literal>internal</literal> <literal>third-party</literal> <literal>law</literal> </enumeration> </datatype> </attribute> <attribute name='service' type='URI'/> <attribute name='description' type='string' minOccurs='0'/> <attribute name='verification' type='string' minOccurs='0'/> <attribute name='image' type='URI' minOccurs='0'/> <attribute name='width' type='non-negative-integer' minOccurs='0'/> <attribute name='height' type='non-negative-integer' minOccurs='0'/> <attribute name='alt' type='string' minOccurs='0'/> <element ref='remedies' minOccurs='0' maxOccurs='1'/> </type> </element> </type> </element> <!-- remedies element --> <element name='remedies'> <type> <group minOccurs='1'> <element name='correct'/> <element name='money'/> <element name='law'/> </group> </type> </element> <!-- statement element -> <element name='statement'> <type> <group> <element ref='consequence' minOccurs='0'/> <element ref='purpose' minOccurs='1'/> <element ref='recipient' minOccurs='1'/> <element ref='retention' minOccurs='1'/> <element ref='data-group' minOccurs='1' maxOccurs='*'/> <element ref='extension' maxOccurs='*'/> </group> </type> </element> <!-- consequence element --> <element name='consequence' maxOccurs='*' type='string'/> <!-- purpose element -> <element name='purpose'> <type> <group minOccurs='1'> <element name='current'/> <attribute name='change_preferences' type='yesorNo'/> <element name='admin'/> <attribute name='change_preferences' type='yesorNo'/> <element name='develop'/> <attribute name='change_preferences' type='yesorNo'/> <element name='contact'/> <attribute name='change_preferences' type='yesorNo'/> <element name='customization'/> <attribute name='change_preferences' type='yesorNo'/> <element name='targeting'/> <attribute name='change_preferences' type='yesorNo'/> <element name='profiling'/> <attribute name='change_preferences' type='yesorNo'/> <element name='other'> <attribute name='change_preferences' type='yesorNo'/> <datatype basetype='string'/> </element> </group> </type> </element> <!-- recipient element -> <element name='recipient'> <type> <group maxOccurs='1'> <element name='ours'/> <element name='same'/> <element name='other'/> <element name='public'/> <element name='unrelated'/> <element name='delivery'/> </group> </type> </element> <!-- retention element -> <element name='retention'> <type> <element name='no-retention'/> <element name='stated-purpose'/> <element name='legal-requirement'/> <element name='indefinitely'/> <element name='business-practices'/> </type> </element> <!-- data-group element --> <element name='data-group'> <type> <element ref='data' maxOccurs='*'/> </type> </element> <!-- data element --> <element name='data'> <type> <attribute name='category'> <datatype basetype='STRING'> <enumeration> <literal>physical</literal> <literal>online</literal> <literal>uniqueid</literal> <literal>purchase</literal> <literal>financial</literal> <literal>computer</literal> <literal>navigation</literal> <literal>interactive</literal> <literal>demographic</literal> <literal>content</literal> <literal>state</literal> <literal>political</literal> <literal>health</literal> <literal>preference</literal> <literal>other</literal> </enumeration> </datatype> </attribute> <attribute name='dataschema' type='string'/> <attribute name='optional' type='yesOrNo'/> <attribute name='type' type='string'/> <attribute name='typeschema' type='string'/> <attribute name='template' type='yesOrNo'/> <attribute name='short' type='string'/> <attribute name='long' type='string'/> <attribute name='size' type='non-negative-integer'/> <attribute name='value' type='string'/> </type> </element> <element name='man-extension'> <!-- any mixed content --> </element> <element name='opt-extension'> <!-- any mixed content --> </element> </schema> <?xml version='1.0'?> <!-- XML Schema for P3P dataschemas --> <!DOCTYPE schema PUBLIC "-//W3C//DTD XMLSCHEMA 19991105//EN" "structures.dtd"> <schema xmlns='http://www.w3.org/1999/XMLSchema' targetNamespace='http://www.w3.org/2000/P3Pv1' version='0.9'> <!-- dataschema element --> <element name='dataschema'> <type> <group maxOccurs='*'> <element ref='data-group'> <element ref='man-extension'> <element ref='opt-extension'> </group> </type> </element> </schema>
This appendix contains the DTDs for policy documents and for dataschemas. The following is the XML DTD for P3P policy documents.
<!-- ************** Entities ************** --> <!ENTITY % URI "CDATA"> <!ENTITY % NUMBER "CDATA"> <!ENTITY % LANG "CDATA"> <!ENTITY % dataelement "CDATA"> <!ENTITY % categories "( physical | online | uniqueid | purchase | financial | computer | navigation | interactive | demographic | content | state | political | health | preference | other )"> <!-- *********** POLICY *********** --> <!ELEMENT POLICY (ENTITY,DISCLOSURE,DISPUTES-GROUP?,STATEMENT+,EXTENSION*)> <!-- *********** ENTITY *********** --> <!ELEMENT ENTITY (DATA+,EXTENSION*)+> <!-- *********** DISCLOSURE *********** --> <!ELEMENT DISCLOSURE (EXTENSION*)> <!ATTLIST DISCLOSURE discuri %URI; #REQUIRED access (nonident | contact | other_ident | contact_and_other | all | none) #REQUIRED > <!-- *********** DISPUTES *********** --> <!ELEMENT DISPUTES-GROUP (DISPUTES+,EXTENSION*)+> <!ELEMENT DISPUTES (REMEDIES?,EXTENSION*)> <!ATTLIST DISPUTES resolution-type (service|independent|court|law) #REQUIRED service %URI; #REQUIRED description CDATA #IMPLIED verification CDATA #IMPLIED image CDATA #IMPLIED width %NUMBER; #IMPLIED height %NUMBER; #IMPLIED alt CDATA #IMPLIED > <!-- *********** REMEDIES *********** --> <!ELEMENT REMEDIES (correct?, money?, law?, EXTENSION*)> <!ELEMENT correct EMPTY> <!ELEMENT money EMPTY> <!ELEMENT law EMPTY> <!-- *********** STATEMENT *********** --> <!ELEMENT STATEMENT (CONSEQUENCE?, PURPOSE, RECIPIENT, RETENTION, DATA-GROUP+, EXTENSION*)> <!-- *********** CONSEQUENCE *********** --> <!ELEMENT CONSEQUENCE (#PCDATA)> <!-- *********** PURPOSE *********** --> <!ELEMENT PURPOSE (current?, admin?, develop?, contact?, customization?, targeting?, profiling?, other-purpose?, EXTENSION*)> <!ELEMENT current EMPTY> <!ATTLIST current change_preferences (yes|no) "no" > <!ELEMENT admin EMPTY> <!ATTLIST admin change_preferences (yes|no) "no" > <!ELEMENT develop EMPTY> <!ATTLIST develop change_preferences (yes|no) "no" > <!ELEMENT contact EMPTY> <!ATTLIST contact change_preferences (yes|no) "no" > <!ELEMENT customization EMPTY> <!ATTLIST customization change_preferences (yes|no) "no" > <!ELEMENT targeting EMPTY> <!ATTLIST targeting change_preferences (yes|no) "no" > <!ELEMENT profiling EMPTY> <!ATTLIST profiling change_preferences (yes|no) "no" > <!ELEMENT other-purpose (#PCDATA)> <!ATTLIST other-purpose change_preferences (yes|no) "no" > <!-- *********** RECIPIENT *********** --> <!ELEMENT RECIPIENT (ours?, same?, other-recipient?, delivery?, public?, unrelated?, EXTENSION*)> <!ELEMENT ours EMPTY> <!ELEMENT same EMPTY> <!ELEMENT other-recipient EMPTY> <!ELEMENT delivery EMPTY> <!ELEMENT public EMPTY> <!ELEMENT unrelated EMPTY> <!-- *********** RETENTION *********** --> <!ELEMENT RETENTION ((no-retention | stated-purpose | legal-requirement | indefinitely | business-practices), EXTENSION*)> <!ELEMENT no-retention EMPTY> <!ELEMENT stated-purpose EMPTY> <!ELEMENT legal-requirement EMPTY> <!ELEMENT indefinitely EMPTY> <!ELEMENT business-practices EMPTY> <!-- *********** DATA *********** --> <!ELEMENT DATA-GROUP (DATA+,EXTENSION*)+> <!ELEMENT DATA (#PCDATA | EXTENSION)*> <!ATTLIST DATA name %dataelement; #REQUIRED dataschema %URI; #IMPLIED optional (yes | no) "no" type CDATA #IMPLIED typeschema CDATA #IMPLIED template (yes | no) "no" category %categories; #IMPLIED short CDATA #IMPLIED long CDATA #IMPLIED size %NUMBER; #IMPLIED > <!-- *********** EXTENSION *********** --> <!ELEMENT EXTENSION (#PCDATA)> <!ATTLIST EXTENSION optional (yes | no) "yes" >
This diagram illustrates the RDF [RDF] data model for the policy shown in Example 3.1. Note this representation can be not completely up-to-date with respect to the current syntax, the working group is currently working on synchronizing this RDF representation with all the most recent changes.
The formal grammar of P3P is given in this specification using a slight modification of [ABNF]. The following is a simple description of the ABNF.
name = (elements)
(
element1 element2)
<a>*<b>element
<a>element
<a>*element
*<b>element
*element
[element]
"string"
or 'string'
Other notations used in the productions are:
This appendix describes the intent of P3P development and recommends guidelines regarding the responsible use of P3P technology. An earlier version was published in the W3C Note "P3P Guiding Principles".
The Platform for Privacy Preferences Project (P3P) has been designed to be flexible and support a diverse set of user preferences, public policies, service provider polices, and applications. This flexibility will provide opportunities for using P3P in a wide variety of innovative ways that its designers had not imagined. The P3P Guiding Principles were created in order to: express the intentions of the undersigned members of the P3P working groups when designing this technology and suggest how P3P can be used most effectively in order to maximize privacy and user confidence and trust on the Web. In keeping with our goal of flexibility, this document does not place requirements upon any party. Rather, it makes recommendations about 1) what should be done to be consistent with the intentions of the P3P designers and 2) how to maximize user confidence in P3P implementations and Web services. Organizations, individuals, policy-makers, and companies who use P3P are invited to join us in supporting these principles.
P3P has been designed to promote privacy and trust on the Web by enabling service providers to disclose their information practices, and enabling individuals to make informed decisions about the collection and use of their personal information. P3P user agents work on behalf of individuals to reach agreements with service providers about the collection and use of personal information. Trust is built upon the mutual understanding that each party will respect the agreement reached.
Service providers should preserve trust and protect privacy by applying relevant laws and principles of data protection and privacy to their information practices. The following is a list of privacy principles and guidelines that helped inform the development of P3P and may be useful to those who use P3P:
In addition, service providers and P3P implementers should recognize and address the special concerns surrounding children's privacy.
Service providers should provide timely and effective notices of their information practices, and user agents should provide effective tools for users to access these notices and make decisions based on them.
Service providers should:
User agents should:
Users should be given the ability to make meaningful choices about the collection, use, and disclosure of personal information. Users should retain control over their personal information and decide the conditions under which they will share it.
Service providers should:
User agents should:
Service providers should treat users and their personal information with fairness and integrity. This is essential for protecting privacy and promoting trust.
Service providers should:
User agents should:
While P3P itself does not include security mechanisms, it is intended to be used in conjunction with security tools. Users' personal information should always be protected with reasonable security safeguards in keeping with the sensitivity of the information.
Service providers should:
User agents should:
This specification was produced by the P3P Specification Working Group. The following individuals participated in the P3P Specification Working Group, chaired by Lorrie Cranor (AT&T): Mark Ackerman (University of California, Irvine), Margareta Björksten (Nokia), Joe Coco (Microsoft), Patrick Feng (RPI), Yuichi Koike (NEC/W3C), Daniel LaLiberte (Crystaliz), Marc Langheinrich (NEC/ETH Zurich), Daniel Lim (PrivacyBank), Massimo Marchiori (W3C/MIT), Christine McKenna (Phone.com, Inc.), Paul Perry (Microsoft), Martin Presler-Marshall (IBM), Joel Reidenberg (Fordham Law School), Dave Remy (Geotrust), Ari Schwartz (CDT), Rigo Wenning (W3C), Betty Whitaker (NCR), Sam Yen (Citigroup), Alan Zausner (American Express).
The P3P Specification Working Group inherited a large part of the specification from previous P3P Working Groups. The Working Group would like to acknowledge the contributions of the members of these previous groups (affiliations shown are the members' affiliations at the time of their participation in each Working Group).
The P3P Implementation and Deployment Working Group, chaired by Rolf Nelson (W3C) and Marc Langheinrich (NEC/ETH Zurich): Mark Ackerman (University of California, Irvine), Rob Barrett (IBM), Joe Coco (Microsoft), Lorrie Cranor (AT&T), Massimo Marchiori (W3C/MIT), Gabe Montero (IBM), Stephen Morse (Netscape), Paul Perry (Microsoft), Ari Schwartz (CDT), Gabriel Speyer (Citibank), Betty Whitaker (NCR).
The P3P Syntax Working Group, chaired by Steve Lucas (Matchlogic): Lorrie Cranor (AT&T), Melissa Dunn (Microsoft), Daniel Jaye (Engage Technologies), Massimo Marchiori (W3C/MIT), Maclen Marvit (Narrowline), Max Metral (Firefly), Paul Perry (Firefly), Martin Presler-Marshall (IBM), Drummond Reed (Intermind), Joseph Reagle (W3C).
The P3P Vocabulary Harmonization Working Group, chaired by Joseph Reagle (W3C): Liz Blumenfeld (America Online), Ann Cavoukian (Information and Privacy Commission/Ontario), Scott Chalfant (Matchlogic), Lorrie Cranor (AT&T), Jim Crowe (Direct Marketing Association), Josef Dietl (W3C), David Duncan (Information and Privacy Commission/Ontario), Melissa Dunn (Microsoft), Patricica Faley (Direct Marketing Association), Marit Köhntopp (Privacy Commissioner of Schleswig-Holstein, Germany), Tony Lam (Hong Kong Privacy Commissioner's Office), Tara Lemmey (Narrowline), Jill Lesser (America Online), Steve Lucas (Matchlogic), Deirdre Mulligan (Center for Democracy and Technology), Nick Platten (Data Protection Consultant, formerly of DG XV, European Commission), Ari Schwartz (Center for Democracy and Technology), Jonathan Stark (TRUSTe).
The P3P Protocols and Data Transport Working Group, chaired by Yves Leroux (Digital): Lorrie Cranor (AT&T), Philip DesAutels (Matchlogic), Melissa Dunn (Microsoft), Peter Heymann (Intermind), Tatsuo Itabashi (Sony), Dan Jaye (Engage), Steve Lucas (Matchlogic), Jim Miller (W3C), Michael Myers (VeriSign), Paul Perry (FireFly), Martin Presler-Marshall (IBM), Joseph Reagle (W3C), Drummond Reed (Intermind), Craig Vodnik (Pencom Web Worlds).
The P3P Vocabulary Working Group, chaired by Lorrie Cranor (AT&T): Mark Ackerman (W3C), Philip DesAutels (W3C), Melissa Dunn (Microsoft), Joseph Reagle (W3C), Upendra Shardanand (Firefly).
The P3P Architecture Working Group, chaired by Martin Presler-Marshall (IBM): Mark Ackerman (W3C), Lorrie Cranor (AT&T), Philip DesAutels (W3C), Melissa Dunn (Microsoft), Joseph Reagle (W3C).
Finally, Appendix 7 is drawn by the W3C Note "P3P Guiding Principles", whose signatories are: Azer Bestavros (Bowne Internet Solutions), Ann Cavoukian (Information and Privacy Commission Ontario Canada), Lorrie Faith Cranor (AT&T Labs-Research), Josef Dietl (W3C), Daniel Jaye (Engage Technologies), Marit Köhntopp (Land Schleswig-Holstein), Tara Lemmey (Narrowline; TrustE), Steven Lucas (MatchLogic), Massimo Marchiori (W3C/MIT), Dave Marvit (Fujitsu Labs), Maclen Marvit (Narrowline Inc.), Yossi Matias (Tel Aviv University), James S. Miller (MIT), Deirdre Mulligan (Center for Democracy and Technology), Joseph Reagle (W3C), Drummond Reed (Intermind), Lawrence C. Stewart (Open Market, Inc.).
Change log from the 2 November 1999 Specification (last call):