Multipart types may be returned on the condition that the client has indicated acceptability (using Accept:)of the multipart type and also of the content types of each consitutent body part. The body parts (unlike in MIME) MAY contain HTTP metainformation header fields which ARE significant. Multipart/alternative This is normally used in mail to send different content-type variants when the receiver's capabilities are not known. This is not the case with HTTP. Multipart/alternate may however be used to provide meta information of many instances of an object, in the case of a indirection response. This allows, for example, pointers to be returned by a name server to a set of instances of an object. Multipart/related This is the type to be used when the first body part contains references to other parts which the server wishes to send at the same time. For example, the first part could be an HTML document, and the included bodyparts could be the inline images mentioned within the text. The body parts may have URI: fields if the body parts have URIs, and so they may be referred to by these URIs in the body-parts. If the body-parts are transient (as in speech insertions in mail messages) then the [propsed] "cid:" URI type may be used to refer to them by content-identifier. Multipart/mixed This may be used to simply transfer an unrelated unstructured set of objects. Multipart/parallel This may be used as in MIME to indicate simultaneous presentation. [It is the author's belief that this is a trivial case of a compound presentation which in general should be described by a script which would be teh first bodypart of a multipart/related document]. DATE: DATE Creation date of object. (or last modified, and separately have a Created: field?) Format as in RFC850 but GMT MUST BE USED. EXPIRES: DATE Gives the date after which the information given ceases to be valid and should be retrieved again. This allows control of caching mechanisms, and also allows for the periodic refreshing of displays of volatile data. Format as for Date:. This does NOT imply that the original object will cease to exist. LAST-MODIFIED: DATE Last time object was modified, i.e. the date of this version if the document is a "living document". Format as for Date:. MESSAGE-ID: URI A unique identifier for the message. As in RFC850 , except that the unlimited lifetime of HTTP objects requires that the Message-ID be unique in all time, not just in two years. A document may only have one Message-ID. No two documents, even if different versions of the same live document, may have the same Message-id. Note: Unlike the URI field, this does not fgive a way of accessing the document, so the Message-Id cannot be used to refer to the document. In the case of NNTP articles, the message-id may in fact be used within the URI for retrieval using NNTP. URI: 1*URI This gives a URI with which the object may be found. There is no guarantee that the object can be retrieved using the URI specified. However, it is guaranteed that if an object is successfully retrieved using that URI it will be to a certain given degree the same object as this one. If the URI is used to refer to a set of variants, then the dimensiosn in which the variants may differ must be given with the "vary" parameter: Syntax URI: [ ; vary = dimension [ , dimension ]* ] dimension content-type | language | version If no "vary" parameters are given, then the URI may not return anything other than the same bit stream as this object. Multiple occurencies of this field give alternative access names or addresses for the object. Examples URI: http://info.cern.ch/pub/www/doc/url6.multi; vary =content-type This indicates that retrieval given the URI will return the same document, never an updated version, but optionally in a different rendition. URI: http://info.cern.ch/pub/www/doc/url.multi; vary=content-type, language, version This indicates that the URI will return the smae document, possibly in a different rendition, possibly updated, and without excluding the provision of translations into different languages.URI: http://info.cern.ch/pub/www/doc/url6.ps vary=content-typeThis indicates that accessing the URI in question will return exactly the same bitstream. VERSION: This is a string defining the version of an evolving object. Its format is currently undefined , and so it should be treated as opaque to the reader, defined by the informatiuon provider. The version field is used to indicate evolution along a single path of a partucular work. It is NOT used to indicate derived works (use a link), translations , or renditions in different representations . Note: It would be useful to have sufficient semantics to be able to deduce whether one version predated or postdated another. However, it may also be useful to be able to insert a particular local code management system's own version stamp in this field. Typically, publishers will have quite complex version information containing hidden local semantics, giving value to the idea of this field being opaque to other readers ofthe document. DERIVED-FROM: When an editied object is resubmitted using PUT for example, this field gives the value of the Version . This typically allows a server to check for example that two concurrent modifications by different parties will not be lost, and for example to use established version management techniques to merge both modifications. LANGUAGE: CODE The language code is the ISO code for the language in which the document is written. If the language is not known, this field should be omitted of course . The language code is an ISO 3316 language code with an optional ISO639 country code to specify a national variant. Example Language: en_UK means that the content of the message is in British English, while Language: en means that the language is English in one of its forms. (@@ If a document is in more than one language, for example requires both Greek Latin and French to be understood, should this be representable?) See also: Accept-Language . COST: TBS The cost of retrieving the object is given. This is the cost of access of a copyright work. Format of units to be specified. Currently refers to an unspecified charging scheme to be agreed out of band between parties. WWW-LINK: Note. It is proposed that any HTML metainformation element (allowed withing the HEAD as opposed to BODY element of the document) be a valid candidate for an HTTP object header. WWW_Link is a required example. The suggestion was that the isomorphism should be realized by prepending "WWW-" t the HTML element name to make the HTTP header name, and the HTML attributes imply identically named semicolon-separated MIME-style header parameters. Other clear candidates include WWW-Title. It is open to discussion whether the "WWW-" should be removed. This is semantically equivalent to the LINK element in an HTML document which should be consulted for a full explanation. Examples WWW-Link: href="http://info.cern.ch/a/b/c"; rel="includes" WWW-Link: href="mailto:timbl@info.cern.ch"; rev=made The first example indicates that this object includes the specified /a/b/c object. The second indicates that the author of the object is identified by the given mail address. Note: Client tolerance of bad servers Servers not implementing the specification as written are not HTTP compiant. Servers should always be made completely copmpliant. However, clients should also tolerate deviant servers where possible. BACK COMPATIBILITY In order that clients using the HTTP protocol should be able to communicate with servers using the protocol originally implemented in the W3 data model, clients should tolerate responses which do not start with a numeric version number and response codes. In this case, they should assume that the rest of the response is a document body in type text/html. WHITE SPACE Clients should be tolerant in parsing response status lines, in particular they should accept any sequence of white space (SP and TAB) characters between fields. Lines should be regarded as terminated by the Line Feed, and the preceeding Carriage Return character ignored. HTTP NEGOTIATION ALGORITM This note defines the significance of the q, mxb and mxs values optionally sent in the Accept: field of the HTTP protocol request message. It is assumed that there is a certain value of the presentation of the document, optimally rendered using all the information available in its original source. It is further assumed that one can allocate a number between 0 and 1 to represent the loss of value which occurs when a document is rendered into a representation with loss of information. Whilst this is a very subjective measurement, and in fact largely a function of the document in question, the approximation is made that one can define this "degradation" figure as a function of merely the representation involved. The next assumption is that the other cost to the user of viewing the document is a function of the time taken for presentation. We first assume that the cost is linear in time, and then assume that the time is linear in the size of the message. The final net value to the user can therefore be written presented_value = initial_value * total-degradation - a - b * size for a document in a given incoming representation. Suppose we normalize the initial value of the document to be 1. The server may judge that the value in a particular format is less than 1 is a conversion on the server side has lost information. The total degradation is then the product of any degradation due to conversions internal to the server, and the degradation "q" sent in the Accept field. If q is not sent, it defaults to 1. The values of a and b have components from processing time on the server, network delays, and processing time on the client. These delays are not additive as a good system will pipeline the processing, and whilst the result may be linear in message size, calculation of it in advance is not simple. The amount of pipelining and the loads on machines and network are all difficult to predict, so a very rough assumption must be made. We make the client responsible for taking into account network delays. The client will in fact be in a better position to do this, as the client will after one transaction be aware of the round-trip time. We assume that the delays imposed by the server and by the client (including network) are additive. We assume that the client's delay is proportional to message size. The three parameters given by the client to the server are q The degradation (quality) factor between 0 and 1. If omitted, 1 is assumed. mxb The size of message (in bytes) which even if immediately available from the server will cause the value to the reader to become zero mxs The delay (in seconds) which, even for a very small message with no length-related penalty, will cause the value to the reader to become zero. These parameters are chosen in part because they are easy to visualize as the largest tolerable delay and size. If not sent, they default to infinity. The server may optimize the presented value for the user when deciding what to return. The hope is that fine decisions will not have to be made, as in most cases the results for different formats will be very different, and there will be a clear winner. A suitable algorithm is that the assumed value v of a document of initial value u delivered to the network after a delay t whose transfer length on the net is b bytes is v = u * q - b/mxb - t/mxs Note that t is the time from the arrival of the request to the first byte being available on the net. [[See also: Design issues discussions around this point.]] HTTP Negotiation algoritm This note defines the significance of the d, a and b values optionally sent in the Accept: field of the HTTP protocol request message. It is assumed that there is a certain value of the presentation of the document, optimally rendered using all the information available in its original source. It is further assumed that one can allocate a number between 0 and 1 to represent the loss of value which occurs when a document is rendered intop a represenmtation with loss of information. Whilst this is a very subjective measurement, and in fact largely a function of the document in question, the approximation is made that one can define this "degradation" figure as a function of merely the representation involved. The next assumption is that there the other cost to the user of viewing the doument is a function of the time taken for presentation. We first assume that the cost is linear in time, and then assume that the time is linear in the size of the message. The final net value to the user can therefore be written presented_value = initial_value * total-degradation - a - b * size for a document in a given incoming represenattion. Suppose we normalize the initial value of the document to be 1. The total degradation is the product of any degradation due to conversions internal to the server, and the degradation "d" sent in the Accept field. The values of a and b are also sent by the protocol. If not sent, they default to no cost (d=1, a=b=0). The server may optimize the presented value for the user when deciding what to return. The hope is that fine decisions will not have to be made, as in most cases the results for different formats will be very different, and there will be a clear winner. See also: Design issues discussions around this point. Tim BL Note: The cost of retrieval time The assumption that the cost to the user associated with a certain retrieval time is linear in that time is wildly innaccurate. The real function could be very dependent on circumstances (like go to infinity at a deadline). A better general approximation might be logarithmic for large time delays, and linear for small ones, like a*log(b*t-1) which has two parameters. REGISTRATION AUTHORITY The HTTP Registration Authority is responsible for maintaining lists of: Charge account name spaces (see ChargeTo: field above) Authorization schemes (see Authorization: field above) Data format names (as MIME Content-Types) Data encoding names (as MIME Content-Encoding)) It is proposed that the Internet Assigned Numbers Authority or their successors take this role. Unregistered values may be used for experimental purposes if they are start with "X-". SECURITY CONSIDERATIONS The HTTP protocol allows requests to communication to a remote server machine, and all the expetant security considerations for client-server systems apply, including Authentication of requests Authenticationtion of servers Privacy of request and response Abuse of server features Abuse of servers by exploiting server bugs Unwitting actions on the net Abuse of log information The bulk of these are well known problems, tackled in part by some featured of this protocol. Some aspects particular to HTTP are mentioned below. Unwitting actions on the net The writers of client software should be aware that the software represents the user in his interactions over the net, and should be careful to allow the user to be aware of any actions he may take which may be taken as having an unexpected significance by others. TCP PORT NUMBERS Clients should prompt a user before allowing HTTP access to reserved ports other than the port resrverd for HTTP (port 80). Otherwise, the user may unwittingly cause a transaction to occur in some other (present or future) protocol. IDEMPOTENT METHODS The convention should be established that the GET and HEAD methods never have the significance of taking an action. The link "click here to subscribe" causing the reading of a special "magic" document is open to abuse by others making a link "click here to see a pretty picture". These methods should be considered "safe" and should not have side effects. This allows the client software to represent other, methods (such as POST) in a special way, so that the use is aware of the fact that an action is being requested. Abuse of log information A server is in the position to save large amount of personal data about information requested by different readers. This information is clearly confidential in nature, and its handling may be constrained by law in certain countries. Server providers shall ensure that such material is not distributed without the permission of any people or groups of people mentioned in the results published. A feature which increases the amount of personal data transferred is the Referer: field. This allow reading patters to be studied, reverse links drawn, and so is very useful. Its power can be abused of course if user details are not separated from the Referee-Referer pairs. Even when the personal information has been removed, the Referer field may in fact be a secure document's URI, whose revelation itself is breach of security. A method of suppressing the Referer information in such cases may be the subject of further study. REFERENCES RFC 822 "Standard for ARPA Internet Text Messages". David H. Crocker, describes Internet mail message fromat. RFC850 "Standard for Interchange of USENET Messages" This RFC uses some field names in common with this specification, and is relevant reading. RFC977 "Network News Transfer Protocol", Kantor and Lampsley. RFC 1341 Multipurpose Internet Mail Extensions (MIME), Nathaniel Borenstien and Ned Freed, Internet RFC 1341, 1992. Now obsoleted by RFC1521: RFC 1521 MIME. Not available in hypertext form yet. RFC 1522 K Moore. "MIME: Part Two: Message Header Extensions for Non-ASCII Text". RFC1523 MIME text/enriched. RFC 1524 MIME... URL Universal Resource Locators. RFCxxx. Currently available by anonymous FTP from info.cern.ch as /pub/ietf/url3.{ps,txt}. MIME and PEM Internet Draft only OBJECT CONTENTS The data (if any) sent with an HTTP request or reply is in a format and encoding defined by the object header fields, the default being "plain/text" type with "8bit" encoding. Note that while all the other information in the request (just as in the reply) is in ISO Latin1 with lines delimited by Carriage Return/Line Feed pairs, the data may contain 8-bit binary data. Termination The delimiting of the message is determined by the Content-Length: field. If this is present, then the message contains the specified number of bytes. Failing that, the content-type filed may contain a "bounday" attribute which gives the boundary string with exacly the same syntax as for a MIME multipart message. Failing either of the above conditions, the data is terminated by the closing of the connection by the sending party. Note that this method can not be used for data sent with the request. See also: note on server tolerance for back-compatibility, etc.