[ Contents ]
3. Data Element Formats This section describes the data elements used by OpenPGP. 3.1. Scalar numbers Scalar numbers are unsigned, and are always stored in big-endian format. Using n[k] to refer to the kth octet being interpreted, the value of a two-octet scalar is ((n[0] << 8) + n[1]). The value of a four-octet scalar is ((n[0] << 24) + (n[1] << 16) + (n[2] << 8) + n[3]). 3.2. Multi-Precision Integers Multi-Precision Integers (also called MPIs) are unsigned integers used to hold large integers such as the ones used in cryptographic calculations. An MPI consists of two pieces: a two-octet scalar that is the length of the MPI in bits followed by a string of octets that contain the actual integer. These octets form a big-endian number; a big-endian number can be made into an MPI by prefixing it with the appropriate length. Examples: (all numbers are in hexadecimal) The string of octets [00 01 01] forms an MPI with the value 1. The string [00 09 01 FF] forms an MPI with the value of 511. Additional rules: The size of an MPI is ((MPI.length + 7) / 8) + 2 octets. The length field of an MPI describes the length starting from its most significant non-zero bit. Thus, the MPI [00 02 01] is not formed correctly. It should be [00 01 01]. 3.3. Key IDs A Key ID is an eight-octet scalar that identifies a key. Implementations SHOULD NOT assume that Key IDs are unique. The section, "Enhanced Key Formats" below describes how Key IDs are formed. 3.4. Text The default character set for text is the UTF-8 [RFC2279] encoding of Unicode [ISO10646]. 3.5. Time fields A time field is an unsigned four-octet number containing the number of seconds elapsed since midnight, 1 January 1970 UTC. 3.6. String-to-key (S2K) specifiers String-to-key (S2K) specifiers are used to convert passphrase strings into symmetric-key encryption/decryption keys. They are used in two places, currently: to encrypt the secret part of private keys in the private keyring, and to convert passphrases to encryption keys for symmetrically encrypted messages. 3.6.1. String-to-key (S2k) specifier types There are three types of S2K specifiers currently supported, as follows: 3.6.1.1. Simple S2K This directly hashes the string to produce the key data. See below for how this hashing is done. Octet 0: 0x00 Octet 1: hash algorithm Simple S2K hashes the passphrase to produce the session key. The manner in which this is done depends on the size of the session key (which will depend on the cipher used) and the size of the hash algorithm's output. If the hash size is greater than or equal to the session key size, the high-order (leftmost) octets of the hash are used as the key. If the hash size is less than the key size, multiple instances of the hash context are created -- enough to produce the required key data. These instances are preloaded with 0, 1, 2, ... octets of zeros (that is to say, the first instance has no preloading, the second gets preloaded with 1 octet of zero, the third is preloaded with two octets of zeros, and so forth). As the data is hashed, it is given independently to each hash context. Since the contexts have been initialized differently, they will each produce different hash output. Once the passphrase is hashed, the output data from the multiple hashes is concatenated, first hash leftmost, to produce the key data, with any excess octets on the right discarded. 3.6.1.2. Salted S2K This includes a "salt" value in the S2K specifier -- some arbitrary data -- that gets hashed along with the passphrase string, to help prevent dictionary attacks. Octet 0: 0x01 Octet 1: hash algorithm Octets 2-9: 8-octet salt value Salted S2K is exactly like Simple S2K, except that the input to the hash function(s) consists of the 8 octets of salt from the S2K specifier, followed by the passphrase. 3.6.1.3. Iterated and Salted S2K This includes both a salt and an octet count. The salt is combined with the passphrase and the resulting value is hashed repeatedly. This further increases the amount of work an attacker must do to try dictionary attacks. Octet 0: 0x03 Octet 1: hash algorithm Octets 2-9: 8-octet salt value Octet 10: count, a one-octet, coded value The count is coded into a one-octet number using the following formula: #define EXPBIAS 6 count = ((Int32)16 + (c & 15)) << ((c >> 4) + EXPBIAS); The above formula is in C, where "Int32" is a type for a 32-bit integer, and the variable "c" is the coded count, Octet 10. Iterated-Salted S2K hashes the passphrase and salt data multiple times. The total number of octets to be hashed is specified in the encoded count in the S2K specifier. Note that the resulting count value is an octet count of how many octets will be hashed, not an iteration count. Initially, one or more hash contexts are set up as with the other S2K algorithms, depending on how many octets of key data are needed. Then the salt, followed by the passphrase data is repeatedly hashed until the number of octets specified by the octet count has been hashed. The one exception is that if the octet count is less than the size of the salt plus passphrase, the full salt plus passphrase will be hashed even though that is greater than the octet count. After the hashing is done the data is unloaded from the hash context(s) as with the other S2K algorithms. 3.6.2. String-to-key usage Implementations SHOULD use salted or iterated-and-salted S2K specifiers, as simple S2K specifiers are more vulnerable to dictionary attacks. 3.6.2.1. Secret key encryption An S2K specifier can be stored in the secret keyring to specify how to convert the passphrase to a key that unlocks the secret data. Older versions of PGP just stored a cipher algorithm octet preceding the secret data or a zero to indicate that the secret data was unencrypted. The MD5 hash function was always used to convert the passphrase to a key for the specified cipher algorithm. For compatibility, when an S2K specifier is used, the special value 255 is stored in the position where the hash algorithm octet would have been in the old data structure. This is then followed immediately by a one-octet algorithm identifier, and then by the S2K specifier as encoded above. Therefore, preceding the secret data there will be one of these possibilities: 0: secret data is unencrypted (no pass phrase) 255: followed by algorithm octet and S2K specifier Cipher alg: use Simple S2K algorithm using MD5 hash This last possibility, the cipher algorithm number with an implicit use of MD5 and IDEA, is provided for backward compatibility; it MAY be understood, but SHOULD NOT be generated, and is deprecated. These are followed by an 8-octet Initial Vector for the decryption of the secret values, if they are encrypted, and then the secret key values themselves. 3.6.2.2. Symmetric-key message encryption OpenPGP can create a Symmetric-key Encrypted Session Key (ESK) packet at the front of a message. This is used to allow S2K specifiers to be used for the passphrase conversion or to create messages with a mix of symmetric-key ESKs and public-key ESKs. This allows a message to be decrypted either with a passphrase or a public key. PGP 2.X always used IDEA with Simple string-to-key conversion when encrypting a message with a symmetric algorithm. This is deprecated, but MAY be used for backward-compatibility.
Updated: 1999-09-30 wkoch