|
| 1 | +====================== |
| 2 | +Basics of ECC handling |
| 3 | +====================== |
| 4 | + |
| 5 | +The :term:`ECC`, as any asymmetric cryptography system, deals with private |
| 6 | +keys and public keys. Private keys are generally used to create signatures, |
| 7 | +and are kept, as the name suggest, private. That's because possession of a |
| 8 | +private key allows creating a signature that can be verified with a public key. |
| 9 | +If the public key is associated with an identity (like a person or an |
| 10 | +institution), possession of the private key will allow to impersonate |
| 11 | +that identity. |
| 12 | + |
| 13 | +The public keys on the other hand are widely distributed, and they don't |
| 14 | +have to be kept private. The primary purpose of them, is to allow |
| 15 | +checking if a given signature was made with the associated private key. |
| 16 | + |
| 17 | +Number representations |
| 18 | +====================== |
| 19 | + |
| 20 | +On a more low level, the private key is a single number, usually the |
| 21 | +size of the curve size: a NIST P-256 private key will have a size of 256 bits, |
| 22 | +though as it needs to be selected randomly, it may be a slightly smaller |
| 23 | +number (255-bit, 248-bit, etc.). |
| 24 | +Public points are a pair of numbers. That pair specifies a point on an |
| 25 | +elliptic curve (a pair of integers that satisfy the curve equation). |
| 26 | +Those two numbers are similarly close in size to the curve size, so both the |
| 27 | +``x`` and ``y`` coordinate of a NIST P-256 curve will also be around 256 bit in |
| 28 | +size. |
| 29 | + |
| 30 | +.. note:: |
| 31 | + To be more precise, the size of the private key is related to the |
| 32 | + curve *order*, i.e. the number of points on a curve. The coordinates |
| 33 | + of the curve depend on the *field* of the curve, which usually means the |
| 34 | + size of the *prime* used for operations on points. While the *order* and |
| 35 | + the *prime* size are related and fairly close in size, it's possible |
| 36 | + to have a curve where either of them is larger by a bit (i.e. |
| 37 | + it's possible to have a curve that uses a 256 bit *prime* that has a 257 bit |
| 38 | + *order*). |
| 39 | + |
| 40 | +Since normally computers work with much smaller numbers, like 32 bit or 64 bit, |
| 41 | +we need to use special approaches to represent numbers that are hundreds of |
| 42 | +bits large. |
| 43 | + |
| 44 | +First is to decide if the numbers should be stored in a big |
| 45 | +endian format, or in little endian format. In big endian, the most |
| 46 | +significant bits are stored first, so a number like :math:`2^{16}` is saved |
| 47 | +as a three bytes: byte with value 1 and two bytes with value 0. |
| 48 | +In little endian format the least significant bits are stored first, so |
| 49 | +the number like :math:`2^{16}` would be stored as three bytes: |
| 50 | +first two bytes with value 0, than a byte with value 1. |
| 51 | + |
| 52 | +For :term:`ECDSA` big endian encoding is usually used, for :term:`EdDSA` |
| 53 | +little endian encoding is usually used. |
| 54 | + |
| 55 | +Secondly, we need to decide if the numbers need to be stored as fixed length |
| 56 | +strings (zero padded if necessary), or if they should be stored with |
| 57 | +minimal number of bytes necessary. |
| 58 | +That depends on the format and place it's used, some require strict |
| 59 | +sizes (so even if the number encoded is 1, but the curve used is 128 bit large, |
| 60 | +that number 1 still needs to be encoded with 16 bytes, with fifteen most |
| 61 | +significant bytes equal zero). |
| 62 | + |
| 63 | +Public key encoding |
| 64 | +=================== |
| 65 | + |
| 66 | +Generally, public keys (i.e. points) are expressed as fixed size byte strings. |
| 67 | + |
| 68 | +While public keys can be saved as two integers, one to represent the |
| 69 | +``x`` coordinate and one to represent ``y`` coordinate, that actually |
| 70 | +provides a lot of redundancy. Because of the specifics of elliptic curves, |
| 71 | +for every valid ``x`` value there are only two valid ``y`` values. |
| 72 | +Moreover, if you have an ``x`` value, you can compute those two possible |
| 73 | +``y`` values (if they exist). |
| 74 | +As such, it's possible to save just the ``x`` coordinate and the sign |
| 75 | +of the ``y`` coordinate (as the two possible values are negatives of |
| 76 | +each-other: :math:`y_1 == -y_2`). |
| 77 | + |
| 78 | +That gives us few options to represent the public point, the most common are: |
| 79 | + |
| 80 | +1. As a concatenation of two fixed-length big-endian integers, so called |
| 81 | + :term:`raw encoding`. |
| 82 | +2. As a concatenation of two fixed-length big-endian integers prefixed with |
| 83 | + the type of the encoding, so called :term:`uncompressed` point |
| 84 | + representation (the type is represented by a 0x04 byte). |
| 85 | +3. As a fixed-length big-endian integer representing the ``x`` coordinate |
| 86 | + prefixed with the byte representing the combined type of the encoding |
| 87 | + and the sign of the ``y`` coordinate, so called :term:`compressed` |
| 88 | + point representation (the type is then represented by a 0x02 or a 0x03 |
| 89 | + byte). |
| 90 | + |
| 91 | +Interoperable file formats |
| 92 | +========================== |
| 93 | + |
| 94 | +Now, while we can save the byte strings as-is and "remember" which curve |
| 95 | +was used to generate those private and public keys, interoperability usually |
| 96 | +requires to also save information about the curve together with the |
| 97 | +corresponding key. Here too there are many ways to do it: |
| 98 | +save the parameters of the used curve explicitly, use the name of the |
| 99 | +well-known curve as a string, use a numerical identifier of the well-known |
| 100 | +curve, etc. |
| 101 | + |
| 102 | +For public keys the most interoperable format is the one described |
| 103 | +in RFC5912 (look for SubjectPublicKeyInfo structure). |
| 104 | +For private keys, the RFC5915 format (also known as the ssleay format) |
| 105 | +and the PKCS#8 format (described in RFC5958) are the most popular. |
| 106 | + |
| 107 | +All three formats effectively support two ways of providing the information |
| 108 | +about the curve used: by specifying the curve parameters explicitly or |
| 109 | +by specifying the curve using ASN.1 OBJECT IDENTIFIER (OID), which is |
| 110 | +called ``named_curve``. ASN.1 OIDs are a hierarchical system of representing |
| 111 | +types of objects, for example, NIST P-256 curve is identified by the |
| 112 | +1.2.840.10045.3.1.7 OID (in dotted-decimal formatting of the OID, also |
| 113 | +known by the ``prime256v1`` OID node name or short name). Those OIDs |
| 114 | +uniquely, identify a particular curve, but the receiver needs to know |
| 115 | +which numerical OID maps to which curve parameters. Thus the prospect of |
| 116 | +using the explicit encoding, where all the needed parameters are provided |
| 117 | +is tempting, the downside is that curve parameters may specify a *weak* |
| 118 | +curve, which is easy to attack and break (that is to deduce the private key |
| 119 | +from the public key). To verify curve parameters is complex and computationally |
| 120 | +expensive, thus generally protocols use few specific curves and require |
| 121 | +all implementations to carry the parameters of them. As such, use of |
| 122 | +``named_curve`` parameters is generally recommended. |
| 123 | + |
| 124 | +All of the mentioned formats specify a binary encoding, called DER. That |
| 125 | +encoding uses bytes with all possible numerical values, which means it's not |
| 126 | +possible to embed it directly in text files. For uses where it's useful to |
| 127 | +limit bytes to printable characters, so that the keys can be embedded in text |
| 128 | +files or text-only protocols (like email), the PEM formatting of the |
| 129 | +DER-encoded data can be used. The PEM formatting is just a base64 encoding |
| 130 | +with appropriate header and footer. |
| 131 | + |
| 132 | +Signature formats |
| 133 | +================= |
| 134 | + |
| 135 | +Finally, ECDSA signatures at the lowest level are a pair of numbers, usually |
| 136 | +called ``r`` and ``s``. While they are the ``x`` coordinates of special |
| 137 | +points on the curve, they are saved modulo *order* of the curve, not |
| 138 | +modulo *prime* of the curve (as a coordinate needs to be). |
| 139 | + |
| 140 | +That again means we have multiple ways of encoding those two numbers. |
| 141 | +The two most popular formats are to save them as a concatenation of big-endian |
| 142 | +integers of fixed size (determined by the curve *order*) or as a DER |
| 143 | +structure with two INTEGERS. |
| 144 | +The first of those is called the :term:``raw encoding`` inside the Python |
| 145 | +ecdsa library. |
| 146 | + |
| 147 | +As ASN.1 signature format requires the encoding of INTEGERS, and DER INTEGERs |
| 148 | +must use the fewest possible number of bytes, a numerically small value of |
| 149 | +``r`` or ``s`` will require fewer |
| 150 | +bytes to represent in the DER structure. Thus, DER encoding isn't fixed |
| 151 | +size for a given curve, but has a maximum possible size. |
| 152 | + |
| 153 | +.. note:: |
| 154 | + |
| 155 | + As DER INTEGER uses so-called two's complement representation of |
| 156 | + numbers, the most significant bit of the most significant byte |
| 157 | + represents the *sign* of the number. If that bit is set, then the |
| 158 | + number is considered to be negative. Thus, to represent a number like |
| 159 | + 255, which in binary representation is 0b11111111 (i.e. a byte with all |
| 160 | + bits set high), the DER encoding of it will require two bytes, one |
| 161 | + zero byte to make sure the sign bit is 0, and a byte with value 255 to |
| 162 | + encode the numerical value of the integer. |
0 commit comments