clarified blob encoding
This commit is contained in:
parent
e9b1688c51
commit
ee377f46c3
2 changed files with 37 additions and 43 deletions
40
index.html
40
index.html
|
@ -893,7 +893,7 @@ OP_SUPPORT_CLAIM <name> <claimID> OP_2DROP OP_DROP <outputScript&
|
||||||
|
|
||||||
<h4 id="blobs">Blobs</h4>
|
<h4 id="blobs">Blobs</h4>
|
||||||
|
|
||||||
<p>The smallest unit of data is called a <em>blob</em>. A blob is an encrypted chunk of data up to 2MiB in size. Each blob is indexed by its <em>blob hash</em>, which is a SHA384 hash of the blob. Addressing blobs by their hash protects against naming collisions and ensures that data cannot be accidentally or maliciously modified.</p>
|
<p>The smallest unit of data is called a <em>blob</em>. A blob is an encrypted chunk of data up to 2MiB in size. Each blob is indexed by its <em>blob hash</em>, which is a SHA-384 hash of the blob. Addressing blobs by their hash protects against naming collisions and ensures that data cannot be accidentally or maliciously modified.</p>
|
||||||
|
|
||||||
<p>Blobs are encrypted using AES-256 in CBC mode and PKCS7 padding. In order to keep each encrypted blob at 2MiB max, a blob can hold at most 2097151 bytes (2MiB minus 1 byte) of plaintext data. The source code for the exact algorithm is available <a href="https://github.com/lbryio/lbry.go/blob/master/stream/blob.go">here</a>. The encryption key and initialization vector for each blob is stored as described below.</p>
|
<p>Blobs are encrypted using AES-256 in CBC mode and PKCS7 padding. In order to keep each encrypted blob at 2MiB max, a blob can hold at most 2097151 bytes (2MiB minus 1 byte) of plaintext data. The source code for the exact algorithm is available <a href="https://github.com/lbryio/lbry.go/blob/master/stream/blob.go">here</a>. The encryption key and initialization vector for each blob is stored as described below.</p>
|
||||||
|
|
||||||
|
@ -905,17 +905,7 @@ OP_SUPPORT_CLAIM <name> <claimID> OP_2DROP OP_DROP <outputScript&
|
||||||
|
|
||||||
<h4 id="manifest-contents">Manifest Contents</h4>
|
<h4 id="manifest-contents">Manifest Contents</h4>
|
||||||
|
|
||||||
<p>A manifest blob’s contents are encoded using a canonical JSON encoding. The JSON encoding must be canonical to support consistent hashing and validation. The encoding is the same as standard JSON, but adds the following rules:</p>
|
<p>A manifest blob’s contents are encoded using <a href="http://wiki.laptop.org/go/Canonical_JSON">canonical JSON encoding</a>. The JSON encoding must be canonical to support consistent hashing and validation. Here’s an example manifest:</p>
|
||||||
|
|
||||||
<ul>
|
|
||||||
<li>Object keys must be quoted and lexicographically sorted.</li>
|
|
||||||
<li>All strings are hex-encoded. Hex letters must be lowercase.</li>
|
|
||||||
<li>Whitespace before, after, or between tokens is not permitted.</li>
|
|
||||||
<li>Floating point numbers, leading zeros, and “minus 0” for integers are not permitted.</li>
|
|
||||||
<li>Trailing commas after the last item in an array or object are not permitted.</li>
|
|
||||||
</ul>
|
|
||||||
|
|
||||||
<p>Here’s an example manifest:</p>
|
|
||||||
|
|
||||||
<!-- originally from 053b2f0f0e82e7f022837382733d5f5817dcd67027103fe43f00fa7a6f9fa8742c1022a851616c1ac15d1c60e89db3f4 -->
|
<!-- originally from 053b2f0f0e82e7f022837382733d5f5817dcd67027103fe43f00fa7a6f9fa8742c1022a851616c1ac15d1c60e89db3f4 -->
|
||||||
|
|
||||||
|
@ -927,18 +917,18 @@ OP_SUPPORT_CLAIM <name> <claimID> OP_2DROP OP_DROP <outputScript&
|
||||||
<pre><code>{
|
<pre><code>{
|
||||||
"blobs":[
|
"blobs":[
|
||||||
{
|
{
|
||||||
"blob_hash":"a6daea71be2bb89fab29a2a10face08143411a5245edcaa5efff48c2e459e7ec01ad20edfde6da43a932aca45b2cec61",
|
"blobHash":"a6daea71be2bb89fab29a2a10face08143411a5245edcaa5efff48c2e459e7ec01ad20edfde6da43a932aca45b2cec61",
|
||||||
"iv":"ef6caef207a207ca5b14c0282d25ce21",
|
"iv":"ef6caef207a207ca5b14c0282d25ce21",
|
||||||
"length":2097152
|
"length":2097152
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"blob_hash":"bf2717e2c445052366d35bcd58edb108cbe947af122d8f76b4856db577aeeaa2def5b57dbb80f7b1531296bd3e0256fc",
|
"blobHash":"bf2717e2c445052366d35bcd58edb108cbe947af122d8f76b4856db577aeeaa2def5b57dbb80f7b1531296bd3e0256fc",
|
||||||
"iv":"a37b291a37337fc1ff90ae655c244c1d",
|
"iv":"a37b291a37337fc1ff90ae655c244c1d",
|
||||||
"length":2097152
|
"length":2097152
|
||||||
},
|
},
|
||||||
...,
|
...,
|
||||||
{
|
{
|
||||||
"blob_hash":"322973617221ddfec6e53bff4b74b9c21c968cd32ba5a5094d84210e660c4b2ed0882b114a2392a08b06183f19330aaf",
|
"blobHash":"322973617221ddfec6e53bff4b74b9c21c968cd32ba5a5094d84210e660c4b2ed0882b114a2392a08b06183f19330aaf",
|
||||||
"iv": "a00f5f458695bdc9d50d3dbbc7905abc",
|
"iv": "a00f5f458695bdc9d50d3dbbc7905abc",
|
||||||
"length": 600160
|
"length": 600160
|
||||||
}
|
}
|
||||||
|
@ -949,7 +939,13 @@ OP_SUPPORT_CLAIM <name> <claimID> OP_2DROP OP_DROP <outputScript&
|
||||||
}
|
}
|
||||||
</code></pre>
|
</code></pre>
|
||||||
|
|
||||||
<p>The <code>key</code> field contains the key to decrypt the stream, and is optional. The key may be stored by a third party and made available to a client when presented with proof that the content was purchased. The <code>version</code> field is always 1. It is intended to signal structure changes in future versions of this protocol. The <code>length</code> field for each blob is the length of the encrypted blob, not the original file chunk.</p>
|
<p>The <code>blobs</code> field is an ordered list of blobs in the stream. Each item in the list has the blob hash for that blob, the hex-encoded initialization vector used to create the blob, and the length of the encrypted blob (not the original file chunk).</p>
|
||||||
|
|
||||||
|
<p>The <code>filename</code> is the hex-encoded name of the original file.</p>
|
||||||
|
|
||||||
|
<p>The <code>key</code> field contains the hex-encoded <em>stream key</em>, which is used to decrypt the blobs in the stream. This field is optional. The stream key may instead be stored by a third party and made available to a client when presented with proof that the content was purchased.</p>
|
||||||
|
|
||||||
|
<p>The <code>version</code> field is always 1. It is intended to signal structure changes in future versions of this protocol.</p>
|
||||||
|
|
||||||
<p>Every stream must have at least two blobs - the manifest blob and a content blob. Consequently, zero-length streams are not allowed.</p>
|
<p>Every stream must have at least two blobs - the manifest blob and a content blob. Consequently, zero-length streams are not allowed.</p>
|
||||||
|
|
||||||
|
@ -960,7 +956,7 @@ OP_SUPPORT_CLAIM <name> <claimID> OP_2DROP OP_DROP <outputScript&
|
||||||
<h5 id="setup">Setup</h5>
|
<h5 id="setup">Setup</h5>
|
||||||
|
|
||||||
<ol>
|
<ol>
|
||||||
<li>Generate a random 32-byte key for the stream. This <em>stream key</em> will be used to encrypt each content blob.</li>
|
<li>Generate a random 32-byte stream key. This key will be used to encrypt each content blob in the stream.</li>
|
||||||
</ol>
|
</ol>
|
||||||
|
|
||||||
<h5 id="content-blobs">Content Blobs</h5>
|
<h5 id="content-blobs">Content Blobs</h5>
|
||||||
|
@ -976,8 +972,8 @@ OP_SUPPORT_CLAIM <name> <claimID> OP_2DROP OP_DROP <outputScript&
|
||||||
<h5 id="manifest-blob">Manifest Blob</h5>
|
<h5 id="manifest-blob">Manifest Blob</h5>
|
||||||
|
|
||||||
<ol>
|
<ol>
|
||||||
<li>Fill in the manifest data.</li>
|
<li>Fill in the manifest data as described in the <a href="#manifest-contents">Manifest Contents</a>.</li>
|
||||||
<li>Encode the data using the canonical JSON encoding described above.</li>
|
<li>Encode the data using the canonical JSON encoding.</li>
|
||||||
<li>Compute the stream hash.</li>
|
<li>Compute the stream hash.</li>
|
||||||
</ol>
|
</ol>
|
||||||
|
|
||||||
|
@ -990,10 +986,10 @@ OP_SUPPORT_CLAIM <name> <claimID> OP_2DROP OP_DROP <outputScript&
|
||||||
<p>Decoding a stream is like encoding in reverse, and with the added step of verifying that the expected blob hashes match the actual data.</p>
|
<p>Decoding a stream is like encoding in reverse, and with the added step of verifying that the expected blob hashes match the actual data.</p>
|
||||||
|
|
||||||
<ol>
|
<ol>
|
||||||
<li>Compute a SHA384 has of the manifest blob and verify that it matches the stream hash.</li>
|
<li>Verify that the hash of the manifest blob and matches the stream hash.</li>
|
||||||
<li>Parse the manifest blob contents.</li>
|
<li>Parse the JSON in manifest blob.</li>
|
||||||
<li>Verify the hashes of the content blobs.</li>
|
<li>Verify the hashes of the content blobs.</li>
|
||||||
<li>Decrypt and remove the padding from each content blob using the key and IVs in the manifest.</li>
|
<li>Decrypt and remove the padding from each content blob using the stream key and IVs in the manifest.</li>
|
||||||
<li>Concatenate the decrypted chunks in order.</li>
|
<li>Concatenate the decrypted chunks in order.</li>
|
||||||
</ol>
|
</ol>
|
||||||
|
|
||||||
|
|
40
index.md
40
index.md
|
@ -799,7 +799,7 @@ Content on LBRY is encoded to facilitate distribution.
|
||||||
|
|
||||||
#### Blobs
|
#### Blobs
|
||||||
|
|
||||||
The smallest unit of data is called a _blob_. A blob is an encrypted chunk of data up to 2MiB in size. Each blob is indexed by its _blob hash_, which is a SHA384 hash of the blob. Addressing blobs by their hash protects against naming collisions and ensures that data cannot be accidentally or maliciously modified.
|
The smallest unit of data is called a _blob_. A blob is an encrypted chunk of data up to 2MiB in size. Each blob is indexed by its _blob hash_, which is a SHA-384 hash of the blob. Addressing blobs by their hash protects against naming collisions and ensures that data cannot be accidentally or maliciously modified.
|
||||||
|
|
||||||
Blobs are encrypted using AES-256 in CBC mode and PKCS7 padding. In order to keep each encrypted blob at 2MiB max, a blob can hold at most 2097151 bytes (2MiB minus 1 byte) of plaintext data. The source code for the exact algorithm is available [here](https://github.com/lbryio/lbry.go/blob/master/stream/blob.go). The encryption key and initialization vector for each blob is stored as described below.
|
Blobs are encrypted using AES-256 in CBC mode and PKCS7 padding. In order to keep each encrypted blob at 2MiB max, a blob can hold at most 2097151 bytes (2MiB minus 1 byte) of plaintext data. The source code for the exact algorithm is available [here](https://github.com/lbryio/lbry.go/blob/master/stream/blob.go). The encryption key and initialization vector for each blob is stored as described below.
|
||||||
|
|
||||||
|
@ -811,15 +811,7 @@ The blob hash of the manifest blob is called the _stream hash_. It uniquely iden
|
||||||
|
|
||||||
#### Manifest Contents
|
#### Manifest Contents
|
||||||
|
|
||||||
A manifest blob's contents are encoded using a canonical JSON encoding. The JSON encoding must be canonical to support consistent hashing and validation. The encoding is the same as standard JSON, but adds the following rules:
|
A manifest blob's contents are encoded using [canonical JSON encoding](http://wiki.laptop.org/go/Canonical_JSON). The JSON encoding must be canonical to support consistent hashing and validation. Here's an example manifest:
|
||||||
|
|
||||||
- Object keys must be quoted and lexicographically sorted.
|
|
||||||
- All strings are hex-encoded. Hex letters must be lowercase.
|
|
||||||
- Whitespace before, after, or between tokens is not permitted.
|
|
||||||
- Floating point numbers, leading zeros, and "minus 0" for integers are not permitted.
|
|
||||||
- Trailing commas after the last item in an array or object are not permitted.
|
|
||||||
|
|
||||||
Here's an example manifest:
|
|
||||||
|
|
||||||
<!-- originally from 053b2f0f0e82e7f022837382733d5f5817dcd67027103fe43f00fa7a6f9fa8742c1022a851616c1ac15d1c60e89db3f4 -->
|
<!-- originally from 053b2f0f0e82e7f022837382733d5f5817dcd67027103fe43f00fa7a6f9fa8742c1022a851616c1ac15d1c60e89db3f4 -->
|
||||||
|
|
||||||
|
@ -833,18 +825,18 @@ Here's the same manifest, with whitespace added for readability:
|
||||||
{
|
{
|
||||||
"blobs":[
|
"blobs":[
|
||||||
{
|
{
|
||||||
"blob_hash":"a6daea71be2bb89fab29a2a10face08143411a5245edcaa5efff48c2e459e7ec01ad20edfde6da43a932aca45b2cec61",
|
"blobHash":"a6daea71be2bb89fab29a2a10face08143411a5245edcaa5efff48c2e459e7ec01ad20edfde6da43a932aca45b2cec61",
|
||||||
"iv":"ef6caef207a207ca5b14c0282d25ce21",
|
"iv":"ef6caef207a207ca5b14c0282d25ce21",
|
||||||
"length":2097152
|
"length":2097152
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"blob_hash":"bf2717e2c445052366d35bcd58edb108cbe947af122d8f76b4856db577aeeaa2def5b57dbb80f7b1531296bd3e0256fc",
|
"blobHash":"bf2717e2c445052366d35bcd58edb108cbe947af122d8f76b4856db577aeeaa2def5b57dbb80f7b1531296bd3e0256fc",
|
||||||
"iv":"a37b291a37337fc1ff90ae655c244c1d",
|
"iv":"a37b291a37337fc1ff90ae655c244c1d",
|
||||||
"length":2097152
|
"length":2097152
|
||||||
},
|
},
|
||||||
...,
|
...,
|
||||||
{
|
{
|
||||||
"blob_hash":"322973617221ddfec6e53bff4b74b9c21c968cd32ba5a5094d84210e660c4b2ed0882b114a2392a08b06183f19330aaf",
|
"blobHash":"322973617221ddfec6e53bff4b74b9c21c968cd32ba5a5094d84210e660c4b2ed0882b114a2392a08b06183f19330aaf",
|
||||||
"iv": "a00f5f458695bdc9d50d3dbbc7905abc",
|
"iv": "a00f5f458695bdc9d50d3dbbc7905abc",
|
||||||
"length": 600160
|
"length": 600160
|
||||||
}
|
}
|
||||||
|
@ -855,7 +847,13 @@ Here's the same manifest, with whitespace added for readability:
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
The `key` field contains the key to decrypt the stream, and is optional. The key may be stored by a third party and made available to a client when presented with proof that the content was purchased. The `version` field is always 1. It is intended to signal structure changes in future versions of this protocol. The `length` field for each blob is the length of the encrypted blob, not the original file chunk.
|
The `blobs` field is an ordered list of blobs in the stream. Each item in the list has the blob hash for that blob, the hex-encoded initialization vector used to create the blob, and the length of the encrypted blob (not the original file chunk).
|
||||||
|
|
||||||
|
The `filename` is the hex-encoded name of the original file.
|
||||||
|
|
||||||
|
The `key` field contains the hex-encoded _stream key_, which is used to decrypt the blobs in the stream. This field is optional. The stream key may instead be stored by a third party and made available to a client when presented with proof that the content was purchased.
|
||||||
|
|
||||||
|
The `version` field is always 1. It is intended to signal structure changes in future versions of this protocol.
|
||||||
|
|
||||||
Every stream must have at least two blobs - the manifest blob and a content blob. Consequently, zero-length streams are not allowed.
|
Every stream must have at least two blobs - the manifest blob and a content blob. Consequently, zero-length streams are not allowed.
|
||||||
|
|
||||||
|
@ -867,7 +865,7 @@ A file must be encoded into a stream before it can be published. Encoding involv
|
||||||
|
|
||||||
##### Setup
|
##### Setup
|
||||||
|
|
||||||
1. Generate a random 32-byte key for the stream. This _stream key_ will be used to encrypt each content blob.
|
1. Generate a random 32-byte stream key. This key will be used to encrypt each content blob in the stream.
|
||||||
|
|
||||||
##### Content Blobs
|
##### Content Blobs
|
||||||
|
|
||||||
|
@ -879,9 +877,9 @@ A file must be encoded into a stream before it can be published. Encoding involv
|
||||||
|
|
||||||
##### Manifest Blob
|
##### Manifest Blob
|
||||||
|
|
||||||
1. Fill in the manifest data.
|
1. Fill in the manifest data as described in the [Manifest Contents](#manifest-contents).
|
||||||
1. Encode the data using the canonical JSON encoding described above.
|
2. Encode the data using the canonical JSON encoding.
|
||||||
1. Compute the stream hash.
|
3. Compute the stream hash.
|
||||||
|
|
||||||
An implementation of this process is available [here](https://github.com/lbryio/lbry.go/tree/master/stream).
|
An implementation of this process is available [here](https://github.com/lbryio/lbry.go/tree/master/stream).
|
||||||
|
|
||||||
|
@ -892,10 +890,10 @@ An implementation of this process is available [here](https://github.com/lbryio/
|
||||||
|
|
||||||
Decoding a stream is like encoding in reverse, and with the added step of verifying that the expected blob hashes match the actual data.
|
Decoding a stream is like encoding in reverse, and with the added step of verifying that the expected blob hashes match the actual data.
|
||||||
|
|
||||||
1. Compute a SHA384 has of the manifest blob and verify that it matches the stream hash.
|
1. Verify that the hash of the manifest blob and matches the stream hash.
|
||||||
2. Parse the manifest blob contents.
|
2. Parse the JSON in manifest blob.
|
||||||
3. Verify the hashes of the content blobs.
|
3. Verify the hashes of the content blobs.
|
||||||
4. Decrypt and remove the padding from each content blob using the key and IVs in the manifest.
|
4. Decrypt and remove the padding from each content blob using the stream key and IVs in the manifest.
|
||||||
5. Concatenate the decrypted chunks in order.
|
5. Concatenate the decrypted chunks in order.
|
||||||
|
|
||||||
|
|
||||||
|
|
Loading…
Reference in a new issue