spec/index.html
2018-10-25 14:53:09 -04:00

888 lines
45 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html>
<head>
<title>LBRY: A Decentralized Digital Content Marketplace</title>
<meta name="GENERATOR" content="github.com/mmarkdown/mmark Mmark Markdown Processor - mmark.nl">
<meta charset="utf-8">
<link rel="stylesheet" type="text/css" href="normalize.css">
<link rel="stylesheet" type="text/css" href="style.css">
</head>
<body>
<h1 id="lbry-a-decentralized-digital-content-marketplace">LBRY: A Decentralized Digital Content Marketplace</h1>
<aside>
<p>Please excuse the unfinished state of this paper. It is being actively worked on. The content here is made available early because it contains useful information for developers.</p>
<p>For more technical information about LBRY, visit <a href="https://lbry.tech">lbry.tech</a>.</p>
</aside>
<h2 id="introduction">Introduction</h2>
<p>LBRY is a protocol for accessing and publishing digital content in a global, decentralized marketplace. Clients can use LBRY to publish, host, find, download, and pay for content — books, movies, music, or anything else. Anyone can participate and no permission is required, nor can anyone be blocked from participating. The system is distributed, so no single entity has unilateral control, nor will the removal of any single entity prevent the system from functioning.</p>
<p>TODO:</p>
<ul>
<li>why is it significant</li>
<li>whom does it help</li>
<li>why is it different/better than what existed before</li>
</ul>
<h2 id="table-of-contents">Table of Contents</h2>
<p><div id="toc"></p>
<!--ts-->
<ul>
<li><a href="#overview">Overview</a></li>
<li><a href="#conventions-and-terminology">Conventions and Terminology</a></li>
<li><a href="#blockchain">Blockchain</a>
<ul>
<li><a href="#claims">Claims</a>
<ul>
<li><a href="#claim-properties">Claim Properties</a></li>
<li><a href="#claim-example">Claim Example</a></li>
<li><a href="#claim-operations">Claim Operations</a></li>
<li><a href="#claimtrie">Claimtrie</a></li>
<li><a href="#claim-statuses">Claim Statuses</a>
<ul>
<li><a href="#accepted">Accepted</a></li>
<li><a href="#abandoned">Abandoned</a></li>
<li><a href="#active">Active</a></li>
<li><a href="#controlling">Controlling</a></li>
</ul></li>
<li><a href="#normalization">Normalization</a></li>
</ul></li>
<li><a href="#urls">URLs</a>
<ul>
<li><a href="#components">Components</a></li>
<li><a href="#grammar">Grammar</a></li>
<li><a href="#design-notes">Design Notes</a></li>
</ul></li>
<li><a href="#transactions">Transactions</a>
<ul>
<li><a href="#operations-and-opcodes">Operations and Opcodes</a></li>
<li><a href="#addresses">Addresses</a></li>
<li><a href="#proof-of-payment">Proof of Payment</a></li>
</ul></li>
<li><a href="#consensus">Consensus</a>
<ul>
<li><a href="#block-timing">Block Timing</a></li>
<li><a href="#difficulty-adjustment">Difficulty Adjustment</a></li>
<li><a href="#block-hash-algorithm">Block Hash Algorithm</a></li>
<li><a href="#block-rewards">Block Rewards</a></li>
</ul></li>
</ul></li>
<li><a href="#metadata">Metadata</a>
<ul>
<li><a href="#metadata-specification">Metadata Specification</a></li>
<li><a href="#key-metadata-fields">Key Metadata Fields</a>
<ul>
<li><a href="#streams-and-stream-hashes">Streams and Stream Hashes</a></li>
<li><a href="#fees-and-fee-structure">Fees and Fee Structure</a></li>
</ul></li>
<li><a href="#identities">Identities</a></li>
<li><a href="#metadata-validation">Metadata Validation</a></li>
</ul></li>
<li><a href="#data">Data</a>
<ul>
<li><a href="#encoding-and-decoding">Encoding and Decoding</a>
<ul>
<li><a href="#blobs">Blobs</a></li>
<li><a href="#streams">Streams</a></li>
<li><a href="#manifest-contents">Manifest Contents</a></li>
<li><a href="#stream-encoding">Stream Encoding</a>
<ul>
<li><a href="#setup">Setup</a></li>
<li><a href="#content-blobs">Content Blobs</a></li>
<li><a href="#manifest-blob">Manifest Blob</a></li>
</ul></li>
<li><a href="#stream-decoding">Stream Decoding</a></li>
</ul></li>
<li><a href="#announce">Announce</a>
<ul>
<li><a href="#distributed-hash-table">Distributed Hash Table</a></li>
<li><a href="#announcing-to-the-dht">Announcing to the DHT</a></li>
</ul></li>
<li><a href="#download">Download</a>
<ul>
<li><a href="#querying-the-dht">Querying the DHT</a></li>
<li><a href="#blob-exchange-protocol">Blob Exchange Protocol</a>
<ul>
<li><a href="#pricecheck">PriceCheck</a></li>
<li><a href="#downloadcheck">DownloadCheck</a></li>
<li><a href="#download-1">Download</a></li>
<li><a href="#uploadcheck">UploadCheck</a></li>
<li><a href="#upload">Upload</a></li>
</ul></li>
</ul></li>
<li><a href="#reflector--blobex-upload">Reflector / BlobEx Upload</a></li>
<li><a href="#data-markets">Data Markets</a></li>
</ul></li>
<li><a href="#conclusion">Conclusion</a>
<!--te--></li>
</ul>
<p></div></p>
<h2 id="overview">Overview</h2>
<p>This document defines the LBRY protocol, its components, and how they fit together. At its core, LBRY consists of several discrete components that are used together in order to provide the end-to-end capabilities of the protocol. There are two distributed data stores (blockchain and DHT), a peer-to-peer protocol for exchanging data, and several specifications for data structure, transformation, and retrieval.</p>
<p>This document assumes that the reader is familiar with Bitcoin and blockchain technology. It does not attempt to document the Bitcoin protocol or explain how it works. The <a href="https://bitcoin.org/en/developer-reference">Bitcoin developer reference</a> is recommended for anyone wishing to understand the technical details.</p>
<h2 id="conventions-and-terminology">Conventions and Terminology</h2>
<p>(Rather than this section, maybe we can use a syntax like brackets around keywords to inline key definitions?)</p>
<dl>
<dt>file</dt>
<dd>A single piece of content published using LBRY.</dd>
<dt>blob</dt>
<dd>The unit of data transmission on the data network. A published file is split into many blobs.</dd>
<dt>stream</dt>
<dd>A set of blobs that can be reassembled into a file. Every stream has a descriptor blob and one or more content blobs.</dd>
<dt>blob hash</dt>
<dd>The output of a cryptographic hash function is applied to a blob. Hashes are used to uniquely identify blobs and to verify that the contents of the blob are correct. Unless otherwise specified, LBRY uses SHA384 as the hash function.</dd>
<dt>metadata</dt>
<dd>Information about the contents of a stream (e.g. creator, description, stream descriptor hash, etc). Metadata is stored in the blockchain.</dd>
<dt>claim</dt>
<dd>A single metadata entry in the blockchain.</dd>
<dt>name</dt>
<dd>A human-readable UTF8 string that is associated with a published claim.</dd>
<dt>channel</dt>
<dd>The unit of pseudonymous publisher identity. Claims may be part of a channel.</dd>
<dt>URL</dt>
<dd>A reference to a claim that specifies how to retrieve it.</dd>
</dl>
<h2 id="blockchain">Blockchain</h2>
<!-- done -->
<p>The LBRY blockchain is a public, proof-of-work blockchain. It serves three key purposes:</p>
<ol>
<li>An index of the content available on the network</li>
<li>A payment system and record of purchases for priced content</li>
<li>Trustful publisher identities</li>
</ol>
<p>The LBRY blockchain is a fork of the <a href="https://bitcoin.org/bitcoin.pdf">Bitcoin</a> blockchain, with substantial modifications. This document will not cover or specify any aspects of LBRY that are identical to Bitcoin, and will instead focus on the differences.</p>
<h3 id="claims">Claims</h3>
<!-- done -->
<p>A <em>claim</em> is a single metadata entry in the blockchain. There are three types of claims:</p>
<dl>
<dt>stream</dt>
<dd>Declare the availability, access method, and publisher of a stream of bytes (typically a file).</dd>
<dt>identity</dt>
<dd>Create a trustful pseudonym that can be used to identify the origin of stream claims.</dd>
<dt>support</dt>
<dd>Add their amount to a stream or identity claim.</dd>
</dl>
<h4 id="claim-properties">Claim Properties</h4>
<p>Claims have 4 properties:</p>
<dl>
<dt>claimId</dt>
<dd>A 20-byte hash unique among all claims. See [Claim Identifier Generation](#claim-identifier-generation).</dd>
<dt>name</dt>
<dd>A normalized UTF-8 string of up to 255 bytes used to address the claim. See [URLs](#urls) and [Normalization](#normalization).</dd>
<dt>amount</dt>
<dd>A quantity of tokens used to stake the claim. See [Controlling](#controlling).</dd>
<dt>value</dt>
<dd>Metadata about a piece of content or an identity. Empty for support claims. See [Metadata](#metadata).</dd>
</dl>
<h4 id="claim-example">Claim Example</h4>
<!-- done -->
<p>Here is an example stream claim:</p>
<pre><code>{
&quot;claimId&quot;: &quot;fa3d002b67c4ff439463fcc0d4c80758e38a0aed&quot;,
&quot;name&quot;: &quot;lbry&quot;,
&quot;amount&quot;: 100000000,
&quot;value&quot;: &quot;{\&quot;ver\&quot;: \&quot;0.0.3\&quot;, \&quot;description\&quot;: \&quot;What is LBRY? An introduction with Alex Tabarrok\&quot;,
\&quot;license\&quot;: \&quot;LBRY inc\&quot;, \&quot;title\&quot;: \&quot;What is LBRY?\&quot;, \&quot;author\&quot;: \&quot;Samuel Bryan\&quot;,
\&quot;language\&quot;: \&quot;en\&quot;, \&quot;sources\&quot;: {\&quot;lbry_sd_hash\&quot;:
\&quot;e1e324bce7437540fac6707fa142cca44d76fc4e8e65060139a88ff7cdb218b4540cb9cff8bb3d5e06157ae6b08e5cb5\&quot;},
\&quot;content_type\&quot;: \&quot;video/mp4\&quot;, \&quot;nsfw\&quot;: false, \&quot;thumbnail\&quot;:
\&quot;https://s3.amazonaws.com/files.lbry.io/logo.png\&quot;}&quot;,
&quot;txid&quot;: &quot;53ed05d9dfd728a94bedf952d67783bbe9da5d2ab436a84338bb53f0b85301b5&quot;,
&quot;n&quot;: 0,
&quot;height&quot;: 146117
}
</code></pre>
<h4 id="claim-operations">Claim Operations</h4>
<!-- done -->
<p>There are three claim operations: <em>create</em>, <em>update</em>, and <em>abandon</em>.</p>
<dl>
<dt>create</dt>
<dd>Makes a new claim.</dd>
<dt>update</dt>
<dd>Changes the value or amount of an existing claim. Updates do not change the claim ID, so an updated claim retains any supports attached to it. </dd>
<dt>abandon</dt>
<dd>Withdraws a claim, freeing the associated credits to be used for other purposes.</dd>
</dl>
<h4 id="claimtrie">Claimtrie</h4>
<!-- done -->
<p>The <em>claimtrie</em> is the data structure used to store the set of all claims and prove the correctness of claim resolution.</p>
<p>The claimtrie is implemented as a <a href="https://en.wikipedia.org/wiki/Merkle_tree">Merkle tree</a> that maps names to claims. Claims are stored as leaf nodes in the tree. Names are stored as the path from the root node to the leaf node.</p>
<p>The <em>root hash</em> is the hash of the root node. It is stored in the header of each block in the blockchain. Nodes in the LBRY network use the root hash to efficiently and securely validate the state of the claimtrie.</p>
<p>Multiple claims can exist for the same name. They are all stored in the leaf node for that name, sorted in decreasing order by the total amount of credits backing each claim.</p>
<p>For more details on the specific claimtrie implementation, see <a href="https://github.com/lbryio/lbrycrd/blob/master/src/claimtrie.cpp">the source code</a>.</p>
<h4 id="claim-statuses">Claim Statuses</h4>
<!-- done -->
<p>A claim can have one or more the following statuses at a given block.</p>
<h5 id="accepted">Accepted</h5>
<!-- done -->
<p>An <em>accepted</em> claim is one that has been entered into the blockchain. This happens when the transaction containing the claim is included in a block.</p>
<p>Accepted claims do not appear in or affect the claimtrie state until they are <a href="#active">Active</a>.</p>
<h5 id="abandoned">Abandoned</h5>
<!-- done -->
<p>An <em>abandoned</em> claim is one that was withdrawn by its creator. Spending a transaction that contains a claim will cause that claim to become abandoned.</p>
<p>Abandoned stream and identity claims are no longer stored in the claimtrie. Abandoned support claims no longer contribute their amount to the sort order of claims listed in a leaf.</p>
<p>While data related to abandoned claims technically still resides in the blockchain, it is improper to use this data to fetch the associated content.</p>
<h5 id="active">Active</h5>
<p>An <em>active</em> claim is an accepted and non-abandoned claim that has been in the blockchain long enough to be activated. The length of time required is called the <em>activation delay</em>.</p>
<p>The activation delay depends on the claim operation, the height of the current block, and the height at which the claimtrie state for that name last changed.</p>
<p>If the claim is an update or support to an already active claim, or if it is the first claim for a name, the claim becomes active as soon as it is accepted. Otherwise it becomes active at the block heigh determined by the following formula:</p>
<pre><code>C + min(4032, floor((H-T) / 32))
</code></pre>
<p>Where:</p>
<ul>
<li>C = claim height (height when the claim was accepted)</li>
<li>H = current height</li>
<li>T = takeover height (the most recent height at which the claimtrie state for the name changed)</li>
</ul>
<p>In plain English, the delay before a claim becomes active is equal to the claims height minus height of the last takeover, divided by 32. The delay is capped at 4032 blocks, which is 7 days of blocks at 2.5 minutes per block (our target block time). The max delay is reached 224 (7x32) days after the last takeover. The goal of this delay function is to give long-standing claimants time to respond to takeover attempts, while still keeping takeover times reasonable and allowing recent or contentious claims to be taken over quickly.</p>
<h5 id="controlling">Controlling</h5>
<p>The controlling claim is the claim that has the highest total effective amount, which is the sum of its own amount and the amounts of all of its supports. It must be active and cannot itself be a support.</p>
<p>Only one claim can be controlling for a given name at a given block. To determine which claim is controlling for a given name at a given block, the following algorithm is used:</p>
<ol>
<li><p>For each active claim for the name, add up the amount of the claim and the amount of all the active supports for that claim.</p></li>
<li><p>Determine if a takeover is happening</p>
<ol>
<li><p>If the claim with the greatest total is the controlling claim from the previous block, then nothing changes. That claim is still controlling at this block.</p></li>
<li><p>Otherwise, a takeover is occurring. Set the takeover height for this name to the current height, recalculate which claims and supports are now active, and then perform step 1 again.</p></li>
</ol></li>
<li><p>At this point, the claim with the greatest total is the controlling claim at this block.</p></li>
</ol>
<p>The purpose of 2b is to handle the case when multiple competing claims are made on the same name in different blocks, and one of those claims becomes active but another still-inactive claim has the greatest amount. Step 2b will cause the greater claim to also activate and become the controlling claim.</p>
<p>Here is a step-by-step example to illustrate the different scenarios. All claims are for the same name.</p>
<p><strong>Block 13:</strong> Claim A for 10LBC is accepted. It is the first claim, so it immediately becomes active and controlling.
<br>State: A(10) is controlling</p>
<p><strong>Block 1001:</strong> Claim B for 20LBC is accepted. Its activation height is <code>1001 + min(4032, floor((1001-13) / 32)) = 1001 + 30 = 1031</code>.
<br>State: A(10) is controlling, B(20) is accepted.</p>
<p><strong>Block 1010:</strong> Support X for 14LBC for claim A is accepted. Since it is a support for the controlling claim, it activates immediately.
<br>State: A(10+14) is controlling, B(20) is accepted.</p>
<p><strong>Block 1020:</strong> Claim C for 50LBC is accepted. The activation height is <code>1020 + min(4032, floor((1020-13) / 32)) = 1020 + 31 = 1051</code>.
<br>State: A(10+14) is controlling, B(20) is accepted, C(50) is accepted.</p>
<p><strong>Block 1031:</strong> Claim B activates. It has 20LBC, while claim A has 24LBC (10 original + 14 from support X). There is no takeover, and claim A remains controlling.
<br>State: A(10+14) is controlling, B(20) is active, C(50) is accepted.</p>
<p><strong>Block 1040:</strong> Claim D for 300LBC is accepted. The activation height is <code>1040 + min(4032, floor((1040-13) / 32)) = 1040 + 32 = 1072</code>.
<br>State: A(10+14) is controlling, B(20) is active, C(50) is accepted, D(300) is accepted.</p>
<p><strong>Block 1051:</strong> Claim C activates. It has 50LBC, while claim A has 24LBC, so a takeover is initiated. The takeover height for this name is set to 1051, and therefore the activation delay for all the claims becomes <code>min(4032, floor((1051-1051) / 32)) = 0</code>. All the claims become active. The totals for each claim are recalculated, and claim D becomes controlling because it has the highest total.
<br>State: A(10+14) is active, B(20) is active, C(50) is active, D(300) is controlling.</p>
<h4 id="normalization">Normalization</h4>
<p>Names in the claimtrie are normalized to avoid confusion due to Unicode equivalence or casing. All names are normalized using the NFD normalization form, then lowercased using the en_US locale.</p>
<h3 id="urls">URLs</h3>
<p>URLs are human-readable references to claims. All URLs contain a name, and can be resolved to a specific claim for that name. The ultimate purpose of much of the claim design, including controlling claims and the claimtrie structure, is to provide human-readable URLs that can be trustfully resolved by clients that have don&rsquo;t have a full copy of the blockchain.</p>
<h4 id="components">Components</h4>
<p>A URL is a name with one or more modifiers. A bare name on its own will resolve to the controlling claim at the latest block height, for reasons covered in <a href="#design-notes">Design Notes</a>. Common URL structures are:</p>
<p><strong>Name:</strong> a basic claim for a name</p>
<pre><code>lbry:meet-LBRY
</code></pre>
<p><strong>Claim ID:</strong> a claim for this name with this claim ID (does not have to be the controlling claim). Partial prefix matches are allowed.</p>
<pre><code>lbry:meet-LBRY#7a0aa95c5023c21c098
lbry:meet-LBRY#7a
</code></pre>
<p><strong>Claim Sequence:</strong> the Nth claim for this name, in the order the claims entered the blockchain. N must be a positive number. This can be used to determine which claim came first, rather than which claim has the most support.</p>
<pre><code>lbry:meet-LBRY:1
</code></pre>
<p><strong>Bid Position:</strong> the Nth claim for this name, in order of most support to least support. N must be a positive number. This is useful for resolving non-winning bids in bid order, e.g. if you want to list the top three winning claims in a voting contest or want to ignore the activation delay.</p>
<pre><code>lbry:meet-LBRY$2
lbry:meet-LBRY$3
</code></pre>
<p><strong>Query Params:</strong> extra parameters (reserved for future use)</p>
<pre><code>lbry:meet-LBRY?arg=value+arg2=value2
</code></pre>
<p><strong>Channel:</strong> a claim for a channel</p>
<pre><code>lbry:@lbry
</code></pre>
<p><strong>Claim in Channel:</strong> URLS with a channel and a claim name are resolved in two steps. First the channel is resolved to get the claim for that channel. Then the name is resolved to get the appropriate claim from among the claims in the channel.</p>
<pre><code>lbry:@lbry/meet-LBRY
</code></pre>
<h4 id="grammar">Grammar</h4>
<p>The full URL grammar is defined using <a href="https://www.w3.org/TR/2017/REC-xquery-31-20170321/#EBNFNotation">Xquery EBNF notation</a>:</p>
<!-- see http://bottlecaps.de/rr/ui for visuals-->
<pre><code>URL ::= Scheme Path Query?
Scheme ::= 'lbry://'
Path ::= ClaimNameAndModifier | ChannelAndModifier ( '/' ClaimNameAndModifier )?
ClaimNameAndModifier ::= ClaimName Modifier?
ChannelAndModifier ::= Channel Modifier?
ClaimName ::= NameChar+
Channel ::= '@' ClaimName
Modifier ::= ClaimID | ClaimSequence | BidPosition
ClaimID ::= '#' Hex+
ClaimSequence ::= ':' PositiveNumber
BidPosition ::= '$' PositiveNumber
Query ::= '?' QueryParameterList
QueryParameterList ::= QueryParameter ( '&amp;' QueryParameterList )*
QueryParameter ::= QueryParameterName ( '=' QueryParameterValue )?
QueryParameterName ::= NameChar+
QueryParameterValue ::= NameChar+
PositiveDigit ::= [123456789]
Digit ::= '0' | PositiveDigit
PositiveNumber ::= PositiveDigit Digit*
HexAlpha ::= [abcdef]
Hex ::= (Digit | HexAlpha)+
NameChar ::= Char - [=&amp;#:$@?/] /* any character that is not reserved */
Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] /* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */
</code></pre>
<h4 id="design-notes">Design Notes</h4>
<p>Most existing public name schemes are first-come, first-serve. This leads to several bad outcomes. When the system is young, users are incentivized to register common names even if they don&rsquo;t intend to use them, in hopes of selling them to the proper owner in the future for an exorbitant price. In a centralized system, the authority may allow for appeals to reassign names based on trademark or other common use reasons. There may also be a process to &ldquo;verify&rdquo; that a name belongs to the entity you think it does (e.g. Twitter&rsquo;s verified accounts). Such processes are often arbitrary, change over time, involve significant transaction costs, and may still lead to names being used in ways that are contrary to user expectation (e.g. <a href="http://nissan.com">nissan.com</a> is not what youd expect).</p>
<p>In a decentralized system, such approaches are not possible, so name squatting is especially dangerous (see Namecoin). Instead, LBRY creates an efficient allocation of names via a market. Following <a href="https://en.wikipedia.org/wiki/Coase_theorem">Coase</a>, we believe that if the rules for name ownership and exchange are clearly defined, transaction costs are low, and there is no information asymmetry, then control of URLs will flow to their highest-valued use.</p>
<p>Note that only vanity URLs (i.e. URLs without a ClaimID or or ClaimSequence modifier) have this property. Permanent URLs like <code>lbry://myclaimname#abc</code> exist and are available for the small cost of issuing a <code>create</code> claim transactions.</p>
<h3 id="transactions">Transactions</h3>
<p>To support claims, the LBRY blockchain makes the following changes on top of Bitcoin.</p>
<h4 id="operations-and-opcodes">Operations and Opcodes</h4>
<p>To enable <a href="#claim-operations">claim operations</a>, three new opcodes were added to the blockchain scripting language: <code>OP_CLAIM_NAME</code>, <code>OP_SUPPORT_CLAIM</code>, and <code>OP_UPDATE_CLAIM</code> (in Bitcoin they are respectively <code>OP_NOP6</code>, <code>OP_NOP7</code>, and <code>OP_NOP8</code>). Each op code will push a zero on to the execution stack, and will trigger the claimtrie to perform calculations necessary for each operation. Below are the three supported transactions scripts using these opcodes.</p>
<pre><code>OP_CLAIM_NAME &lt;name&gt; &lt;value&gt; OP_2DROP OP_DROP &lt;pubKey&gt;
OP_UPDATE_CLAIM &lt;name&gt; &lt;claimId&gt; &lt;value&gt; OP_2DROP OP_2DROP &lt;pubKey&gt;
OP_SUPPORT_CLAIM &lt;name&gt; &lt;claimId&gt; OP_2DROP OP_DROP &lt;pubKey&gt;
</code></pre>
<p><code>&lt;pubKey&gt;</code> can be any valid Bitcoin payout script, so a claimtrie script is also a pay-to-pubkey script to a user-controlled address. Note that the zeros pushed onto the stack by the claimtrie opcodes and vectors are all dropped by <code>OP_2DROP</code> and <code>OP_DROP</code>. This means that claimtrie transactions exist as prefixes to Bitcoin payout scripts and can be spent just like standard transactions.</p>
<p>For example, a claim transaction setting the name “Fruit” to “Apple” and using a pay-to-pubkey script will have the following payout script:</p>
<pre><code>OP_CLAIM_NAME Fruit Apple OP_2DROP OP_DROP OP_DUP OP_HASH160 &lt;addressOne&gt; OP_EQUALVERIFY OP_CHECKSIG
</code></pre>
<p>Like any standard Bitcoin transaction output script, it will be associated with a transaction hash and output index. The transaction hash and output index are concatenated and hashed to create the claimID for this claim. For the example above, let&rsquo;s say the above transaction hash is <code>7560111513bea7ec38e2ce58a58c1880726b1515497515fd3f470d827669ed43</code> and the output index is <code>1</code>. Then the claimID would be <code>529357c3422c6046d3fec76be2358004ba22e323</code>.</p>
<p>A support for this bid will have the following payout script:</p>
<pre><code>OP_SUPPORT_CLAIM Fruit 529357c3422c6046d3fec76be2358004ba22e323 OP_2DROP OP_DROP OP_DUP OP_HASH160 &lt;addressTwo&gt; OP_EQUALVERIFY OP_CHECKSIG
</code></pre>
<p>And now let&rsquo;s say we want to update the original claim to change the value to “Banana”. An update transaction has a special requirement that it must spend the existing claim that it wishes to update in its redeem script. Otherwise, it will be considered invalid and will not make it into the claimtrie. Thus it will have the following redeem script:</p>
<pre><code>&lt;signature&gt; &lt;pubKeyForAddressOne&gt;
</code></pre>
<p>This is identical to the standard way of redeeming a pay-to-pubkey script in Bitcoin.</p>
<p>The payout script for the update transaction is:</p>
<pre><code>OP_UPDATE_CLAIM Fruit 529357c3422c6046d3fec76be2358004ba22e323 Banana OP_2DROP OP_2DROP OP_DUP OP_HASH160 &lt;addressThree&gt; OP_EQUALVERIFY OP_CHECKSIG
</code></pre>
<h4 id="addresses">Addresses</h4>
<p>The address version byte is set to <code>0x55</code> for standard (pay-to-public-key-hash) addresses and <code>0x7a</code> for multisig (pay-to-script-hash) addresses. P2PKH addresses start with the letter <code>b</code>, and P2SH addresses start with <code>r</code>.</p>
<p>All the chain parameters are defined <a href="https://github.com/lbryio/lbrycrd/blob/master/src/chainparams.cpp">here</a>.</p>
<h4 id="proof-of-payment">Proof of Payment</h4>
<p>TODO: Explain how transactions serve as proof that a client has made a valid payment for a piece of content.</p>
<h3 id="consensus">Consensus</h3>
<p>LBRY makes a few small changes to consensus rules.</p>
<h4 id="block-timing">Block Timing</h4>
<p>The target block time was lowered from 10 minutes to 2.5 minutes to facilitate faster transaction confirmation.</p>
<h4 id="difficulty-adjustment">Difficulty Adjustment</h4>
<p>The proof-of-work target is adjusted every block to better adapt to sudden changes in hash rate. The exact adjustment algorithm can be seen <a href="https://github.com/lbryio/lbrycrd/blob/master/src/lbry.cpp">here</a>.</p>
<h4 id="block-hash-algorithm">Block Hash Algorithm</h4>
<p>LBRY uses a combination of SHA256, SHA512, and RIPEMD160. The exact hashing algorithm can be seen <a href="https://github.com/lbryio/lbrycrd/blob/master/src/hash.cpp#L18">here</a>.</p>
<h4 id="block-rewards">Block Rewards</h4>
<p>The block reward schedule was adjusted to provide an initial testing period, a quick ramp-up to max block rewards, then a logarithmic decay to 0. The source for the algorithm is <a href="https://github.com/lbryio/lbrycrd/blob/master/src/main.cpp#L1594">here</a>.</p>
<h2 id="metadata">Metadata</h2>
<p>Claim metadata is stored in a serialized format using <a href="https://developers.google.com/protocol-buffers/">Protocol Buffers</a>. This was chosen for several reasons:</p>
<ul>
<li><strong>Extensibility</strong>. The metadata structure could grow to encompass thousands of fields for dozens of types of content. It should be easy to modify the structure while maintaining backward compatibility. Blockchain data is permanent and cannot be migrated.</li>
<li><strong>Compactness</strong>. Blockchain space is expensive. Data should be stored as compactly as possible.</li>
<li><strong>Interoperability</strong>. These definitions will be used by many projects written in different languages. Protocol buffers are language-independent and have great support for most popular languages.</li>
</ul>
<p>No enforcement or validation on metadata happens at the blockchain level. Instead, metadata encoding, decoding, and validation is done by clients. This allows evolution of the metadata without changes to consensus rules.</p>
<h3 id="metadata-specification">Metadata Specification</h3>
<p>A useful index of available content must be succinct yet meaningful. It should be machine-readable, comprehensive, and should ideally point you toward the content youre looking for. LBRY achieves this by defining a set of common properties for streams.</p>
<p>The metadata contains structured information describing the content, such as the title, author, language, and so on.</p>
<p>Heres an example:</p>
<pre><code>&quot;metadata&quot;: {
&quot;author&quot;: &quot;&quot;,
&quot;description&quot;: &quot;All proceeds go to holly for buying toys, i will post the video with those toys on Xmas day&quot;,
&quot;language&quot;: &quot;en&quot;,
&quot;license&quot;: &quot;All rights reserved.&quot;,
&quot;licenseUrl&quot;: &quot;&quot;,
&quot;nsfw&quot;: false,
&quot;preview&quot;: &quot;&quot;,
&quot;thumbnail&quot;: &quot;http://www.thetoydiscounter.com/happy.jpg&quot;,
&quot;title&quot;: &quot;Holly singing The Happy Working Song&quot;,
&quot;source&quot;: {
&quot;contentType&quot;: &quot;video/mp4&quot;,
&quot;source&quot;: &quot;92b8aae7a901c56901fd5602c1f1acc0e63fb5492ef2a3cd5b9c631d92cab2e060e2a908baa922c24dee6c5229a98136&quot;,
&quot;sourceType&quot;: &quot;lbry_sd_hash&quot;,
&quot;version&quot;: &quot;_0_0_1&quot;
},
&quot;version&quot;: &quot;_0_1_0&quot;
}
</code></pre>
<p>Because the metadata structure can and does change frequently, a complete specification is omitted from this document. Instead, <a href="https://github.com/lbryio/types">github.com/lbryio/types</a> should be consulted for the precise definition of current metadata structure.</p>
<h3 id="key-metadata-fields">Key Metadata Fields</h3>
<p>Despite not covering the full metadata structure, a few important metadata fields are highlighted below.</p>
<h4 id="streams-and-stream-hashes">Streams and Stream Hashes</h4>
<p>(The metadata property <code>lbry_sd_hash</code> contains a unique identifier to locate and find the content in the data network. Reference [[Data]].)</p>
<h4 id="fees-and-fee-structure">Fees and Fee Structure</h4>
<ul>
<li>LBC</li>
<li>Currencies?</li>
<li>channel signatures and private keys</li>
</ul>
<h3 id="identities">Identities</h3>
<p>Channels are the unit of identity in the LBRY system. A channel is a claim that start with <code>@</code> and contains a metadata structure for identities rather than content. The most important part of channel&rsquo;s metadata is the public key. Claims belonging to a channel are signed with the corresponding private key. A valid signature proves channel membership.</p>
<p>The purpose of channels is to allow content to be clustered under a single pseudonym or identity. This allows publishers to easily list all their content, maintain attribution, and build their brand.</p>
<p>Heres the value of an example channel claim:</p>
<pre><code>&quot;certificate&quot;: {
&quot;keyType&quot;: &quot;SECP256k1&quot;,
&quot;publicKey&quot;: &quot;3056301006072a8648ce3d020106052b8104000a0342
0004180488ffcb3d1825af538b0b952f0eba6933faa6
d8229609ac0aeadfdbcf49C59363aa5d77ff2b7ff06c
ddc07116b335a4a0849b1b524a4a69d908d69f1bcebb&quot;,
&quot;version&quot;: &quot;_0_0_1&quot;
}
</code></pre>
<p>When a claim published into a channel, the claim data is signed and the following is added to the claim:</p>
<pre><code>&quot;publisherSignature&quot;: {
&quot;channelClaimID&quot;: &quot;2996b9a087c18456402b57cba6085b2a8fcc136d&quot;,
&quot;signature&quot;: &quot;bf82d53143155bb0cac1fd3d917c000322244b5aD17
e7865124db2ed33812ea66c9b0c3f390a65a9E2d452
e315e91ae695642847d88e90348ef3c1fa283a36a8&quot;,
&quot;signatureType&quot;: &quot;SECP256k1&quot;,
&quot;version&quot;: &quot;_0_0_1&quot;
}
</code></pre>
<h3 id="metadata-validation">Metadata Validation</h3>
<p>Clients are responsible for validating metadata, including data structure and signatures.</p>
<p>(expand)</p>
<ul>
<li>Validation 101</li>
<li>Channel / identity validation</li>
</ul>
<h2 id="data">Data</h2>
<p>(This portion covers how content is actually encoded and decoded, fetched, and announced. Expand/fix.)</p>
<h3 id="encoding-and-decoding">Encoding and Decoding</h3>
<!-- done -->
<h4 id="blobs">Blobs</h4>
<!-- done -->
<p>The unit of data in our network is called a <em>blob</em>. A blob is an encrypted chunk of data up to 2MiB in size. Each blob is indexed by its <em>blob hash</em>, which is a SHA384 hash of the blob contents. Addressing blobs by their hashes simultaneously protects against naming collisions and ensures that the content you get is what you expect.</p>
<p>Blobs are encrypted using AES-256 in CBC mode and PKCS7 padding. In order to keep each encrypted blob at 2MiB max, a blob can hold at most 2097151 bytes (2MiB minus 1 byte) of plaintext data. The source code for the exact algorithm is available <a href="https://github.com/lbryio/lbry.go/blob/master/stream/blob.go">here</a>. The encryption key and IV for each blob is stored as described below.</p>
<h4 id="streams">Streams</h4>
<!-- done -->
<p>Multiple blobs are combined into a <em>stream</em>. A stream may be a book, a movie, a CAD file, etc. All content on the network is shared as streams. Every stream begins with the <em>manifest blob</em>, followed by one or more <em>content blobs</em>. The content blobs hold the actual content of the stream. The manifest blob contains information necessary to find the content blobs and convert them into a file. This includes the hashes of the content blobs, their order in the stream, and cryptographic material for decrypting them.</p>
<p>The blob hash of the manifest blob is called the <em>stream hash</em>. It uniquely identifies each stream.</p>
<h4 id="manifest-contents">Manifest Contents</h4>
<!-- done -->
<p>A manifest blob&rsquo;s contents are encoded using a canonical JSON encoding. The JSON encoding must be canonical to support consistent hashing and validation. The encoding is the same as standard JSON, but adds the following rules:</p>
<ul>
<li>Object keys must be quoted and lexicographically sorted.</li>
<li>All strings are hex-encoded. Hex letters must be lowercase.</li>
<li>Whitespace before, after, or between tokens is not permitted.</li>
<li>Floating point numbers, leading zeros, and &ldquo;minus 0&rdquo; for integers are not permitted.</li>
<li>Trailing commas after the last item in an array or object are not permitted.</li>
</ul>
<p>Here&rsquo;s an example manifest, with whitespace added for readability:</p>
<!-- originally from 053b2f0f0e82e7f022837382733d5f5817dcd67027103fe43f00fa7a6f9fa8742c1022a851616c1ac15d1c60e89db3f4 -->
<pre><code>{
&quot;blobs&quot;:[
{
&quot;blob_hash&quot;:&quot;a6daea71be2bb89fab29a2a10face08143411a5245edcaa5efff48c2e459e7ec01ad20edfde6da43a932aca45b2cec61&quot;,
&quot;iv&quot;:&quot;ef6caef207a207ca5b14c0282d25ce21&quot;,
&quot;length&quot;:2097152
},
{
&quot;blob_hash&quot;:&quot;bf2717e2c445052366d35bcd58edb108cbe947af122d8f76b4856db577aeeaa2def5b57dbb80f7b1531296bd3e0256fc&quot;,
&quot;iv&quot;:&quot;a37b291a37337fc1ff90ae655c244c1d&quot;,
&quot;length&quot;:2097152
},
...,
{
&quot;blob_hash&quot;:&quot;322973617221ddfec6e53bff4b74b9c21c968cd32ba5a5094d84210e660c4b2ed0882b114a2392a08b06183f19330aaf&quot;,
&quot;iv&quot;: &quot;a00f5f458695bdc9d50d3dbbc7905abc&quot;,
&quot;length&quot;: 600160
}
],
&quot;filename&quot;:&quot;6b706a7977755477704d632e6d7034&quot;,
&quot;key&quot;:&quot;94d89c0493c576057ac5f32eb0871180&quot;,
&quot;version&quot;:1
}
</code></pre>
<p>The <code>key</code> field contains the key to decrypt the stream, and is optional. The key may be stored by a third party and made available to a client when presented with proof that the content was purchased. The <code>version</code> field is always 1. It is intended to signal structure changes in the future. The <code>length</code> field for each blob is the length of the encrypted blob, not the original file chunk.</p>
<p>Every stream must have at least two blobs - the manifest blob and a content blob. Consequently, zero-length streams are not allowed.</p>
<h4 id="stream-encoding">Stream Encoding</h4>
<!-- done -->
<p>A file must be encoded into a stream before it can be published. Encoding involves breaking the file into chunks, encrypting the chunks into content blobs, and creating the manifest blob. Here are the steps:</p>
<h5 id="setup">Setup</h5>
<!-- done -->
<ol>
<li>Generate a random 32-byte key for the stream.</li>
</ol>
<h5 id="content-blobs">Content Blobs</h5>
<!-- done -->
<ol>
<li>Break the file into chunks of at most 2097151 bytes.</li>
<li>Generate a random IV for each chuck.</li>
<li>Pad each chunk using PKCS7 padding</li>
<li>Encrypt each chunk with AES-CBC using the stream key and the IV for that chunk.</li>
<li>An encrypted chunk is a blob.</li>
</ol>
<h5 id="manifest-blob">Manifest Blob</h5>
<!-- done -->
<ol>
<li>Fill in the manifest data.</li>
<li>Encode the data using the canonical JSON encoding described above.</li>
<li>Compute the stream hash.</li>
</ol>
<p>An implementation of this process is available <a href="https://github.com/lbryio/lbry.go/tree/master/stream">here</a>.
fixme: this link is for v0, not v1. need to implement v1 or drop the link.</p>
<h4 id="stream-decoding">Stream Decoding</h4>
<!-- done -->
<p>Decoding a stream is like encoding in reverse, and with the added step of verifying that the expected blob hashes match the actual data.</p>
<ol>
<li>Verify that the manifest blob hash matches the stream hash you expect.</li>
<li>Parse the manifest blob contents.</li>
<li>Verify the hashes of the content blobs.</li>
<li>Decrypt and remove the padding from each content blob using the key and IVs in the manifest.</li>
<li>Concatenate the decrypted chunks in order.</li>
</ol>
<h3 id="announce">Announce</h3>
<p>After a [[stream]] is encoded, it must be <em>announced</em> to the network. Announcing is the process of letting other nodes on the network know that you have content available for download. The LBRY networks tracks announced content using a distributed hash table.</p>
<h4 id="distributed-hash-table">Distributed Hash Table</h4>
<p><em>Distributed hash tables</em> (or DHTs) have proven to be an effective way to build a decentralized content network. Our DHT implementation follows the <a href="https://pdos.csail.mit.edu/~petar/papers/maymounkov-kademlia-lncs.pdf">Kademlia</a>
spec fairly closely, with some modifications.</p>
<p>A distributed hash table is a key-value store that is spread over multiple host nodes in a network. Nodes may join or leave the network anytime, with no central coordination necessary. Nodes communicate with each other using a peer-to-peer protocol to advertise what data they have and what they are best positioned to store.</p>
<p>When a host connects to the DHT, it announces the hash for every [[blob]] it wishes to share. Downloading a blob from the network requires querying the DHT for a list of hosts that announced that blobs hash (called <em>peers</em>), then requesting the blob from the peers directly.</p>
<h4 id="announcing-to-the-dht">Announcing to the DHT</h4>
<p>A host announces a hash to the DHT in two steps. First, the host looks for nodes that are closest to the target hash that will be announced. Then the host announces the target hash to those nodes.</p>
<p>Finding the closest nodes is done via iterative <code>FindNode</code> DHT requests. The host starts with the closest nodes it knows about and sends a <code>FindNode(target_hash)</code> request to each of them. If any of the requests return nodes that are closer to the target hash, the host sends <code>FindNode</code> requests to those nodes to try to get even closer. When the <code>FindNode</code> requests no longer return nodes that are closer, the search ends.</p>
<p>Once the search is over, the host takes the 8 closest nodes it found and sends a <code>Store(target_hash)</code> request to them. The nodes receiving this request store the fact that the host is a peer for the target hash.</p>
<h3 id="download">Download</h3>
<p>A client wishing to download a [[stream]] must first query the [[DHT]] to find peers hosting the [[blobs]] in that stream, then contact those peers directly to download the blobs directly.</p>
<h4 id="querying-the-dht">Querying the DHT</h4>
<p>Querying works almost the same way as [[announcing]]. A client looking for a target hash will start by sending iterative <code>FindValue(target_hash)</code> requests to the nodes it knows that are closest to the target hash. If a node receives a <code>FindValue</code> request and knows of any peers for the target hash, it will respond with a list of those peers. Otherwise, it will respond with the closest nodes to the target hash that it knows about. The client then queries those closer nodes using the same <code>FindValue</code> call. This way, each call either finds the client some peers, or brings it closer to finding those peers. If no peers are found and no closer nodes are being returned, the client will determine that the target hash is not available and will give up.</p>
<h4 id="blob-exchange-protocol">Blob Exchange Protocol</h4>
<p>Downloading a blob from a peer is governed by the <em>Blob Exchange Protocol</em>. It is used by hosts and clients to check blob availability, exchange blobs, and negotiate data prices. The protocol is an RPC protocol using Protocol Buffers and the gRPC framework. It has five types of requests.</p>
<p>fixme: protocol does not <strong>negotiate</strong> anything right now. It just checks the price. Should we include negotiation in v1?</p>
<h5 id="pricecheck">PriceCheck</h5>
<p>PriceCheck gets the price that the server is charging for data transfer. It returns the prices in [[deweys]] per KB.</p>
<h5 id="downloadcheck">DownloadCheck</h5>
<p>DownloadCheck checks whether the server has certain blobs available for download. For each hash in the request, the server returns a true or false to indicate whether the blob is available.</p>
<h5 id="download-1">Download</h5>
<p>Download requests the blob for a given hash. The response contains the blob, its hash, and the address where to send payment for the data transfer. If the blob is not available on the server, the response will instead contain an error.</p>
<h5 id="uploadcheck">UploadCheck</h5>
<p>UploadCheck asks the server whether blobs can be uploaded to it. For each hash in the request, the server returns a true or false to indicate whether it would accept a given blob for upload. In addition, if any of the hashes in the request is a stream hash and the server has the manifest blob for that stream but is missing some content blobs, it may include the hashes of those content blobs in the response.</p>
<h5 id="upload">Upload</h5>
<p>Upload sends a blob to the server. If uploading many blobs, the client should use the UploadCheck request to check which blobs the server actually needs. This avoids needlessly uploading blobs that the server already has. If a client tries to upload too many blobs that the server does not want, this may be consider a denial of service attack.</p>
<p>The protocol calls and message types are defined in detail <a href="https://github.com/lbryio/lbry.go/blob/master/blobex/blobex.proto">here</a>.</p>
<h3 id="reflector-blobex-upload">Reflector / BlobEx Upload</h3>
<h3 id="data-markets">Data Markets</h3>
<p>To incentivize hosts and reflectors, the blob exchange protocol supports payment for data.</p>
<p>(Price negotiation.)</p>
<!--
### Data Market
Hosts in the DHT can treat blobs as opaque chunks of data. There is price negotiation mechanism for data. So some hosts can be
purely interested in storing data and selling it. They may create algorithms for what data is more in demand (e.g. the first content
blob in a stream is probably requested more often than the last blob).
Talk about reputation system for hosts.
Talk about how lightning can be used for streaming payments.
-->
<h2 id="conclusion">Conclusion</h2>
<p><em>TODO</em></p>
<p><em>Edit this on Github: <a href="https://github.com/lbryio/spec">https://github.com/lbryio/spec</a></em></p>
<!---
### Supports, Tips
Supports add weight to name claims. They are kind of like voting. You retain control of the credits.
Tips are supports where person you tip gains control of the credits.
### Discovery
### Search
Search will be handled primarily by external indexing services. There are many existing search solutions
that would ingest the blockchain data and build an index of the content. The novel aspect of our system is
that the credits committed to a claim are a strong signal of relevance.
### Tagging
Tags provide extra information for content discovery. A tag has a claim ID and a name. Tags can be created,
supported, updated, and abandoned, just like claims. One key difference is that tag supports may be
labeled “negative” supports. A negative support reduces the effective amount of credits attached to a
tag. This is a signal that the tag is not a good fit for the content of the claim.
## Trust and Security
We believe that the
## Combatting the Ugly
Use this section to rebut some of the most common concerns regarding the nature of LBRY.
One of our core beliefs is that people want to pay the legitimate content owners and creators, as
long as the content reasonably-priced and the payment process is convenient.
Conclusion
Summary
-->
</body>
</html>