<main>

# LBRY: A Decentralized Digital Content Marketplace


<div class="notice">
  <p>Please excuse the unfinished state of this paper. It is being actively worked on. The content here is made available early because it contains useful information for developers.</p>
  <p>For more technical information about LBRY, visit <a href="https://lbry.tech">lbry.tech</a>.</p>
</div>


<div class="toc-menu">Menu</div>
<nav class="toc"></nav>
<div id="content">

<noscript>

## Table of Contents
 
<!-- this TOC is autogenerated for github preview or js-challenged browsers -->

<!--ts-->
* [Introduction](#introduction)
   * [Overview](#overview)
   * [Assumptions](#assumptions)
   * [Conventions and Terminology](#conventions-and-terminology)
* [Blockchain](#blockchain)
   * [Stakes](#stakes)
      * [Claims](#claims)
      * [Claim Properties](#claim-properties)
      * [Example Claim](#example-claim)
      * [Claim Operations](#claim-operations)
      * [Supports](#supports)
      * [Claimtrie](#claimtrie)
      * [Statuses](#stake-statuses)
         * [Accepted](#accepted)
         * [Abandoned](#abandoned)
         * [Active](#active)
         * [Controlling (claims only)](#controlling)
      * [Claimtrie Transitions](#claimtrie-transitions)
         * [Stake Activation](#stake-activation)
         * [Example](#transition-example)
      * [Normalization](#normalization)
   * [URLs](#urls)
      * [Components](#components)
         * [Stream Claim Name](#stream-claim-name)
         * [Channel Claim Name](#channel-claim-name)
         * [Channel Claim Name and Stream Claim Name](#channel-claim-name-and-stream-claim-name)
         * [Claim ID](#claim-id)
         * [Claim Sequence](#claim-sequence)
         * [Bid Position](#bid-position)
         * [Query Params](#query-params)
      * [Grammar](#grammar)
      * [Resolution](#resolution)
         * [No Modifier](#no-modifier)
         * [Claim ID](#claim-id-1)
         * [Claim Sequence](#claim-sequence-1)
         * [Bid Position](#bid-position-1)
         * [ChannelName and ClaimName](#channelname-and-claimname)
         * [Examples](#url-resolution-examples)
      * [Design Notes](#design-notes)
   * [Transactions](#transactions)
      * [Operations and Opcodes](#operations-and-opcodes)
         * [Claim Identifier Generation](#claim-identifier-generation)
         * [OP_CLAIM_NAME](#op-claim-name)
         * [OP_UPDATE_CLAIM](#op-update-claim)
         * [OP_SUPPORT_CLAIM](#op-support-claim)
      * [Tips](#tips)
      * [Addresses](#addresses)
      * [Proof of Payment](#proof-of-payment)
   * [Consensus](#consensus)
      * [Block Timing](#block-timing)
      * [Difficulty Adjustment](#difficulty-adjustment)
      * [Block Hash Algorithm](#block-hash-algorithm)
      * [Block Rewards](#block-rewards)
* [Metadata](#metadata)
   * [Specification](#specification)
      * [Example](#metadata-example)
   * [Key Fields](#key-fields)
      * [Source and Stream Hashes](#source-and-stream-hashes)
      * [Fees and Fee Structure](#fees-and-fee-structure)
      * [Title, Author, Description](#title-author-description)
      * [Language](#language)
      * [Thumbnail](#thumbnail)
      * [Media Type](#media-type)
   * [Channels (Identities)](#channels)
      * [Signing](#signing)
         * [Format Versions](#format-versions)
         * [Signing Process](#signing-process)
         * [Signature Validation](#signature-validation)
   * [Validation](#metadata-validation)
* [Data](#data)
   * [Encoding](#encoding)
      * [Blobs](#blobs)
      * [Streams](#streams)
      * [Manifest Contents](#manifest-contents)
      * [Stream Encoding](#stream-encoding)
         * [Setup](#setup)
         * [Content Blobs](#content-blobs)
         * [Manifest Blob](#manifest-blob)
      * [Stream Decoding](#stream-decoding)
   * [Announce](#announce)
      * [Distributed Hash Table](#distributed-hash-table)
      * [Announcing to the DHT](#announcing-to-the-dht)
   * [Download](#download)
      * [Querying the DHT](#querying-the-dht)
      * [Blob Exchange Protocol](#blob-exchange-protocol)
         * [PriceCheck](#pricecheck)
         * [DownloadCheck](#downloadcheck)
         * [Download](#download-1)
         * [UploadCheck](#uploadcheck)
         * [Upload](#upload)
   * [Reflectors and Data Markets](#reflectors-and-data-markets)
<!--te-->

</noscript>

<!-- 
fixme final polish checklist:

- go over the paper to make sure we use active voice in most places (though passive is better sometimes)
- standardize when we say "we do X" vs "LBRY does X"
- check that all anchors work
- check css across browsers/mobile
- 

-->


## Introduction

<!-- fixme -->

LBRY is a protocol for accessing and publishing digital content in a global, decentralized marketplace. Clients can use LBRY to publish, host, find, download, and pay for content — books, movies, music, or anything else that can be represented as a stream of bits. Anyone can participate and no permission is required, nor can anyone be blocked from participating. The system is distributed, so no single entity has unilateral control, nor will the removal of any single entity prevent the system from functioning.

TODO:

- why is it significant
- whom does it help
- why is it different/better than what existed before

### Overview

<!-- fix me -->

This document defines the LBRY protocol, its components, and how they fit together. LBRY consists of several discrete components that are used together in order to provide the end-to-end capabilities of the protocol. There are two distributed data stores (blockchain and DHT), a peer-to-peer protocol for exchanging data, and several specifications for data structure, transformation, and retrieval. 

### Assumptions

<!-- fix me -->

This document assumes that the reader is familiar with Bitcoin and blockchain technology. It does not attempt to document the Bitcoin protocol or explain how it works. The [Bitcoin developer reference](https://bitcoin.org/en/developer-reference) is recommended for anyone wishing to understand the technical details.

### Conventions and Terminology

<!-- fix me - rather than inline this here, I think we should use lbry.tech glossary definitions and the [[keyword]] syntax -->

<dl>
  <dt>file</dt>
  <dd>A single piece of content published using LBRY.</dd>

  <dt>blob</dt>
  <dd>The unit of data transmission on the data network. A published file is split into many blobs.</dd>

  <dt>stream</dt>
  <dd>A set of blobs that can be reassembled into a file. Every stream has a descriptor blob and one or more content blobs.</dd>

  <dt>blob hash</dt>
  <dd>The output of a cryptographic hash function is applied to a blob. Hashes are used to uniquely identify blobs and to verify that the contents of the blob are correct. Unless otherwise specified, LBRY uses SHA384 as the hash function.</dd>

  <dt>metadata</dt>
  <dd>Information about the contents of a stream (e.g. creator, description, stream descriptor hash, etc). Metadata is stored in the blockchain.</dd>

  <dt>name</dt>
  <dd>A human-readable UTF8 string that is associated with a stream.</dd>

  <dt>stake</dt>
  <dd>An entry in the blockchain that commits credits toward a name.</dd>

  <dt>claim</dt>
  <dd>A stake that contains metadata about a stream or channel.</dd>

  <dt>support</dt>
  <dd>A stake that lends its credits to bolster an existing claim.</dd>

  <dt>channel</dt>
  <dd>The unit of pseudonymous publisher identity. Claims may be part of a channel.</dd>

  <dt>URL</dt>
  <dd>A reference to a claim that specifies how to retrieve it.</dd>
</dl>


## Blockchain


The LBRY blockchain is a public, proof-of-work blockchain. It serves three key purposes: 

1. An index of the content available on the network 
2. A payment system and record of purchases for priced content
3. Trustful publisher identities

The LBRY blockchain is a fork of the [Bitcoin](https://bitcoin.org/bitcoin.pdf) blockchain, with substantial modifications. This document will not cover or specify any aspects of LBRY that are identical to Bitcoin, and will instead focus on the differences.

### Stakes

A _stake_ is a a single entry in the blockchain that commits credits toward a name. The two types of stakes are [_claims_](#claims) and [_supports_](#supports).

All stakes have these properties:

<dl>
  <dt>id</dt>
  <dd>A 20-byte hash unique among all stakes. See <a href="#stake-identifier-generation">Stake Identifier Generation</a>.</dd>
  <dt>amount</dt>
  <dd>A quantity of tokens used to back the stake. See <a href="#controlling">Controlling</a>.</dd>
</dl>


#### Claims

A _claim_ is a stake that stores metadata. There are two types of claims:

<dl>
  <dt>stream claim</dt>
  <dd>Declares the availability, access method, and publisher of a stream of bytes (an encoded file).</dd>
  <dt>channel claim</dt>
  <dd>Creates a pseudonym that can be declared as the publisher of a set of stream claims.</dd>
</dl>

#### Claim Properties

In addition to the properties that all stakes have, claims have two more properties:

<dl>
  <dt>name</dt>
  <dd>A normalized UTF-8 string of up to 255 bytes used to address the claim. See <a href="#urls">URLs</a> and <a href="#normalization">Normalization</a>.</dd>
  <dt>value</dt>
  <dd>Metadata about a stream or a channel. See <a href="#metadata">Metadata</a>.</dd>
</dl>
  
#### Example Claim

Here is an example stream claim:

```
{
  "claim_id": "6e56325c5351ceda2dd0795a30e864492910ccbf",
  "name": "lbry",
  "amount": 1.0,
  "value": {
    "stream": {
      "title": "What is LBRY?",
      "author": "Samuel Bryan",
      "description": "What is LBRY? An introduction with Alex Tabarrok",
      "language": "en",
      "license": "LBRY inc",
      "thumbnail": "https://s3.amazonaws.com/files.lbry.io/logo.png",
      "mediaType": "video/mp4",
      "stream_hash": "232068af6d51325c4821ac897d13d7837265812164021ec832cb7f18b9caf6c77c23016b31bac9747e7d5d9be7f4b752",
    },
  }
}
```
Figure: Note: the blockchain treats the `value` as an opaque byte string and does not impose any structure on it. Structure is applied and validated [higher in the stack](#metadata-validation). In this example, the value is shown for demonstration purposes only. 

#### Claim Operations

There are three claim operations: _create_, _update_, and _abandon_.

<dl>
  <dt>create</dt>
  <dd>Makes a new claim.</dd>
  <dt>update</dt>
  <dd>Changes the value or amount of an existing claim, without changing the claim ID.</dd>
  <dt>abandon</dt>
  <dd>Withdraws a claim, freeing the associated credits to be used for other purposes.</dd>
</dl>

#### Supports

A _support_ is a stake that lends its _amount_ to an existing claim.

Supports have one extra property on top of the basic stake properties: a `claim_id`. This is the ID of the claim that the support is bolstering. 

Supports are created and abandoned just like claims (see [Claim Operations](#claim-operations)). They cannot be updated or themselves supported.

#### Claimtrie

The _claimtrie_ is the data structure used to store the set of all claims and prove the correctness of claim resolution.

The claimtrie is implemented as a [Merkle tree](https://en.wikipedia.org/wiki/Merkle_tree) that maps names to claims. Claims are stored as leaf nodes in the tree. Names are stored as the path from the root node to the leaf node.

The _root hash_ is the hash of the root node. It is stored in the header of each block in the blockchain. Nodes in the LBRY network use the root hash to efficiently and securely validate the state of the claimtrie.

Multiple claims can exist for the same name. They are all stored in the leaf node for that name, sorted by the amount of credits backing the claim (descending), then by block height (ascending), then by transaction order in the block (ascending).

<!-- fix me above? "amount of credits backing each claim" is the effective amount, but that's not defined till later -->

For more details on the specific claimtrie implementation, see [the source code](https://github.com/lbryio/lbrycrd/blob/master/src/claimtrie.cpp).

#### Statuses {#stake-statuses}

Stakes can have one or more of the following statuses at a given block.

##### Accepted

An _accepted_ stake is one that has been been entered into the blockchain. This happens when the transaction containing it is included in a block.

Accepted stakes do not affect the intra-leaf claim order until they are [active](#active).

The sum of the amount of a claim stake and all accepted supports is called the _total amount_.

##### Abandoned

An _abandoned_ stake is one that was withdrawn by its creator or current owner. Spending a transaction that contains a stake will cause that claim to become abandoned. Abandoned claim stakes are removed from the claimtrie.

While data related to abandoned stakes still resides in the blockchain, it is improper to use this data to fetch the associated content. Active claim stakes signed by abandoned identities will no longer be reported as valid.

##### Active

<!-- fix me a lot -->

An _active_ stake is an accepted and non-abandoned stake that has been in the blockchain for an algorithmically determined number of blocks. This length of time required is called the _activation delay_.

If the stake is an update to an already active claim, is the first claim for a name, or does not cause a change in which claim is controlling the name, the activation delay is 0 (i.e. the stake becomes active in the same block it is accepted).

Otherwise, the activation delay is determined by a formula covered in [Claimtrie Transitions](#claimtrie-transitions). The formula's inputs are the height of the current block, the height at which the stake was accepted, and the height at which the controlling claim for that name last changed.

The sum of the amount of an active claim and all active supports is called it's _effective amount_. Only the effective amount affects the sort order of claims in a leaf node. Claims that are not active have an effective amount of 0.

##### Controlling (claims only) {#controlling}

A _controlling_ claim is the active claim that is first in the sort order of a leaf node. That is, it has the highest effective amount of all claims with the same name. 

Only one claim can be controlling for a given name at a given block. 

#### Claimtrie Transitions

To determine the sort order of claims in a leaf node, the following algorithm is used:

1. For each claim, recalculate the effective amount.

2. Sort the claims by effective amount in descending order. Claims tied for the same amount are ordered by block height (ascending), then by transaction order within the block (ascending).

3. If the controlling claim from the previous block is still first in the order, then the sort is finished.

4. Otherwise, a takeover is occurring. Set the takeover height for this name to the current height, recalculate which stakes are now active, and return to step 1.

5. At this point, the claim with the greatest effective amount is the controlling claim at this block.

The purpose of 4 is to handle the case when multiple competing claims are made on the same name in different blocks, and one of those claims becomes active but another still-inactive claim has the greatest effective amount. Step 4 will cause the greater claim to also activate and become the controlling claim.

##### Stake Activation

If a stake does not become active immediately, it becomes active at the block heigh determined by the following formula:

```
C + min(4032, floor((H-T) / 32))
```

Where: 

- C = stake height (height when the stake was accepted)
- H = current height
- T = takeover height (the most recent height at which the controlling claim for the name changed)

In written form, the delay before a stake becomes active is equal to the stake's height minus height of the last takeover, divided by 32. The delay is capped at 4032 blocks, which is 7 days of blocks at 2.5 minutes per block (our target block time). The max delay is reached 224 (7x32) days after the last takeover. 

The purpose of this delay function is to give long-standing claimants time to respond to changes, while still keeping takeover times reasonable and allowing recent or contentious claims to change state quickly.

##### Example {#transition-example}

Here is a step-by-step example to illustrate the different scenarios. All stakes are for the same name.

**Block 13:** Claim A for 10LBC is accepted. It is the first claim, so it immediately becomes active and controlling.
<br>State: A(10) is controlling

**Block 1001:** Claim B for 20LBC is accepted. It’s activation height is `1001 + min(4032, floor((1001-13) / 32)) = 1001 + 30 = 1031`.
<br>State: A(10) is controlling, B(20) is accepted.

**Block 1010:** Support X for 14LBC for claim A is accepted. Since it is a support for the controlling claim, it activates immediately.
<br>State: A(10+14) is controlling, B(20) is accepted.

**Block 1020:** Claim C for 50LBC is accepted. The activation height is `1020 + min(4032, floor((1020-13) / 32)) = 1020 + 31 = 1051`.
<br>State: A(10+14) is controlling, B(20) is accepted, C(50) is accepted.

**Block 1031:** Claim B activates. It has 20LBC, while claim A has 24LBC (10 original + 14 from support X). There is no takeover, and claim A remains controlling.
<br>State: A(10+14) is controlling, B(20) is active, C(50) is accepted.

**Block 1040:** Claim D for 300LBC is accepted. The activation height is `1040 + min(4032, floor((1040-13) / 32)) = 1040 + 32 = 1072`.
<br>State: A(10+14) is controlling, B(20) is active, C(50) is accepted, D(300) is accepted.

**Block 1051:** Claim C activates. It has 50LBC, while claim A has 24LBC, so a takeover is initiated. The takeover height for this name is set to 1051, and therefore the activation delay for all the claims becomes `min(4032, floor((1051-1051) / 32)) = 0`. All the claims become active. The totals for each claim are recalculated, and claim D becomes controlling because it has the highest total.
<br>State: A(10+14) is active, B(20) is active, C(50) is active, D(300) is controlling.


#### Normalization

Names in the claimtrie are normalized prior to comparison to avoid confusion due to Unicode equivalence or casing. When names are being compared, they are first converted using [Unicode Normalization Form D](http://unicode.org/reports/tr15/#Norm_Forms) (NFD), then lowercased using the en_US locale when possible. Since claims competing for the same name are stored in the same node in the claimtrie, names are also normalized to determine the claimtrie path to the node.

### URLs

<!-- fix me:
  jeremy: @grin does SPV need a mention inside of the document? 
  grin: no, but we should probably include an example for how to do the validation using the root hash. its not strictly necessary because its similar to how bitcoin does it. so maybe link to https://lbry.tech/resources/claimtrie (which needs an update) and add a validation example there?
  -->

URLs are human-readable references to claims. All URLs:

1. must contain a name (see [Claim Properties](#claim-properties)), and
2. resolve to a single, specific claim for that name

The ultimate purpose of much of the claim and blockchain design is to provide human-readable URLs that can be provably resolved by clients without a full copy of the blockchain (i.e. [Simplified Payment Verification](https://lbry.tech/glossary#spv) wallets).

It is possible to write short, human-readable, and memorable URLs. 


#### Components

A URL is a name with one or more modifiers. A bare name on its own will resolve to the [controlling claim](#controlling) at the latest block height. Common URL structures are:

##### Stream Claim Name

A basic stream claim.

```
lbry://meet-lbry
```

##### Channel Claim Name

A basic channel claim.

```
lbry://@lbry
```

##### Channel Claim Name and Stream Claim Name

A URL containing both a channel and a stream claim name. URLs containing both are resolved in two steps. First, the channel is resolved to it's associated claim. Then the stream claim name is resolved to get the appropriate claim from among the claims in the channel.

```
lbry://@lbry/meet-lbry
```

##### Claim ID

A claim for this name with this claim ID. Partial prefix matches are allowed (see [Resolution](#resolution)).

```
lbry://meet-lbry#7a0aa95c5023c21c098
lbry://meet-lbry#7a
lbry://@lbry#3f/meet-lbry
```

##### Claim Sequence

The _n_th claim for this name, in the order the claims entered the blockchain. _n_ must be a positive number. This can be used to resolve claims in the order in which they were accepted, rather than by which claim has the most credits backing it.

```
lbry://meet-lbry:1
lbry://@lbry:1/meet-lbry
```

##### Bid Position

The _n_th claim for this name, ordered by total amount (highest first). _n_ must be a positive number. This is useful for resolving non-winning bids in bid order.

```
lbry://meet-lbry$2
lbry://meet-lbry$3
lbry://@lbry$2/meet-lbry
```

##### Query Params

These parameters are reserved for future use.

```
lbry://meet-lbry?arg=value+arg2=value2
```

#### Grammar

The full URL grammar is defined using [Xquery EBNF notation](https://www.w3.org/TR/2017/REC-xquery-31-20170321/#EBNFNotation):

<!-- use http://bottlecaps.de/rr/ui for visuals -->

```
URL ::= Scheme Path Query?

Scheme ::= 'lbry://'

Path ::=  StreamClaimNameAndModifier | ChannelClaimNameAndModifier ( '/' StreamClaimNameAndModifier )?

StreamClaimNameAndModifier ::= StreamClaimName Modifier?
ChannelClaimNameAndModifier ::= ChannelClaimName Modifier?

StreamClaimName ::= NameChar+
ChannelClaimName ::= '@' NameChar+

Modifier ::= ClaimID | ClaimSequence | BidPosition
ClaimID ::= '#' Hex+
ClaimSequence ::= ':' PositiveNumber
BidPosition ::= '$' PositiveNumber

Query ::= '?' QueryParameterList
QueryParameterList ::= QueryParameter ( '&' QueryParameterList )*
QueryParameter ::= QueryParameterName ( '=' QueryParameterValue )?
QueryParameterName ::= NameChar+
QueryParameterValue ::= NameChar+

PositiveDigit ::= [123456789]
Digit ::= '0' | PositiveDigit
PositiveNumber ::= PositiveDigit Digit*

HexAlpha ::= [abcdef]
Hex ::= (Digit | HexAlpha)+

NameChar ::= Char - [=&#:$@?/]  /* any character that is not reserved */
Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] /* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */
```

#### Resolution

URL _resolution_ is the process of translating a URL into the associated claim ID and metadata. 

##### No Modifier

Return the controlling claim for the name. Stream claims and channel claims are resolved the same way.

##### Claim ID

Get all claims for the claim name whose IDs start with the given `ClaimID`. Sort the claims in ascending order by block height and position within the block. Return the first claim.

##### Claim Sequence

Get all claims for the claim name. Sort the claims in ascending order by block height and position within the block. Return the Nth claim, where N is the given `ClaimSequence` value.

##### Bid Position

Get all claims for the claim name. Sort the claims in descending order by total effective amount. Return the Nth claim, where N is the given `BidSequence` value.

##### ChannelName and ClaimName

<!-- fix me: explain how claim signing works, and what it means to be **in** a channel -->

If both a channel name and a claim name is present, resolution happens in two steps. First, remove the `/` and `StreamClaimNameAndModifier` from the path, and resolve the URL as if it only had a `ChannelClaimNameAndModifier`. Then get the list of all claims in that channel. Finally, resolve the `StreamClaimNameAndModifier` as if it was its own URL, but instead of considering all claims, only consider the set of claims in the channel.

If multiple claims for the same name exist inside the same channel, they are resolved via the same resolution rules applied entirely within the sub-scope of the channel.

##### Examples {#url-resolution-examples}

Suppose the following names were claimed in the following order:


Name            | Claim ID  | Amount
:---            | :---      | :---
apple           | 690eea    | 1
banana          | 714a3f    | 2
cherry          | bfaabb    | 100
apple           | 690eea    | 10
@Arthur         | b7bab5    | 1
@Bryan          | 0da517    | 1
@Chris          | b3f7b1    | 1
@Chris/banana   | fc861c    | 1
@Arthur/apple   | 37ee1     | 20
@Bryan/cherry   | a18bca    | 10
@Chris          | 005a7d    | 100
@Arthur/cherry  | d39aa0    | 20

Here is how the following URLs should resolve:

URL                          | Claim ID     | Note
:---                         | :---         | :---
`lbry://apple`               | a37ee1 
`lbry://banana`              | 714a3f 
`lbry://@Chris`              | 005a7d 
`lbry://@Chris/banana`       | _not found_  | the controlling `@Chris` does not have a `banana`
`lbry://@Chris:1/banana`     | fc861c
`lbry://@Chris:#fc8/banana`  | fc861c
`lbry://cherry`              | bfaabb 
`lbry://@Arthur/cherry`      | d39aa0 
`lbry://@Bryan`              | 0da517 
`lbry://banana$1`            | 714a3f 
`lbry://banana$2`            | fc861c 
`lbry://banana$3`            | _not found_
`lbry://@Arthur:1`           |  b7bab5

#### Design Notes

The most contentious aspect of this design has been the choice to resolve naked names (sometimes called _vanity names_) to the claim with the highest effective amount.

First, it is important to note the problems in existing domain allocation design. Most existing public name schemes are first-come, first-serve with a fixed price. This leads to several bad outcomes:

1. Speculation and extortion. Entrepreneurs are incentivized to register common names even if they don't intend to use them, in hopes of selling them to the proper owner in the future for an exorbitant price. While speculation in general can have positive externalities (stable prices and price signals),  in this case it is pure value extraction. Speculation also harms the user experience, who will see the vast majority of URLs sitting unused (c.f. Namecoin).

2. Bureaucracy and transaction costs. While a centralized system can allow for an authority to use a process to reassign names based on trademark or other common use reasons, this system is also imperfect. Most importantly, it is a censorship point and an avenue for complete exclusion. Additionally, such processes are often arbitrary, change over time, involve significant transaction costs, and _still_ lead to names being used in ways that are contrary to user expectation (e.g. [nissan.com](http://nissan.com)).

3. Inefficiencies from price controls. Any system that does not allow a price to float freely creates inefficiencies. If the set price is too low, we facilitate speculation and rent-seeking. If the price is too high, we see people excluded from a good that it would otherwise be beneficial for them to purchase.

We sought an algorithmic design built into consensus that would allow URLs to flow to their highest valued use. Following [Coase](https://en.wikipedia.org/wiki/Coase_theorem), this staking design allows for clearly defined rules, low transaction costs, and no information asymmetry, minimizing inefficiency in URL allocation.

Finally, it's important to note that _only_ vanity URLs have this property. Extremely short, memorable URLs like `lbry://myclaimname#a` exist and are available for the minimal cost of issuing a transaction.


### Transactions

The LBRY blockchain includes the following changes to Bitcoin's transaction scripting language.

#### Operations and Opcodes

To enable [claim operations](#claim-operations), we added three new opcodes to the scripting language: `OP_CLAIM_NAME`, `OP_UPDATE_CLAIM`, and `OP_SUPPORT_CLAIM`. In Bitcoin they are respectively `OP_NOP6`, `OP_NOP7`, and `OP_NOP8`. The opcodes are used in output scripts to interact with the claimtrie. Each opcode is followed by one or more parameters. Here's how these opcodes are used:

```
OP_CLAIM_NAME <name> <value> OP_2DROP OP_DROP <outputScript>

OP_UPDATE_CLAIM <name> <claimId> <value> OP_2DROP OP_2DROP <outputScript>

OP_SUPPORT_CLAIM <name> <claimId> OP_2DROP OP_DROP <outputScript>
```

The `<name>` parameter is the [[name]] that the claim is associated with. `<value>` is the protobuf-encoded claim metadata and optional channel signature (see [Metadata](#metadata) for more about this value). The `<claimId>` is the claim ID of a previous claim that is being updated or supported.

Each opcode will push a zero on to the execution stack. Those zeros, as well as any additional parameters after the opcodes, are all dropped by `OP_2DROP` and `OP_DROP`. `<outputScript>` can be any valid script, so a script using these opcodes is also a pay-to-pubkey script. This means that claim scripts can be spent just like regular Bitcoin output scripts.

##### Claim Identifier Generation

Like any standard Bitcoin output script, a claim script will be associated with a transaction hash and output index. This combination of transaction hash and index is called an _outpoint_. Each claim script has a unique outpoint. The outpoint is hashed using SHA-256 and RIPEMD-160 to generate the claim ID for a claim. For the example above, let's say claim script is included in transaction `7560111513bea7ec38e2ce58a58c1880726b1515497515fd3f470d827669ed43` at the output index `1`. Then the claim ID would be `529357c3422c6046d3fec76be2358004ba22e323`. An implementation of this is available [here](https://github.com/lbryio/lbry.go/blob/master/lbrycrd/blockchain.go).


##### OP\_CLAIM\_NAME

New claims are created using `OP_CLAIM_NAME`. For example, a claim transaction setting the name `Fruit` to the value `Apple` will look like this:

```
OP_CLAIM_NAME Fruit Apple OP_2DROP OP_DROP OP_DUP OP_HASH160 <address> OP_EQUALVERIFY OP_CHECKSIG
```


##### OP\_UPDATE\_CLAIM

`OP_UPDATE_CLAIM` updates a claim by replacing its metadata. An update transaction has an added requirement that it must spend the output for the existing claim that it wishes to update. Otherwise, it will be considered invalid and will not make it into the claimtrie. Thus it must have the following redeem script:

```
<signature> <pubKeyForPreviousAddress>
```

The syntax is identical to the standard way of redeeming a pay-to-pubkey script in Bitcoin, with the caveat that `<pubKeyForPreviousAddress>` must be the public key for the address of the output that contains the claim that will be updated. 

To change the value of the previous example claim to “Banana”, the payout script would be

```
OP_UPDATE_CLAIM Fruit 529357c3422c6046d3fec76be2358004ba22e323 Banana OP_2DROP OP_2DROP OP_DUP OP_HASH160 <address> OP_EQUALVERIFY OP_CHECKSIG
```

The `<address>` in this script may be the same as the address in the original transaction, or it may be a new address.

##### OP\_SUPPORT\_CLAIM

A support for the original example claim will have the following payout script:

```
OP_SUPPORT_CLAIM Fruit 529357c3422c6046d3fec76be2358004ba22e323 OP_2DROP OP_DROP OP_DUP OP_HASH160 <address> OP_EQUALVERIFY OP_CHECKSIG
```

The `<address>` in this script may be the same as the address in the original transaction, or it may be a new address.


#### Tips

<!-- fixme: describe how tips are different from supports -->


#### Addresses

The address version byte is set to `0x55` for standard (pay-to-public-key-hash) addresses and `0x7a` for multisig (pay-to-script-hash) addresses. P2PKH addresses start with the letter `b`, and P2SH addresses start with `r`.

All the chain parameters are defined [here](https://github.com/lbryio/lbrycrd/blob/master/src/chainparams.cpp).

#### Proof of Payment

<!-- fixme -->

TODO: Explain how transactions serve as proof that a client has made a valid payment for a piece of content.


### Consensus

LBRY makes a few small changes to consensus rules.

#### Block Timing 

The target block time was lowered from 10 minutes to 2.5 minutes to facilitate faster transaction confirmation.

#### Difficulty Adjustment

The proof-of-work target is adjusted every block to better adapt to sudden changes in hash rate. The exact adjustment algorithm can be seen [here](https://github.com/lbryio/lbrycrd/blob/master/src/lbry.cpp).

#### Block Hash Algorithm

LBRY uses a combination of SHA-256, SHA-512, and RIPEMD-160. The exact hashing algorithm can be seen [here](https://github.com/lbryio/lbrycrd/blob/master/src/hash.cpp#L18).

#### Block Rewards

The block reward schedule was adjusted to provide an initial testing period, a quick ramp-up to max block rewards, then a logarithmic decay to 0. The source for the algorithm is [here](https://github.com/lbryio/lbrycrd/blob/master/src/main.cpp#L1594).


## Metadata

Metadata is structured information about the stream or channel separate from the content itself (e.g. the title, language, media type, etc.). It is stored in the [value property](#claim-properties) of a claim.

Metadata is stored in a serialized binary format via [Protocol Buffers](https://developers.google.com/protocol-buffers/). This allows for metadata to be:

- **Extensibile**. Metadata can encompass thousands of fields for dozens of types of content. It must be efficient to both modify the structure and maintain backward compatibility. 
- **Compact**. Blockchain space is expensive. Data must be stored as compactly as possible.
- **Interoperabile**. Metadata will be used by many projects written in different languages.

The serialized metadata may be signed to indicate membership in a channel. See [Channels](#channels) for more info.

### Specification

As the metadata specification is designed to grow and change frequently, the full specification will not be examined here. The [types](https://github.com/lbryio/types) repository is considered the precise specification.

Instead, let's look at an example and some key fields.

#### Example {#metadata-example}

Here’s some example metadata:

```
{
  "stream": {
    "title": "What is LBRY?",
    "author": "Samuel Bryan",
    "description": "What is LBRY? An introduction with Alex Tabarrok",
    "language": "en",
    "license": "LBRY inc",
    "thumbnail": "https://s3.amazonaws.com/files.lbry.io/logo.png",
    "mediaType": "video/mp4",
    "streamHash": "232068af6d51325c4821ac897d13d7837265812164021ec832cb7f18b9caf6c77c23016b31bac9747e7d5d9be7f4b752"
  }
}
```

### Key Fields

Some important metadata fields are highlighted below.

#### Source and Stream Hashes

The `source` property contains information about how to fetch the data from the network. Within the `source` is a unique identifier to locate and find the content in the data network. More in  [[Data]].

#### Fees and Fee Structure

<!-- fix me extensively -->

- LBC
- Currencies?
- channel signatures and private keys

#### Title, Author, Description 

Basic information about the stream.

#### Language

The [ISO 639-1](https://www.iso.org/iso-639-language-codes.html) two-letter code for the language of the stream.

#### Thumbnail

A URL to be used to display an image associated with the content.

#### Media Type

The media type of the item as [defined](https://www.iana.org/assignments/media-types/media-types.xhtml) by the IANA.


### Channels (Identities) {#channels}

Channels are the unit of identity in the LBRY system. A channel is a claim for a name beginning with `@` that contains a metadata structure for identity rather than content. Included in the metadata is the channel's public key. Here's an example:

```
"claim_id": "6e56325c5351ceda2dd0795a30e864492910ccbf",
"name": "@lbry",
"amount": 6.26,
"value": {
  "channel": {
    "keyType": "SECP256k1",
    "publicKey": "3056301006072a8648ce3d020106052b8104000a03420004180488ffcb3d1825af538b0b952f0eba6933faa6d8229609ac0aeadfdbcf49C59363aa5d77ff2b7ff06cddc07116b335a4a0849b1b524a4a69d908d69f1bcebb"
  }
}
```

Claims published to a channel contain a signature made with the corresponding private key. A valid signature proves channel membership.

The purpose of channels is to allow content to be clustered under a single pseudonym or identity. This allows publishers to easily list all their content, maintain attribution, and build their brand.


#### Signing

A claim is considered part of a channel when its metadata is signed by the channel's private key. Here's the structure of a signed metadata value:


field          | size     | description
:---           | :---     | :---
Version        | 1 byte   | Format version. See [Format Versions](#format-versions).
Channel ID     | 20 bytes | Claim ID of the channel claim that contains the matching public key. _Skip this field if there is no signature._
Signature      | 64 bytes | The signature. _Skip this field if there is no signature._
Payload        | variable | The protobuf-encoded metadata.


##### Format Versions

The following formats are supported: 

format     | description
:---       | :--- 
`00000000` | No signature.
`00000001` | Signature using ECDSA SECP256k1 key and SHA-256 hash.

##### Signing Process

1. Encode the metadata using protobuf.
1. Hash the encoded claim using SHA-256.
1. Sign the hash using the private key associated with the channel.
1. Append all the values (the version, the claim ID of the corresponding channel claim, the signature, and the protobuf-encoded metadata).

##### Signature Validation

1. Split out the version from the rest of the data.
1. Check the version field. If it indicates that there is no signature, then no validation is necessary.
1. Split out the channel ID and signature from the rest of the data.
1. Look up the channel claim to ensure it exists and contains a public key.
1. Use the public key to verify the signature.

### Validation {#metadata-validation}

No enforcement or validation on metadata happens at the blockchain level. Instead, metadata encoding, decoding, and validation is done by clients. This allows evolution of the metadata without changes to consensus rules.

Clients are responsible for validating metadata, including data structure and signatures. This typically happens when the raw binary data stored in the blockchain is decoded client side via Protocol Buffers. 

## Data

<!-- fixme this section -->

Data refers to the full binary data which is ultimate distributed by blah blah blah.

The purpose of blah blah blah is to blah blah.


### Encoding

Content on the LBRY network is encoded to facilitate distribution.

#### Blobs

The unit of data in the LBRY network is called a _blob_. A blob is an encrypted chunk of data up to 2MiB in size. Each blob is indexed by its _blob hash_, which is a SHA384 hash of the blob contents. Addressing blobs by their hash protects against naming collisions and ensures that the content you get is what you expect.

Blobs are encrypted using AES-256 in CBC mode and PKCS7 padding. In order to keep each encrypted blob at 2MiB max, a blob can hold at most 2097151 bytes (2MiB minus 1 byte) of plaintext data. The source code for the exact algorithm is available [here](https://github.com/lbryio/lbry.go/blob/master/stream/blob.go). The encryption key and IV for each blob is stored as described below. 

#### Streams

Multiple blobs are combined into a _stream_. A stream may be a book, a movie, a CAD file, etc. All content on the network is shared as streams. Every stream begins with the _manifest blob_, followed by one or more _content blobs_. The content blobs hold the actual content of the stream. The manifest blob contains information necessary to find the content blobs and decode them into a file. This includes the hashes of the content blobs, their order in the stream, and cryptographic material for decrypting them.

The blob hash of the manifest blob is called the _stream hash_. It uniquely identifies each stream.

#### Manifest Contents

A manifest blob's contents are encoded using a canonical JSON encoding. The JSON encoding must be canonical to support consistent hashing and validation. The encoding is the same as standard JSON, but adds the following rules:

- Object keys must be quoted and lexicographically sorted.
- All strings are hex-encoded. Hex letters must be lowercase.
- Whitespace before, after, or between tokens is not permitted.
- Floating point numbers, leading zeros, and "minus 0" for integers are not permitted.
- Trailing commas after the last item in an array or object are not permitted.

Here's an example manifest: 

<!-- originally from 053b2f0f0e82e7f022837382733d5f5817dcd67027103fe43f00fa7a6f9fa8742c1022a851616c1ac15d1c60e89db3f4 -->

```
{"blobs":[{"blob_hash":"a6daea71be2bb89fab29a2a10face08143411a5245edcaa5efff48c2e459e7ec01ad20edfde6da43a932aca45b2cec61","iv":"ef6caef207a207ca5b14c0282d25ce21","length":2097152},{"blob_hash":"bf2717e2c445052366d35bcd58edb108cbe947af122d8f76b4856db577aeeaa2def5b57dbb80f7b1531296bd3e0256fc","iv":"a37b291a37337fc1ff90ae655c244c1d","length":2097152},...,{"blob_hash":"322973617221ddfec6e53bff4b74b9c21c968cd32ba5a5094d84210e660c4b2ed0882b114a2392a08b06183f19330aaf","iv": "a00f5f458695bdc9d50d3dbbc7905abc","length":600160}],"filename":"6b706a7977755477704d632e6d7034","key":"94d89c0493c576057ac5f32eb0871180","version":1}
```

Here's the same manifest, with whitespace added for readability:

```
{
  "blobs":[
    {
      "blob_hash":"a6daea71be2bb89fab29a2a10face08143411a5245edcaa5efff48c2e459e7ec01ad20edfde6da43a932aca45b2cec61",
      "iv":"ef6caef207a207ca5b14c0282d25ce21",
      "length":2097152
    },
    {
      "blob_hash":"bf2717e2c445052366d35bcd58edb108cbe947af122d8f76b4856db577aeeaa2def5b57dbb80f7b1531296bd3e0256fc",
      "iv":"a37b291a37337fc1ff90ae655c244c1d",
      "length":2097152
    },
    ...,
    {
      "blob_hash":"322973617221ddfec6e53bff4b74b9c21c968cd32ba5a5094d84210e660c4b2ed0882b114a2392a08b06183f19330aaf",
      "iv": "a00f5f458695bdc9d50d3dbbc7905abc",
      "length": 600160
    }  
  ],
  "filename":"6b706a7977755477704d632e6d7034",
  "key":"94d89c0493c576057ac5f32eb0871180",
  "version":1
}
```

The `key` field contains the key to decrypt the stream, and is optional. The key may be stored by a third party and made available to a client when presented with proof that the content was purchased. The `version` field is always 1. It is intended to signal structure changes in future versions of this protocol. The `length` field for each blob is the length of the encrypted blob, not the original file chunk.

Every stream must have at least two blobs - the manifest blob and a content blob. Consequently, zero-length streams are not allowed.


#### Stream Encoding

A file must be encoded into a stream before it can be published. Encoding involves breaking the file into chunks, encrypting the chunks into content blobs, and creating the manifest blob. Here are the steps:

##### Setup

1. Generate a random 32-byte key for the stream.

##### Content Blobs

1. Break the file into chunks of at most 2097151 bytes.
1. Generate a random IV for each chuck.
1. Pad each chunk using PKCS7 padding
1. Encrypt each chunk with AES-CBC using the stream key and the IV for that chunk.
1. An encrypted chunk is a blob.

##### Manifest Blob

1. Fill in the manifest data.
1. Encode the data using the canonical JSON encoding described above.
1. Compute the stream hash.

An implementation of this process is available [here](https://github.com/lbryio/lbry.go/tree/master/stream).
fixme: this link is for v0, not v1. need to implement v1 or drop the link.


#### Stream Decoding

Decoding a stream is like encoding in reverse, and with the added step of verifying that the expected blob hashes match the actual data.

1. Verify that the manifest blob hash matches the stream hash you expect.
1. Parse the manifest blob contents.
1. Verify the hashes of the content blobs.
1. Decrypt and remove the padding from each content blob using the key and IVs in the manifest.
1. Concatenate the decrypted chunks in order.


### Announce

After a [[stream]] is encoded, it must be _announced_ to the network. Announcing is the process of letting other nodes on the network know that you have content available for download. The LBRY network tracks announced content using a distributed hash table.

#### Distributed Hash Table

_Distributed hash tables_ (or DHTs) have proven to be an effective way to build a decentralized content network. Our DHT implementation follows the [Kademlia](https://pdos.csail.mit.edu/~petar/papers/maymounkov-kademlia-lncs.pdf)
specification fairly closely, with some modifications.

A distributed hash table is a key-value store that is spread over multiple nodes in a network. Nodes may join or leave the network anytime, with no central coordination necessary. Nodes communicate with each other using a peer-to-peer protocol to advertise what data they have and what they are best positioned to store.

When a host connects to the DHT, it announces the hash for every [[blob]] it wishes to share. Downloading a blob from the network requires querying the DHT for a list of hosts that announced that blob’s hash (called _peers_), then requesting the blob from the peers directly.

#### Announcing to the DHT

A host announces a hash to the DHT in two steps. First, the host looks for nodes that are closest to the target hash. Then the host asks those nodes to store the fact that the host has the target hash available for download.

Finding the closest nodes is done via iterative `FindNode` DHT requests. The host starts with the closest nodes it knows about and sends a `FindNode(target_hash)` request to each of them. If any of the requests return nodes that are closer to the target hash, the host sends `FindNode` requests to those nodes to try to get even closer. When the `FindNode` requests no longer return nodes that are closer, the search ends.

Once the search is over, the host takes the 8 closest nodes it found and sends a `Store(target_hash)` request to them. The nodes receiving this request store the fact that the host is a peer for the target hash. 


### Download

A client wishing to download a [[stream]] must first query the [[DHT]] to find [[peers]] hosting the [[blobs]] in that stream, then contact those peers to download the blobs directly.

#### Querying the DHT

Querying works almost the same way as [[announcing]]. A client looking for a target hash will start by sending iterative `FindValue(target_hash)` requests to the nodes it knows that are closest to the target hash. If a node receives a `FindValue` request and knows of any peers for the target hash, it will respond with a list of those peers. Otherwise, it will respond with the closest nodes to the target hash that it knows about. The client then queries those closer nodes using the same `FindValue` call. This way, each call either finds the client some peers, or brings it closer to finding those peers. If no peers are found and no closer nodes are being returned, the client will determine that the target hash is not available and will give up.


#### Blob Exchange Protocol

Downloading a blob from a peer is governed by the _Blob Exchange Protocol_. It is used by hosts and clients to check blob availability, exchange blobs, and negotiate data prices. The protocol is an RPC protocol using Protocol Buffers and the gRPC framework. It has five types of requests.

fixme: protocol does not **negotiate** anything right now. It just checks the price. Should we include negotiation in v1?

##### PriceCheck

PriceCheck gets the price that the server is charging for data transfer. It returns the prices in [[deweys]] per KB.

##### DownloadCheck

DownloadCheck checks whether the server has certain blobs available for download. For each hash in the request, the server returns a true or false to indicate whether the blob is available.

##### Download

Download requests the blob for a given hash. The response contains the blob, its hash, and the address where to send payment for the data transfer. If the blob is not available on the server, the response will instead contain an error.

##### UploadCheck

UploadCheck asks the server whether blobs can be uploaded to it. For each hash in the request, the server returns a true or false to indicate whether it would accept a given blob for upload. In addition, if any of the hashes in the request is a stream hash and the server has the manifest blob for that stream but is missing some content blobs, it may include the hashes of those content blobs in the response.

##### Upload

Upload sends a blob to the server. If uploading many blobs, the client should use the UploadCheck request to check which blobs the server actually needs. This avoids needlessly uploading blobs that the server already has. If a client tries to upload too many blobs that the server does not want, this may be consider a denial of service attack.


The protocol calls and message types are defined in detail [here](https://github.com/lbryio/lbry.go/blob/master/blobex/blobex.proto).


### Reflectors and Data Markets

In order for a client to download content, there must be hosts online that have the content the client wants, when the client wants it. To incentivize the continued hosting of data, the blob exchange protocol supports data upload and payment for data. _Reflectors_ are hosts that accept data uploads. They rehost (reflect) the uploaded data and charge for downloads. Using a reflector is optional, but most publishers will probably choose to use them. Doing so obviates the need for the publisher's server to be online and connectable, which can be especially useful for mobile clients or those behind a firewall.

The current version of the protocol does not support sophisticated price negotiation between clients and hosts. The host simply chooses the price it will charge. Clients check this price before downloading, and pay the price after the download is complete. Future protocol versions will include more options for price negotiation, as well as stronger proofs of payment.


<pre style="font: 10px/5px monospace;overflow:hidden;text-align: center;margin: 10rem 0">
                                                                                           
                                             ++                                            
                                           :+++++                                          
                                          +++++++++                                        
                                        '++++++++++++`                                     
                                      .++++++',++++++++`                                   
                                     +++++++    .++++++++.                                 
                                   ;++++++:       `++++++++,                               
                                  +++++++            ++++++++,                             
                                +++++++`               ++++++++:                           
                              ,++++++'                   ++++++++;                         
                             +++++++                       ++++++++'                       
                           '++++++,                          ++++++++'                     
                         `+++++++                              '++++++++                   
                        +++++++                                  '++++++++                 
                      :++++++;                                     ;++++++++               
                     +++++++                                         :++++++++             
                   '++++++.                                            ,++++++++           
                 .++++++'                                                ,++++++++`        
                +++++++                                                    .++++++++`      
              ;++++++,                                                       `+++++++      
            `+++++++                                                           `+++++      
           +++++++`                                                              ++++      
         :++++++;                                                               +++++      
        +++++++                                                               ,++++++      
      '++++++.                                                               +++++++       
    .++++++'                                                               '++++++,        
   +++++++                                                               `+++++++          
   +++++:                                                               +++++++            
   ++++                                                               ;++++++:             
   ++++                                                              +++++++               
   ++++      ++                                                    +++++++`                
   ++++      ++++                                                ,++++++'             .:   
   ++++      ++++++                                             +++++++     :'++++++++++   
   ++++      ++++++++                                         '++++++.       ++++++++++    
   ++++       :++++++++                                     .++++++'         .+++++++++    
   ++++         :++++++++                                  +++++++            ++++++++.    
   ++++           ,++++++++                              ;++++++:            :++++++++     
   ++++             ,++++++++                          `+++++++             +++++++++,     
   ++++               ,++++++++                       +++++++`            +++++++++++      
   +++++.               .++++++++`                  :++++++;            ,+++++++ +++;      
   +++++++.               .++++++++`               +++++++             +++++++    ++       
    ++++++++,               `++++++++`           '++++++.            '++++++:     ,'       
      ++++++++,               `++++++++`       .++++++'            .+++++++                
        ++++++++,               `++++++++.    +++++++             +++++++`                 
          ++++++++,                ++++++++.'++++++,            ;++++++;                   
            ++++++++:                +++++++++++++            `+++++++                     
              ++++++++:                +++++++++`            +++++++.                      
                ++++++++:                +++++;            ,++++++'                        
                  ++++++++:                ++             +++++++                          
                    ++++++++;                           '++++++,                           
                      '+++++++;                       .+++++++                             
                        '+++++++;                    +++++++                               
                          '+++++++;                ;++++++:                                
                            '+++++++'            `+++++++                                  
                              '+++++++'         +++++++`                                   
                                '+++++++'     :++++++;                                     
                                  ;+++++++'  +++++++                                       
                                    ;+++++++++++++.                                        
                                      ;+++++++++'                                          
                                        ;++++++                                            
                                          :++,                                             

</pre>


---


_Edit this on Github at https://github.com/lbryio/spec_


</div></main> <!-- DONT DELETE THIS, its for the TOC -->