Include non-winning claims in the claimtrie hash #107

Closed
opened 2018-03-19 18:57:15 +01:00 by lyoshenka · 7 comments
lyoshenka commented 2018-03-19 18:57:15 +01:00 (Migrated from github.com)

Problem

Right now, only the winning claim for each name is included in the hash of the claimtrie. Therefore resolving non-winning claims cannot be validated by SPV clients.

Solution

Include all claims in the claimtrie hash.

## Problem Right now, only the winning claim for each name is included in the hash of the claimtrie. Therefore resolving non-winning claims cannot be validated by SPV clients. ## Solution Include all claims in the claimtrie hash.
BrannonKing commented 2018-08-31 18:31:13 +02:00 (Migrated from github.com)

We currently hash in the last take-over height as well as the currently active claim. With this change are we needing to hash in take-over heights or valid heights for each claim? Or is the one take-over height sufficient?

We currently hash in the last take-over height as well as the currently active claim. With this change are we needing to hash in take-over heights or valid heights for each claim? Or is the one take-over height sufficient?
lyoshenka commented 2018-09-04 15:44:19 +02:00 (Migrated from github.com)

pinging @kaykurokawa or @lbrynaut

pinging @kaykurokawa or @lbrynaut
BrannonKing commented 2018-09-07 20:20:42 +02:00 (Migrated from github.com)

Clarified goal: allow a 3rd party to prove the existence of a non-winning claim.

The design for this hard fork:

First, a word on the current computation: the claimID == hash(outPoint.hash, outPoint.n). However, the per-leaf hash used in the Merkle tree == hash(child[0], child[1], ..., hash(outPoint.hash, outPoint.n, last takeover height)), where outPoint in this context refers to the current winning claim at that leaf node.

Proposal A: At some particular block number the method CClaimTrieCache::recursiveComputeMerkleHash will change its behavior. The inner hash on the per-leaf computation will change to: hash(last takeover height, claimID[0], claimID[1], etc.).

With that change in the hash computation, getnameproof no longer returns sufficient contextual data to recompute the tree hash. We have two options to address that shortcoming:

  1. getnameproof would be augmented to take an optional claimID. If that parameter is included, an additional layer of hierarchy would be returned for each leaf node -- the other claimIDs and the index of the missing one.
  2. those performing the proof can call getclaimsforname and utilize that data to rebuild the tree. Unfortunately, it would need to be called for each leaf node in the hierarchy. It could potentially add |claim| more RPC calls.

Either way, those using getnameproof would need to be updated. I'm only aware of this usage: ee5be5adc8/lbrynet/wallet/claim_proofs.py (L20) .

Proposal B: same as above except that we would also change to use a binary tree hashing combination for the claimtrie nodes and the claims in the nodes (or just the latter). As I understand it, this is how the transactions are hashed already. See the answers here for an explanation of what I mean: https://bitcoin.stackexchange.com/questions/50674/why-is-the-full-merkle-path-needed-to-verify-a-transaction . I think this would significantly reduce the amount of data that we need to return for getnameproof.

Open questions:

  1. Why do we hash in the last takeover height?
  2. Do we need to include the takeover height or valid height for all claims as part of the hash?
  3. Should I modify the getnameproof output (when the to-be-optional claimID is passed in)?
  4. Do you like the binary tree hashing approach?
  5. Why did we compute the hashes for the claimtree nodes as an appended list rather than a binary tree?
Clarified goal: allow a 3rd party to prove the existence of a non-winning claim. The design for this hard fork: First, a word on the current computation: the claimID == hash(outPoint.hash, outPoint.n). However, the per-leaf hash used in the Merkle tree == hash(child[0], child[1], ..., hash(outPoint.hash, outPoint.n, last takeover height)), where outPoint in this context refers to the current winning claim at that leaf node. Proposal A: At some particular block number the method `CClaimTrieCache::recursiveComputeMerkleHash` will change its behavior. The inner hash on the per-leaf computation will change to: hash(last takeover height, claimID[0], claimID[1], etc.). With that change in the hash computation, `getnameproof` no longer returns sufficient contextual data to recompute the tree hash. We have two options to address that shortcoming: 1. getnameproof would be augmented to take an optional claimID. If that parameter is included, an additional layer of hierarchy would be returned for each leaf node -- the other claimIDs and the index of the missing one. 2. those performing the proof can call `getclaimsforname` and utilize that data to rebuild the tree. Unfortunately, it would need to be called for each leaf node in the hierarchy. It could potentially add |claim| more RPC calls. Either way, those using `getnameproof` would need to be updated. I'm only aware of this usage: https://github.com/lbryio/lbry/blob/ee5be5adc80d4513fbd2fa26881833de3487dd54/lbrynet/wallet/claim_proofs.py#L20 . Proposal B: same as above except that we would also change to use a binary tree hashing combination for the claimtrie nodes and the claims in the nodes (or just the latter). As I understand it, this is how the transactions are hashed already. See the answers here for an explanation of what I mean: https://bitcoin.stackexchange.com/questions/50674/why-is-the-full-merkle-path-needed-to-verify-a-transaction . I think this would significantly reduce the amount of data that we need to return for `getnameproof`. Open questions: 1. Why do we hash in the last takeover height? 2. Do we need to include the takeover height or valid height for all claims as part of the hash? 3. Should I modify the getnameproof output (when the to-be-optional claimID is passed in)? 4. Do you like the binary tree hashing approach? 5. Why did we compute the hashes for the claimtree nodes as an appended list rather than a binary tree?
BrannonKing commented 2018-09-10 20:51:24 +02:00 (Migrated from github.com)

Answers discussed with @kaykurokawa :

  1. We hash the takeover height so that it can be verified.
  2. We should include a valid height as part of our claim hash in the tree. We should also include effective amount, but we don't have that data (yet -- we need a one-time upgrade to get it stored).
  3. Yes, modify getnameproof output.
  4. Yes, use the binary tree hashing approach for both claims and node children.
  5. Eh... MVP.
Answers discussed with @kaykurokawa : 1. We hash the takeover height so that it can be verified. 2. We should include a valid height as part of our claim hash in the tree. We should also include effective amount, but we don't have that data (yet -- we need a one-time upgrade to get it stored). 3. Yes, modify getnameproof output. 4. Yes, use the binary tree hashing approach for both claims and node children. 5. Eh... MVP.
BrannonKing commented 2018-09-18 00:22:02 +02:00 (Migrated from github.com)

One of my concerns on this was the ordering issue mentioned in #196 . Essentially, you can't re-sort claims in a node without first computing the effective amount (EA) for those nodes. Unfortuantely the EA isn't persisted. Hence, it's set to random bits right when the claims are loaded from the disk. The method to sort the claims does recompute the value. I'm just going to run on the assumption that the claims were sorted when they were persisted on disk, and that they don't lose their ordering when the data is retrieved from disk.

One of my concerns on this was the ordering issue mentioned in #196 . Essentially, you can't re-sort claims in a node without first computing the effective amount (EA) for those nodes. Unfortuantely the EA isn't persisted. Hence, it's set to random bits right when the claims are loaded from the disk. The method to sort the claims does recompute the value. I'm just going to run on the assumption that the claims were sorted when they were persisted on disk, and that they don't lose their ordering when the data is retrieved from disk.
BrannonKing commented 2019-02-13 18:43:13 +01:00 (Migrated from github.com)

We've had new requirements on this. It now needs to implement all necessary support for this: https://www.notion.so/lbry/Q-A-on-RPC-results-verification-44888d23efa3475a90eec997f9bf3103 . It also needs to return the claim_sequence field in all claim RPC calls.

We've had new requirements on this. It now needs to implement all necessary support for this: https://www.notion.so/lbry/Q-A-on-RPC-results-verification-44888d23efa3475a90eec997f9bf3103 . It also needs to return the claim_sequence field in all claim RPC calls.
BrannonKing commented 2019-09-06 22:17:24 +02:00 (Migrated from github.com)
Merged via https://github.com/lbryio/lbrycrd/pull/302 and https://github.com/lbryio/lbrycrd/pull/303
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: LBRYCommunity/lbrycrd#107
No description provided.