Reimplement ClaimTrie as Patricia Trie #106
Labels
No labels
area: devops
area: discovery
area: docs
area: livestream
area: proposal
consider soon
Epic
good first issue
hacktoberfest
hard fork
help wanted
icebox
Invalid
level: 0
level: 1
level: 2
level: 3
level: 4
needs: exploration
needs: grooming
needs: priority
needs: repro
needs: tech design
on hold
priority: blocker
priority: high
priority: low
priority: medium
resilience
soft fork
Tom's Wishlist
type: bug
type: discussion
type: improvement
type: new feature
type: refactor
type: task
type: testing
unplanned
work in progress
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: LBRYCommunity/lbrycrd#106
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Our Claimtrie is inefficient because it must store lots of empty nodes when storing a long name. The solution is to use this: https://github.com/ethereum/wiki/wiki/Patricia-Tree
For the sake of having this documented, we should be mindful that there is a space efficiency optimization and a time efficiency one mixed in this change. If we are doing one or both is a pending discussion.
If we just change the trie without changing how proofs are calculated we could solve the memory issue without a hardfork. While changing the proof calculation to match the new data structure (skipping empty nodes) would require a hardfork.
I was just doing some testing on node counts for this. With a million random strings (standard 60 characters), I end up with 25 million nodes in traditional trie. With the collapsed prefixes, I end up with 1.5 million nodes after a million insertions. That's a substantial difference! Next test: the qp trie. We can talk about where to check in my tests and what to name them next week.
I'm attaching my experiments on this. I originally concluded that the PrefixTrie is the right structure, and that it would drastically reduce our (present) memory usage. I also concluded that we don't need a separate library as it is not a lot of code. Since then a discussion with @shyba pointed out that std::map may be the best container. I did some testing and came to that conclusion as well. I'll leave it here on the shelf and hope that we can get c++11 support in lbrycrd in the very near future. I intend to bring it in without a hard fork (keep the hash calculation functioning the way it is).
lbry_tries.zip
Last week I tested the code here with my same sample data: https://github.com/ytakano/radix_tree. It worked well enough, but it was not as fast as std::map. I didn't see a lot of other libraries that were easily accessible. I did not attempt to use Ethereum's trie code. It appeared complex, and I don't want to mix their code in with ours.