Reimplement ClaimTrie as Patricia Trie #106

New issue

Closed

opened 2018-03-14 14:42:23 +01:00 by lyoshenka · 4 comments

lyoshenka commented

2018-03-14 14:42:23 +01:00

(Migrated from github.com)

Our Claimtrie is inefficient because it must store lots of empty nodes when storing a long name. The solution is to use this: https://github.com/ethereum/wiki/wiki/Patricia-Tree

👍 1

shyba commented

2018-05-02 20:28:54 +02:00

(Migrated from github.com)

For the sake of having this documented, we should be mindful that there is a space efficiency optimization and a time efficiency one mixed in this change. If we are doing one or both is a pending discussion.
If we just change the trie without changing how proofs are calculated we could solve the memory issue without a hardfork. While changing the proof calculation to match the new data structure (skipping empty nodes) would require a hardfork.

For the sake of having this documented, we should be mindful that there is a space efficiency optimization and a time efficiency one mixed in this change. If we are doing one or both is a pending discussion. If we just change the trie without changing how proofs are calculated we could solve the memory issue without a hardfork. While changing the proof calculation to match the new data structure (skipping empty nodes) would require a hardfork.

BrannonKing commented

2018-09-08 01:28:36 +02:00

(Migrated from github.com)

I was just doing some testing on node counts for this. With a million random strings (standard 60 characters), I end up with 25 million nodes in traditional trie. With the collapsed prefixes, I end up with 1.5 million nodes after a million insertions. That's a substantial difference! Next test: the qp trie. We can talk about where to check in my tests and what to name them next week.

BrannonKing commented

2018-09-18 23:41:43 +02:00

(Migrated from github.com)

I'm attaching my experiments on this. I originally concluded that the PrefixTrie is the right structure, and that it would drastically reduce our (present) memory usage. I also concluded that we don't need a separate library as it is not a lot of code. Since then a discussion with @shyba pointed out that std::map may be the best container. I did some testing and came to that conclusion as well. I'll leave it here on the shelf and hope that we can get c++11 support in lbrycrd in the very near future. I intend to bring it in without a hard fork (keep the hash calculation functioning the way it is).

lbry_tries.zip

BrannonKing commented

2018-09-19 00:53:54 +02:00

(Migrated from github.com)

Last week I tested the code here with my same sample data: https://github.com/ytakano/radix_tree. It worked well enough, but it was not as fast as std::map. I didn't see a lot of other libraries that were easily accessible. I did not attempt to use Ethereum's trie code. It appeared complex, and I don't want to mix their code in with ours.

No milestone

No project

No assignees

1 participant

Notifications

Due date

The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference

LBRYCommunity/lbrycrd#106

No description provided.

Rows
Columns