CClaimTrieCache is caching Nodes inefficiently during updates #90

Closed
opened 2018-02-02 21:22:54 +01:00 by kaykurokawa · 1 comment
kaykurokawa commented 2018-02-02 21:22:54 +01:00 (Migrated from github.com)

lbrycrd uses space inefficient map and set structure to cache Nodes (cacheing nodes is done when updating the claim trie structure to keep track of which node should or should not be updated)

CClaimTrieCache.cacheHashes -> (hashMapType, std::map<std::string, uint256> , stores hashes for each nodes
CClaimTrieCache.cache -> (nodeCacheType, std::map<std::string, CClaimTrieNode*>
CClaimTrieCache.block_originals -> (nodeCacheType)
CClaimTrieCache.dirtyHashes -> (std::setstd::string)
CClaimTrie.dirtyNodes -> (nodeCacheType)

This is problematic for caching large name claims (max length 255) since each subnode gets put into the map/set (i.e, claim "test" must cache "t", "te" ,"tes" , "test") thus 32 kilobytes alone (255*256 /2) is used to store the string keys for a max length name claim in the cache. We could improve the memory usage by utilizing a trie structure.

We actually do not have a base template trie class although ClaimTrie itself is a trie structure. It would make sense to create this base template tire class so that a) ClaimTrie can inherit from it and b) we can use it instead of map and set structure used to cache nodes.

Having a base trie class will also have the benefit of improving code readability and testability.

lbrycrd uses space inefficient map and set structure to cache Nodes (cacheing nodes is done when updating the claim trie structure to keep track of which node should or should not be updated) CClaimTrieCache.cacheHashes -> (hashMapType, std::map<std::string, uint256> , stores hashes for each nodes CClaimTrieCache.cache -> (nodeCacheType, std::map<std::string, CClaimTrieNode*> CClaimTrieCache.block_originals -> (nodeCacheType) CClaimTrieCache.dirtyHashes -> (std::set<std::string>) CClaimTrie.dirtyNodes -> (nodeCacheType) This is problematic for caching large name claims (max length 255) since each subnode gets put into the map/set (i.e, claim "test" must cache "t", "te" ,"tes" , "test") thus 32 kilobytes alone (255*256 /2) is used to store the string keys for a max length name claim in the cache. We could improve the memory usage by utilizing a trie structure. We actually do not have a base template trie class although ClaimTrie itself is a trie structure. It would make sense to create this base template tire class so that a) ClaimTrie can inherit from it and b) we can use it instead of map and set structure used to cache nodes. Having a base trie class will also have the benefit of improving code readability and testability.
kaykurokawa commented 2018-07-19 18:52:36 +02:00 (Migrated from github.com)
closing as this is a duplicate of https://github.com/lbryio/lbrycrd/issues/108 and https://github.com/lbryio/lbrycrd/issues/106
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: LBRYCommunity/lbrycrd#90
No description provided.