Need to cache fixed amount of CClaimTrieNode's, and store the rest on disk #166

Closed
opened 2018-06-27 21:12:21 +02:00 by kaykurokawa · 15 comments
kaykurokawa commented 2018-06-27 21:12:21 +02:00 (Migrated from github.com)

Currently , lbrycrdd loads the entire claimtrie composed of CClaimTrieNode's into memory
The total size of CClaimTrieNode's can exceed what is available in memory so we need to create a cache that will keep a fixed size of CClaimTrieNode's as needed, and store the rest on disk.

Say we create a class CClaimTrieCache, perhaps something like this in pseudo-code (just trying to brainstorm here, open to different approaches).

CClaimTrieCache(int num_nodes_in_cache) : num_nodes_in_cache is the max number of CClaimTrieNode that can be loaded into cache

public methods:

void loadFromDisk() : loads as much CClaimTrieNode's as possible, runs on startup

*CClaimTrieNode root(): This always returns a pointer to the root claim trie node

nodeMapType::iterator children(CClaimTrieNode *node) : Returns an iterator for CClaimTrieNode.children that allows for the traversal of the claimtrie. If the children nodes of "node" has not been loaded into cache, than this means that CClaimTrieCache will need to load it from disk into cache.

Note that currently CClaimTrieNode's are stored on disk using key value storage where the key is the node name. The "children" of CClaimTrieNode's are not stored on disk, they are loaded when they are put into memory ( see InsertFromDisk() https://github.com/lbryio/lbrycrd/blob/master/src/claimtrie.cpp#L1108 ).

Thus this means that for the disk storage of CClaimTrieNode we should store a vector of children letters on disk as well. This will allow us to look up all of the node's children efficiently in the key value storage.

I think for the cache to have efficient performance, we need to eject based on the depth of the node and maybe distance from the new node being inserted? The logic is that if a CClaimTrieNode is closer to the root, it will see far more accesses to it rather than if it further from the root.

Currently , lbrycrdd loads the entire claimtrie composed of CClaimTrieNode's into memory The total size of CClaimTrieNode's can exceed what is available in memory so we need to create a cache that will keep a fixed size of CClaimTrieNode's as needed, and store the rest on disk. Say we create a class CClaimTrieCache, perhaps something like this in pseudo-code (just trying to brainstorm here, open to different approaches). CClaimTrieCache(int num_nodes_in_cache) : num_nodes_in_cache is the max number of CClaimTrieNode that can be loaded into cache public methods: void loadFromDisk() : loads as much CClaimTrieNode's as possible, runs on startup *CClaimTrieNode root(): This always returns a pointer to the root claim trie node nodeMapType::iterator children(CClaimTrieNode *node) : Returns an iterator for CClaimTrieNode.children that allows for the traversal of the claimtrie. If the children nodes of "node" has not been loaded into cache, than this means that CClaimTrieCache will need to load it from disk into cache. Note that currently CClaimTrieNode's are stored on disk using key value storage where the key is the node name. The "children" of CClaimTrieNode's are not stored on disk, they are loaded when they are put into memory ( see InsertFromDisk() https://github.com/lbryio/lbrycrd/blob/master/src/claimtrie.cpp#L1108 ). Thus this means that for the disk storage of CClaimTrieNode we should store a vector of children letters on disk as well. This will allow us to look up all of the node's children efficiently in the key value storage. I think for the cache to have efficient performance, we need to eject based on the depth of the node and maybe distance from the new node being inserted? The logic is that if a CClaimTrieNode is closer to the root, it will see far more accesses to it rather than if it further from the root.
bvbfan commented 2018-06-28 16:32:00 +02:00 (Migrated from github.com)

I took in other way, you can decide it's valuable or not, so can use basically same structure of code as is, but we can use memory map file to allocate space, i.e. all node allocations will be placement new that will in the virtual memory, shared_ptr will use a custom deallocator.
https://www.boost.org/doc/libs/1_67_0/doc/html/interprocess/managed_memory_segments.html
If you have an other idea, share it, cause other mine is again use of placement new with custom allocations which will be error prone.

I took in other way, you can decide it's valuable or not, so can use basically same structure of code as is, but we can use memory map file to allocate space, i.e. all node allocations will be placement new that will in the virtual memory, shared_ptr will use a custom deallocator. https://www.boost.org/doc/libs/1_67_0/doc/html/interprocess/managed_memory_segments.html If you have an other idea, share it, cause other mine is again use of placement new with custom allocations which will be error prone.
bvbfan commented 2018-07-05 18:03:39 +02:00 (Migrated from github.com)

I implemented it on memory managed file, but it has a bug, about me, i reported it https://svn.boost.org/trac10/ticket/13626 so did you have a suggestion to investigate?

I implemented it on memory managed file, but it has a bug, about me, i reported it https://svn.boost.org/trac10/ticket/13626 so did you have a suggestion to investigate?
kaykurokawa commented 2018-07-09 02:32:20 +02:00 (Migrated from github.com)

Do you have this implemented in a branch somewhere I could look at?

Do you have this implemented in a branch somewhere I could look at?
bvbfan commented 2018-07-09 07:37:10 +02:00 (Migrated from github.com)

It depends on #160 , which is ready to merge?

It depends on #160 , which is ready to merge?
bvbfan commented 2018-07-09 09:42:34 +02:00 (Migrated from github.com)

Here is on pastebin https://pastebin.com/xNuF4YZ4

Here is on pastebin https://pastebin.com/xNuF4YZ4
bvbfan commented 2018-07-15 07:31:08 +02:00 (Migrated from github.com)

One possible solution is CMemoryFile to manage a fixed size file, CMemoryFileManager to manage a bunch of memory files, he will lookup in all created files for enough space for every new allocation, if does not have, it will construct a new one. Growing will be a just creating a new file with desired size, shrinking will search for empty files to delete.

One possible solution is CMemoryFile to manage a fixed size file, CMemoryFileManager to manage a bunch of memory files, he will lookup in all created files for enough space for every new allocation, if does not have, it will construct a new one. Growing will be a just creating a new file with desired size, shrinking will search for empty files to delete.
bvbfan commented 2018-07-16 16:09:08 +02:00 (Migrated from github.com)

@kaykurokawa, i've implement it, like i comment it above, tests pass, it looks good to me. I can upload to pastebin or i can wait #160 to merged.

@kaykurokawa, i've implement it, like i comment it above, tests pass, it looks good to me. I can upload to pastebin or i can wait #160 to merged.
bvbfan commented 2018-07-18 16:32:46 +02:00 (Migrated from github.com)

Here is on pastebin for now -> https://pastebin.com/KBXdEtyw

Here is on pastebin for now -> https://pastebin.com/KBXdEtyw
BrannonKing commented 2018-07-18 16:40:03 +02:00 (Migrated from github.com)

I want to get #160 merged today. Can you look into using make_shared instead of the new keyword? Return shared pointers from your methods instead of raw pointers.

I want to get #160 merged today. Can you look into using `make_shared` instead of the `new` keyword? Return shared pointers from your methods instead of raw pointers.
bvbfan commented 2018-07-18 16:41:02 +02:00 (Migrated from github.com)

No it cannot be used, in placement new make_shared does not work.

No it cannot be used, in placement new make_shared does not work.
BrannonKing commented 2018-08-21 17:19:27 +02:00 (Migrated from github.com)

@bvbfan , Roy's approach to this problem (in btcd) was to store all of the claimtrie nodes in LevelDB. He used the node hash code as the key. This has the advantage of indexed lookups for replay. He also kept the active claims in a dictionary so that no DB lookup is required for the common usage. Do you see any advantages/disadvantages over the mmap approach? I'm assuming that you rebuild the mmap on startup, whereas the LevelDB could resume construction at the last saved height.

@bvbfan , Roy's approach to this problem (in btcd) was to store all of the claimtrie nodes in LevelDB. He used the node hash code as the key. This has the advantage of indexed lookups for replay. He also kept the active claims in a dictionary so that no DB lookup is required for the common usage. Do you see any advantages/disadvantages over the mmap approach? I'm assuming that you rebuild the mmap on startup, whereas the LevelDB could resume construction at the last saved height.
bvbfan commented 2018-08-21 18:09:06 +02:00 (Migrated from github.com)

I'm not sure how accurate is LevelDB when size increase significantly, e.g. when we have more than 8GB claims, mapping files (last implementation has a couple of file with compile-time configured size) can benefit OS level access of virtual address space which will be as fast as it can, about me. Also @kaykurokawa note:

Note that currently CClaimTrieNode's are stored on disk using key value storage where the key is the node name. The "children" of CClaimTrieNode's are not stored on disk, they are loaded when they are put into memory ( see InsertFromDisk() https://github.com/lbryio/lbrycrd/blob/master/src/claimtrie.cpp#L1108

I'm not sure how accurate is LevelDB when size increase significantly, e.g. when we have more than 8GB claims, mapping files (last implementation has a couple of file with compile-time configured size) can benefit OS level access of virtual address space which will be as fast as it can, about me. Also @kaykurokawa note: > Note that currently CClaimTrieNode's are stored on disk using key value storage where the key is the node name. The "children" of CClaimTrieNode's are not stored on disk, they are loaded when they are put into memory ( see InsertFromDisk() https://github.com/lbryio/lbrycrd/blob/master/src/claimtrie.cpp#L1108
bvbfan commented 2018-08-21 18:09:39 +02:00 (Migrated from github.com)

I made a mistake by closing, sorry.

I made a mistake by closing, sorry.
BrannonKing commented 2019-07-26 00:32:37 +02:00 (Migrated from github.com)

Now that we have the cache trie separated from the base trie, the next step is to make the base trie actually just be the LevelDB data that we already write to disk to persist that trie. We don't need to load the base trie into RAM. Instead, we can just pull the nodes directly from there as necessary. Things that need to happen:

  1. base->find(...) becomes a DB lookup on node name.
  2. base->nodes(...) would be multiple DB lookups.
  3. ensure that we compute nEffectiveAmount where it's needed, or maybe do a one-time upgrade where we persist it to the DB.
  4. we would need to write a custom iterator for it.
Now that we have the cache trie separated from the base trie, the next step is to make the base trie actually just be the LevelDB data that we already write to disk to persist that trie. We don't need to load the base trie into RAM. Instead, we can just pull the nodes directly from there as necessary. Things that need to happen: 1. base->find(...) becomes a DB lookup on node name. 2. base->nodes(...) would be multiple DB lookups. 3. ensure that we compute nEffectiveAmount where it's needed, or maybe do a one-time upgrade where we persist it to the DB. 4. we would need to write a custom iterator for it.
BrannonKing commented 2019-09-06 22:19:52 +02:00 (Migrated from github.com)
Merged via https://github.com/lbryio/lbrycrd/pull/308
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: LBRYCommunity/lbrycrd#166
No description provided.