Separate disk related functions in CClaimTrieDb #140
No reviewers
Labels
No labels
area: devops
area: discovery
area: docs
area: livestream
area: proposal
consider soon
Epic
good first issue
hacktoberfest
hard fork
help wanted
icebox
Invalid
level: 0
level: 1
level: 2
level: 3
level: 4
needs: exploration
needs: grooming
needs: priority
needs: repro
needs: tech design
on hold
priority: blocker
priority: high
priority: low
priority: medium
resilience
soft fork
Tom's Wishlist
type: bug
type: discussion
type: improvement
type: new feature
type: refactor
type: task
type: testing
unplanned
work in progress
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: LBRYCommunity/lbrycrd#140
Loading…
Reference in a new issue
No description provided.
Delete branch "master"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Signed-off-by: Anthony Fieroni bvbfan@abv.bg
@kaykurokawa, OSX build fails, but it's not looks like pull request related.
Yes OSX build on travis is currently failing for all : https://github.com/lbryio/lbrycrd/issues/120
@ -1,18 +1,25 @@
#include "claimtrie.h"
I think these can go in claimtriedb.h
@ -771,2 +936,2 @@
if (hasChild && nNextHeight < Params().GetConsensus().nMaxTakeoverWorkaroundHeight) {
removalWorkaround.insert(name);
success = currentNode->removeClaim(outPoint, claim);
base->removeFromClaimIndex(claim);
Ideally this block of code would reside in CClaimTrieDb , the motivation would be that it would be much clearer exactly what is being written/read in the database if we have it in a well defined place. As you said, a more aggressive refactor could have CClaimTrieDb manage the dirty* variables and would make this possible.
Hi, as you mentioned in https://github.com/lbryio/lbrycrd/issues/136 , a bit more aggressive refactor would be acceptable. To be specific , the various dirty* memeber variables that is currently managed by CClaimTrie could be managed by CClaimTrieDb. This would make it much clearer exactly what kind of things are being written/read to disk and how, and should also improve some DRY problems involved in the manipulation of dirty* member variables.
Please note though I think there is a slight misunderstanding you may have about CClaimTrieCache , this class is NOT a cache of the CClaimTrie. CClaimTrieCache actually contains a cache of changes that must be applied to CClaimTrie upon block increment (removing claims, adding claims, etc..). Thus CClaimTrieCache should be minimally affected by this refactor.
You're right about of CClaimTrieCache, i wrote the comment before i see the implementation, name lies to me, did you think it's proper name?
@ -771,2 +936,2 @@
if (hasChild && nNextHeight < Params().GetConsensus().nMaxTakeoverWorkaroundHeight) {
removalWorkaround.insert(name);
success = currentNode->removeClaim(outPoint, claim);
base->removeFromClaimIndex(claim);
I'll investigate to move dirty queues as well as read/write in CClaimTrieDb
@ -1,18 +1,25 @@
#include "claimtrie.h"
I've give a try to implement it on elegant way. Generic in claimtriedb as
I mean it can be used a hash to store generic queues (hash will be a key in db not char) so all definitions will gone away, but we should provide db migration, right?
I force push new version, but it looks like not updated here :)
I can't figure out why when i write to leveldb CClaimTrieNode is not empty, but when i reads it is. Did you have any advice? Test fails about that
Ok, Kay, can you review it? The architecture aims to provide queues by needs, for now they are maps, through CClaimTrieDb. I plan to refactor CClaimTrieCache too to benefit db cache and to be renamed, as we discuss. One test still fail, somehow, i can't see this is my fault. Hope we can merge it in refactor branch after you explain your remarks.
Regards
Did you have a time to take a look on this? After all i have one pending change, last implementation uses vector, under the hood in CClaimTrieDb, to not reorder cached queues.
Will start looking at this today
Kay i will change hash calculation since i check older version of boost, https://wandbox.org/ helped a lot, Boost.TypeIndex started at version 1.56, but the problem is around 1.64 it's change hash calculation, which is annoying. I will implement it in different way.
@ -308,0 +295,4 @@
typedef std::pair<std::string, claimQueueValueType> claimQueueEntryType;
typedef std::pair<std::string, supportQueueValueType> supportQueueEntryType;
This block of code (253 - 293) is a bit confusing, it's basically just to enable swap() right?
Maybe adding some comments about what its trying to achieve would be helpful.
@ -0,0 +61,4 @@
CClaimTrieDb(bool fMemory = false, bool fWipe = false);
~CClaimTrieDb();
Spelling : "durty" ->"dirty"
@ -308,0 +295,4 @@
typedef std::pair<std::string, claimQueueValueType> claimQueueEntryType;
typedef std::pair<std::string, supportQueueValueType> supportQueueEntryType;
From 262 to 284 - yes, it for std::swap to work. Helpers and typedefs are to make a different type when insert to db, so let's see an example
Helper is used only to make these 2 types different e.g. to stored in different keys in db. I'll commented it.
@ -0,0 +51,4 @@
virtual ~CCBase() {}
virtual void write(const size_t key, CClaimTrieDb *db) = 0;
};
Would like to see better description of CClaimTrieDb class, maybe something like this:
"This class implements key value storage for use by the CClaimTrie class. It allows for the storage of values of datatype V that can be retrieved using key datatype K. Changes to the key value storage is buffered until they are written to disk using writeQueues()."
@ -0,0 +89,4 @@
*/
template <typename K, typename V>
void updateQueueRow(const K &key, V &row);
Clarity and grammar for keyTypeEmpty() docstring:
"Check that there are no data stored under key datatype K and value datatype V. Checks both the buffer and disk."
@ -0,0 +61,4 @@
CClaimTrieDb(bool fMemory = false, bool fWipe = false);
~CClaimTrieDb();
Actually, we should probably stop using this "dirty" term as it is not very descriptive. Maybe:
"Write to disk the buffered changes to the key value storage"
@ -0,0 +70,4 @@
/**
* Gets a map representation of K type / V type stored by their hash
* @param[out] map key / value pairs readed from queues and disk
*/
"key / value pairs readed from queues and disk" - > "key / value pairs read from disk with changes from the buffer applied"
@ -0,0 +79,4 @@
* @param[in] key key to looking for in dirty queues and disk
* @param[out] row value which is found
*/
template <typename K, typename V>
"key to looking for in dirty queues and disk" -> "key to look for in buffer and disk"
@ -0,0 +86,4 @@
* Update value of type V by key of type K through their hash
* @param[in] key key to looking for in dirty queues and disk
* @param[in/out] row update value and gets its last value
*/
" key to looking for in dirty queues and disk" - > "key to look for in buffer and disk"
@ -0,0 +99,4 @@
/**
* Get a map representation of K type / V type stored by theirs hash
* @param[out] map key / value pairs readed only from disk
*/
"key / value pairs readed only from disk" -> "key/value pairs, read only from disk"
@ -0,0 +118,4 @@
/**
* Represents dirty queues before stored to disk
*/
"Represents buffer of changes"
@ -308,0 +295,4 @@
typedef std::pair<std::string, claimQueueValueType> claimQueueEntryType;
typedef std::pair<std::string, supportQueueValueType> supportQueueEntryType;
line 253: /// Helpers to separate queue types from each other
line 271: /// Make std::swap to work with custom types
line 288: /// Each type will be stored in database separately
@ -308,0 +295,4 @@
typedef std::pair<std::string, claimQueueValueType> claimQueueEntryType;
typedef std::pair<std::string, supportQueueValueType> supportQueueEntryType;
looks good
Updated doc strings in
53cf80d99c
,See branch: seperate_disk_claim
Regarding terminology, I tried to change things so that "queue" refers to a series of data that is used by CClaimTrie (a "queue" can be on disk or in memory). "buffer" refers to a set of changes to "queue" that is stored in memory. I removed mentions to "dirty" or "dirty queue" as I think its not descriptive enough.
It looks good to me.
@ -1439,0 +1899,4 @@
bool CClaimTrieCache::forkForExpirationChange(bool increment)
{
/*
If increment is True, we have forked to extend the expiration time, thus items in the expiration queue
Upon fresh start (with empty data directory), there won't be any nodes to load, so this will cause lbrycrdd to exit. See
https://github.com/lbryio/lbrycrd/blob/master/src/init.cpp#L1286
On previous ReadFromDisk() function, it will only fail if it somehow pcursor->GetValue(*node) returns False.
I found a problem, try syncing lbrycrdd from scratch (clear out data directory before starting lbrycrdd).
See comment above, should be a simple fix.
So maybe we should update test as well
claimtriebranching_tests.cpp:319
Yes, these changes look good, also add a docstring to seekByKey()
"Returns false if database read fails."
Can you make these changes on your branch,
Also, add the docstring commit
53cf80d99c
And than squash these commits (maybe one commit is ok?) to make it ready for merge to master.
Also notice that i have pending changes to that branch
https://github.com/lbryio/lbrycrd/issues/145
Closing as further work is continued in #160
Pull request closed