Memory improvements, claimtrie revamp #276
Labels
No labels
area: devops
area: discovery
area: docs
area: livestream
area: proposal
consider soon
Epic
good first issue
hacktoberfest
hard fork
help wanted
icebox
Invalid
level: 0
level: 1
level: 2
level: 3
level: 4
needs: exploration
needs: grooming
needs: priority
needs: repro
needs: tech design
on hold
priority: blocker
priority: high
priority: low
priority: medium
resilience
soft fork
Tom's Wishlist
type: bug
type: discussion
type: improvement
type: new feature
type: refactor
type: task
type: testing
unplanned
work in progress
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: LBRYCommunity/lbrycrd#276
Loading…
Reference in a new issue
No description provided.
Delete branch "memory_improvements"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
This PR addresses the following issues:
Fixes #106 (Reimplement claimtrie without so many empty nodes, save RAM)
Fixes #108 (Separate data structure and data operations)
Fixes #133 (Trie serialization failure)
Fixes #138 (Rename nCurrentHeight)
Fixes #145 (Rename CClaimTrieCache)
Fixes #158 (Properly delete CClaimTrieNode)
It also includes these enhancements:
Although the commits have been squashed, much of this work was done by @bvbfan .
We experimented with keeping empty nodes truly empty -- allocating claim space for them as needed. This saved less than 10MB of RAM and complicated the code a bit; we ditched it. We also experimented with a similar approach where empty node hashes were kept in a separate map. This also was too complicated to get right.
@lbrynaut Is this meant to be here?
I'm glad you brought this up. I'm the one that changed the default and checked it in. I found it frustrating during our last hard fork that none of the log files had IP addresses in them. I suppose this is there because some people may be uncomfortable exposing their internal IP addresses. I think that concern is generally a thing of the past, though, and they aren't forced to send me their logs. I'm certainly open to arguments the other way. I think we should enable IP logging on all instances that we manage.
@ -0,0 +305,4 @@
}
template <typename TKey, typename TData>
template <typename TDataUni>
TDataUni allows us to have both lvalue and rvalue data inputs on this method. The alternative is to have two overloads of this method.
@ -0,0 +333,4 @@
auto name = it.key();
name.insert(name.end(), key.begin(), key.end());
auto& node = insert(key, shared);
copy = iterator{name, node};
Not a true forward iterator -- that would require the full stack. This is a children-only iterator. It's true of all the places we create the iterator here. We are specifically choosing to not initialize them fully, and we're using the weak_ptr workaround for the place where we need the parent function but don't have it.
@ -0,0 +477,4 @@
using iterator = Trie::iterator;
using const_iterator = Trie::const_iterator;
template class CPrefixTrie<Key, Data>;
This and the lines below it are a c++ nuance. You have to instantiate a specific type of a template or you need the full definition in the header file. As a native C# programmer, I prefer the latter. @bvbfan prefers the former (as it is here).
without
dataadding true everywhere doesn't help me understand what is being passed in here. Shouldn't we just assign this to a variable like the others?
@ -36,3 +36,3 @@
//! Max memory allocated to block tree DB specific cache, if no -txindex (MiB)
static const int64_t nMaxBlockDBCache = 2;
static const int64_t nMaxBlockDBCache = 8;
//! Max memory allocated to block tree DB specific cache, if -txindex (MiB)
Going to 8GB max is an important note for the release that contains this. Should be in the change log. Also I am interested in the mindbcache purpose.
should this be higher for lbrycrd? blocks are just files on a disk at risk of malware manipulation. Bitcoin is 6 blocks but also a lot harder based on computational power requirements at 6. Maybe we should have something much more than 24 as a default? Just a thought since our hash power is minimal in comparison.
Agreed. If we find it useful we should enable it, but maybe disable it by default for privacy.
this is probably a very low priority attack vector, but 6 blocks with lbrycrd is not a lot to overwrite with malware. Most exchanges have min confirmations at 200 for LBC. Validating the top 200 blocks is probably a good default for right now. Again the likelihood that enough people catch the LBC malware that does this is probably near zero...just a thought though after researching the default value.
claimtrie.h::insertClaimIntoTrie utilizes a default parameter value even though it's a virtual method. This is non-standard c++, and whether or not inheritors use that default is undefined. I had intended to remove it, but I apparently only made it part way down that road.
@ -0,0 +477,4 @@
using iterator = Trie::iterator;
using const_iterator = Trie::const_iterator;
template class CPrefixTrie<Key, Data>;
In some project is used *.impl file ext (not *.cpp) and gets include where it's needed, i'm ok with that too. But all in header makes things to look redundant and hard to read.
The comment is old, it should be removed at all, we have data on all nodes so it's safe to be used range-based for loop.
@ -0,0 +333,4 @@
auto name = it.key();
name.insert(name.end(), key.begin(), key.end());
auto& node = insert(key, shared);
copy = iterator{name, node};
That's by design, iterator is forward through children, i mean this is not an exception. Thus we don't have ability to obtain parent, that's nodes stand for. We can go for full iterator but i'm afraid we will fill iterator stack for nothing i.e. we will not use it.
Possibly unrelated to this PR, but that's a good point. Bitcoin core did backtrack this from 288 to 6, but it's likely not appropriate for us. Reverting to 288 is probably more sane.
I'd leave it disabled as well.
I still feel like this method fits better on the base class.
This method also fits better on the base class.
24 is an hour -- same as bitcoin. Why would we need more than that?
I don't think time is the intention here. The intention is to confirm enough blocks such that it is statistically painful to overwrite like with a big reorg for a double spend attack. 6 blocks of bitcoin is a standard for confirmation not meant to be a time equivalence. Admittedly, it is nice that it evenly breaks down to an hour, however arbitrary, but should not be the correlation between cryptocurrencies. However, I could be misinterpreting it.
@ -78,0 +125,4 @@
{
sValue = HexStr(vvchParams[2].begin(), vvchParams[2].end());
return true;
}
not that I am well informed on whats going on here, but @BrannonKing wouldn't it be better to handle this sort of thing in the processing of the request rather than modifying so many lines of code? Seems like a DRY violation.