The main normalization branch that needs review pronto #235
No reviewers
Labels
No labels
area: devops
area: discovery
area: docs
area: livestream
area: proposal
consider soon
Epic
good first issue
hacktoberfest
hard fork
help wanted
icebox
Invalid
level: 0
level: 1
level: 2
level: 3
level: 4
needs: exploration
needs: grooming
needs: priority
needs: repro
needs: tech design
on hold
priority: blocker
priority: high
priority: low
priority: medium
resilience
soft fork
Tom's Wishlist
type: bug
type: discussion
type: improvement
type: new feature
type: refactor
type: task
type: testing
unplanned
work in progress
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: LBRYCommunity/lbrycrd#235
Loading…
Reference in a new issue
No description provided.
Delete branch "normalization_1107_2"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Some small commits on top of normalization PR #159.. The thread is getting quite lengthy so I thought I'd open a new PR... @BrannonKing can quickly review this here and move these commits in there when he's ready.
a0741b8
Added tests that checks the normalization only , without making claims. Contains tests for invalid utf8, some tests for NFD normalization in certain languages, and casings on non english languages.
c26f2c9
Just moved and clarified a comment on the normalizeClaimName function
2f61fb2
Added an RPC call checknormalization so user can check what the normalization function does from RPC/CLI
This looks fine to me. I'll include it on the other branch shortly.
@ -1420,2 +1282,4 @@
namesToCheckForTakeover.insert(name);
CClaimTrieNode* rootNode = getRoot();
return recursivePruneName(rootNode, 0, name);
I was trying to establish a boundary between the trie and the queues where we always normalize going into the trie. The same is true for the support map.
@bvbfan , this situation is for you to look at.
I think that we should commit to master all work before this changes, without non normalized names. Then we can figure out easily special case that we want to handle. I was thinking for simple solution, we can add only non normalized names to claim index and undo blocks, but i should check it.
@ -1692,3 +1556,3 @@
expirationQueueType::iterator CClaimTrieCache::getExpirationQueueCacheRow(int nHeight, bool createIfNotExists) const
expirationQueueType::iterator CClaimTrieCacheBase::getExpirationQueueCacheRow(int nHeight, bool createIfNotExists) const
{
Cache normalizeClaimName(name) before loop, we don't need to do it more that once.
Cache normalizeClaimName(name) before loop
@ -2017,7 +1876,7 @@ void CClaimTrieCache::removeSupportFromExpirationQueue(const std::string& name,
}
}
Cache normalizeClaimName(name) before loop
@ -2094,0 +1959,4 @@
// it will cause a double remove when rolling back -- something we don't handle correctly
// (same problem exists for the insertSupportUndo below)
//for (insertUndoType::reverse_iterator itInsertUndo = insertUndo.rbegin(); itInsertUndo != insertUndo.rend(); ++itInsertUndo)
// if (itInsertUndo->name == itEntry->first && itInsertUndo->outPoint == itEntry->second.outPoint)
Cache normalizeClaimName(name) before loop.
You should return or make function void
We have 2 major issues in last attempt to keep track of original name, they aren't handled by test case, cause reordering nodes after normalization:
I've thinking about to minimize normalization effort by lower-casing on demand, note
We can avoid lower-casing when we does not have real change.
@ -1692,3 +1556,3 @@
expirationQueueType::iterator CClaimTrieCache::getExpirationQueueCacheRow(int nHeight, bool createIfNotExists) const
expirationQueueType::iterator CClaimTrieCacheBase::getExpirationQueueCacheRow(int nHeight, bool createIfNotExists) const
{
I was expecting the outPoint to be reasonably unique, meaning that we would only call the normalization once or less on average.
Again here I was thinking that outPoint was unique.
I did write a new test case to ensure that the nodes are reordered going forwards or backwards. (But I didn't do anything to the test to ensure that they were out of order at any point during the test.)
I'm not sure what you're stating here exactly. We still return the best claim, but we now compare across a larger set of claims. It is true that previous best may come into competition with claims that it was not competing against before the fork.
Only if the new competition causes it to change. Otherwise it should hold on to its previous value.
This code doesn't make any sense to me. The goal was to make more names compete with each other. It's not just for normalization alone. We want to avoid having winners off by one letter casing, etc.
I mean exactly that.
It's true if you check that there isn't claims after normalization will have same names and takeover height will override another one.
Yeah, in that case lower case makes sense,
fixed
I just pushed in some minor changes and fixes. You will now see normalizedName in the RPC calls. (I haven't done a major RPC field name refactor yet.) I yanked all the code to return pending supports. I couldn't get the normalized name correct there, and the whole idea seemed wrong after further reflection. A support is not a support until the block increments. Same with claims. Please review it. Thanks.
I think I understand how this is happening now. I have some ideas on how to address it.
Is this still happen? When I correct unit test it does not throw.
It fails for me in latest with this test: claimtriebranching_tests/claimtriecache_normalization
Did you try that patch:
@lbrynaut & @kaykurokawa , this is the line where I wanted you to verify that getDelayForName would return the same as in addClaim (aka, the same going forward as going backward).
This needs to be 'normalized_name'. After looking through the lbry codebase today, I saw how much they really want PEP8 names.
I'm thinking to change this to always return the normalized name, not just after the fork. That we people can adjust to it sooner.
@ -738,1 +745,4 @@
AS_IF([test "x$ICU_PREFIX" != xauto], [
ICU_CPPFLAGS="-I$ICU_PREFIX/include"
ICU_LIBS="-L$ICU_PREFIX/lib -licui18n -licuuc -licudata -ldl"
ICU detection is still broken. What this is doing is preferring the system ICU over the built dependency ICU version, which is not what we want. Our build must require --with-icu and prefer that path. I'll update with something later and you can see if it works for you.
@ -738,1 +745,4 @@
AS_IF([test "x$ICU_PREFIX" != xauto], [
ICU_CPPFLAGS="-I$ICU_PREFIX/include"
ICU_LIBS="-L$ICU_PREFIX/lib -licui18n -licuuc -licudata -ldl"
I don't want it to require --with-icu. However, if --with-icu is provided it had darn well better use that one.