Add encoding for binary data returned from RPC, make upstream handle it and UTF8 #236
Labels
No labels
area: devops
area: discovery
area: docs
area: livestream
area: proposal
consider soon
Epic
good first issue
hacktoberfest
hard fork
help wanted
icebox
Invalid
level: 0
level: 1
level: 2
level: 3
level: 4
needs: exploration
needs: grooming
needs: priority
needs: repro
needs: tech design
on hold
priority: blocker
priority: high
priority: low
priority: medium
resilience
soft fork
Tom's Wishlist
type: bug
type: discussion
type: improvement
type: new feature
type: refactor
type: task
type: testing
unplanned
work in progress
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: LBRYCommunity/lbrycrd#236
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Three RPC/CLI issues related to the Normalization hard fork PR #159
I've had some problems being able to input utf8 strings into the CLI interface .. I'm not sure what the problem is but it does not seem to be parsing the string properly, I made sure my command line supported utf8. I don't think there is any changes we need to make to lbrycrd but I could be wrong..
For RPC, input works fine (as long as your JSON RPC supports it, the popular one does https://github.com/jgarzik/python-bitcoinrpc ). However, I realized that the RPC output actually encodes unicode in this weird way: https://github.com/lbryio/lbrycrd/blob/master/src/univalue/lib/univalue_write.cpp#L15 , so this is actually a bit annoying here as we need to decode this properly, or perhaps its ok to remove this encoding? I presume its there because some command lines may support only ASCII...
I think an RPC call that lets the user know what normalization does would be good:
checknormalization(name) -> returns normalized name
This function was added here: https://github.com/lbryio/lbrycrd/pull/235, but will probably need some additional return arguments based on what we decide for https://github.com/lbryio/lbrycrd/issues/234 (i.e, are we returning a normalized string with the casing preserved?)
Further I think existing RPC calls need to be modified depending on what we decide here https://github.com/lbryio/lbrycrd/issues/234 , i.e. are we normalizing the input "name" for calls like getvalueforname() and getclaimsforname() or are we expecting that the input is already normalized?
We have relation one to many, normalization name to claims and supports so we expect non-normalized string, then we find node by normalized it, results should be filter by non-normalized string again to show exactly desired claims. So we should make changes to these RPC methods as well, good point. But, we don't need such a method like checknormalization, let's make an example:
let's we have 3 claims Dog, DOg, DoG
checknormalization DOg => dog
getclaimsforname dog => will return nothing, why? because we have 3 claims but none of them is exactly lowercase. The user desire is DOg so calling checknormalization will lie him, instead:
getclaimsforname DOg => will return only 1 claim, exactly named DOg
Normalization should be transparent and makes sense in trie not outside so if you have before normalization one claim DOg you will have same result afterwards.
Responses to @kaykurokawa :
Responses to @bvbfan :
@BrannonKing, you are right for current normalization implementation, but i got the impression we will change it. Note test before normalization
It returns only one claim - the exactly case sensitive name, after normalization claims size will be 2. Since we can keep non-normalization name we can keep same behavior.
Let's change this issue to just refer to the RPC output encoding. From our discussion yesterday, it sounded like people were in favor of holding off on changing the output encoding until the upstream merge is done. It doesn't have to be done as part of the normalization branch (and there are already many changes there). Part of that requires us to change how the value field is encoded. It will need to come out as hex or base64. Polls on that are ongoing.
I tried changing https://github.com/lbryio/lbrycrd/blob/master/src/univalue/lib/univalue_write.cpp#L15 to not encode unicode in such format, but this change was not sufficient to print out utf8 characters through the CLI. The below error was returned,
error: couldn't parse reply from server
@kaykurokawa , I did the same change originally and saw the same error. I realized that the univalue changes went all across the codebase. Hence, the only approach I had was to upgrade the whole package. That's what I did in the branch named "normalization_plus_base64". You're welcome to rebase that branch if needed.
Done in master. We now return hexadecimal for the value fields.