Case-insensitive claims #65
Labels
No labels
area: devops
area: discovery
area: docs
area: livestream
area: proposal
consider soon
Epic
good first issue
hacktoberfest
hard fork
help wanted
icebox
Invalid
level: 0
level: 1
level: 2
level: 3
level: 4
needs: exploration
needs: grooming
needs: priority
needs: repro
needs: tech design
on hold
priority: blocker
priority: high
priority: low
priority: medium
resilience
soft fork
Tom's Wishlist
type: bug
type: discussion
type: improvement
type: new feature
type: refactor
type: task
type: testing
unplanned
work in progress
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: LBRYCommunity/lbrycrd#65
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
LBRY URLs ought to be case-insensitive, the way traditional domains are. Two claims with names with different case-sensitivity should be considered having the same name.
How would this affect already existing claims?
a) We need to have the claimtrie work with unicode, not bytes if we are attaching strict interpretations of what the bytes means, are non unicode bytes now invalid? Which unicode encoding are we using?
a) We need to define "case" here. English is of course easy, but it is kind of silly to deal with just English "cases" but ignore other languages. Is there a standard we can adopt or work off of here?
c) as aayani said, we need to determine what happens to the already existing claims when this change occurs. (Does claim dog and DOG get merged into the same name?) and implement this logic.
It is also possible to do this via soft fork by rejecting claims with cases in them after a certain block and letting already existing ones expire.
a) I'm not sure of the answer here, but if some claims get invalidated that would acceptable (though obviously not ideal). I don't have a problem with supporting non-Unicode or arbitrary byte claims, so long as when/if the claim is valid Unicode, we are considering case-sensitivity.
b) I believe Unicode already defines lower and upper-case characters, as you can use regular expressions categories like
\p{Ll}
and\p{Lu}
to refer to upper and lower case characters.c) Yes, these would get merged into the same name under this proposal.
I'd prefer the solution here to be that claims retain the original provided capitalization, rather than preventing capitalization in claims. That is, we want someone to be able to claim
lbry://eXistenZ
and retain the capitalization, but still havelbry://existenz
resolve to it.So if two people own the same channel/claim name with different capitalizations, the resolve will default to the one with the higher bid right?
Yes, that's the idea.
What is the status of this feature? I'd like to counter that by the argument of address space being tightly coupled to length of string and variance in characters. This added expense to hashing is equivalent to mandating everything be in lower-case and have a doubly long url. It'd only defer a few of the thousands of fanboys who need the name Sephiroth and they'll add any number or iteration to get that name to show the world how much they love that name... It sounds like you found an elegant solution though.
This provides a standarized method of normalization for unicode characters: http://unicode.org/reports/tr15/#Notation , and is currently being proposed as the method for implementation case-insensitve claims along with other normalizations.
This is also another alternative https://en.wikipedia.org/wiki/Punycode , https://tools.ietf.org/html/rfc3492
Here is a unicdoe vulnerability that was floating around as a phishing attack on Telegram recently:
https://krebsonsecurity.com/2011/09/right-to-left-override-aids-email-attacks/
Addressed by #159