Case-insensitive claims #65

Closed
opened 2017-06-30 22:30:12 +02:00 by kauffj · 9 comments
kauffj commented 2017-06-30 22:30:12 +02:00 (Migrated from github.com)

LBRY URLs ought to be case-insensitive, the way traditional domains are. Two claims with names with different case-sensitivity should be considered having the same name.

LBRY URLs ought to be case-insensitive, the way traditional domains are. Two claims with names with different case-sensitivity should be considered having the same name.
WaveringAna commented 2017-09-23 19:06:27 +02:00 (Migrated from github.com)

How would this affect already existing claims?

How would this affect already existing claims?
kaykurokawa commented 2017-10-31 18:50:50 +01:00 (Migrated from github.com)

a) We need to have the claimtrie work with unicode, not bytes if we are attaching strict interpretations of what the bytes means, are non unicode bytes now invalid? Which unicode encoding are we using?

a) We need to define "case" here. English is of course easy, but it is kind of silly to deal with just English "cases" but ignore other languages. Is there a standard we can adopt or work off of here?

c) as aayani said, we need to determine what happens to the already existing claims when this change occurs. (Does claim dog and DOG get merged into the same name?) and implement this logic.

It is also possible to do this via soft fork by rejecting claims with cases in them after a certain block and letting already existing ones expire.

a) We need to have the claimtrie work with unicode, not bytes if we are attaching strict interpretations of what the bytes means, are non unicode bytes now invalid? Which unicode encoding are we using? a) We need to define "case" here. English is of course easy, but it is kind of silly to deal with just English "cases" but ignore other languages. Is there a standard we can adopt or work off of here? c) as aayani said, we need to determine what happens to the already existing claims when this change occurs. (Does claim dog and DOG get merged into the same name?) and implement this logic. It is also possible to do this via soft fork by rejecting claims with cases in them after a certain block and letting already existing ones expire.
kauffj commented 2017-10-31 19:17:57 +01:00 (Migrated from github.com)

a) I'm not sure of the answer here, but if some claims get invalidated that would acceptable (though obviously not ideal). I don't have a problem with supporting non-Unicode or arbitrary byte claims, so long as when/if the claim is valid Unicode, we are considering case-sensitivity.

b) I believe Unicode already defines lower and upper-case characters, as you can use regular expressions categories like \p{Ll} and \p{Lu} to refer to upper and lower case characters.

c) Yes, these would get merged into the same name under this proposal.

I'd prefer the solution here to be that claims retain the original provided capitalization, rather than preventing capitalization in claims. That is, we want someone to be able to claim lbry://eXistenZ and retain the capitalization, but still have lbry://existenz resolve to it.

a) I'm not sure of the answer here, but if some claims get invalidated that would acceptable (though obviously not ideal). I don't have a problem with supporting non-Unicode or arbitrary byte claims, so long as when/if the claim is valid Unicode, we are considering case-sensitivity. b) I believe Unicode already defines lower and upper-case characters, as you can use regular expressions categories like `\p{Ll}` and `\p{Lu}` to refer to upper and lower case characters. c) Yes, these would get merged into the same name under this proposal. I'd prefer the solution here to be that claims retain the original provided capitalization, rather than preventing capitalization in claims. That is, we want someone to be able to claim `lbry://eXistenZ` and retain the capitalization, but still have `lbry://existenz` resolve to it.
tzarebczan commented 2017-10-31 19:43:04 +01:00 (Migrated from github.com)

So if two people own the same channel/claim name with different capitalizations, the resolve will default to the one with the higher bid right?

So if two people own the same channel/claim name with different capitalizations, the resolve will default to the one with the higher bid right?
kauffj commented 2017-10-31 20:07:00 +01:00 (Migrated from github.com)

Yes, that's the idea.

Yes, that's the idea.
etisdew commented 2017-11-25 00:43:53 +01:00 (Migrated from github.com)

What is the status of this feature? I'd like to counter that by the argument of address space being tightly coupled to length of string and variance in characters. This added expense to hashing is equivalent to mandating everything be in lower-case and have a doubly long url. It'd only defer a few of the thousands of fanboys who need the name Sephiroth and they'll add any number or iteration to get that name to show the world how much they love that name... It sounds like you found an elegant solution though.

What is the status of this feature? I'd like to counter that by the argument of address space being tightly coupled to length of string and variance in characters. This added expense to hashing is equivalent to mandating everything be in lower-case and have a doubly long url. It'd only defer a few of the thousands of fanboys who need the name Sephiroth and they'll add any number or iteration to get that name to show the world how much they love that name... It sounds like you found an elegant solution though.
kaykurokawa commented 2018-02-23 20:40:06 +01:00 (Migrated from github.com)

This provides a standarized method of normalization for unicode characters: http://unicode.org/reports/tr15/#Notation , and is currently being proposed as the method for implementation case-insensitve claims along with other normalizations.

This is also another alternative https://en.wikipedia.org/wiki/Punycode , https://tools.ietf.org/html/rfc3492

This provides a standarized method of normalization for unicode characters: http://unicode.org/reports/tr15/#Notation , and is currently being proposed as the method for implementation case-insensitve claims along with other normalizations. This is also another alternative https://en.wikipedia.org/wiki/Punycode , https://tools.ietf.org/html/rfc3492
kaykurokawa commented 2018-02-23 20:54:50 +01:00 (Migrated from github.com)

Here is a unicdoe vulnerability that was floating around as a phishing attack on Telegram recently:
https://krebsonsecurity.com/2011/09/right-to-left-override-aids-email-attacks/

Here is a unicdoe vulnerability that was floating around as a phishing attack on Telegram recently: https://krebsonsecurity.com/2011/09/right-to-left-override-aids-email-attacks/
lbrynaut commented 2018-06-26 21:47:01 +02:00 (Migrated from github.com)

Addressed by #159

Addressed by #159
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: LBRYCommunity/lbrycrd#65
No description provided.