Search ranking improvement #58
Labels
No labels
area: app c
area: app d
area: devops
area: discovery
area: docs
area: proposal
area: X-device Sync
Chainquery
consider soon
dependencies
Epic
Fix till next release
good first issue
hacktoberfest
help wanted
icebox
Invalid
level: 1
level: 2
level: 3
level: 4
needs: exploration
needs: grooming
needs: priority
needs: repro
needs: tech design
on hold
Parked
priority: blocker
priority: high
priority: low
priority: medium
Tom's Wishlist
type: bug
type: discussion
type: improvement
type: new feature
type: refactor
type: task
type: testing
unplanned
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: LBRYCommunity/lighthouse.js#58
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
When searching @imineblocks you get two channels with the same name. The channel with only one item ranks higher than the other channel with 300 items and the vanity name claim.
The one with the vanity name and more items should rank on top. The search result should also show which channel has the vanity name claim when multiple channels of the same name is listed.
@m160 - Thanks for bringing this to our attention. This is an awkward situation but it appears that the channel with 1 item ranks higher because it is the claim with the winning bid of the channel name "@imineblocks". It is expected that our search algorithm ranks claims that have the winning bid for a name to be ranked high than claims that do not. So this should explain the results of the search.
However, why would that be since it is clear that the channel with 300 items is much more relevant? I could even say that the intention appears to be to move content to the new claim.
#19d59f352adb3fde2eb9b448b343f2528b1a1313(1 item)
EffectiveAmount: 1,000,000,000
LastTakeOver: 346,073
ValidAtHeight: 171,990
#bc5981833c596cb4ac0b51930827a827d5feb64c(300 items)
EffectiveAmount: 1,001,000,000
LastTakeOver:
ValidAtHeight: 346,073
Based on the above information, it is clear that the effective amount is greater for the second channel claim, so why does it lose the bidding war? You can see that it is not winning because it does not have a last take over. This just references at which block in the blockchain this channel claim becomes the winning bidder.
The other is winning because of some game theory the creators of LBRY put deep thought into. If you claim a channel, you would not want someone with more LBC to steal it from you. People could be very sneaky. So the score in who wins a bid also accounts for how long they have had the channel claimed, not just the effective amount. So if you look at the valid at height you can see this publisher has had been around much longer than the second one. Now this does not prevent someone from taking the channel name, it just takes a longer time and more LBC to get it.
There is some really great articles on this topic:
Naming system - https://lbry.io/faq/naming
Claimtrie - https://lbry.io/faq/claimtrie-implementation
Wait, so the channel with 1 item has the winning bid?
Then I have another issue for you: the exact url (lbry://@imineblocks) takes you to the channel with 300 items. This is why I believed the channel with 300 items had the vanity name.
@tiger5226 think you misinterpreted the last take over value - that's when it was actually taken over by claim #2. Valid at height for claim #2 is 346073 and the current height is 35014, so it's the winning claim.
@tiger5226 does the current search implementation take into updates of these claims? I.e. I could add support for the first claim right now and it would take back over. Just making sure we are taking that into account - IIRC, at one point, it may not have been.
@m160 Thanks for reopening this!
@tzarebczan On point 1, yes, I agree, I misinterpreted that. I actually queried lbrycrd and the sequence of claims lead me to believe the the non-controlling one was the controlling one when in reality the sequence means nothing. I immediately, saw that the first one was there for quite some time so I thought this was a clear example of a minute take over bid not taking the name due to the duration restrictions. Still curious why it was allowed to be taken so easily. It had the vanity name since block 172K and 1/100th of the claim amount allowed it to be overtaken pretty quickly. Looking at the docs it appears I could take over any ones channel in the same manner, and hopefully they notice my bid within 7 days, otherwise, I have just technically "hijacked" their channel and I get all their traffic for a period of time, however, short.
@tzarebczan On point 2, I delved a bit deeper. I performed the query for
@imineblocks
directly on elastic search from kibana. The result showed that these two claims were equal in weight. So there was no preference for one of over the other.I delved a bit deeper and the
iscontrolling
received from lbrycrd was true for both of them in elastic search. Looking at the sync logic, we are only adding not updating like you suggested. Once a claim is added it is never updated in elastic search. It seems to me like we need a refresh every so often, like a full sync that pushes everything. Elastic Search will update the elastic document if you send it in with the same claimid. So maybe once per day we do a bigger sync since we don't really know if there are updates or not. Thoughts?How does chainquery fit into this? Would regularly syncing from chainquery be the solution?
@lyoshenka Sorry for the delayed response here. Chainquery, will allow for lighthouse to query all claims updated since a specific date and time. Everytime a claim is modified we timestamp it in chainquery. So in theory this will be a much simpler process with chainquery. The only gap there would be the expired or spent claims which we would not then be deleting. Since we want all claims to be searchable this is a non-issue.
This will be solved once #71 is closed out as it has the new sync process where claims will get the updates.
Ok so now that claim updates are coming in and syncing properly with the latest release of lighthouse we can add weight for controlling claims. This is a separate issue so split it out into #75
this will be solved with the additions for controlling weights. https://github.com/lbryio/lighthouse/pull/81#event-1692597622
deployed