Search ranking improvement #58

Closed
opened 2018-04-04 22:43:42 +02:00 by ghost · 10 comments
ghost commented 2018-04-04 22:43:42 +02:00 (Migrated from github.com)

When searching @imineblocks you get two channels with the same name. The channel with only one item ranks higher than the other channel with 300 items and the vanity name claim.

The one with the vanity name and more items should rank on top. The search result should also show which channel has the vanity name claim when multiple channels of the same name is listed.

When searching @imineblocks you get two channels with the same name. The channel with only one item ranks higher than the other channel with 300 items **and** the vanity name claim. The one with the vanity name and more items should rank on top. The search result should also show which channel has the vanity name claim when multiple channels of the same name is listed.
tiger5226 commented 2018-04-07 07:06:29 +02:00 (Migrated from github.com)

@m160 - Thanks for bringing this to our attention. This is an awkward situation but it appears that the channel with 1 item ranks higher because it is the claim with the winning bid of the channel name "@imineblocks". It is expected that our search algorithm ranks claims that have the winning bid for a name to be ranked high than claims that do not. So this should explain the results of the search.

However, why would that be since it is clear that the channel with 300 items is much more relevant? I could even say that the intention appears to be to move content to the new claim.

#19d59f352adb3fde2eb9b448b343f2528b1a1313(1 item)

EffectiveAmount: 1,000,000,000
LastTakeOver: 346,073
ValidAtHeight: 171,990

#bc5981833c596cb4ac0b51930827a827d5feb64c(300 items)

EffectiveAmount: 1,001,000,000
LastTakeOver:
ValidAtHeight: 346,073

Based on the above information, it is clear that the effective amount is greater for the second channel claim, so why does it lose the bidding war? You can see that it is not winning because it does not have a last take over. This just references at which block in the blockchain this channel claim becomes the winning bidder.

The other is winning because of some game theory the creators of LBRY put deep thought into. If you claim a channel, you would not want someone with more LBC to steal it from you. People could be very sneaky. So the score in who wins a bid also accounts for how long they have had the channel claimed, not just the effective amount. So if you look at the valid at height you can see this publisher has had been around much longer than the second one. Now this does not prevent someone from taking the channel name, it just takes a longer time and more LBC to get it.

There is some really great articles on this topic:

Naming system - https://lbry.io/faq/naming

Claimtrie - https://lbry.io/faq/claimtrie-implementation

@m160 - Thanks for bringing this to our attention. This is an awkward situation but it appears that the channel with 1 item ranks higher because it is the claim with the winning bid of the channel name "@imineblocks". It is expected that our search algorithm ranks claims that have the winning bid for a name to be ranked high than claims that do not. So this should explain the results of the search. However, why would that be since it is clear that the channel with 300 items is much more relevant? I could even say that the intention appears to be to move content to the new claim. ## #19d59f352adb3fde2eb9b448b343f2528b1a1313(1 item) ## EffectiveAmount: 1,000,000,000 LastTakeOver: 346,073 ValidAtHeight: 171,990 ## #bc5981833c596cb4ac0b51930827a827d5feb64c(300 items) ## EffectiveAmount: 1,001,000,000 LastTakeOver: ValidAtHeight: 346,073 Based on the above information, it is clear that the effective amount is greater for the second channel claim, so why does it lose the bidding war? You can see that it is not winning because it does not have a last take over. This just references at which block in the blockchain this channel claim becomes the winning bidder. The other is winning because of some game theory the creators of LBRY put deep thought into. If you claim a channel, you would not want someone with more LBC to steal it from you. People could be very sneaky. So the score in who wins a bid also accounts for how long they have had the channel claimed, not just the effective amount. So if you look at the valid at height you can see this publisher has had been around much longer than the second one. Now this does not prevent someone from taking the channel name, it just takes a longer time and more LBC to get it. There is some really great articles on this topic: Naming system - https://lbry.io/faq/naming Claimtrie - https://lbry.io/faq/claimtrie-implementation
ghost commented 2018-04-07 09:20:23 +02:00 (Migrated from github.com)

Wait, so the channel with 1 item has the winning bid?
Then I have another issue for you: the exact url (lbry://@imineblocks) takes you to the channel with 300 items. This is why I believed the channel with 300 items had the vanity name.

Wait, so the channel with 1 item has the winning bid? Then I have another issue for you: the exact url (lbry://@imineblocks) takes you to the channel with 300 items. This is why I believed the channel with 300 items had the vanity name.
tzarebczan commented 2018-04-07 19:48:13 +02:00 (Migrated from github.com)

@tiger5226 think you misinterpreted the last take over value - that's when it was actually taken over by claim #2. Valid at height for claim #2 is 346073 and the current height is 35014, so it's the winning claim.

@tiger5226 does the current search implementation take into updates of these claims? I.e. I could add support for the first claim right now and it would take back over. Just making sure we are taking that into account - IIRC, at one point, it may not have been.

lbrynet-cli claim_list @imineblocks
{
  "claims": [
    {
      "address": "baBPZJi5kATnvb4chjq3rQBsQ2FrzMAwK6",
      "amount": 10.0,
      "claim_id": "19d59f352adb3fde2eb9b448b343f2528b1a1313",
      "claim_sequence": 1,
      "decoded_claim": true,
      "depth": 178153,
      "effective_amount": 10.0,
      "has_signature": false,
      "height": 171990,
      "hex": "08011002225e0801100322583056301006072a8648ce3d020106052b8104000a03420004ff6e5865818f324fc726510844115e64def80229ad3091c2ba9431796fa9032c06d91329f19b3c9d6d8fde1cd7e914bc0e628bc30e8d10ec712cc417cd920b87",
      "name": "@imineblocks",
      "nout": 0,
      "permanent_url": "@imineblocks#19d59f352adb3fde2eb9b448b343f2528b1a1313",
      "supports": [],
      "txid": "6f3cc670fbf81d5517a47d8e5fa70961a938f40bee82425609049e412cf54eea",
      "valid_at_height": 171990,
      "value": {
        "certificate": {
          "keyType": "SECP256k1",
          "publicKey": "3056301006072a8648ce3d020106052b8104000a03420004ff6e5865818f324fc726510844115e64def80229ad3091c2ba9431796fa9032c06d91329f19b3c9d6d8fde1cd7e914bc0e628bc30e8d10ec712cc417cd920b87",
          "version": "_0_0_1"
        },
        "claimType": "certificateType",
        "version": "_0_0_1"
      }
    },
    {
      "address": "bZht5jLbncbzvaNBBVZMQbaMDnMdP5LVsg",
      "amount": 10.01,
      "claim_id": "bc5981833c596cb4ac0b51930827a827d5feb64c",
      "claim_sequence": 2,
      "decoded_claim": true,
      "depth": 8102,
      "effective_amount": 10.01,
      "has_signature": false,
      "height": 342041,
      "hex": "08011002225e0801100322583056301006072a8648ce3d020106052b8104000a034200045642c26f57fd1d7642f4c5f77be9085379f5e3d17dee12ec3416b1d36b1b9d4341ed48a87da43918331574ce7a94355d05124510b015ef1ca062cf2470e9a075",
      "name": "@imineblocks",
      "nout": 1,
      "permanent_url": "@imineblocks#bc5981833c596cb4ac0b51930827a827d5feb64c",
      "supports": [],
      "txid": "05d3a30877e43706a113b8a6688df1bc56a0d86b9ae56cdde1077ffb01a21311",
      "valid_at_height": 346073,
      "value": {
        "certificate": {
          "keyType": "SECP256k1",
          "publicKey": "3056301006072a8648ce3d020106052b8104000a034200045642c26f57fd1d7642f4c5f77be9085379f5e3d17dee12ec3416b1d36b1b9d4341ed48a87da43918331574ce7a94355d05124510b015ef1ca062cf2470e9a075",
          "version": "_0_0_1"
        },
        "claimType": "certificateType",
        "version": "_0_0_1"
      }
    }
  ],
  "last_takeover_height": 346073,
  "supports_without_claims": []
}
@tiger5226 think you misinterpreted the last take over value - that's when it was actually taken over by claim #2. Valid at height for claim #2 is 346073 and the current height is 35014, so it's the winning claim. @tiger5226 does the current search implementation take into updates of these claims? I.e. I could add support for the first claim right now and it would take back over. Just making sure we are taking that into account - IIRC, at one point, it may not have been. ``` lbrynet-cli claim_list @imineblocks { "claims": [ { "address": "baBPZJi5kATnvb4chjq3rQBsQ2FrzMAwK6", "amount": 10.0, "claim_id": "19d59f352adb3fde2eb9b448b343f2528b1a1313", "claim_sequence": 1, "decoded_claim": true, "depth": 178153, "effective_amount": 10.0, "has_signature": false, "height": 171990, "hex": "08011002225e0801100322583056301006072a8648ce3d020106052b8104000a03420004ff6e5865818f324fc726510844115e64def80229ad3091c2ba9431796fa9032c06d91329f19b3c9d6d8fde1cd7e914bc0e628bc30e8d10ec712cc417cd920b87", "name": "@imineblocks", "nout": 0, "permanent_url": "@imineblocks#19d59f352adb3fde2eb9b448b343f2528b1a1313", "supports": [], "txid": "6f3cc670fbf81d5517a47d8e5fa70961a938f40bee82425609049e412cf54eea", "valid_at_height": 171990, "value": { "certificate": { "keyType": "SECP256k1", "publicKey": "3056301006072a8648ce3d020106052b8104000a03420004ff6e5865818f324fc726510844115e64def80229ad3091c2ba9431796fa9032c06d91329f19b3c9d6d8fde1cd7e914bc0e628bc30e8d10ec712cc417cd920b87", "version": "_0_0_1" }, "claimType": "certificateType", "version": "_0_0_1" } }, { "address": "bZht5jLbncbzvaNBBVZMQbaMDnMdP5LVsg", "amount": 10.01, "claim_id": "bc5981833c596cb4ac0b51930827a827d5feb64c", "claim_sequence": 2, "decoded_claim": true, "depth": 8102, "effective_amount": 10.01, "has_signature": false, "height": 342041, "hex": "08011002225e0801100322583056301006072a8648ce3d020106052b8104000a034200045642c26f57fd1d7642f4c5f77be9085379f5e3d17dee12ec3416b1d36b1b9d4341ed48a87da43918331574ce7a94355d05124510b015ef1ca062cf2470e9a075", "name": "@imineblocks", "nout": 1, "permanent_url": "@imineblocks#bc5981833c596cb4ac0b51930827a827d5feb64c", "supports": [], "txid": "05d3a30877e43706a113b8a6688df1bc56a0d86b9ae56cdde1077ffb01a21311", "valid_at_height": 346073, "value": { "certificate": { "keyType": "SECP256k1", "publicKey": "3056301006072a8648ce3d020106052b8104000a034200045642c26f57fd1d7642f4c5f77be9085379f5e3d17dee12ec3416b1d36b1b9d4341ed48a87da43918331574ce7a94355d05124510b015ef1ca062cf2470e9a075", "version": "_0_0_1" }, "claimType": "certificateType", "version": "_0_0_1" } } ], "last_takeover_height": 346073, "supports_without_claims": [] } ```
tiger5226 commented 2018-04-08 05:31:20 +02:00 (Migrated from github.com)

@m160 Thanks for reopening this!

@tzarebczan On point 1, yes, I agree, I misinterpreted that. I actually queried lbrycrd and the sequence of claims lead me to believe the the non-controlling one was the controlling one when in reality the sequence means nothing. I immediately, saw that the first one was there for quite some time so I thought this was a clear example of a minute take over bid not taking the name due to the duration restrictions. Still curious why it was allowed to be taken so easily. It had the vanity name since block 172K and 1/100th of the claim amount allowed it to be overtaken pretty quickly. Looking at the docs it appears I could take over any ones channel in the same manner, and hopefully they notice my bid within 7 days, otherwise, I have just technically "hijacked" their channel and I get all their traffic for a period of time, however, short.

@tzarebczan On point 2, I delved a bit deeper. I performed the query for @imineblocks directly on elastic search from kibana. The result showed that these two claims were equal in weight. So there was no preference for one of over the other.

I delved a bit deeper and the iscontrolling received from lbrycrd was true for both of them in elastic search. Looking at the sync logic, we are only adding not updating like you suggested. Once a claim is added it is never updated in elastic search. It seems to me like we need a refresh every so often, like a full sync that pushes everything. Elastic Search will update the elastic document if you send it in with the same claimid. So maybe once per day we do a bigger sync since we don't really know if there are updates or not. Thoughts?

@m160 Thanks for reopening this! @tzarebczan On point 1, yes, I agree, I misinterpreted that. I actually queried lbrycrd and the sequence of claims lead me to believe the the non-controlling one was the controlling one when in reality the sequence means nothing. I immediately, saw that the first one was there for quite some time so I thought this was a clear example of a minute take over bid not taking the name due to the duration restrictions. Still curious why it was allowed to be taken so easily. It had the vanity name since block 172K and 1/100th of the claim amount allowed it to be overtaken pretty quickly. Looking at the docs it appears I could take over any ones channel in the same manner, and hopefully they notice my bid within 7 days, otherwise, I have just technically "hijacked" their channel and I get all their traffic for a period of time, however, short. @tzarebczan On point 2, I delved a bit deeper. I performed the query for `@imineblocks` directly on elastic search from kibana. The result showed that these two claims were equal in weight. So there was no preference for one of over the other. I delved a bit deeper and the `iscontrolling` received from lbrycrd was true for both of them in elastic search. Looking at the sync logic, we are only adding not updating like you suggested. Once a claim is added it is never updated in elastic search. It seems to me like we need a refresh every so often, like a full sync that pushes everything. Elastic Search will update the elastic document if you send it in with the same claimid. So maybe once per day we do a bigger sync since we don't really know if there are updates or not. Thoughts?
lyoshenka commented 2018-04-08 13:10:42 +02:00 (Migrated from github.com)

How does chainquery fit into this? Would regularly syncing from chainquery be the solution?

How does chainquery fit into this? Would regularly syncing from chainquery be the solution?
tiger5226 commented 2018-04-15 04:26:11 +02:00 (Migrated from github.com)

@lyoshenka Sorry for the delayed response here. Chainquery, will allow for lighthouse to query all claims updated since a specific date and time. Everytime a claim is modified we timestamp it in chainquery. So in theory this will be a much simpler process with chainquery. The only gap there would be the expired or spent claims which we would not then be deleting. Since we want all claims to be searchable this is a non-issue.

@lyoshenka Sorry for the delayed response here. Chainquery, will allow for lighthouse to query all claims updated since a specific date and time. Everytime a claim is modified we timestamp it in chainquery. So in theory this will be a much simpler process with chainquery. The only gap there would be the expired or spent claims which we would not then be deleting. Since we want all claims to be searchable this is a non-issue.
tiger5226 commented 2018-05-13 01:50:30 +02:00 (Migrated from github.com)

This will be solved once #71 is closed out as it has the new sync process where claims will get the updates.

This will be solved once #71 is closed out as it has the new sync process where claims will get the updates.
tiger5226 commented 2018-05-17 23:59:17 +02:00 (Migrated from github.com)

Ok so now that claim updates are coming in and syncing properly with the latest release of lighthouse we can add weight for controlling claims. This is a separate issue so split it out into #75

Ok so now that claim updates are coming in and syncing properly with the latest release of lighthouse we can add weight for controlling claims. This is a separate issue so split it out into #75
tiger5226 commented 2018-06-21 04:37:27 +02:00 (Migrated from github.com)

this will be solved with the additions for controlling weights. https://github.com/lbryio/lighthouse/pull/81#event-1692597622

this will be solved with the additions for controlling weights. https://github.com/lbryio/lighthouse/pull/81#event-1692597622
tiger5226 commented 2018-07-04 06:18:46 +02:00 (Migrated from github.com)

deployed

deployed
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: LBRYCommunity/lighthouse.js#58
No description provided.