Autocomplete Query is not returning the proper results #55
Labels
No labels
area: app c
area: app d
area: devops
area: discovery
area: docs
area: proposal
area: X-device Sync
Chainquery
consider soon
dependencies
Epic
Fix till next release
good first issue
hacktoberfest
help wanted
icebox
Invalid
level: 1
level: 2
level: 3
level: 4
needs: exploration
needs: grooming
needs: priority
needs: repro
needs: tech design
on hold
Parked
priority: blocker
priority: high
priority: low
priority: medium
Tom's Wishlist
type: bug
type: discussion
type: improvement
type: new feature
type: refactor
type: task
type: testing
unplanned
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: LBRYCommunity/lighthouse.js#55
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
The autocomplete query suffers from the same problems that search did previously. the value section of the elastic document is of type nested, which means the query needs to also be a nested query. The result therefore is actually only searching the name of a claim instead of search for the best auto complete term across the main fields of title, description and author. Below is an example that is current:
https://lighthouse.lbry.io/autocomplete?s=test%20a
Result:
Now if you look at the first result
make-a-test-tube-thunderstorm
and search that withhttps://lighthouse.lbry.io/search?s=test%20a
Result:
You can see that the claim name
test
is first but then"name":"make-a-test-tube-thunderstorm"
is second. So it is only searching the name field.The internal server error should be a separate issue. We should not hit an error by entering a query. I will create another issue for this.
Lastly, since the elastic search query needs to be modified, getting this right takes some time, I don't think this is a level 1, so I increased it to a level 2.
@tzarebczan Do you know how autocomplete is even used in the application? I don't think it is being leveraged right now. The api is no longer broken and I cannot get what I would expect from an autocomplete feature even if the results are not the best.
We are pretty much just splitting any string typed into the search box and sending that to the api. This is where we do it https://github.com/lbryio/lbry-redux/blob/master/src/redux/actions/search.js#L72
@tiger5226 talked to @seanyesmunt about this - believe he was trying to use it at one point but then disabled it when it was broken. He'll look into it.
How much effort is it to make it work similar to the search function?
uggh..sorry I missed this conversation. My notifications are all working great now, won't missed another one like this.
So it was a good amount of work getting the query for search right. Posted link so you can see the complexity of the query. The good news is I did a lot of the work already and can reuse about 80% of it. Most of the work for autocomplete would be tweaking and testing. I could probably do it in a half day and have it ready for deployment.
@tiger5226 , also, this is the response when searching for "url":
["ucb-UrLej5w0hJI","Japanese 7B - 2015-03-05: Novel: No Longer Human","UCBerkeley"]
Japanese 7B - 2015-03-05: Novel: No Longer Human = title of 'ucb-UrLej5w0hJI' - do we want both the title and URL returning here? The idea of autocomplete is to give the user suggestions on URLs + suggested search terms (titles? - but based on search within titles?) that match their search input. I guess the question is do we want to return a title for a search term that matches the URL or should it be searching titles for search term suggestions?
@seanyesmunt do you need an identifier to tell if a result is a URL or not? This is related to https://github.com/lbryio/lbry-app/issues/1454. We are already doing this for channels, but they are easier to identify.
So the results are not good. Below is the where it collates the results.
https://github.com/lbryio/lighthouse/blob/master/server/controllers/lighthouse.js#L261
The query itself, is effectively only querying against the claim names, not the things you would think. The results are just adding the name, author and title to an array of options that are then returned.
The result that we return should have a match. if it is not a match we should not return it. I would be pretty weirded out by entering
url
and getting backUCBerkeley
as the suggestion.I always default to the most familiar search tool because that is what the vast majority of users will be interested in and will most likely be their intent.
So in looking at the logs of lighthouse, the autocomplete is being called quite often. So I presume this is from the app and it is in fact running.
@seanyesmunt - Does this sound right?
Either way, when testing in the app, typing 1 character at a time, I notice some lag. The logs show that the current query for autocomplete takes around 12-25ms. This is very fast, however the search query is what I think is causing the lag. So we need to make sure we are not doing the same query, because that would be too noticeable when typing into the search bar. I think we should stick with the prefix query and fix the problems with its current state where it is unintentionally only searching the claim name.
Additionally, we need to run a regex over each result to make sure we only send the ones that contain the query. Not sure of the performance implications of this. I think it is a requirement so maybe it doesn't matter and we would have to fix it no matter what if this is the case. I will certainly do testing around it.
I confirmed this was the case. The redesign is now using the autocomplete.
I have made the referenced changes and reworked the auto-complete query. It now returns results that actually mean something
Query Example - "Shi"
Old results:
New Results
What it was doing was duplicating information for a claim. So if the term matched the title it would pass the claim name even if it had nothing to do with the search. So we can get more results now.
Solved with https://github.com/lbryio/lighthouse/pull/116