Improve --text search timeouts #71
Labels
No labels
area: database
area: documentation
area: elasticsearch
area: herald
area: packaging
area: scribe
consider soon
critical
dependencies
good first issue
hacktoberfest
help wanted
improvement
needs: repro
new feature
priority: blocker
priority: high
priority: low
priority: medium
type: bug
type: bug-fix
type: discussion
type: feature request
type: improvement
type: new feature
type: refactor
type: task
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: LBRYCommunity/hub#71
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
For example,
./lbrynet claim search --text="(\"silver\" + bitten)"
finally returns after the 5th try for me.
Modified
scripts/test_claim_search.py
and tested a few text queries against spvNN.lbry.com. The hubs that are responsive to connection are usually replying within the 10s timeout. But I did get one close call (9.9s), and one timeout.(Omitting the non-responsive spv11,12,13,14,15)
Second query:
Third query:
Perhaps the thing to be done here is spread the load around more. It looks like the SDK selects the one with the lowest latency SPVPong response. This could be misleading, as it doesn't account for elastic search latency and other things that might go into servicing hub RPCs.
Also, the hub performance could change with day of week, or time of day. I don't see a provision to react to deteriorated performance, or claim_search timeout by choosing a different hub.
Other ideas from ES documentation:
https://www.elastic.co/guide/en/elasticsearch/reference/8.3/tune-for-search-speed.html
https://www.elastic.co/guide/en/elasticsearch/reference/8.3/tune-for-search-speed.html#search-as-few-fields-as-possible
Hard to say what effect this would have. But 6 fields are being searched currently:
34c5ab2e56/hub/common.py (L907)
Another observation... The
--query_timeout_ms
(10s default) is passed into constructor AsyncElasticSearch()35483fa0b1/hub/herald/search.py (L62)
However, there are API-level timeout params accepted for individual calls to ES:
https://www.elastic.co/guide/en/elasticsearch/client/python-api/current/config.html#_api_and_server_timeouts
API-level timeout for search:
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-your-data.html#search-timeout
The
allow_partial_search_results
option (default true) means the search should never fail on hitting the (API-level) timeout, but return whatever it has available after the time-budget is exhausted:https://www.elastic.co/guide/en/elasticsearch/reference/8.3/search-search.html#search-search-api-query-params
Here's the search invocation (no timeout=X):
34c5ab2e56/hub/herald/search.py (L208)