Improve --text search timeouts #71

Open
opened 2022-08-12 22:47:48 +02:00 by jessopb · 5 comments
jessopb commented 2022-08-12 22:47:48 +02:00 (Migrated from github.com)

For example,
./lbrynet claim search --text="(\"silver\" + bitten)"
finally returns after the 5th try for me.

For example, `./lbrynet claim search --text="(\"silver\" + bitten)"` finally returns after the 5th try for me.
moodyjon commented 2022-08-16 20:38:54 +02:00 (Migrated from github.com)

Modified scripts/test_claim_search.py and tested a few text queries against spvNN.lbry.com. The hubs that are responsive to connection are usually replying within the 10s timeout. But I did get one close call (9.9s), and one timeout.

(Omitting the non-responsive spv11,12,13,14,15)

(lbry-venv-3.9) swdev1@Jonathans-Mac-mini lbry-sdk % python3 ./scripts/test_claim_search.py
{
    "server": "('spv16.lbry.com', 50001)",
    "time": 0.7056792500000002,
    "error": null,
    "args": {
        "no_totals": false,
        "page_size": 100,
        "page": 1,
        "text": "(\"silver\" + bitten)"
    },
    "offset": 0,
    "total": 12,
    "blocked_total": 0
}
{
    "server": "('spv17.lbry.com', 50001)",
    "time": 0.20056029199999958,
    "error": null,
    "args": {
        "no_totals": false,
        "page_size": 100,
        "page": 1,
        "text": "(\"silver\" + bitten)"
    },
    "offset": 0,
    "total": 12,
    "blocked_total": 0
}
{
    "server": "('spv18.lbry.com', 50001)",
    "time": 0.817374,
    "error": null,
    "args": {
        "no_totals": false,
        "page_size": 100,
        "page": 1,
        "text": "(\"silver\" + bitten)"
    },
    "offset": 0,
    "total": 12,
    "blocked_total": 0
}
{
    "server": "('spv19.lbry.com', 50001)",
    "time": 0.3833993339999999,
    "error": null,
    "args": {
        "no_totals": false,
        "page_size": 100,
        "page": 1,
        "text": "(\"silver\" + bitten)"
    },
    "offset": 0,
    "total": 12,
    "blocked_total": 0
}

Second query:

(lbry-venv-3.9) swdev1@Jonathans-Mac-mini lbry-sdk % python3 ./scripts/test_claim_search.py
{
    "server": "('spv16.lbry.com', 50001)",
    "time": 1.4023516669999996,
    "error": null,
    "args": {
        "no_totals": false,
        "page_size": 100,
        "page": 1,
        "text": "(cord + extension)"
    },
    "offset": 0,
    "total": 399,
    "blocked_total": 0
}
{
    "server": "('spv18.lbry.com', 50001)",
    "time": 1.0191300420000005,
    "error": null,
    "args": {
        "no_totals": false,
        "page_size": 100,
        "page": 1,
        "text": "(cord + extension)"
    },
    "offset": 0,
    "total": 399,
    "blocked_total": 0
}
{
    "server": "('spv19.lbry.com', 50001)",
    "time": 0.3879024590000011,
    "error": null,
    "args": {
        "no_totals": false,
        "page_size": 100,
        "page": 1,
        "text": "(cord + extension)"
    },
    "offset": 0,
    "total": 399,
    "blocked_total": 0
}
{
    "server": "('spv17.lbry.com', 50001)",
    "time": 9.945044459000002,
    "error": null,
    "args": {
        "no_totals": false,
        "page_size": 100,
        "page": 1,
        "text": "(cord + extension)"
    },
    "offset": 0,
    "total": 399,
    "blocked_total": 0
}

Third query:

(lbry-venv-3.9) swdev1@Jonathans-Mac-mini lbry-sdk % python3 ./scripts/test_claim_search.py
{
    "server": "('spv16.lbry.com', 50001)",
    "time": 3.2578703750000004,
    "error": null,
    "args": {
        "no_totals": false,
        "page_size": 100,
        "page": 1,
        "text": "foo + bar | baz"
    },
    "offset": 0,
    "total": 1000,
    "blocked_total": 0
}
{
    "server": "('spv19.lbry.com', 50001)",
    "time": 0.6107135420000009,
    "error": null,
    "args": {
        "no_totals": false,
        "page_size": 100,
        "page": 1,
        "text": "foo + bar | baz"
    },
    "offset": 0,
    "total": 1000,
    "blocked_total": 0
}
{
    "server": "('spv18.lbry.com', 50001)",
    "time": 1.7782999579999998,
    "error": null,
    "args": {
        "no_totals": false,
        "page_size": 100,
        "page": 1,
        "text": "foo + bar | baz"
    },
    "offset": 0,
    "total": 1000,
    "blocked_total": 0
}
Wallet server (spv17.lbry.com:50001) returned an error. Code: -32000 Message: query timed out
{
    "server": "('spv17.lbry.com', 50001)",
    "time": 11.295952083,
    "error": "(-32000, 'query timed out')",
    "args": {
        "no_totals": false,
        "page_size": 100,
        "page": 1,
        "text": "foo + bar | baz"
    },
    "offset": null,
    "total": null,
    "blocked_total": null
}
Modified `scripts/test_claim_search.py` and tested a few text queries against spvNN.lbry.com. The hubs that are responsive to connection are usually replying within the 10s timeout. But I did get one close call (9.9s), and one timeout. (Omitting the non-responsive spv11,12,13,14,15) ``` (lbry-venv-3.9) swdev1@Jonathans-Mac-mini lbry-sdk % python3 ./scripts/test_claim_search.py { "server": "('spv16.lbry.com', 50001)", "time": 0.7056792500000002, "error": null, "args": { "no_totals": false, "page_size": 100, "page": 1, "text": "(\"silver\" + bitten)" }, "offset": 0, "total": 12, "blocked_total": 0 } { "server": "('spv17.lbry.com', 50001)", "time": 0.20056029199999958, "error": null, "args": { "no_totals": false, "page_size": 100, "page": 1, "text": "(\"silver\" + bitten)" }, "offset": 0, "total": 12, "blocked_total": 0 } { "server": "('spv18.lbry.com', 50001)", "time": 0.817374, "error": null, "args": { "no_totals": false, "page_size": 100, "page": 1, "text": "(\"silver\" + bitten)" }, "offset": 0, "total": 12, "blocked_total": 0 } { "server": "('spv19.lbry.com', 50001)", "time": 0.3833993339999999, "error": null, "args": { "no_totals": false, "page_size": 100, "page": 1, "text": "(\"silver\" + bitten)" }, "offset": 0, "total": 12, "blocked_total": 0 } ``` Second query: ``` (lbry-venv-3.9) swdev1@Jonathans-Mac-mini lbry-sdk % python3 ./scripts/test_claim_search.py { "server": "('spv16.lbry.com', 50001)", "time": 1.4023516669999996, "error": null, "args": { "no_totals": false, "page_size": 100, "page": 1, "text": "(cord + extension)" }, "offset": 0, "total": 399, "blocked_total": 0 } { "server": "('spv18.lbry.com', 50001)", "time": 1.0191300420000005, "error": null, "args": { "no_totals": false, "page_size": 100, "page": 1, "text": "(cord + extension)" }, "offset": 0, "total": 399, "blocked_total": 0 } { "server": "('spv19.lbry.com', 50001)", "time": 0.3879024590000011, "error": null, "args": { "no_totals": false, "page_size": 100, "page": 1, "text": "(cord + extension)" }, "offset": 0, "total": 399, "blocked_total": 0 } { "server": "('spv17.lbry.com', 50001)", "time": 9.945044459000002, "error": null, "args": { "no_totals": false, "page_size": 100, "page": 1, "text": "(cord + extension)" }, "offset": 0, "total": 399, "blocked_total": 0 } ``` Third query: ``` (lbry-venv-3.9) swdev1@Jonathans-Mac-mini lbry-sdk % python3 ./scripts/test_claim_search.py { "server": "('spv16.lbry.com', 50001)", "time": 3.2578703750000004, "error": null, "args": { "no_totals": false, "page_size": 100, "page": 1, "text": "foo + bar | baz" }, "offset": 0, "total": 1000, "blocked_total": 0 } { "server": "('spv19.lbry.com', 50001)", "time": 0.6107135420000009, "error": null, "args": { "no_totals": false, "page_size": 100, "page": 1, "text": "foo + bar | baz" }, "offset": 0, "total": 1000, "blocked_total": 0 } { "server": "('spv18.lbry.com', 50001)", "time": 1.7782999579999998, "error": null, "args": { "no_totals": false, "page_size": 100, "page": 1, "text": "foo + bar | baz" }, "offset": 0, "total": 1000, "blocked_total": 0 } Wallet server (spv17.lbry.com:50001) returned an error. Code: -32000 Message: query timed out { "server": "('spv17.lbry.com', 50001)", "time": 11.295952083, "error": "(-32000, 'query timed out')", "args": { "no_totals": false, "page_size": 100, "page": 1, "text": "foo + bar | baz" }, "offset": null, "total": null, "blocked_total": null } ```
moodyjon commented 2022-08-16 21:14:59 +02:00 (Migrated from github.com)

Perhaps the thing to be done here is spread the load around more. It looks like the SDK selects the one with the lowest latency SPVPong response. This could be misleading, as it doesn't account for elastic search latency and other things that might go into servicing hub RPCs.

Also, the hub performance could change with day of week, or time of day. I don't see a provision to react to deteriorated performance, or claim_search timeout by choosing a different hub.

Perhaps the thing to be done here is spread the load around more. It looks like the SDK selects the one with the lowest latency SPVPong response. This could be misleading, as it doesn't account for elastic search latency and other things that might go into servicing hub RPCs. Also, the hub performance could change with day of week, or time of day. I don't see a provision to react to deteriorated performance, or claim_search timeout by choosing a different hub.
moodyjon commented 2022-08-17 19:54:18 +02:00 (Migrated from github.com)
Other ideas from ES documentation: https://www.elastic.co/guide/en/elasticsearch/reference/8.3/tune-for-search-speed.html https://www.elastic.co/guide/en/elasticsearch/reference/8.3/tune-for-search-speed.html#search-as-few-fields-as-possible Hard to say what effect this would have. But 6 fields are being searched currently: https://github.com/lbryio/hub/blob/34c5ab2e56c8f3234ac068ec0f40ac339a3f8708/hub/common.py#L907
moodyjon commented 2022-08-17 20:28:29 +02:00 (Migrated from github.com)

Another observation... The --query_timeout_ms (10s default) is passed into constructor AsyncElasticSearch()

35483fa0b1/hub/herald/search.py (L62)

However, there are API-level timeout params accepted for individual calls to ES:

https://www.elastic.co/guide/en/elasticsearch/client/python-api/current/config.html#_api_and_server_timeouts

API-level timeout for search:
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-your-data.html#search-timeout

The allow_partial_search_results option (default true) means the search should never fail on hitting the (API-level) timeout, but return whatever it has available after the time-budget is exhausted:

https://www.elastic.co/guide/en/elasticsearch/reference/8.3/search-search.html#search-search-api-query-params

Another observation... The `--query_timeout_ms` (10s default) is passed into constructor AsyncElasticSearch() https://github.com/lbryio/hub/blob/35483fa0b1d3eb044659fc3c294317664dcaa431/hub/herald/search.py#L62 However, there are ***API-level*** timeout params accepted for individual calls to ES: https://www.elastic.co/guide/en/elasticsearch/client/python-api/current/config.html#_api_and_server_timeouts API-level timeout for search: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-your-data.html#search-timeout The `allow_partial_search_results` option (default true) means the search should never fail on hitting the (API-level) timeout, but return whatever it has available after the time-budget is exhausted: https://www.elastic.co/guide/en/elasticsearch/reference/8.3/search-search.html#search-search-api-query-params
moodyjon commented 2022-08-17 20:33:33 +02:00 (Migrated from github.com)

Here's the search invocation (no timeout=X):
34c5ab2e56/hub/herald/search.py (L208)

Here's the search invocation (no timeout=X): https://github.com/lbryio/hub/blob/34c5ab2e56c8f3234ac068ec0f40ac339a3f8708/hub/herald/search.py#L208
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: LBRYCommunity/hub#71
No description provided.