Improve scribe sync #109

Merged
jackrobison merged 37 commits from faster-sync into master 2022-11-10 04:06:24 +01:00
jackrobison commented 2022-10-24 21:36:23 +02:00 (Migrated from github.com)

This branch significantly improves scribe sync time and scaling by doing the following:

  • Previously, RevertableOps would be verified (they they don't delete something that doesn't exist or put onto an existing value unexpectedly) as they were staged, this meant lots of single get lookups against the db, potentially redundantly if a key is touched multiple times. Now they are accumulated on a per-transaction basis before being verified in bulk using the multi_get api, which can be improved upon further.
  • Adds FutureEffectiveAmountPrefixRow column family to the db to eliminate needing to do an intensive iterate scan for one of the values needed to calculate takeovers.
  • Fully switches scribe to using batched lookups on EffectiveAmountPrefixRow and FutureEffectiveAmountPrefixRow for internal pending effective amount calculations instead of having looped iterate scans.

Bug fixes:

Deprecated:

  • Drops --elastic_host, --elastic_port, --elastic_notifier_host, --elastic_notifier_port arguments to herald

New features:

This branch significantly improves scribe sync time and scaling by doing the following: - Previously, RevertableOps would be verified (they they don't delete something that doesn't exist or put onto an existing value unexpectedly) as they were staged, this meant lots of single `get` lookups against the db, potentially redundantly if a key is touched multiple times. Now they are accumulated on a per-transaction basis before being verified in bulk using the `multi_get` api, which can be improved upon further. - Adds `FutureEffectiveAmountPrefixRow` column family to the db to eliminate needing to do an intensive `iterate` scan for one of the values needed to calculate takeovers. - Fully switches scribe to using batched lookups on `EffectiveAmountPrefixRow` and `FutureEffectiveAmountPrefixRow` for internal pending effective amount calculations instead of having looped `iterate` scans. Bug fixes: - Fixes a number of takeover / activation related edge case bugs that had not previously surfaced due to not having test coverage and them not causing integrity errors, upon adding the effective amounts indexes these bugs would cause them to underflow (go negative) where it should not be possible. - Fixes https://github.com/lbryio/hub/issues/97 - Fixes https://github.com/lbryio/hub/issues/104 Deprecated: - Drops `--elastic_host`, `--elastic_port`, `--elastic_notifier_host`, `--elastic_notifier_port` arguments to `herald` New features: - [Adds failover functionality for elasticsearch and elastic sync notifiers to herald](https://github.com/lbryio/hub/issues/74) using a simplified argument `--elastic_services`, which is comma separated and defaults to `--elastic_services=127.0.0.1:9200/127.0.0.1:19080`. - Adds an optional `--db_disable_integrity_checks` flag to `scribe` to make initial sync much faster.
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: LBRYCommunity/hub#109
No description provided.