scribe "error while processing txs: int too large to convert" at mainnet block 1176266 #124
Labels
No labels
area: database
area: documentation
area: elasticsearch
area: herald
area: packaging
area: scribe
consider soon
critical
dependencies
good first issue
hacktoberfest
help wanted
improvement
needs: repro
new feature
priority: blocker
priority: high
priority: low
priority: medium
type: bug
type: bug-fix
type: discussion
type: feature request
type: improvement
type: new feature
type: refactor
type: task
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: LBRYCommunity/hub#124
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Using the latest published docker image for
lbry/hub:master
(lbry/hub@sha256:7ad1a2f570f061c9a8e2db65a9fe18f978c12df913f66ff93b3a49d87e674f6e
) the mainnet sync stops with this backtrace:Please let me know what other info I can provide (I'm reasonably experienced with Python and docker, I could set up a local build to dig into it if needed).
is this project abandoned?
Any progress is on a volunteer unpaid basis at this point so much slower.
@jackrobison
I can't locate 7ad1a2f570f... in the history of the hub repo. Also, are you starting from a snapshot obtained from somewhere, or starting from zero blocks? It might be a flaw in the snapshot OR the code.
I suppose it's the other one then?
c4ebce1d42cc
(I'm not well versed in repository internals, I presumed the repo digest is the one I should provide)Is it worth setting up a local build and adding some prints / logging in there to see what exactly is too large?
Yes, you could gather logging about the claim hash involved and the other values. But be aware that debugging these things requires tracking the history of updates on the claim, perhaps going back 1000s of blocks. The first step is to identify the claim. Then add logging focusing on the history of that claim. The LBRY block explorer can also help you understand the history of the claim and locate the range of block numbers to focus on.
Another route would be to try to reproduce the problem starting from block 0 without any snapshot. This would be an attempt to eliminate the possibility that the snapshot is incompatible with the current code. This will take 2-3 weeks of runtime.
I had it print all things it's trying to put into
future_effective_amount
, and I have a strong suspicion that this is the culprit (the rest are all positive numbers):(the rest are all positive numbers); I don't know enough about the system to figure out if https://explorer.lbry.com/claims/12c3847a9155b524cd9cc199eb406e79e90b761a has anything wrong about it (
but the rest of the publisher's stuff looks oddnevermind, I was confused by the explorer's presentation, they have readable titles and all)as an update, that negative number is the result of a pre-existing amount and a negative delta that's larger than the amount; I took a snapshot (hopefully - did a
cp -al
on both es and rocksdb because I don't have space to waste on that machine) and forced the value to be positive in https://github.com/lbryio/hub/blob/master/hub/scribe/service.py#L1589 withmax(<expression>,0)
this hub is for personal use and experimentation, I'll restart it without a snapshot after I manage to make it work like this and get lbry-desktop to talk to it
I'll report back if it does the same when started from scratch
Claim created height 1174719 with .01 amount attached:
https://explorer.lbry.com/tx/a99e454a0ff41a62e89ece450d0aed4fd78d368a066423c1e4ba04e0039986d8#output-0
The original .01 amount spent at height 1174802:
https://explorer.lbry.com/tx/bbabe06c83647746781b72a31de181dbba4d1c246f74245bf93bb71c5de6b5a0#input-127637079
I'm not sure what's special about height 1176266. But there could be support amounts being added/removed pertaining to the claim 12c3847a9155b524cd9cc199eb406e79e90b761a which lie in the range 1174719..1176266. It's just not obvious in the block explorer.
The channel @SkyDomeAtlantis has a lot of videos named according to a scheme "trim.UUID". So the LBRY URL on these will be something like: https://odysee.com/@SkyDomeAtlantis:4/trim.AB556F9F-D56F-41F7-9DE0-8C05839AC3CF:8. The videos have normal looking titles/descriptions but the claim name involving a UUID is a little odd. One can choose a meaningful claim name to appear in the URL (like the title), but in this case it's machine-generated.
The channel seems to frequently be shifting amounts between claims, and possibly revising/re-uploading their videos. Consider the name "trim.AB556F9F-D56F-41F7-9DE0-8C05839AC3CF:8". It looks like it starts with a UUID, yet the suffix ":8" mean that the base name "trim.AB556F9F-D56F-41F7-9DE0-8C05839AC3CF" has been seen 7 times before. So maybe this one was the 8th revision based off the same file with name "trim.AB556F9F-D56F-41F7-9DE0-8C05839AC3CF".
It's similar to scenarios where underflow problems were seen before -- frequent updates to claim amounts with multiple claims under the same base name. There's a feature of the LBRY blockchain called Activation Delay that complicates calculation of effective amount and future effective amount. The activation delay comes into play precisely in these competing claim scenarios. It's just that the channel owner seems to be competing with themselves for control over the name.
@jackrobison
as an update: after it finished syncing with my crude fix, it kept following the chain for a while, then scribe crashed on another block; I started scribe from scratch (and with the original code instead of my modifications), and it seems it's up to date now, so that snapshot was broken somehow
Good to know... Do you know which snapshot you were starting from? The latest ones appear to be in subdirs named
block_NNNNNNN
. They were uploaded in Nov 2022, Jan 2023, and Feb 2023.https://snapshots.lbry.com/hub/
Based on the fact that the crash was observed at block 1176266, you must have been using an earlier one.