Claimtire build taking too much ram and causing a crash #104
Labels
No labels
ci
claimtrie
consider soon
Epic
good first issue
hacktoberfest
help wanted
mempool
mining
peer
priority: blocker
priority: high
priority: low
priority: medium
rpc
runtime
test
type: bug
type: discussion
type: improvement
type: new feature
type: refactor
type: task
ux
wallet
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: LBRYCommunity/lbcd#104
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
I'm trying to run a full lbcd node and synced up to height ~1 000 000 but now when i'm trying to run lbcd again i can't get past
height ~700 000 in the initial rebuilding of the claimtrie with it eating all my 16GB of ram + 16GB of swap and crashing.
The ram usage goes up very rapidly from around height 600 000.
Considering the blockchain is 1 400 000 blocks high at this moment how much ram is needed to run a full lbcd node ?
The readme says 8GB are needed but its already taking 32GB at height ~700 000 and i'm guessing the remaining blocks contain way more claims than the first.
I know the readme says that ram usage may increase over time but at this point if this is the expected usage i think the baseline should be updated to reflect more accurately the current state of things.
It took me ~50 mins to sync from 0 - 739,000 blocks with 1.4GB memory.
I remember even 1.3 million blocks back in January, the operational memory required was approximately 7GB.
Chances are your database might be corrupted, and the sync went rogue.
Remove the ~/.lbcd and re-sync and see if that changes.
You might also try to tweak some environment variables
GOGC
andGOMEMLIMIT
to get behavior that is friendlier to your 16GB memory + 16GB swap environment. The default settings areGOGC=100
andGOMEMLIMIT=math.MaxInt64
. For example,GOGC=100
means that starting from 1GB live memory usage, the program is allowed to allocate 100% more memory (doubling the heap to 2GiB) before the garbage collector scans the heap and releases unused memory. So out of your 32GB, half of that or more might be garbage.The claimtrie build process makes lots of temporary allocations which become garbage immediately. The live memory needed to store the claimtrie is around 7GB, but you might observe the
lbcd
process using up to 14GB at the given moment (from the OS perspective).See:
https://go.dev/doc/gc-guide#GOGC
https://go.dev/doc/gc-guide#Memory_limit
More on
GOMEMLIMIT
:https://pkg.go.dev/runtime/debug#SetMemoryLimit
Running
lbcd
with a command likeenv GOGC=50 GOMEMLIMIT=16GiB ./lbcd ...
would make the garbage collector more aggressive, especially above 16GiB heap usage. The cost of this is CPU time spent scanning the heap more often. The CPU time is usually not a big deal as long as there is 1 extra CPU core idle/available.Thank you.
i've tried to run it with
env GOGC=50 GOMEMLIMIT=16GiB ./lbcd
but i get the same result.I've tried deleting the .lbcd folder and am now in the process of re-syncing the whole blockchain with the lastest release of lbcd. When i'm close to height 1 000 000 like i was before i'll try building the claim trie again and report on my results
FYI: an M2 Max MacBook Pro took about 20 hours to sync to height 0 to 1,388,729 (2023-07-09 11:25:55 -0700 PDT)
Just to be clear my problem happened not when syncing the blockchain but at the "building the full claimtrie in ram" point when restarting the node after it had been synced.
But it looks like you were right. My database must have been corrupted, i deleted the wholde .lbcd folder and started with a fresh install from the latest version and have now synced to the height i was at before.
When i restart lbcd now the claimtrie build only takes about 7GB of ram like you told me.
Thanks everyone for your help
Hey. Just to say that the same bug is happening again. When i build the claimtrie it takes all my 16gb of rame + 16gb of swap. I'll delete the database again and resync since it solved it last time but it would be good to look into what's causing that issue
I confirm the issue also exist in my linux server with 16GB RAM.
This is pprof map: