Reduce claimtrie temp allocs. #51
No reviewers
Labels
No labels
ci
claimtrie
consider soon
Epic
good first issue
hacktoberfest
help wanted
mempool
mining
peer
priority: blocker
priority: high
priority: low
priority: medium
rpc
runtime
test
type: bug
type: discussion
type: improvement
type: new feature
type: refactor
type: task
ux
wallet
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: LBRYCommunity/lbcd#51
Loading…
Reference in a new issue
No description provided.
Delete branch "moodyjon/claimtrie_temp_allocs"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Unbundled from PR https://github.com/lbryio/lbcd/pull/43
See that for usage/testing.
Based on work by @BrannonKing in mem_pressure_try_2 and reduce_memory_pressure.
The commits
8482890759
and787098e2a2
are newly added to claw back costs from node cache and node.Clone().Pull Request Test Coverage Report for Build 2450737946
💛 - Coveralls
The tuning doesn't work out, both sync time and startup time got worse.
Sync time
Before (pr46):
After:
(Re)startup time
Before (pr46):
After:
Wow. 10 minutes for claimtrie build. :-( Something's not right.... That's so bad that it makes me suspect that there was some extra load on the machine at some point.
Can you rerun it with profile collection? Just the restart case at first to see whether 10min is reproducible.
./lbcd --memprofile=./after.mem --cpuprofile=./after.cpu
When I run it on an 8core arm64 Mac mini, it continues to be about 3 min (before) and 2.2min (after) for claimtrie build.
Before:
After:
Hm.. a re-run does perform closer to what it was. But no significant changes though.
Attach the profiled data.
after.zip
Does it show improvements on the sync time in your environment?
Yes, it's showing claimtrie build time improvement (2min after, 3min before), but not overall sync time AFAICT. It takes me many days to sync from a remote node.
Here is my run:
after2.zip
I tried to get a comparable one with 317s wall clock runtime. CPU profiles are very different. Yours shows a lot more CPU time consumed in GC, makeNameHashNext func1 and func2. The descendants of sstable Reader.readBlock() are very different. Some stuff like Sha256 does not even show up in mine under the default pprof options.
My hardware specs (16GB RAM 2TB SSD variant):
https://en.wikipedia.org/wiki/Mac_Mini#Technical_specifications_5
My guess is you've reached some hardware limitation (RAM throughput/latency?), so demonstrating improvement becomes impossible. Just like I can't see any improvement in overall sync time because fetching blocks from a remote node is the limiting factor.
I assume you are syncing blocks over LAN. What is your test hardware? SSD? CPUs? RAM type? What about isolation from other load? Is this a virtual machine?
Regarding sha256 performance discrepancy:
https://github.com/golang/go/tree/master/src/crypto/sha256
The performance you get depends on availability of special CPU instructions.
go tool pprof --svg --show_from sha256 XXX.cpu
Mine (arm64):
Yours (???):
Focusing on GC scanobject:
go tool pprof --svg --show_from scanobject XXX.cpu
Mine (arm64):
Yours (???):
I don't see anything obviously architecture-specific. But yours is registering a lot more time spent. The mem allocs report is ~28GB for yours and 26GB for mine, so GC should be triggering about the same number of times.
Thanks for your contributions @moodyjon!
Can you send me an email at jeremy@lbry.com? You're welcome to stay anonymous if that's your preference.
Demote to draft status as this is a sensitive code area. There is an outstanding bug being investigated at this time:
https://github.com/lbryio/lbcd/issues/71
View command line instructions
Checkout
From your project repository, check out a new branch and test the changes.Merge
Merge the changes and update on Forgejo.Warning: The "Autodetect manual merge" setting is not enabled for this repository, you will have to mark this pull request as manually merged afterwards.