Concurrent RPC requests against the ClaimTrie blocked critical section lock #126
Labels
No labels
area: devops
area: discovery
area: docs
area: livestream
area: proposal
consider soon
Epic
good first issue
hacktoberfest
hard fork
help wanted
icebox
Invalid
level: 0
level: 1
level: 2
level: 3
level: 4
needs: exploration
needs: grooming
needs: priority
needs: repro
needs: tech design
on hold
priority: blocker
priority: high
priority: low
priority: medium
resilience
soft fork
Tom's Wishlist
type: bug
type: discussion
type: improvement
type: new feature
type: refactor
type: task
type: testing
unplanned
work in progress
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: LBRYCommunity/lbrycrd#126
Loading…
Add table
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
In syncing the ClaimTrie in LbryCRD with Chainquery a bottleneck was identified where a mutex lock on the critical section was blocking concurrent RPC connections from accessing the claimtrie in the same RPC call. The example is the
getclaimsforname
call. The mutex lock is located in several places:Affected Code
getclaimsintrie
https://github.com/lbryio/lbrycrd/blob/master/src/rpc/claimtrie.cpp#L36
getclaimtrie
https://github.com/lbryio/lbrycrd/blob/master/src/rpc/claimtrie.cpp#L108
getvalueforname
https://github.com/lbryio/lbrycrd/blob/master/src/rpc/claimtrie.cpp#L182
getclaimsforname
https://github.com/lbryio/lbrycrd/blob/master/src/rpc/claimtrie.cpp#L308
getclaimbyid
https://github.com/lbryio/lbrycrd/blob/master/src/rpc/claimtrie.cpp#L373
getotalclaimednames
https://github.com/lbryio/lbrycrd/blob/master/src/rpc/claimtrie.cpp#L414
gettotalclaims
https://github.com/lbryio/lbrycrd/blob/master/src/rpc/claimtrie.cpp#L434
getotalvalueofclaims
https://github.com/lbryio/lbrycrd/blob/master/src/rpc/claimtrie.cpp#L456
getclaimsfortx
https://github.com/lbryio/lbrycrd/blob/master/src/rpc/claimtrie.cpp#L495
getnameproof
https://github.com/lbryio/lbrycrd/blob/master/src/rpc/claimtrie.cpp#L720
Proposal
These locks are locking a critical section. It is used in many areas of bitcoin. However, since we are dealing with the claimtrie, we should not use a critical section lock. We should implement a shared mutex that allows for RW/RO shared/exclusive access.
The processing of a block should require exclusive access to the claimtrie due to the ability to add, delete, update claimtrie nodes. Since the RPC calls do not modify but read the trie these should provide for shared access allowing concurrent reads of the claimtrie.
It appears this lock was simply copied from other bitcoin rpc calls.
Use Case
The use case is where you are processing claim specific information where you can get the claims in the trie with
getclaimsintrie
, however subsequent calls togetclaimsforname
are required to get certain fields likeeffective_amount
orvalid_at_height
.Reproduction 1 - Chainquery
chainquery.zip(linux)
run:
./chainquery serve
branch: https://github.com/lbryio/chainquery/tree/lbrycrdissue
code: https://github.com/lbryio/chainquery/blob/lbrycrdissue/daemon/jobs/claimtriesync.go#L21
Outputs time per call of iterating through the claim names and getting the claims for each name from lbrycrd.
Reproduction 2 - Apache Benchmark
This is a great ticket. Thank you @tiger5226 !
I'm planning to work on this in the upcoming week.
Excellent! Let me know if you need any assistance with the reproduction! Happy to help!
@shyba Also had a reproduction that did not make it into the issue but I wanted to get it from him for you, in case you would like to use an apache benchmark for the reproduction:
In looking at this, I see three approaches (in order of difficulty):
I'm planning to go with _1. This will add overhead to each of those lock calls; it assumes that the majority of our calls are reader operations and that each lock protects hundreds to thousands of CPU instructions. _2 would require some restructuring, and I'm not sure it would perform any better than _1. _3 would require more effort than I can commit to at this time.
As sub-options of _1:
Again, I'm planning sub-option _1, given the code precedent and my time constraints. Let me know if you have thoughts to the contrary or other ideas. Thanks.
We probably want @lyoshenka and @kaykurokawa to chime in here, but I would think we want a separate lock for the claimtrie ( option 2). I am not sure of the implications of option one as that is used for the blockchain ( blocks,transactions etc) rpc calls.
Other reasons I was leaning towards modifying cs_main: there are many places in main.cpp that could benefit from the reader/writer approach; it appears that most operations there are merely reading the present state. Very few places actually modify state. Hence, we would benefit at that level in addition to the RPC level. When the miner is running it does request the exclusive lock much more frequently, but that situation would be no worse than it is now. Second, many of the RPC methods use both pcoinsTip and pclaimTrie while main.cpp uses a combo of pcoinsTip and chainActive. I'm obviously new to this codebase; I don't have a firm grip on those relationships yet. However, I was hesitant to add in an additional lock to main.cpp. Nested locks always increase the deadlock risk.
Excellent points. It sounds more risky to add a lock than to extend the current locks features. I am not familiar enough with main.cp to make a call. @lyoshenka , @kaykurokawa will respond soon. I assign it to them so it shows up in their queues.
@kaykurokawa @lbrynaut @shyba can you guys please take a look at this and give Brannon some guidance? thanks :-)
I'm going to rule out 1) Modify the cs_main lock to be shareable (reader/writer)
We don't want to make big changes to any components that we inherited from Bitcoin Core unless if its absolutely necessary. This is because a) Bitcoin Core has far more eyes on it than we have and b) it will make it easier for us to pull upstream changes from Bitcoin Core
I think 2) is a much better option. I'll have to think a bit about how to best approach 2) but I just wanted to chime in with this thought for now.
I discovered that approach 1) is more difficult than I first thought: the recursive lock mechanism doesn't work out-of-the-box for shared locks. I've come to despise recursive locks (even though I've used them for years in C#). I've seen some recursive shared locks on the web; I think I'd have to massage one of those before I trusted it. The other angle would be to map out all the
LOCK(cs_main)
calls and eliminate the recursive usage; this is fairly invasive and should have been done upstream years ago.As for approach 2), I started on that here: https://github.com/BrannonKing/lbrycrd/tree/shared_locks_claimtrie . It still has several deadlocks in it. It struggles with the lack of a recursive shared lock as well; I think I can map out the call tree for that, though, as there are much fewer calls. I've been pondering on using a different type of critical section protection for that as well.
@BrannonKing I didn't review in detail, but this approach looks like a dead-end. Boost upgrade/shared locks are the way to go, but I strongly agree with @kaykurokawa that there are many subtleties to playing with the cs_main lock, and it's not a quick fix to replace/modify the usage of it. Adding more locks in the meantime is not beneficial.
@kaykurokawa ruled out 1 ( upgrade/shared locks) and it appears you are suggesting that while agreeing it is a big lift. So is the suggestion that this is not beneficial to solve? Just want to make sure @BrannonKing has the guidance he needs.
@tiger5226 Compared to the implementation attempt of 2), I was commenting that upgrade locks are preferable to additional locks/critical sections.
I'm new to this specific issue, so I'll throw out some thoughts that may have been discussed before. I believe the existing subtleties of the locking are so nuanced, there's a very good reason they haven't been changed upstream -- it's not laziness. Personally, I don't have my head wrapped around this anymore, though I tried in the past and realized it was really nasty (admittedly years ago so it may have improved ... but probably not). When one has a full understanding of the implications of the locking, it's near trivial to decide on what solution makes the most sense. Perhaps it's even what's there. But I would estimate that's a difficult understanding to achieve in the first place (not impossible, just a trade-off of resources).
Since most claimtrie accesses are made under existing locks and avoid races, it seems to me that adding additional locks won't serve much purpose. If the goal instead is to remove/replace existing locks, see the above thoughts. So maybe I just don't understand what the issue is, or what is being attempted?
But if I do, my vote is to wait until upstream resolves this a bit more and instead explore alternative approaches in the meantime (like load-balancing across multiple daemons, caching, or whatever makes sense for what's really needed).
All clear! Thanks a ton for the added thoughts, very helpful. I was able to work around this bottleneck rather easily by making a fraction of the calls. I am not sure anyone else is suffering from this issue, hence I don't believe there is any urgency.
@lyoshenka This makes sense and we probably don't want to take on the risk. Thoughts?
I like the suggestion for load balancing daemons. I think a Docker Swarm or Kubernetes approach would work very well. Also, for the situation where you need to make multiple RPC calls for a single piece of data -- it makes a lot of sense to make a new RPC call that returns all the data that you need in a single pull. I think those are both great workarounds.
I did want to understand how recursive the cs_main mutex really was. I modified it to not be recursive and made the recursion checks at the individual locations. You can see my work here: https://github.com/BrannonKing/lbrycrd/tree/testNonRecursiveCsMain . I didn't finish the exercise; the daemon does not run, but it's close. Most of the tests run. Conclusively, almost all lock calls are recursive. As an aside, there is a unit test failure that seems legitimate; I left a comment in checkqueue.h about that. Also, see my fix in claimtrie_tests.cpp.
With that exercise I discovered that most of the recursive lock calls come from the unit test scenarios more than the actual code. At the same time, this library has no definitive public API. If any method is an entry point, who is responsible for the necessary preconditioning of each method?
I still intend to run an exercise using a shared recursive lock.
I did make a branch with a recursive shared mutex for cs_main: https://github.com/BrannonKing/lbrycrd/tree/use_recursive_shared_lock. It compiles and runs successfully using some code I acquired and extended from CodeReview. I've been trying to benchmark it using the apache benchmark (ab) call shown above. I'm not sure how much data I would have to put in to make the trie lookup take any significant time. Can I just run the cli generate command for that? I've added 10k, but the lookup still takes hardly no time. Or does somebody with a lot of data want to run my branch?
I'm not sure we should merge this but I believe @tiger5226 should at least try this branch ( https://github.com/BrannonKing/lbrycrd/tree/use_recursive_shared_lock ) out and let us know if he seems any improvements and works properly for his chainquery application.
@BrannonKing I would not mind testing it out with the chainquery branch. However, there is another issue blocking me from building lbrycrd on my mac pro. If you can get me the binary I can test it out.
@kaykurokawa maybe it's time to sort out that mac build issue?
About me, code written by @BrannonKing looks good, however recursive shared lock is a mess as design. I would do protect only shared resources, i.e. like this
@bvbfan , if you look through the code, particularly in the main header, you'll see that many of the mutexes protect multiple data structures. That's quite unfortunate -- it makes it impossible to do as you suggest and make the protection explicit in code. I would prefer to take things a step farther; use concurrent or immutable data structures. IOW, fully encapsulate the mutexes into the structures that rely on them. However, it seems that the current goal is to keep the codebase close to the original bitcoin fork so that newer bitcoin codebase fixes can be merged in. (Hence, we're free to use better structures in the LBRY portion of the code, and should work to keep that as untangled in the bitcoin codebase as possible.)
The problem with readers/writers is that you can't be sure whatever thread is read or write. So let's see CCoinsViewCache::AccessCoins through CCoinsViewCache::FetchCoins it can, potentially, write to cacheCoins, it can be done while other thread reads it -> data race, right?
Right approach should be - accessing structure through mutex, accessing element by copying.
Waiting on #174
While it's true that using a custom read/write recursive mutex was workable, it was a fairly large enhancement on existing code. It wasn't clear that this was a maintainable path going forward. With much of the data now easily accessible via sqlite, this seems less valuable. It could be revisited if and when upstream no longer uses a recursive mutex for cs_main