made two scripts to bring in many claims to regtest

BrannonKing commented

2018-08-27 23:48:32 +02:00

(Migrated from github.com)

Usage:

Scenario 1:

With mainnet server running: ./lbrycrd-cli getclaimsintrie > intrie.txt
with regtest server running: ../contrib/devtools/import_claims_from_claimsintrie_output.py ./lbrycrd/src/lbrycrd-cli < intrie.txt

Scenario 2:

with regtest server running: ./import_claims_from_name_per_line.py ../../src/lbrycrd-cli < /usr/share/dict/american-english

Usage: Scenario 1: 1. With mainnet server running: ./lbrycrd-cli getclaimsintrie > intrie.txt 2. with regtest server running: ../contrib/devtools/import_claims_from_claimsintrie_output.py ./lbrycrd/src/lbrycrd-cli < intrie.txt Scenario 2: 1. with regtest server running: ./import_claims_from_name_per_line.py ../../src/lbrycrd-cli < /usr/share/dict/american-english

BrannonKing commented

2018-08-27 23:54:39 +02:00

(Migrated from github.com)

The proposed approach is not a quick copy. I spent quite a bit of time tracking through the performance issues associated with this but could see no obvious wins. See:

The proposed approach is not a quick copy. I spent quite a bit of time tracking through the performance issues associated with this but could see no obvious wins. See: ![screenshot from 2018-08-27 13-58-46](https://user-images.githubusercontent.com/1509322/44688351-3aa72a80-aa11-11e8-903b-ab8443b7a99b.png)

bvbfan commented

2018-08-28 07:01:36 +02:00

(Migrated from github.com)

To me, it looks like slow down functions are (based on you sceenshot)
CCryptoKeyStore::HaveKey
isMine
CWallet::AvailableCoins
CWallet::CreateTransaction

To me, it looks like slow down functions are (based on you sceenshot) CCryptoKeyStore::HaveKey isMine CWallet::AvailableCoins CWallet::CreateTransaction

bvbfan commented

2018-09-21 14:43:10 +02:00

(Migrated from github.com)

The problem is that in CWallet::AvailableCoins we can have potentially O(N*M) calls to IsMine which calls to CBasicKeyStore::HaveKey where we can find a recursive mutex. That's extremely downsides look-ups furthermore recursive mutex is even slower than normal one. We have calls to HaveKey also in CWallet::GetKeyFromPool -> CWallet::ReserveKeyFromKeyPool. Since we don't own mutex so it's create -> release recursive mutex every time, one possible solution is to owns cs_KeyStore earlier before first loop in AvailableCoins, some kind of LOCK3 (which is not present). With C++11 atomic will be great improvement for variables but for containers still not. We can use boost::shared_mutex for multiple-readers / single-writer pattern.

BrannonKing commented

2018-09-21 16:37:30 +02:00

(Migrated from github.com)

The LOCK2 is just two LOCK calls; it's okay to lock a third right after. If we call LOCK on a recursive mutex that is already owned, is that faster?

What can we do to reduce the HaveKey time? Can we cache the results (per LOCK of cs_main)? Is it getting called multiple times with the same input?

The LOCK2 is just two LOCK calls; it's okay to lock a third right after. If we call LOCK on a recursive mutex that is already owned, is that faster? What can we do to reduce the HaveKey time? Can we cache the results (per LOCK of cs_main)? Is it getting called multiple times with the same input?

bvbfan commented

2018-09-21 18:49:27 +02:00

(Migrated from github.com)

The LOCK2 is just two LOCK calls;

But it shouldn't, the idea behind that is atomic lock of more than one mutexes to avoid deadlock
https://en.cppreference.com/w/cpp/thread/lock

it's okay to lock a third right after.

You can try it.

If we call LOCK on a recursive mutex that is already owned, is that faster?

Sure, the slower part is acquiring

What can we do to reduce the HaveKey time? Can we cache the results (per LOCK of cs_main)? Is it getting called multiple times with the same input?

Yes, you can minimize calls to HaveKey by making a map <key, result> in AvailableCoins before first loop, even it can be local variable not guarded by any mutex.

If you want help in implementation i can give a try to make it.

> The LOCK2 is just two LOCK calls; But it shouldn't, the idea behind that is atomic lock of more than one mutexes to avoid deadlock https://en.cppreference.com/w/cpp/thread/lock > it's okay to lock a third right after. You can try it. > If we call LOCK on a recursive mutex that is already owned, is that faster? Sure, the slower part is acquiring > What can we do to reduce the HaveKey time? Can we cache the results (per LOCK of cs_main)? Is it getting called multiple times with the same input? Yes, you can minimize calls to HaveKey by making a map <key, result> in AvailableCoins before first loop, even it can be local variable not guarded by any mutex. If you want help in implementation i can give a try to make it.

BrannonKing commented

2018-09-27 17:52:27 +02:00

(Migrated from github.com)

I don't think this approach is going to be sufficient. It's just too slow. Last night I exported 350k claims from mainnet. I broke the file into quarters. I then ran import scripts on all four in parallel. Ten hours later We had 100k blocks and about 100k claims imported. However, the import rate had slowed to about 200/minute, which says we need another 21 hours to complete this. However, it's probably going to continue to slow as the tree gets more nodes.

BrannonKing commented

2019-02-12 23:17:54 +01:00

(Migrated from github.com)

Not only is this approach insufficient from a performance standpoint, it's also insufficient on its data. It needs to bring in the real values from mainnet. To do that, it has to parse the metadata on mainnet and replace the claimIds with updated inserts.

Pull request closed

Please reopen this pull request to perform a merge.

Rows
Columns

made two scripts to bring in many claims to regtest #198

Pull request closed