cfaac2a60 Add build support for 'gprof' profiling. (murrayn)
Pull request description:
Support for profiling build: `./configure --enable-profiling`
Tree-SHA512: ea983cfce385f1893bb4ab7f94ac141b7d620951dc430da3bbc92ae1357fb05521eac689216e66dc87040171a8a57e76dd7ad98036e12a2896cfe5ab544347f0
13a399a46 depends: patch pthread_set_name_np out of zeromq (Cory Fields)
8f7922636 depends: zeromq 4.2.3 (fanquake)
Pull request description:
This is a followup to #9254 and #11981. Zeromq 4.2.3 was released just after #9254 was merged, and contains a years worth of improvements/bug fixes. See the release notes [here](https://github.com/zeromq/libzmq/releases/tag/v4.2.3).
Todo:
- [ ] Add zeromq-4.2.3.tar.gz to /depends-sources on bitcoincore.org
- [ ] Verify gitian builds are still OK
- [ ] Check: https://github.com/zeromq/libzmq/pull/2787
Tree-SHA512: 85e06f47be3e1fdedcee50ce90e3391d69df2ea1c167472ffc3126d8970d418eb75141b970e422eb2fda9a8cad00e6ba5b36afa53565171a9ebaa152a9dc9b60
937bf4335 Use std:🧵:hardware_concurrency, instead of Boost, to determine available cores (fanquake)
Pull request description:
Following discussion on IRC about replacing Boost usage for detecting available system cores, I've opened this to collect some benchmarks + further discussion.
The current method for detecting available cores was introduced in #6361.
Recap of the IRC chat:
```
21:14:08 fanquake: Since we seem to be giving Boost removal a good shot for 0.15, does anyone have suggestions for replacing GetNumCores?
21:14:26 fanquake: There is std:🧵:hardware_concurrency(), but that seems to count virtual cores, which I don't think we want.
21:14:51 BlueMatt: fanquake: I doubt we'll do boost removal for 0.15
21:14:58 BlueMatt: shit like BOOST_FOREACH, sure
21:15:07 BlueMatt: but all of boost? doubtful, there are still things we need
21:16:36 fanquake: Yea sorry, not the whole lot, but we can remove a decent chunk. Just looking into what else needs to be done to replace some of the less involved Boost usage.
21:16:43 BlueMatt: fair
21:17:14 wumpus: yes, it makes sense to plan ahead a bit, without immediately doing it
21:18:12 wumpus: right, don't count virtual cores, that used to be the case but it makes no sense for our usage
21:19:15 wumpus: it'd create a swarm of threads overwhelming any machine with hyperthreading (+accompanying thread stack overhead), for script validation, and there was no gain at all for that
21:20:03 sipa: BlueMatt: don't worry, there is no hurry
21:59:10 morcos: wumpus: i don't think that is correct
21:59:24 morcos: suppose you have 4 cores (8 virtual cores)
21:59:24 wumpus: fanquake: indeed seems that std has no equivalent to physical_concurrency, on any standard. That's annoying as it is non-trivial to implement
21:59:35 morcos: i think running par=8 (if it let you) would be notably faster
21:59:59 morcos: jeremyrubin and i discussed this at length a while back... i think i commented about it on irc at the time
22:00:21 wumpus: morcos: I think the conclusion at the time was that it made no difference, but sure would make sense to benchmark
22:00:39 morcos: perhaps historical testing on the virtual vs actual cores was polluted by concurrency issues that have now improved
22:00:47 wumpus: I think there are not more ALUs, so there is not really a point in having more threads
22:01:40 wumpus: hyperthreads are basically just a stored register state right?
22:02:23 sipa: wumpus: yes but it helps the scheduler
22:02:27 wumpus: in which case the only speedup using "number of cores" threads would give you is, possibly, excluding other software from running on the cores on the same time
22:02:37 morcos: well this is where i get out of my depth
22:02:50 sipa: if one of the threads is waiting on a read from ram, the other can use the arithmetic unit for example
22:02:54 morcos: wumpus: i'm pretty sure though that the speed up is considerably more than what you might expect from that
22:02:59 wumpus: sipa: ok, I back down, I didn't want to argue this at all
22:03:35 morcos: the reason i haven't tested it myself, is the machine i usually use has 16 cores... so not easy due to remaining concurrency issues to get much more speedup
22:03:36 wumpus: I'm fine with restoring it to number of virtual threads if that's faster
22:03:54 morcos: we should have somene with 4 cores (and  actually test it though, i agree
22:03:58 sipa: i would expect (but we should benchmark...) that if 8 scriot validation threads instead of 4 on a quadcore hyperthreading is not faster, it's due to lock contention
22:04:20 morcos: sipa: yeah thats my point, i think lock contention isn't that bad with 8 now
22:04:22 wumpus: on 64-bit systems the additional thread overhead wouldn't be important at least
22:04:23 gmaxwell: I previously benchmarked, a long time ago, it was faster.
22:04:33 gmaxwell: (to use the HT core count)
22:04:44 wumpus: why was this changed at all then?
22:04:47 wumpus: I'm confused
22:05:04 sipa: good question!
22:05:06 gmaxwell: I had no idea we changed it.
22:05:25 wumpus: sigh 
22:05:54 gmaxwell: What PR changed it?
22:06:51 gmaxwell: In any case, on 32-bit it's probably a good tradeoff... the extra ram overhead is worth avoiding.
22:07:22 wumpus: https://github.com/bitcoin/bitcoin/pull/6361
22:07:28 gmaxwell: PR 6461 btw.
22:07:37 gmaxwell: er lol at least you got it right.
22:07:45 wumpus: the complaint was that systems became unsuably slow when using that many thread
22:07:51 wumpus: so at least I got one thing right, woohoo
22:07:55 sipa: seems i even acked it!
22:07:57 BlueMatt: wumpus: there are more alus
22:08:38 BlueMatt: but we need to improve lock contention first
22:08:40 morcos: anywya, i think in the past the lock contention made 8 threads regardless of cores a bit dicey.. now that is much better (although more still to be done)
22:09:01 BlueMatt: or we can just merge #10192, thats fee
22:09:04 gribble: https://github.com/bitcoin/bitcoin/issues/10192 | Cache full script execution results in addition to signatures by TheBlueMatt · Pull Request #10192 · bitcoin/bitcoin · GitHub
22:09:11 BlueMatt: s/fee/free/
22:09:21 morcos: no, we do not need to improve lock contention first. but we should probably do that before we increase the max beyond 16
22:09:26 BlueMatt: then we can toss concurrency issues out the window and get more speedup anyway
22:09:35 gmaxwell: wumpus: yea, well in QT I thought we also diminished the count by 1 or something? but yes, if the motivation was to reduce how heavily the machine was used, thats fair.
22:09:56 sipa: the benefit of using HT cores is certainly not a factor 2
22:09:58 wumpus: gmaxwell: for the default I think this makes a lot of sense, yes
22:10:10 gmaxwell: morcos: right now on my 24/28 physical core hosts going beyond 16 still reduces performance.
22:10:11 wumpus: gmaxwell: do we also restrict the maximum par using this? that'd make less sense
22:10:51 wumpus: if someone *wants* to use the virtual cores they should be able to by setting -par=
22:10:51 sipa: *flies to US*
22:10:52 BlueMatt: sipa: sure, but the shared cache helps us get more out of it than some others, as morcos points out
22:11:30 BlueMatt: (because it means our thread contention issues are less)
22:12:05 morcos: gmaxwell: yeah i've been bogged down in fee estimation as well (and the rest of life) for a while now.. otherwise i would have put more effort into jeremy's checkqueue
22:12:36 BlueMatt: morcos: heh, well now you can do other stuff while the rest of us get bogged down in understanding fee estimation enough to review it 
22:12:37 wumpus: [to answer my own question: no, the limit for par is MAX_SCRIPTCHECK_THREADS, or 16]
22:12:54 morcos: but to me optimizing for more than 16 cores is pretty valuable as miners could use beefy machines and be less concerned by block validation time
22:14:38 BlueMatt: morcos: i think you may be surprised by the number of mining pools that are on VPSes that do not have 16 cores 
22:15:34 gmaxwell: I assume right now most of the time block validation is bogged in the parts that are not as concurrent. simple because caching makes the concurrent parts so fast. (and soon to hopefully increase with bluematt's patch)
22:17:55 gmaxwell: improving sha2 speed, or transaction malloc overhead are probably bigger wins now for connection at the tip than parallelism beyond 16 (though I'd like that too).
22:18:21 BlueMatt: sha2 speed is big
22:18:27 morcos: yeah lots of things to do actually...
22:18:57 gmaxwell: BlueMatt: might be a tiny bit less big if we didn't hash the block header 8 times for every block. 
22:21:27 BlueMatt: ehh, probably, but I'm less rushed there
22:21:43 BlueMatt: my new cache thing is about to add a bunch of hashing
22:21:50 BlueMatt: 1 sha round per tx
22:22:25 BlueMatt: and sigcache is obviously a ton
```
Tree-SHA512: a594430e2a77d8cc741ea8c664a2867b1e1693e5050a4bbc8511e8d66a2bffe241a9965f6dff1e7fbb99f21dd1fdeb95b826365da8bd8f9fab2d0ffd80d5059c
6058766de Remove deprecated PyZMQ call from Python ZMQ example (Michał Zabielski)
Pull request description:
PyZMQ 17.0.0 has deprecated and removed zmq.asyncio.install() call
with advice to use asyncio native run-loop instead of zmq specific.
This caused exception when running the contrib/zmq/zmq_sub*.py examples.
This commit simply follows the advice and fixes mentioned examples.
Tree-SHA512: af357aafa5eb9506cfa3f513f06979bbc49f6132fddc1e96fbcea175da4f8e2ea298be7c7055e7d3377f0814364e13bb88b5c195f6a07898cd28c341d23a93c5
e2b2e48b6 doc: SIGNER can contains space inside now. (Ken Lee)
Pull request description:
SIGNER can contains space inside now. mentioned in https://github.com/bitcoin/bitcoin/pull/12527#issuecomment-370506416
Tree-SHA512: 8da1e8146751457c351058c0142fa3d474a338fe7304a31ebed4726202202724aaca94441806458512259238a703601b87961196abc3fdd57b5eb0f062ff0c12
741f0177c Add DynamicMemoryUsage() to LevelDB (Evan Klitzke)
Pull request description:
This adds a new method `CDBWrapper::DynamicMemoryUsage()` similar to Bitcoin's existing methods of the same name. It's implemented by asking LevelDB for the information, and then parsing the string response. I've also added logging to `CDBWrapper::WriteBatch()` to track this information:
```
$ tail -f ~/.bitcoin/testnet3/debug.log | grep WriteBatch
2018-03-05 19:34:55 WriteBatch memory usage: db=chainstate, before=0.0MiB, after=0.0MiB
2018-03-05 19:35:17 WriteBatch memory usage: db=index, before=0.0MiB, after=0.0MiB
2018-03-05 19:35:17 WriteBatch memory usage: db=chainstate, before=0.0MiB, after=8.0MiB
2018-03-05 19:35:22 WriteBatch memory usage: db=index, before=0.0MiB, after=0.0MiB
2018-03-05 19:35:22 WriteBatch memory usage: db=chainstate, before=8.0MiB, after=17.0MiB
2018-03-05 19:35:26 WriteBatch memory usage: db=index, before=0.0MiB, after=0.0MiB
2018-03-05 19:35:27 WriteBatch memory usage: db=chainstate, before=9.0MiB, after=18.0MiB
2018-03-05 19:35:40 WriteBatch memory usage: db=index, before=0.0MiB, after=0.0MiB
2018-03-05 19:35:41 WriteBatch memory usage: db=chainstate, before=9.0MiB, after=7.0MiB
2018-03-05 19:35:52 WriteBatch memory usage: db=index, before=0.0MiB, after=0.0MiB
2018-03-05 19:35:52 WriteBatch memory usage: db=chainstate, before=7.0MiB, after=9.0MiB
^C
```
As LevelDB doesn't seem to provide a way to get the database name, I've also added a new `m_name` field to the `CDBWrapper`. This is necessary because we have multiple LevelDB databases (two now, and possibly more later, e.g. #11857).
I am using this information in other branches where I'm experimenting with changing LevelDB buffer sizes.
Tree-SHA512: 7ea8ff5484bb07ef806af17d000c74ccca27d2e0f6c3229e12d93818f00874553335d87428482bd8acbcae81ea35aef2a243326f9fccbfac25989323d24391b4
874e81808 Allow dustrelayfee to be set to zero (Luke Dashjr)
Pull request description:
I don't see and can't think of any rationale for forbidding this configuration.
Tree-SHA512: df09441f4aec63e79bea94838b7f8e336cebaeb0a22b5e58d27937bbeb1377f229921aeae43674e0b63fc40a39ae51a264d48aa1cdb4cbd0e3339d32856698bf
55f89da1a Don't test against the mempool min fee information in mempool_limit.py (Ben Woosley)
Pull request description:
Because the right-hand side of this comparison can be influenced
externally, e.g. via the -maxmempool argument, the existing mempool state,
host memory usage, etc.
Called out by @MarcoFalke here: https://github.com/bitcoin/bitcoin/pull/12356#discussion_r170094948
Tree-SHA512: 1644cb8046a6953fb93423a5e51af4f5c7d00a35f10389fddd6a823dae6f31ab367b53af70b3b69161adb9c48f57cf4772db7f4610fd7aadd9c0e9b3da17e9f8
2b1f79457 [Depends] Fix Qt build with Xcode 9.2 (fanquake)
Pull request description:
Building Qt in depends is currently broken with versions of Xcode > 9.0 (on both 10.12.x and 10.13.x). We'll bump our Clang/Qt/SDK again soon, however this fixes building in the interim.
Fixes#11461
Related upstream issues: https://bugreports.qt.io/browse/QTBUG-63401, https://bugreports.qt.io/browse/QTBUG-62266.
Tree-SHA512: b1aaa50580c5180bbae2d79f5618c145667edd824b66c20851c7346674a1700267d6df249ba559d781941def312f7867bc3950f673e9128ffd4beda2965ca639
9c5a4a6ed Stop special-casing phashBlock handling in validation for TBV (Matt Corallo)
Pull request description:
There is no reason to do this, really, we already have "ignore PoW" flags. Motivated by https://github.com/bitcoin/bitcoin/pull/11739#discussion_r155841721
Tree-SHA512: 37cb1ae5b11c9e8ed7a679bb07ad3b119a2a014744b26d197d67ba21beb19fe6815271df935e40f7c7bd5f2e4d7ae4dad7bd4d00fa230a8d789f37e9de31a769
0bc095efd [qt] Improved "custom fee" explanation in tooltip (Randolf Richardson)
Pull request description:
Thanks to @dooglus for asking about this tooltip in Issue 12500.
Reference: https://www.github.com/bitcoin/bitcoin/issues/12500
I would also appreciate it if someone can confirm that 1 kilobyte in this field indeed represents 1,000 bytes rather than 1,024 bytes (if it's supposed to be 1,024, then I'll gladly make the necessary changes to reflect this).
Tree-SHA512: da2fe0128411b5ef6f0a26382a80601efcf823c3f3591bdd83a7fe7e25777728e7eb89e2e8b175b991566e63838aca12d204792f981031b86e7b2ba28ca50021
9360f5032 Drop extra script variable in ProduceSignature (Russell Yanofsky)
Pull request description:
Was slightly confusing.
Tree-SHA512: 1d18f92c133772ffc8eb71826c8d778988839a14bcefc50f9c591111b0a5f81ebc12bca0f1ab25d5fdd02d3d50c2325c04cbfcbdcd18a7b80ca112d049c2327d
2736c9e05 Avoid unintentional unsigned integer wraparounds in tests (practicalswift)
Pull request description:
Avoid unintentional unsigned integer wraparounds in tests.
This is a subset of #11535 as suggested by @MarcoFalke :-)
Tree-SHA512: 4f4ee8a08870101a3f7451aefa77ae06aaf44e3c3b2f7555faa2b8a8503f97f34e34dffcf65154278f15767dc9823955f52d1aa7b39930b390e57cdf2b65e0f3
992f56876 depends: Only use D_DARWIN_C_SOURCE when building miniupnpc on darwin (fanquake)
Pull request description:
Only use D_DARWIN_C_SOURCE when building on darwin, so we don't inadvertently introduce issues elsewhere.
cc @theuni
Tree-SHA512: e49a8456ba2b9925c06e62c73e139152b6d63cc5a4cee66944e41c863ca9103e98ac81a5718eceb3d0885a677fc53ece34062b02c304a05c3280e094965e856a
87c4320df gitian-build.sh: fix signProg being recognized as two parameters (Ken Lee)
Pull request description:
On current build script, `gpg --detach-sign` will be recognized as two parameters of gsign
This PR fix it and can build successfully
Tree-SHA512: 32e2f9e8414658ea4145bcbccd9aaa3cdf61ea648ad9328246bad67957e11a8e496afec71cbd888f8d0d49bd7eaed35c971fe2dac43b80ee6e664b90ffd997a3
18307849b Consensus: Fix bug when compiler do not support __builtin_clz* (532479301)
Pull request description:
#ifdef is not correct since defination is defined to 0 or 1. Should change to #if
Tree-SHA512: ba13a591d28f4d7d6ebaab081be4304c43766a611226f8d2994c8db415dfcf318e82217d26a8c4af290760c68eded9503b39535b0e6e079ded912e6a8fca5b36
7eb665fc8 [Trivial] link mentioned scripted-diff-commit (Felix Wolfsteller)
Pull request description:
Make it easier for people who do not operate on a cloned repository to access the example mentioned.
Tree-SHA512: 1c06e551c68cad03e6bd541bf0e0076cdf0b48ef9b8b4e4a61435367c3435e2e4ccb934112e8dc29d3d70217d8834924704aaf839e25d1133312df86848ca1a1
4d14d06fc docs: clarified systemd installation instructions in init.md for Ubuntu users. (DaveFromBinary)
Pull request description:
Added a note to init.md to clarify the .service copy path for Ubuntu because it differs from the described copy path.
Also noted which version of Ubuntu switched to systemd for the default system init to clarify when the systemd installation steps should be used instead of the upstart installation steps for Ubuntu users.
Tree-SHA512: 1ac6143a177d0f3782ff641029d71eb1f3b3be0c1482e266154d3ca093251b58a10a5f037d1cc82dbfaeae058df2bb8e904833ccb88b032f1a59a151724f95e2
fa9461473 [doc] dev-notes: Members should be initialized (MarcoFalke)
Pull request description:
Also, remove mention of threads that were removed long ago.
Motivation:
Make it easier to spot bugs such as #11654 and #12426
Tree-SHA512: 8ca1cb54e830e9368803bd98a8b08c39bf2d46f079094ed7e070b32ae15a6e611ce98d7a614f897803309f4728575e6bc9357fab1157c53d2536417eb8271653
fa41d68a2 qa: Fix python TypeError in script.py (MarcoFalke)
Pull request description:
`__repr__` returns string, so don't mix it with byte strings.
This fixes
```
TypeError: %b requires a bytes-like object, or an object that implements __bytes__, not 'str'
Tree-SHA512: fac06e083f245209bc8a36102217580b0f6186842f4e52a686225111b0b96ff93c301640ff5e7ddef6a5b4f1689071b16a9a8dc80f28e2b060ddee29edd24ec7
ee041196fc Show a transaction's virtual size in its details dialog. (Chris Moore)
Pull request description:
#12501 looks like it is going to mention transaction's "virtual size" in the custom fee tooltip, so let's display the virtual size when the user double-clicks a transaction.
Tree-SHA512: c60ae23c9f86edfba086b840519941d8e8ee1be9da5987ffe6dee3255943ea5d215708ce57464f109a1d1c612c4c0eeb11f8f3e203d8a8cfc1f8ec753a8aac27
7ba2d57852 Fix ListCoins test failure due to unset g_wallet_allow_fallback_fee (Russell Yanofsky)
Pull request description:
New global variable was introduced in #11882 and not setting it causes:
```
wallet/test/wallet_tests.cpp(638): error in "ListCoins": check wallet->CreateTransaction({recipient}, wtx, reservekey, fee, changePos, error, dummy) failed
wallet/test/wallet_tests.cpp(679): error in "ListCoins": check list.begin()->second.size() == 2 failed [1 != 2]
wallet/test/wallet_tests.cpp(686): error in "ListCoins": check available.size() == 2 failed [1 != 2]
wallet/test/wallet_tests.cpp(705): error in "ListCoins": check list.begin()->second.size() == 2 failed [1 != 2]
```
It's possible to reproduce the failure reliably by running:
```
src/test/test_bitcoin --log_level=test_suite --run_test=wallet_tests/ListCoins
```
Failures happen nondeterministically because boost test framework doesn't run tests in a specified order, and tests that run previously can set the global variables and mask the bug.
This is similar to bugs #12150 and #12424. Example travis failures are:
https://travis-ci.org/bitcoin/bitcoin/jobs/348296805#L2676https://travis-ci.org/bitcoin/bitcoin/jobs/348362560#L2769https://travis-ci.org/bitcoin/bitcoin/jobs/348362563#L2824
Tree-SHA512: ca37b554a75c12ac2d534de62bf74eb9e0b29e4399ebf1fa10053a40887e55e9e7135f754a01e5a67499cc8677ae226542146b370b1e83d08bb63d79ff379073
PyZMQ 17.0.0 has deprecated and removed zmq.asyncio.install() call
with advice to use asyncio native run-loop instead of zmq specific.
This caused exception when running the contrib/zmq/zmq_sub*.py examples.
This commit simply follows the advice.
New global variables were introduced in #11882 and not setting them causes:
wallet/test/wallet_tests.cpp(638): error in "ListCoins": check wallet->CreateTransaction({recipient}, wtx, reservekey, fee, changePos, error, dummy) failed
wallet/test/wallet_tests.cpp(679): error in "ListCoins": check list.begin()->second.size() == 2 failed [1 != 2]
wallet/test/wallet_tests.cpp(686): error in "ListCoins": check available.size() == 2 failed [1 != 2]
wallet/test/wallet_tests.cpp(705): error in "ListCoins": check list.begin()->second.size() == 2 failed [1 != 2]
It's possible to reproduce the failure reliably by running:
src/test/test_bitcoin --log_level=test_suite --run_test=wallet_tests/ListCoins
Failures happen nondeterministically because boost test framework doesn't run
tests in a specified order, and tests that run previously can set the global
variables and mask the bug.
b22c289 [docs] Minor improvements to Compatibility Notes (Randolf Richardson)
Pull request description:
Progressed from Windows Vista to Windows 7 for OS testing statement.
Tree-SHA512: b8027c3d284d6f7197b434f29fdff9da9b9b1daa2b05cd900eb6d216df4930d9cfa6280e23e9f493aa6bcc874cb32ea377034d64783f535986e8c069acf8319d
3f592b8 [QA] add wallet-rbf test (Jonas Schnelli)
8222e05 Disable wallet fallbackfee by default on mainnet (Jonas Schnelli)
Pull request description:
Removes the default fallback fee on mainnet (but keeps it on testnet/regtest).
Transactions using the fallbackfee in case the fallback fee has not been set are getting rejected.
Tree-SHA512: e54d2594b7f954e640cc513a18b0bfbe189f15e15bdeed4fe02b7677f939bca1731fef781b073127ffd4ce08a595fb118259b8826cdaa077ff7d5ae9495810db
eb91835 Add setter for g_initial_block_download_completed (Jonas Schnelli)
3f56df5 [QA] add NODE_NETWORK_LIMITED address relay and sync test (Jonas Schnelli)
158e1a6 [QA] fix mininode CAddress ser/deser (Jonas Schnelli)
fa999af [QA] Allow addrman loopback tests (add debug option -addrmantest) (Jonas Schnelli)
6fe57bd Connect to peers signaling NODE_NETWORK_LIMITED when out-of-IBD (Jonas Schnelli)
31c45a9 Accept addresses with NODE_NETWORK_LIMITED flag (Jonas Schnelli)
Pull request description:
Eventually connect to peers signalling NODE_NETWORK_LIMITED if we are out of IBD.
Accept and relay NODE_NETWORK_LIMITED peers in addrman.
Tree-SHA512: 8a238fc97f767f81cae1866d6cc061390f23a72af4a711d2f7158c77f876017986abb371d213d1c84019eef7be4ca951e8e6f83fda36769c4e1a1d763f787037
e7d9fc5 [qt] navigate to transaction history page after send (Sjors Provoost)
Pull request description:
Before this change QT just remained on the Send tab, which I found confusing. Now it switches to the Transactions tab. This makes it more clear to the user that the send actually succeeded, and here they can monitor progress.
Ideally I would like to highlight the transaction, e.g. by refactoring `TransactionView::focusTransaction(const QModelIndex &idx)` to accept a transaction hash, but I'm not sure how to do that.
Tree-SHA512: 8aa93e03874de8434e18951f8aec47377814c0bcaf7eda4766fc41d5a4e32806346e12e4139e4d45468dfdf0b786f5a7faa393a31b8cd6c65ccac21fb3782c33
5aad635 Use memset() to optimize prevector::resize() (Evan Klitzke)
e46be25 Reduce redundant code of prevector and speed it up (Akio Nakamura)
f0e7aa7 Add new prevector benchmarks. (Evan Klitzke)
Pull request description:
This branch optimizes various `prevector` operations, especially resizing vectors. While profiling the `loadblk` thread I noticed that a lot of time was being spent in `prevector::resize()` which led to this work. I have some data here indicating that it takes up **37%** of the time in `ReadBlockFromDisk()`: https://monad.io/readblockfromdisk.svg
This branch improves things significantly. For trivial types, the new results for the prevector benchmark are:
* `PrevectorClearTrivial` which tests `prevector::clear()` becomes 24.6x faster
* `PrevectorDestructorTrivial` which tests `prevector::~prevector()` becomes 20.5x faster
* `PrevectorResizeTrivial` which tests `prevector::resize()` becomes 20.3x faster
Note that in practice it looks like the prevector is only used to contain `unsigned char` types, which is a trivial type. The benchmarks are testing a bit of an extreme case, but the changes here are motivated by the profiling data for `ReadBlockFromDisk()` I linked to above.
The pull request here consists of a series of three commits:
* The first adds new benchmarks but does not change the prevector code.
* The second is from @AkioNak , and merges some prevector optimizations he submitted in #11988
* The third optimizes `prevector::resize()` to use `memset()` when the prevector contains trivially constructible types
Tree-SHA512: 28f7cbb91a19f9f43b6a5942781d7eb2e3197389186b666f086b69df12bee37773140f765426d715bfb8ebff79cb27a5f1206d0325b54b4aa65598b50fb18368