getblocktemplate corrupted claimtrie database #71

Open
opened 2022-07-17 20:29:13 +02:00 by roylee17 · 7 comments
roylee17 commented 2022-07-17 20:29:13 +02:00 (Migrated from github.com)

Failed to create new block template: in reset height: unable to restore the hash at height 1193913

unknown

Waiting for feedback if the database is recoverable using reconsiderblock blockhash

`Failed to create new block template: in reset height: unable to restore the hash at height 1193913` ![unknown](https://user-images.githubusercontent.com/97372/179419559-9c816128-a95d-4121-bd4c-483ac1dced7b.jpeg) Waiting for feedback if the database is recoverable using `reconsiderblock blockhash`
BrannonKing commented 2022-07-18 21:27:18 +02:00 (Migrated from github.com)

What version? It would be interesting to know if this is related to the node cache or not. I thought that I had it so that it wouldn't modify the database from getblocktemplate (in current code).

What version? It would be interesting to know if this is related to the node cache or not. I thought that I had it so that it wouldn't modify the database from getblocktemplate (in current code).
roylee17 commented 2022-07-18 21:48:40 +02:00 (Migrated from github.com)

Database "corruption" might not be accurate. It maybe just the block marked as invalided in the database.

It happened on v0.22.102, which is a pretty recent commit (2022-07-06)

And it was recoverable by

# block 1193914
lbcctl reconsiderblock 2c3cd68739a5f2ef4c8dda71024e608f0898a67ef1210962f3454ce011671bc4

From the log below, the ClaimTrie root calculation mismatches the actual hash in the header.
Not sure the claimtrie was fully rebuilt (in the claimtrie.ResetHeight) or not in this case though.

2022-07-17 05:30:28.145 [INF] SYNC: Processed 1 block in the last 1m25.01s (26 transactions, height 1193912, 2022-07-17 05:30:13 -0400 EDT)
2022-07-17 05:32:45.918 [INF] SYNC: Processed 1 block in the last 2m17.77s (197 transactions, height 1193913, 2022-07-17 05:32:32 -0400 EDT)
2022-07-17 05:36:59.997 [ERR] RPCS: Failed to create new block template: in reset height: unable to restore the hash at height 1193913
2022-07-17 05:37:21.195 [ERR] RPCS: Failed to create new block template: in reset height: unable to restore the hash at height 1193913
2022-07-17 05:37:22.017 [ERR] RPCS: Failed to create new block template: in reset height: unable to restore the hash at height 1193913
2022-07-17 05:38:24.264 [ERR] RPCS: Failed to create new block template: in reset height: unable to restore the hash at height 1193913
2022-07-17 05:39:26.103 [ERR] RPCS: Failed to create new block template: in reset height: unable to restore the hash at height 1193913
2022-07-17 05:39:55.804 [INF] MAIN: RAM: using 6.0 GB with 18.3 available, DISK: using 127.3 GB with 89.8 available
2022-07-17 05:40:28.039 [ERR] RPCS: Failed to create new block template: in reset height: unable to restore the hash at height 1193913
2022-07-17 05:41:29.656 [ERR] RPCS: Failed to create new block template: in reset height: unable to restore the hash at height 1193913
2022-07-17 05:41:58.625 [INF] SYNC: Rejected block 2c3cd68739a5f2ef4c8dda71024e608f0898a67ef1210962f3454ce011671bc4 from 147.135.15.197:9246 (outbound): height: 1193914, computed hash: 6a274f874d2ceb112fbeec50e649919edb1b10c291c83ec29b8723036bbab5e3 != header's ClaimTrie: 5152b64862af9eefd161ddccb0f0eb663b9c65d4c5a44728371ea43bf17cac26
2022-07-17 05:43:05.475 [INF] SYNC: Rejected block 9c007d3d7cc47d0dc4aac3fdce6d5b11e46050fbaac1b72f2da0a95c98246ed3 from 147.135.15.197:9246 (outbound): previous block 2c3cd68739a5f2ef4c8dda71024e608f0898a67ef1210962f3454ce011671bc4 is known to be invalid
2022-07-17 05:43:05.531 [INF] SYNC: Rejected block 9c007d3d7cc47d0dc4aac3fdce6d5b11e46050fbaac1b72f2da0a95c98246ed3 from 51.81.34.141:9246 (outbound): previous block 2c3cd68739a5f2ef4c8dda71024e608f0898a67ef1210962f3454ce011671bc4 is known to be invalid
2022-07-17 05:43:06.090 [INF] SYNC: Rejected block 9c007d3d7cc47d0dc4aac3fdce6d5b11e46050fbaac1b72f2da0a95c98246ed3 from 47.52.140.89:9246 (outbound): previous block 2c3cd68739a5f2ef4c8dda71024e608f0898a67ef1210962f3454ce011671bc4 is known to be invalid
2022-07-17 05:43:06.624 [INF] SYNC: Rejected block 9c007d3d7cc47d0dc4aac3fdce6d5b11e46050fbaac1b72f2da0a95c98246ed3 from 47.91.158.151:9246 (outbound): previous block 2c3cd68739a5f2ef4c8dda71024e608f0898a67ef1210962f3454ce011671bc4 is known to be invalid
2022-07-17 05:43:07.441 [INF] SYNC: Rejected block 9c007d3d7cc47d0dc4aac3fdce6d5b11e46050fbaac1b72f2da0a95c98246ed3 from 47.75.9.191:9246 (outbound): previous block 2c3cd68739a5f2ef4c8dda71024e608f0898a67ef1210962f3454ce011671bc4 is known to be invalid
2022-07-17 05:45:00.050 [INF] CHAN: Adding orphan block 021a6c35dc168ce004da137c0366ad96dfb42ee66d7bc5fdc2a56ff5f8460df0 with parent 9c007d3d7cc47d0dc4aac3fdce6d5b11e46050fbaac1b72f2da0a95c98246ed3
Database "corruption" might not be accurate. It maybe just the block marked as invalided in the database. It happened on v0.22.102, which is a pretty recent commit (2022-07-06) And it was recoverable by ``` # block 1193914 lbcctl reconsiderblock 2c3cd68739a5f2ef4c8dda71024e608f0898a67ef1210962f3454ce011671bc4 ``` From the log below, the ClaimTrie root calculation mismatches the actual hash in the header. Not sure the claimtrie was fully rebuilt (in the claimtrie.ResetHeight) or not in this case though. ``` 2022-07-17 05:30:28.145 [INF] SYNC: Processed 1 block in the last 1m25.01s (26 transactions, height 1193912, 2022-07-17 05:30:13 -0400 EDT) 2022-07-17 05:32:45.918 [INF] SYNC: Processed 1 block in the last 2m17.77s (197 transactions, height 1193913, 2022-07-17 05:32:32 -0400 EDT) 2022-07-17 05:36:59.997 [ERR] RPCS: Failed to create new block template: in reset height: unable to restore the hash at height 1193913 2022-07-17 05:37:21.195 [ERR] RPCS: Failed to create new block template: in reset height: unable to restore the hash at height 1193913 2022-07-17 05:37:22.017 [ERR] RPCS: Failed to create new block template: in reset height: unable to restore the hash at height 1193913 2022-07-17 05:38:24.264 [ERR] RPCS: Failed to create new block template: in reset height: unable to restore the hash at height 1193913 2022-07-17 05:39:26.103 [ERR] RPCS: Failed to create new block template: in reset height: unable to restore the hash at height 1193913 2022-07-17 05:39:55.804 [INF] MAIN: RAM: using 6.0 GB with 18.3 available, DISK: using 127.3 GB with 89.8 available 2022-07-17 05:40:28.039 [ERR] RPCS: Failed to create new block template: in reset height: unable to restore the hash at height 1193913 2022-07-17 05:41:29.656 [ERR] RPCS: Failed to create new block template: in reset height: unable to restore the hash at height 1193913 2022-07-17 05:41:58.625 [INF] SYNC: Rejected block 2c3cd68739a5f2ef4c8dda71024e608f0898a67ef1210962f3454ce011671bc4 from 147.135.15.197:9246 (outbound): height: 1193914, computed hash: 6a274f874d2ceb112fbeec50e649919edb1b10c291c83ec29b8723036bbab5e3 != header's ClaimTrie: 5152b64862af9eefd161ddccb0f0eb663b9c65d4c5a44728371ea43bf17cac26 2022-07-17 05:43:05.475 [INF] SYNC: Rejected block 9c007d3d7cc47d0dc4aac3fdce6d5b11e46050fbaac1b72f2da0a95c98246ed3 from 147.135.15.197:9246 (outbound): previous block 2c3cd68739a5f2ef4c8dda71024e608f0898a67ef1210962f3454ce011671bc4 is known to be invalid 2022-07-17 05:43:05.531 [INF] SYNC: Rejected block 9c007d3d7cc47d0dc4aac3fdce6d5b11e46050fbaac1b72f2da0a95c98246ed3 from 51.81.34.141:9246 (outbound): previous block 2c3cd68739a5f2ef4c8dda71024e608f0898a67ef1210962f3454ce011671bc4 is known to be invalid 2022-07-17 05:43:06.090 [INF] SYNC: Rejected block 9c007d3d7cc47d0dc4aac3fdce6d5b11e46050fbaac1b72f2da0a95c98246ed3 from 47.52.140.89:9246 (outbound): previous block 2c3cd68739a5f2ef4c8dda71024e608f0898a67ef1210962f3454ce011671bc4 is known to be invalid 2022-07-17 05:43:06.624 [INF] SYNC: Rejected block 9c007d3d7cc47d0dc4aac3fdce6d5b11e46050fbaac1b72f2da0a95c98246ed3 from 47.91.158.151:9246 (outbound): previous block 2c3cd68739a5f2ef4c8dda71024e608f0898a67ef1210962f3454ce011671bc4 is known to be invalid 2022-07-17 05:43:07.441 [INF] SYNC: Rejected block 9c007d3d7cc47d0dc4aac3fdce6d5b11e46050fbaac1b72f2da0a95c98246ed3 from 47.75.9.191:9246 (outbound): previous block 2c3cd68739a5f2ef4c8dda71024e608f0898a67ef1210962f3454ce011671bc4 is known to be invalid 2022-07-17 05:45:00.050 [INF] CHAN: Adding orphan block 021a6c35dc168ce004da137c0366ad96dfb42ee66d7bc5fdc2a56ff5f8460df0 with parent 9c007d3d7cc47d0dc4aac3fdce6d5b11e46050fbaac1b72f2da0a95c98246ed3 ```
roylee17 commented 2022-07-20 21:34:29 +02:00 (Migrated from github.com)

Some update:

It just happened again on block 1194911 (07/20/2022).
We're restarting it with v0.22.104 to see if the RamTrie was fully or partially rebuilt when in case it happens again.
And when it happens again before we fixing it, we'll revert to v0.22.100-rc2 to exclude the node cache optimization and see if that help narrow down the scope.

Some update: It just happened again on block 1194911 (07/20/2022). We're restarting it with v0.22.104 to see if the RamTrie was fully or partially rebuilt when in case it happens again. And when it happens again before we fixing it, we'll revert to v0.22.100-rc2 to exclude the node cache optimization and see if that help narrow down the scope.
BrannonKing commented 2022-08-19 15:53:18 +02:00 (Migrated from github.com)

The node-cache optimization was a substantial speedup on the getblocktemplate call. It might be worth it to test the single-threaded version (with the node cache).

The node-cache optimization was a substantial speedup on the getblocktemplate call. It might be worth it to test the single-threaded version (with the node cache).
roylee17 commented 2022-08-20 02:28:53 +02:00 (Migrated from github.com)

We'll set up a small long-running node dedicated to getblocktemplate related stuff including this one (single-threaded version).

We'll set up a small long-running node dedicated to `getblocktemplate` related stuff including this one (single-threaded version).
roylee17 commented 2022-08-20 10:02:41 +02:00 (Migrated from github.com)

Can you rebase that change to the current master, and test it in your environment or the CI.

The following was my attempt to rebase, but it runs twice as longer rebuilding the RamTrie (3min vs 6min) on my machine. It could be me dropping something during the merge.

https://github.com/lbryio/lbcd/tree/single_thread_cache-rebased

Can you rebase that change to the current master, and test it in your environment or the CI. The following was my attempt to rebase, but it runs twice as longer rebuilding the RamTrie (3min vs 6min) on my machine. It could be me dropping something during the merge. https://github.com/lbryio/lbcd/tree/single_thread_cache-rebased
BrannonKing commented 2022-08-22 17:12:59 +02:00 (Migrated from github.com)

Your rebase looks okay; the RamTrie rebuild is slower in the single-threaded case. That's expected.

Your rebase looks okay; the RamTrie rebuild is slower in the single-threaded case. That's expected.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: LBRYCommunity/lbcd#71
No description provided.