The Chainquery sync needs to be done in batches #142

Closed
opened 2019-01-22 06:29:21 +01:00 by tiger5226 · 4 comments
tiger5226 commented 2019-01-22 06:29:21 +01:00 (Migrated from github.com)

Right now, the sync if starting from scratch will grab all claims from chainquery. This transfers a huge amount of data. From recent events this will crash lighthouse with the volume of claims we currently have in the chain. There are a few PRs right that require a full resync of the claims. So this needs to be done first.

As part of the process, we need to do a full sync on another machine @nikooo777 and then move DNS to that new machine when completed, but only after this batch processing is completed.

Right now, the sync if starting from scratch will grab all claims from chainquery. This transfers a huge amount of data. From recent events this will crash lighthouse with the volume of claims we currently have in the chain. There are a few PRs right that require a full resync of the claims. So this needs to be done first. As part of the process, we need to do a full sync on another machine @nikooo777 and then move DNS to that new machine when completed, but only after this batch processing is completed.
kauffj commented 2019-01-23 04:04:15 +01:00 (Migrated from github.com)

Should lighthouse just be using it's own independent chainquery instance?

Should lighthouse just be using it's own independent chainquery instance?
tzarebczan commented 2019-01-23 04:12:43 +01:00 (Migrated from github.com)

That might be a good idea, but it's not what's causing the initial sync issue Mark mentioned here.

That might be a good idea, but it's not what's causing the initial sync issue Mark mentioned here.
tiger5226 commented 2019-01-25 02:32:29 +01:00 (Migrated from github.com)

Should lighthouse just be using it's own independent chainquery instance?

No, i think that would be a waste. We are barely using Chainquery right now ( as far as capacity goes ). This problem is just a volume issue for lighthouse. When we grab claims we need to use pagination so node doesn't grab too much information at once and crash with an out of memory exception.

>Should lighthouse just be using it's own independent chainquery instance? No, i think that would be a waste. We are barely using Chainquery right now ( as far as capacity goes ). This problem is just a volume issue for lighthouse. When we grab claims we need to use pagination so node doesn't grab too much information at once and crash with an out of memory exception.
tiger5226 commented 2019-02-24 23:00:06 +01:00 (Migrated from github.com)

I added a batching process to only grab 5000 claims at a time. This should make things much easier on node and allow @nikooo777 to continue to rebuild the lighthouse machine so we can merge the other PR too. Solved with 81986315cb.

screen shot 2019-02-24 at 4 57 21 pm
I added a batching process to only grab 5000 claims at a time. This should make things much easier on node and allow @nikooo777 to continue to rebuild the lighthouse machine so we can merge the other PR too. Solved with 81986315cb98d5ad44b937eb41fe904d1ca0e36b. <img width="687" alt="screen shot 2019-02-24 at 4 57 21 pm" src="https://user-images.githubusercontent.com/3402064/53305972-b6195580-3855-11e9-90d6-097063413197.png">
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: LBRYCommunity/lighthouse.js#142
No description provided.