remove redisDB dependency #25

Merged
nikooo777 merged 12 commits from use_video_statuses into master 2018-08-20 13:56:26 +02:00
nikooo777 commented 2018-07-31 19:43:34 +02:00 (Migrated from github.com)

This PR aims at removing the local redisdb dependency that makes it impossible to redistribute work across different servers.

Further work has been put in, here's a summary:

  • Use channel IDs while publishing instead of channel names
  • Rename API endpoints and associated functions
  • Fix indexes in information sent to slack (those +1s that you see in the loop)
  • Slightly changed the way the number of "published" claims is counted (making use of the new synced videos list)
  • Add 5% slack on UTXOs to avoid waiting a full block even if just a few transactions were made
  • Speed up startup by eliminating unnecessary block waits
  • Remove waiting for UTXOs by listing UTXOs (instead we just wait for a new block)
  • Small bug fixes
This PR aims at removing the local redisdb dependency that makes it impossible to redistribute work across different servers. Further work has been put in, here's a summary: - Use channel IDs while publishing instead of channel names - Rename API endpoints and associated functions - Fix indexes in information sent to slack (those +1s that you see in the loop) - Slightly changed the way the number of "published" claims is counted (making use of the new synced videos list) - Add 5% slack on UTXOs to avoid waiting a full block even if just a few transactions were made - Speed up startup by eliminating unnecessary block waits - Remove waiting for UTXOs by listing UTXOs (instead we just wait for a new block) - Small bug fixes
lyoshenka (Migrated from github.com) requested changes 2018-08-09 16:56:47 +02:00
@ -152,1 +162,3 @@
var response apiSyncUpdateResponse
var response struct {
Success bool `json:"success"`
Error null.String `json:"error"`
lyoshenka (Migrated from github.com) commented 2018-08-09 16:21:42 +02:00

you don't have to create a named struct for this if its only used in one place. you can inline it as

var response struct {
	Success bool        `json:"success"`
	Error   null.String `json:"error"`
	Data    null.String `json:"data"`
}
you don't have to create a named struct for this if its only used in one place. you can inline it as ``` var response struct { Success bool `json:"success"` Error null.String `json:"error"` Data null.String `json:"data"` } ```
@ -40,3 +40,4 @@
}
numOnSource = int(n)
}
log.Debugf("Source channel has %d videos", numOnSource)
lyoshenka (Migrated from github.com) commented 2018-08-09 16:47:21 +02:00

why are all these cast to floats? they can be ints here

why are all these cast to floats? they can be ints here
@ -65,3 +64,1 @@
claimAddress string
videoDirectory string
db *redisdb.DB
daemon *jsonrpc.Client
lyoshenka (Migrated from github.com) commented 2018-08-09 16:31:56 +02:00

if you have multiple mutexes, you can't call one of them just mux. its not clear what that's for. it could be something like walletSetupMux. though once you name it that, it becomes clear that something might be wrong with wrapping the whole wallet setup in a single mutex. does the whole thing actually need to be locked?

the mutexes should be pointers, so new copies are not created if the Sync struct is copied

its clearer if you put the mutex right above the thing the mutex is locking, and leave newlines on either side. so in this example, upt videosMapMux right above syncedVideos. and then maybe rename it to syncedVideosMux, like so:

...other vars...

syncedVideosMux *sync.Mutex
syncedVideos    map[string]syncedVideo

...other vars...
if you have multiple mutexes, you can't call one of them just `mux`. its not clear what that's for. it could be something like `walletSetupMux`. though once you name it that, it becomes clear that something might be wrong with wrapping the whole wallet setup in a single mutex. does the whole thing *actually* need to be locked? the mutexes should be pointers, so new copies are not created if the Sync struct is copied its clearer if you put the mutex right above the thing the mutex is locking, and leave newlines on either side. so in this example, upt `videosMapMux` right above `syncedVideos`. and then maybe rename it to `syncedVideosMux`, like so: ``` ...other vars... syncedVideosMux *sync.Mutex syncedVideos map[string]syncedVideo ...other vars... ```
lyoshenka (Migrated from github.com) commented 2018-08-09 16:32:30 +02:00

this should also be a pointer. but more importantly, you don't need this if you already have a stop.Group. stop.Group works as a combo WaitGroup + channel that will be closed to indicate stopping (and can be closed safely multiple times). so use grp.Add() and grp.Wait() instead

this should also be a pointer. but more importantly, you don't need this if you already have a stop.Group. stop.Group works as a combo WaitGroup + channel that will be closed to indicate stopping (and can be closed safely multiple times). so use `grp.Add()` and `grp.Wait()` instead
lyoshenka (Migrated from github.com) commented 2018-08-09 16:38:35 +02:00

it won't let me comment below, so I'm commenting here:

you call Add() and Done() inside startWorker(workerNum int), but the correct pattern is to make those calls outside the function. startWorker() doesn't know if its being run asynchronously or not. you only need a waitgroup if it is. there are also subtle concurrency issues with calling it inside the function. so the right way to go is to remove Add and Done from inside startWorker, and do this:

	for i := 0; i < s.ConcurrentVideos; i++ {
		s.wg.Add(1)
		go func() {
			defer s.wg.Done()
			s.startWorker(i)
		}()
	}
it won't let me comment below, so I'm commenting here: you call `Add()` and `Done()` inside `startWorker(workerNum int)`, but the correct pattern is to make those calls outside the function. startWorker() doesn't know if its being run asynchronously or not. you only need a waitgroup if it is. there are also subtle concurrency issues with calling it inside the function. so the right way to go is to remove Add and Done from inside startWorker, and do this: ``` for i := 0; i < s.ConcurrentVideos; i++ { s.wg.Add(1) go func() { defer s.wg.Done() s.startWorker(i) }() } ```
lyoshenka (Migrated from github.com) commented 2018-08-09 16:45:11 +02:00

you can use a read/write lock to lock the map for reading when you read it, and writing when you write to it. this lets multiple threads read the data at once, which is safe and blocks less. use *sync.RWMutex instead, and call mux.RLock() and mux.RUnlock() when you're only reading from the variable. leave mux.Lock() for writing

you can use a read/write lock to lock the map for reading when you read it, and writing when you write to it. this lets multiple threads read the data at once, which is safe and blocks less. use `*sync.RWMutex` instead, and call `mux.RLock()` and `mux.RUnlock()` when you're only reading from the variable. leave `mux.Lock()` for writing
lyoshenka (Migrated from github.com) commented 2018-08-09 16:42:20 +02:00

why is this code all the way down here, so far from line 125 where syncedVideos is first declared?

why is this code all the way down here, so far from line 125 where syncedVideos is first declared?
lyoshenka (Migrated from github.com) commented 2018-08-09 16:55:30 +02:00

how often does this happen, and do you know why? is it within our control?

how often does this happen, and do you know why? is it within our control?
nikooo777 (Migrated from github.com) reviewed 2018-08-09 18:23:27 +02:00
nikooo777 (Migrated from github.com) commented 2018-08-09 18:23:27 +02:00

the first happens due to youtube restrictions (either region, age or deleted videos)
the second happens when we hit the defined size limit (it's in our control).

While we can't do anything about the first, we can raise the limits for the second, but as per several discussions we'll keep it at 2GB or less.

the first happens due to youtube restrictions (either region, age or deleted videos) the second happens when we hit the defined size limit (it's in our control). While we can't do anything about the first, we can raise the limits for the second, but as per several discussions we'll keep it at 2GB or less.
nikooo777 (Migrated from github.com) reviewed 2018-08-09 18:25:03 +02:00
nikooo777 (Migrated from github.com) commented 2018-08-09 18:25:02 +02:00

out of 100k published videos we have about 1% that failed for those two (leading) reasons: https://scrn.storni.info/2018-08-09_12-24-24-339707393.png

out of 100k published videos we have about 1% that failed for those two (leading) reasons: https://scrn.storni.info/2018-08-09_12-24-24-339707393.png
nikooo777 (Migrated from github.com) reviewed 2018-08-09 18:26:59 +02:00
@ -40,3 +40,4 @@
}
numOnSource = int(n)
}
log.Debugf("Source channel has %d videos", numOnSource)
nikooo777 (Migrated from github.com) commented 2018-08-09 18:26:59 +02:00

no clue. but you're completely right. Will fix

no clue. but you're completely right. Will fix
nikooo777 (Migrated from github.com) reviewed 2018-08-09 18:29:12 +02:00
nikooo777 (Migrated from github.com) commented 2018-08-09 18:29:12 +02:00

the only reason i put it there is because there were a bunch of assignments there and i wanted to keep them grouped, but I agree it's not very read-able this way. I'll move them up

the only reason i put it there is because there were a bunch of assignments there and i wanted to keep them grouped, but I agree it's not very read-able this way. I'll move them up
nikooo777 (Migrated from github.com) reviewed 2018-08-09 18:30:21 +02:00
@ -65,3 +64,1 @@
claimAddress string
videoDirectory string
db *redisdb.DB
daemon *jsonrpc.Client
nikooo777 (Migrated from github.com) commented 2018-08-09 18:30:21 +02:00

I need to look into that, this is from your original code and I don't remember ever changing it. Thanks for the pointers there.

I need to look into that, this is from your original code and I don't remember ever changing it. Thanks for the pointers there.
nikooo777 (Migrated from github.com) reviewed 2018-08-09 18:32:46 +02:00
@ -65,3 +64,1 @@
claimAddress string
videoDirectory string
db *redisdb.DB
daemon *jsonrpc.Client
nikooo777 (Migrated from github.com) commented 2018-08-09 18:32:46 +02:00

Will rename and move the mutexes.
The reason I didn't use read/write locks is that i don't want the application to read when it's being written to, plus the locks are held for a very very short time so there would be no noticeable improvement, only a higher risk of race conditions happening during wallet refills.

Will rename and move the mutexes. The reason I didn't use read/write locks is that i don't want the application to read when it's being written to, plus the locks are held for a very very short time so there would be no noticeable improvement, only a higher risk of race conditions happening during wallet refills.
nikooo777 (Migrated from github.com) reviewed 2018-08-09 18:35:17 +02:00
@ -152,1 +162,3 @@
var response apiSyncUpdateResponse
var response struct {
Success bool `json:"success"`
Error null.String `json:"error"`
nikooo777 (Migrated from github.com) commented 2018-08-09 18:35:17 +02:00

ok, will look into that

ok, will look into that
nikooo777 (Migrated from github.com) reviewed 2018-08-14 16:37:24 +02:00
@ -65,3 +64,1 @@
claimAddress string
videoDirectory string
db *redisdb.DB
daemon *jsonrpc.Client
nikooo777 (Migrated from github.com) commented 2018-08-14 16:37:24 +02:00

I looked more into why we need to lock the whole walletSetup function. The reason I'm doing that is to avoid multiple threads from refilling the wallet concurrently. I don't think I can easily break up the the function to lock fewer lines of code. I think it's fair to leave it like that.

I removed the wait group in favor of the stop group

I looked more into why we need to lock the whole walletSetup function. The reason I'm doing that is to avoid multiple threads from refilling the wallet concurrently. I don't think I can easily break up the the function to lock fewer lines of code. I think it's fair to leave it like that. I removed the wait group in favor of the stop group
nikooo777 commented 2018-08-14 17:21:30 +02:00 (Migrated from github.com)

addressed reviews

addressed reviews
tiger5226 (Migrated from github.com) reviewed 2018-08-17 00:38:15 +02:00
tiger5226 (Migrated from github.com) commented 2018-08-17 00:38:15 +02:00

why are we not using the aligned code in the api package? I would make sure we are making things DRY. api.Response

why are we not using the aligned code in the api package? I would make sure we are making things DRY. `api.Response`
tiger5226 (Migrated from github.com) requested changes 2018-08-17 01:32:20 +02:00
@ -19,2 +17,4 @@
s.walletMux.Lock()
defer s.walletMux.Unlock()
err := s.ensureChannelOwnership()
if err != nil {
tiger5226 (Migrated from github.com) commented 2018-08-17 01:05:32 +02:00

why are we using a mutex lock? What is wrong with concurrent execution? A mutex is not the standard way to handle synchronization. It can cause deadlocks. Ideally if you want things to run synchronously from different go routines, you use channels in golang.

why are we using a mutex lock? What is wrong with concurrent execution? A mutex is not the standard way to handle synchronization. It can cause deadlocks. Ideally if you want things to run synchronously from different go routines, you use channels in golang.
@ -41,4 +42,4 @@
}
log.Debugf("Source channel has %d videos", numOnSource)
if numOnSource == 0 {
return nil
tiger5226 (Migrated from github.com) commented 2018-08-17 01:06:12 +02:00

why change this to an int? It better to be more specific than more general.

why change this to an int? It better to be more specific than more general.
@ -43,4 +44,4 @@
if numOnSource == 0 {
return nil
}
tiger5226 (Migrated from github.com) commented 2018-08-17 01:07:02 +02:00

what is this?! not part of the PR but this is not good to have in the code base.

what is this?! not part of the PR but this is not good to have in the code base.
tiger5226 (Migrated from github.com) commented 2018-08-17 01:10:58 +02:00

so is this ok? If there is a problem it would be an infinite loop no? Maybe have a maxit?

so is this ok? If there is a problem it would be an infinite loop no? Maybe have a `maxit`?
@ -187,4 +185,4 @@
return publishAndRetryExistingNames(daemon, v.title, v.getFilename(), amount, options)
}
tiger5226 (Migrated from github.com) commented 2018-08-17 01:13:22 +02:00

Why is the author UC Berkley? should this function be called publishUCBerkley?

Why is the author UC Berkley? should this function be called publishUCBerkley?
tiger5226 (Migrated from github.com) commented 2018-08-17 01:14:27 +02:00

ahhh, ucbVideo Seems odd to have a struct type for a specific author.

ahhh, `ucbVideo` Seems odd to have a struct type for a specific author.
@ -74,3 +84,4 @@
}
}
// SendErrorToSlack Sends an error message to the default channel and to the process log.
tiger5226 (Migrated from github.com) commented 2018-08-17 01:21:55 +02:00

I don't understand why we have a mutex lock here. Why would this get called twice in different go routines on the same sync instance?

I don't understand why we have a mutex lock here. Why would this get called twice in different go routines on the same sync instance?
tiger5226 (Migrated from github.com) commented 2018-08-17 01:23:54 +02:00

we should not do this here. Please use the stopper pattern grin built in the stop package.

we should not do this here. Please use the stopper pattern grin built in the stop package.
tiger5226 (Migrated from github.com) commented 2018-08-17 01:25:01 +02:00

Last comment on mutex locks. We should discuss the pattern you are using here, as I am almost positive using channels is a better and more robust solution design.

Last comment on mutex locks. We should discuss the pattern you are using here, as I am almost positive using channels is a better and more robust solution design.
lyoshenka (Migrated from github.com) reviewed 2018-08-18 15:28:20 +02:00
@ -65,3 +64,1 @@
claimAddress string
videoDirectory string
db *redisdb.DB
daemon *jsonrpc.Client
lyoshenka (Migrated from github.com) commented 2018-08-18 15:28:20 +02:00

a read/write lock does not allow reading during writing. thats the point of every lock. what it does allow is multiple concurrent reads. when you do RLock, others can RLock at the same time. when you Lock (for writing), no one's allowed to Lock or RLock at the same time.

a read/write lock does not allow reading during writing. thats the point of every lock. what it does allow is multiple concurrent reads. when you do RLock, others can RLock at the same time. when you Lock (for writing), no one's allowed to Lock or RLock at the same time.
lyoshenka (Migrated from github.com) approved these changes 2018-08-18 15:33:09 +02:00
lyoshenka (Migrated from github.com) left a comment

i agree with most of beamer's comments, and none are major blockers, so im fine with merging this now and cleaning up as you continue to work with the code.

in general im trying to be less strict on PRs when the code works but has little nits in it.

i agree with most of beamer's comments, and none are major blockers, so im fine with merging this now and cleaning up as you continue to work with the code. in general im trying to be less strict on PRs when the code works but has little nits in it.
nikooo777 (Migrated from github.com) reviewed 2018-08-20 13:03:09 +02:00
@ -65,3 +64,1 @@
claimAddress string
videoDirectory string
db *redisdb.DB
daemon *jsonrpc.Client
nikooo777 (Migrated from github.com) commented 2018-08-20 13:03:09 +02:00

Oh yes, you're right, not sure what was going in my mind. I'll swap that

Oh yes, you're right, not sure what was going in my mind. I'll swap that
nikooo777 (Migrated from github.com) reviewed 2018-08-20 13:45:23 +02:00
@ -19,2 +17,4 @@
s.walletMux.Lock()
defer s.walletMux.Unlock()
err := s.ensureChannelOwnership()
if err != nil {
nikooo777 (Migrated from github.com) commented 2018-08-20 13:45:23 +02:00

this is the only place the lock is used and no multiple locks are being held that can cause a deadlock here.
However I would like to discuss channels with you to understand how they could be used here (not sure they can)

this is the only place the lock is used and no multiple locks are being held that can cause a deadlock here. However I would like to discuss channels with you to understand how they could be used here (not sure they can)
nikooo777 (Migrated from github.com) reviewed 2018-08-20 13:46:43 +02:00
@ -41,4 +42,4 @@
}
log.Debugf("Source channel has %d videos", numOnSource)
if numOnSource == 0 {
return nil
nikooo777 (Migrated from github.com) commented 2018-08-20 13:46:43 +02:00

Grins previous review outlined a mess with casts here and there to make simple math.
I changed everything to int as it's reasonable for the values they represent.

Grins previous review outlined a mess with casts here and there to make simple math. I changed everything to int as it's reasonable for the values they represent.
nikooo777 (Migrated from github.com) reviewed 2018-08-20 13:52:54 +02:00
@ -43,4 +44,4 @@
if numOnSource == 0 {
return nil
}
nikooo777 (Migrated from github.com) commented 2018-08-20 13:52:54 +02:00

I can explain the berkeley stuff to you via DM, it's all good as it will be eventually removed from here. Not worth changing now

I can explain the berkeley stuff to you via DM, it's all good as it will be eventually removed from here. Not worth changing now
nikooo777 (Migrated from github.com) reviewed 2018-08-20 13:55:01 +02:00
@ -74,3 +84,4 @@
}
}
// SendErrorToSlack Sends an error message to the default channel and to the process log.
nikooo777 (Migrated from github.com) commented 2018-08-20 13:55:01 +02:00

videos end concurrently, that's why

videos end concurrently, that's why
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: LBRYCommunity/lbry.go#25
No description provided.