LBRY daemon does not always launch when the app runs #19

Closed
opened 2017-03-19 15:28:51 +01:00 by kauffj · 13 comments
kauffj commented 2017-03-19 15:28:51 +01:00 (Migrated from github.com)

The Issue

I've heard of this issue on Reddit and from several people on Slack, but I have not experienced it myself.

Several people report that the daemon does not launch when the app is run and that they have to launch it manually to get it to work.

Steps to reproduce

I can't :(

Expected behaviour

App launches daemon.

Actual behaviour

Sometimes it doesn't, apparently.

## The Issue I've heard of this issue on Reddit and from several people on Slack, but I have not experienced it myself. Several people report that the daemon does not launch when the app is run and that they have to launch it manually to get it to work. ### Steps to reproduce I can't :( ### Expected behaviour App launches daemon. ### Actual behaviour Sometimes it doesn't, apparently.
drakythe commented 2017-03-22 19:10:02 +01:00 (Migrated from github.com)

I can vouch that it doesn't work for me.

MacOS 10.12.3

Hey stupid question: With the commandline version of the daemon I see python. What version of python is required for all this to run properly?

Note: The commandline version doesn't work for me either.

I can vouch that it doesn't work for me. MacOS 10.12.3 Hey stupid question: With the commandline version of the daemon I see python. What version of python is required for all this to run properly? Note: The commandline version doesn't work for me either.
kauffj commented 2017-03-22 21:12:30 +01:00 (Migrated from github.com)

@drakythe installing the daemon from source is (relatively) painless: https://github.com/lbryio/lbry/blob/master/INSTALL.md

If you're also not seeing the daemon run properly when run directly, that may explain why it is not launching when the browser runs either.

Could you share the output of why it is failing? Thank you!

@drakythe installing the daemon from source is (relatively) painless: https://github.com/lbryio/lbry/blob/master/INSTALL.md If you're also not seeing the daemon run properly when run directly, that may explain why it is not launching when the browser runs either. Could you share the output of why it is failing? Thank you!
drakythe commented 2017-03-22 21:35:16 +01:00 (Migrated from github.com)

Sure, output I get when I try to run the daemon is this:

lbrynet-daemon-v0.9.1-macos $ ./lbrynet-daemon
2017-03-22 15:28:59,201 INFO     lbrynet.lbrynet_daemon.DaemonControl:92: Starting lbrynet-daemon from command line
2017-03-22 15:28:59,804 INFO     lbrynet.lbrynet_daemon.DaemonControl:126: Making attempt 1 / 5 to startup
2017-03-22 15:28:59,807 INFO     lbrynet.lbrynet_daemon.DaemonServer:56: Using non-authenticated API
2017-03-22 15:28:59,808 INFO     lbrynet.lbrynet_daemon.DaemonServer:40: Daemon already running, exiting app
Unhandled error in Deferred:
2017-03-22 15:28:59,808 CRITICAL twisted:154: Unhandled error in Deferred:

2017-03-22 15:28:59,810 CRITICAL twisted:154:
Traceback (most recent call last):
  File "site-packages/twisted/internet/defer.py", line 1299, in _inlineCallbacks
  File "site-packages/lbrynet/lbrynet_daemon/DaemonServer.py", line 47, in start
  File "site-packages/lbrynet/lbrynet_daemon/DaemonServer.py", line 41, in _setup_server
SystemExit: 1

And it just hangs there. If I force exit the script with a ctrl+c I get this

^CUnhandled Error
Traceback (most recent call last):
  File "daemon.py", line 4, in <module>

  File "site-packages/lbrynet/lbrynet_daemon/DaemonControl.py", line 97, in start

  File "site-packages/twisted/internet/base.py", line 1199, in run

  File "site-packages/twisted/internet/base.py", line 1208, in mainLoop

--- <exception caught here> ---
  File "site-packages/twisted/internet/base.py", line 801, in runUntilCurrent

  File "site-packages/twisted/internet/base.py", line 584, in stop

twisted.internet.error.ReactorNotRunning: Can't stop reactor that isn't running.

2017-03-22 15:31:22,319 CRITICAL twisted:154: Unhandled Error
Traceback (most recent call last):
  File "daemon.py", line 4, in <module>

  File "site-packages/lbrynet/lbrynet_daemon/DaemonControl.py", line 97, in start

  File "site-packages/twisted/internet/base.py", line 1199, in run

  File "site-packages/twisted/internet/base.py", line 1208, in mainLoop

--- <exception caught here> ---
  File "site-packages/twisted/internet/base.py", line 801, in runUntilCurrent

  File "site-packages/twisted/internet/base.py", line 584, in stop

twisted.internet.error.ReactorNotRunning: Can't stop reactor that isn't running.

2017-03-22 15:31:22,321 INFO     lbrynet.lbrynet_daemon.Daemon:523: Closing lbrynet session
2017-03-22 15:31:22,322 INFO     lbrynet.lbrynet_daemon.Daemon:524: Status at time of shutdown: initializing
2017-03-22 15:31:22,323 INFO     lbrynet.lbrynet_daemon.ExchangeRateManager:163: Stopping exchange rate manager

Further, if I leave it running for a bit, and then ctrl+c it I get

^C2017-03-22 15:31:02,036 INFO     lbrynet.lbrynet_daemon.Daemon:523: Closing lbrynet session
2017-03-22 15:31:02,037 INFO     lbrynet.lbrynet_daemon.Daemon:524: Status at time of shutdown: initializing
2017-03-22 15:31:02,037 INFO     lbrynet.lbrynet_daemon.ExchangeRateManager:163: Stopping exchange rate manager

Which seems to indicate it reached further than the first message, but it never goes anywhere or gives any further output. And no matter how long I wait a curl against localhost/127.0.0.1 won't return anything.

This is all on my iMac. Out of curiosity I installed the app on my Air, and it worked without issue. I have not yet tried just the daemon, but I imagine it works just fine.

Sure, output I get when I try to run the daemon is this: ```shell lbrynet-daemon-v0.9.1-macos $ ./lbrynet-daemon 2017-03-22 15:28:59,201 INFO lbrynet.lbrynet_daemon.DaemonControl:92: Starting lbrynet-daemon from command line 2017-03-22 15:28:59,804 INFO lbrynet.lbrynet_daemon.DaemonControl:126: Making attempt 1 / 5 to startup 2017-03-22 15:28:59,807 INFO lbrynet.lbrynet_daemon.DaemonServer:56: Using non-authenticated API 2017-03-22 15:28:59,808 INFO lbrynet.lbrynet_daemon.DaemonServer:40: Daemon already running, exiting app Unhandled error in Deferred: 2017-03-22 15:28:59,808 CRITICAL twisted:154: Unhandled error in Deferred: 2017-03-22 15:28:59,810 CRITICAL twisted:154: Traceback (most recent call last): File "site-packages/twisted/internet/defer.py", line 1299, in _inlineCallbacks File "site-packages/lbrynet/lbrynet_daemon/DaemonServer.py", line 47, in start File "site-packages/lbrynet/lbrynet_daemon/DaemonServer.py", line 41, in _setup_server SystemExit: 1 ``` And it just hangs there. If I force exit the script with a ctrl+c I get this ```shell ^CUnhandled Error Traceback (most recent call last): File "daemon.py", line 4, in <module> File "site-packages/lbrynet/lbrynet_daemon/DaemonControl.py", line 97, in start File "site-packages/twisted/internet/base.py", line 1199, in run File "site-packages/twisted/internet/base.py", line 1208, in mainLoop --- <exception caught here> --- File "site-packages/twisted/internet/base.py", line 801, in runUntilCurrent File "site-packages/twisted/internet/base.py", line 584, in stop twisted.internet.error.ReactorNotRunning: Can't stop reactor that isn't running. 2017-03-22 15:31:22,319 CRITICAL twisted:154: Unhandled Error Traceback (most recent call last): File "daemon.py", line 4, in <module> File "site-packages/lbrynet/lbrynet_daemon/DaemonControl.py", line 97, in start File "site-packages/twisted/internet/base.py", line 1199, in run File "site-packages/twisted/internet/base.py", line 1208, in mainLoop --- <exception caught here> --- File "site-packages/twisted/internet/base.py", line 801, in runUntilCurrent File "site-packages/twisted/internet/base.py", line 584, in stop twisted.internet.error.ReactorNotRunning: Can't stop reactor that isn't running. 2017-03-22 15:31:22,321 INFO lbrynet.lbrynet_daemon.Daemon:523: Closing lbrynet session 2017-03-22 15:31:22,322 INFO lbrynet.lbrynet_daemon.Daemon:524: Status at time of shutdown: initializing 2017-03-22 15:31:22,323 INFO lbrynet.lbrynet_daemon.ExchangeRateManager:163: Stopping exchange rate manager ``` Further, if I leave it running for a bit, and then ctrl+c it I get ```shell ^C2017-03-22 15:31:02,036 INFO lbrynet.lbrynet_daemon.Daemon:523: Closing lbrynet session 2017-03-22 15:31:02,037 INFO lbrynet.lbrynet_daemon.Daemon:524: Status at time of shutdown: initializing 2017-03-22 15:31:02,037 INFO lbrynet.lbrynet_daemon.ExchangeRateManager:163: Stopping exchange rate manager ``` Which seems to indicate it reached further than the first message, but it never goes anywhere or gives any further output. And no matter how long I wait a curl against localhost/127.0.0.1 won't return anything. This is all on my iMac. Out of curiosity I installed the app on my Air, and it worked without issue. I have not yet tried just the daemon, but I imagine it works just fine.
kauffj commented 2017-03-22 23:48:08 +01:00 (Migrated from github.com)

@drakythe Can you check your task/process manager and ensure there is not already a frozen copy running? The output seems to indicate that. Though it certainly didn't handle it cleanly.

@drakythe Can you check your task/process manager and ensure there is not already a frozen copy running? The output seems to indicate that. Though it certainly didn't handle it cleanly.
drakythe commented 2017-03-23 13:52:17 +01:00 (Migrated from github.com)

@kauffj I'm not 100% sure what I'm looking for, but reviewing my Activity Monitor does not seem to indicate a running lbry process, either under that name, a Daemon name, or a python process. Can you give me the exact task I should be looking for?

@kauffj I'm not 100% sure what I'm looking for, but reviewing my Activity Monitor does not seem to indicate a running lbry process, either under that name, a Daemon name, or a python process. Can you give me the exact task I should be looking for?
kauffj commented 2017-03-23 22:08:53 +01:00 (Migrated from github.com)

@drakythe Sounds like probably a bug on our end. The way the app is spawning and killing the daemon process is being changed (literally by the guy sitting next to me as I type this), so let's wait until the next release and see if this is cleaned up.

@drakythe Sounds like probably a bug on our end. The way the app is spawning and killing the daemon process is being changed (literally by the guy sitting next to me as I type this), so let's wait until the next release and see if this is cleaned up.
lyoshenka commented 2017-04-29 00:11:09 +02:00 (Migrated from github.com)

@alexliebowitz is this fixed?

@alexliebowitz is this fixed?
alexliebowitz commented 2017-05-04 09:52:54 +02:00 (Migrated from github.com)

Hmm, I guess I'd call it "mostly fixed." It's really more of a category of problems that all cause the same symptom. If memory serves, we decided that it usually happens because there's a hung daemon process running and the app latches onto that instead of starting a new one.

The main thing we've done to reduce that issue is that we now stop the daemon by issuing a daemon_stop API call instead of SIGTERM. We don't have numbers on this, but IIRC closing with SIGTERM would very often leave a hung daemon process hanging around, so it's possible that that alone already reduced this issue by 80%+.

Things we can do in the future:

  • Continue to improve general daemon stability to reduce spontaneous hangs
  • Add fallback logic so even hung daemons can respond to the 'stop' call and try do a clean shutdown. I could even imagine a mini event loop that runs if the main one crashes.
  • On startup, use heuristics to detect if the current daemon process is hung (maybe make an RPC call that should throw an error if the daemon is hung?). Then either give the user the option to force kill it through the interface (with a warning about the risks), or tell them they need to kill it manually. Most of the infrastructure for this is already there from when I revamped the upgrade system.

We could close this issue now since it seems like it's improved a lot already, and getting it down to 0 will be a gradual process. Or we could leave it open until we're confident that we've got it down to an acceptable level.

Hmm, I guess I'd call it "mostly fixed." It's really more of a category of problems that all cause the same symptom. If memory serves, we decided that it usually happens because there's a hung daemon process running and the app latches onto that instead of starting a new one. The main thing we've done to reduce that issue is that we now stop the daemon by issuing a `daemon_stop` API call instead of SIGTERM. We don't have numbers on this, but IIRC closing with SIGTERM would very often leave a hung daemon process hanging around, so it's possible that that alone already reduced this issue by 80%+. Things we can do in the future: - Continue to improve general daemon stability to reduce spontaneous hangs - Add fallback logic so even hung daemons can respond to the 'stop' call and try do a clean shutdown. I could even imagine a mini event loop that runs if the main one crashes. - On startup, use heuristics to detect if the current daemon process is hung (maybe make an RPC call that should throw an error if the daemon is hung?). Then either give the user the option to force kill it through the interface (with a warning about the risks), or tell them they need to kill it manually. Most of the infrastructure for this is already there from when I revamped the upgrade system. We could close this issue now since it seems like it's improved a lot already, and getting it down to 0 will be a gradual process. Or we could leave it open until we're confident that we've got it down to an acceptable level.
tzarebczan commented 2017-05-30 02:13:53 +02:00 (Migrated from github.com)

@kauffj is this related to the same issue Niko and I have been seeing on Linux with daemon start up? I have to retry it many times before it gets past "starting daemon....".

@kauffj is this related to the same issue Niko and I have been seeing on Linux with daemon start up? I have to retry it many times before it gets past "starting daemon....".
kauffj commented 2017-05-30 20:30:19 +02:00 (Migrated from github.com)

Yes, this sounds like the same issue.

On Mon, May 29, 2017 at 8:13 PM, Thomas Zarebczan notifications@github.com
wrote:

@kauffj https://github.com/kauffj is this related to the same issue
Niko and I have been seeing on Linux with daemon start up? I have to retry
it many times before it gets past "starting daemon....".


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/lbryio/lbry-app/issues/19#issuecomment-304746784, or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAgZVq5kDG1CvKuBIh5gtRbbDn9KUtcPks5r-19BgaJpZM4Mhv9V
.

--

Jeremy Kauffman, Founder, LBRY http://lbry.io/
(267) 210-4292

Build LBRY: get https://lbry.io/get, follow https://twitter.com/lbryio,
like https://facebook.com/lbryio

Yes, this sounds like the same issue. On Mon, May 29, 2017 at 8:13 PM, Thomas Zarebczan <notifications@github.com> wrote: > @kauffj <https://github.com/kauffj> is this related to the same issue > Niko and I have been seeing on Linux with daemon start up? I have to retry > it many times before it gets past "starting daemon....". > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <https://github.com/lbryio/lbry-app/issues/19#issuecomment-304746784>, or mute > the thread > <https://github.com/notifications/unsubscribe-auth/AAgZVq5kDG1CvKuBIh5gtRbbDn9KUtcPks5r-19BgaJpZM4Mhv9V> > . > -- Jeremy Kauffman, Founder, LBRY <http://lbry.io/> (267) 210-4292 Build LBRY: get <https://lbry.io/get>, follow <https://twitter.com/lbryio>, like <https://facebook.com/lbryio>
sleepdefic1t commented 2017-05-30 23:47:43 +02:00 (Migrated from github.com)

I ran into some issues as well.
I resolved it by prepending the install commands with "python"; forcing it to use macOS's installed version of python, as opposed to the build version.

Working on a script to see if it helps.

I ran into some issues as well. I resolved it by prepending the install commands with "python"; forcing it to use macOS's installed version of python, as opposed to the build version. Working on a script to see if it helps.
finer9 commented 2017-05-30 23:49:53 +02:00 (Migrated from github.com)

Logfile from a Linux user with daemon startup issues

lbrynet (3).log.txt

Logfile from a Linux user with daemon startup issues [lbrynet (3).log.txt](https://github.com/lbryio/lbry-app/files/1039786/lbrynet.3.log.txt)
sleepdefic1t commented 2017-05-31 00:47:42 +02:00 (Migrated from github.com)

https://github.com/sleepdefic1t/lbryBuilder

let me know if this helps.

https://github.com/sleepdefic1t/lbryBuilder let me know if this helps.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: LBRYCommunity/lbry-desktop#19
No description provided.