docs: update redis storage docs

2019-10-17 14:59:59 +09:00 · 2019-10-17 14:59:59 +09:00 · a9a2d37f11
commit a9a2d37f11
parent 728ec0c623
1 changed files with 27 additions and 14 deletions
--- a/docs/storage/redis.md
+++ b/docs/storage/redis.md
@ -1,12 +1,20 @@
 # Redis Storage

-This storage system separates chihaya server from storage service, chihaya server achieves HA by storing all peer data in Redis, the storage service gets HA by creating cluster. If one chihaya node goes down, peer data will still be available in Redis service.
+This storage implementation separates Chihaya from its storage service.
+Chihaya achieves HA by storing all peer data in Redis.
+Multiple instances of Chihaya can use the same redis instance concurrently.
+The storage service can get HA by clustering.
+If one instance of Chihaya goes down, peer data will still be available in Redis.

-The HA of storage service is not considered here, it's another topic. In case Redis service is a single node, peer data will be unavailable if the node is down. So you should setup a Redis cluster for chihaya server in production.
+The HA of storage service is not considered here.
+In case Redis runs as a single node, peer data will be unavailable if the node is down.
+You should consider setting up a Redis cluster for Chihaya in production.
+
+This storage implementation is currently orders of magnitude slower than the in-memory implementation.

 ## Use Case

-When one chihaya instance is down, the Redis can continuily serve peer data through other chihaya instances.
+When one instance of Chihaya is down, other instances can continue serving peers from Redis.

 ## Configuration

@ -16,14 +24,18 @@ chihaya:
    name: redis
    config:
      # The frequency which stale peers are removed.
-      gc_interval: 14m
+      # This balances between
+      # - collecting garbage more often, potentially using more CPU time, but potentially using less memory (lower value)
+      # - collecting garbage less frequently, saving CPU time, but keeping old peers long, thus using more memory (higher value).
+      gc_interval: 3m

-      # The frequency which metrics are pushed into a local Prometheus endpoint.
+      # The interval at which metrics about the number of infohashes and peers
+      # are collected and posted to Prometheus.
      prometheus_reporting_interval: 1s

      # The amount of time until a peer is considered stale.
      # To avoid churn, keep this slightly larger than `announce_interval`
-      peer_lifetime: 16m
+      peer_lifetime: 31m

      # The address of redis storage.
      redis_broker: "redis://pwd@127.0.0.1:6379/0"
@ -40,11 +52,12 @@ chihaya:

 ## Implementation

-Seeders and Leechers for a particular InfoHash are stored with a redis hash structure, the infohash is used as hash key, peer key is field, last modified time is value.
+Seeders and Leechers for a particular InfoHash are stored within a redis hash.
+The InfoHash is used as key, _peer keys_ are the fields, last modified times are values.
+Peer keys are derived from peers and contain Peer ID, IP, and Port.
+All the InfoHashes (swarms) are also stored in a redis hash, with IP family as the key, infohash as field, and last modified time as value.

-All the InfoHashes (swarms) are also stored into redis hash, IP family is the key, infohash is field, last modified time is value.
-
-Here is an example
+Here is an example:

 ```
 - IPv4
@ -61,9 +74,8 @@ Here is an example
 ```


-In this case, prometheus will record two swarms, three seeders and one leecher.
-
-So tree keys are used to record the count of swarms, seeders and leechers for each group (IPv4, IPv6).
+In this case, prometheus would record two swarms, three seeders, and one leecher.
+These three keys per address family are used to record the count of swarms, seeders, and leechers.

 ```
 - IPv4_infohash_count: 2
@ -71,4 +83,5 @@ So tree keys are used to record the count of swarms, seeders and leechers for ea
 - IPv4_L_count: 1
 ```

-Note: IPv4_infohash_count has the different meaning with `memory` storage, it represents the number of infohashes reported by seeder.
+Note: IPv4_infohash_count has a different meaning compared to the `memory` storage:
+It represents the number of infohashes reported by seeder, meaning that infohashes without seeders are not counted.