diff --git a/docs/api/interfaces.rst b/docs/api/interfaces.rst index f178ac6..7e05719 100644 --- a/docs/api/interfaces.rst +++ b/docs/api/interfaces.rst @@ -208,3 +208,47 @@ FilterPolicy passed to methods of this type. :rtype: ``bytes`` + + +SliceTransform +============== + +.. py:class:: rocksdb.interfaces.SliceTransform + + SliceTransform is currently used to implement the 'prefix-API' of rocksdb. + https://github.com/facebook/rocksdb/wiki/Proposal-for-prefix-API + + .. py:method:: transform(src) + + :param bytes src: Full key to extract the prefix from. + + :returns: A tuple of two interges ``(offset, size)``. + Where the first integer is the offset within the ``src`` + and the second the size of the prefix after the offset. + Which means the prefix is generted by ``src[offset:offset+size]`` + + :rtype: ``(int, int)`` + + + .. py:method:: in_domain(src) + + Decide if a prefix can be extraced from ``src``. + Only if this method returns ``True`` :py:meth:`transform` will be + called. + + :param bytes src: Full key to check. + :rtype: ``bool`` + + .. py:method:: in_range(prefix) + + Checks if prefix is a valid prefix + + :param bytes prefix: Prefix to check. + :returns: ``True`` if ``prefix`` is a valid prefix. + :rtype: ``bool`` + + .. py:method:: name() + + Return the name of this transformation. + + :rtype: ``bytes`` diff --git a/docs/api/options.rst b/docs/api/options.rst index 4593250..e5c0c1b 100644 --- a/docs/api/options.rst +++ b/docs/api/options.rst @@ -693,6 +693,28 @@ Options object *Default:* ``None`` + .. py:attribute:: prefix_extractor + + If not ``None``, use the specified function to determine the + prefixes for keys. These prefixes will be placed in the filter. + Depending on the workload, this can reduce the number of read-IOP + cost for scans when a prefix is passed to the calls generating an + iterator (:py:meth:`rocksdb.DB.iterkeys` ...). + + A python prefix_extractor must implement the + :py:class:`rocksdb.interfaces.SliceTransform` interface + + For prefix filtering to work properly, "prefix_extractor" and "comparator" + must be such that the following properties hold: + + 1. ``key.starts_with(prefix(key))`` + 2. ``compare(prefix(key), key) <= 0`` + 3. ``If compare(k1, k2) <= 0, then compare(prefix(k1), prefix(k2)) <= 0`` + 4. ``prefix(prefix(key)) == prefix(key)`` + + *Default:* ``None`` + + CompressionTypes ================ diff --git a/docs/index.rst b/docs/index.rst index 1ece8ec..883d653 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -20,10 +20,19 @@ Tested with python2.7 and python3.3 API +Contributing +------------ + +Source can be found on `github `_. +Feel free to fork and send pull-requests or create issues on the +`github issue tracker `_ + RoadMap/TODO ------------ -* support prefix API +* wrap Backup/Restore https://github.com/facebook/rocksdb/wiki/How-to-backup-RocksDB%3F +* wrap DestroyDB +* wrap RepairDB * Links from tutorial to API pages (for example merge operator) Indices and tables diff --git a/docs/tutorial/index.rst b/docs/tutorial/index.rst index d6fa410..c272436 100644 --- a/docs/tutorial/index.rst +++ b/docs/tutorial/index.rst @@ -190,4 +190,47 @@ The simple Associative merge :: # prints b'2' print db.get(b"a") +PrefixExtractor +=============== +According to `Prefix API `_ +a prefix_extractor can reduce IO for scans within a prefix range. +The following example presents a prefix extractor of a static size. So always +the first 5 bytes are used as the prefix :: + + class StaticPrefix(rocksdb.interfaces.SliceTransform): + def name(self): + return b'static' + + def transform(self, src): + return (0, 5) + + def in_domain(self, src): + return len(src) >= 5 + + def in_range(self, dst): + return len(dst) == 5 + + opts = rocksdb.Options() + opts.create_if_missing=True + opts.prefix_extractor = StaticPrefix() + + db = rocksdb.DB('test.db', opts) + + db.put(b'00001.x', b'x') + db.put(b'00001.y', b'y') + db.put(b'00001.z', b'z') + + db.put(b'00002.x', b'x') + db.put(b'00002.y', b'y') + db.put(b'00002.z', b'z') + + db.put(b'00003.x', b'x') + db.put(b'00003.y', b'y') + db.put(b'00003.z', b'z') + + it = db.iteritems(prefix=b'00002') + it.seek(b'00002') + + # prints {b'00002.z': b'z', b'00002.y': b'y', b'00002.x': b'x'} + print dict(it)