Add doce for prefix_extractor

This commit is contained in:
hofmockel 2014-01-21 17:28:38 +01:00
parent 36eb7024d3
commit 3afcb98657
4 changed files with 119 additions and 1 deletions

View file

@ -208,3 +208,47 @@ FilterPolicy
passed to methods of this type. passed to methods of this type.
:rtype: ``bytes`` :rtype: ``bytes``
SliceTransform
==============
.. py:class:: rocksdb.interfaces.SliceTransform
SliceTransform is currently used to implement the 'prefix-API' of rocksdb.
https://github.com/facebook/rocksdb/wiki/Proposal-for-prefix-API
.. py:method:: transform(src)
:param bytes src: Full key to extract the prefix from.
:returns: A tuple of two interges ``(offset, size)``.
Where the first integer is the offset within the ``src``
and the second the size of the prefix after the offset.
Which means the prefix is generted by ``src[offset:offset+size]``
:rtype: ``(int, int)``
.. py:method:: in_domain(src)
Decide if a prefix can be extraced from ``src``.
Only if this method returns ``True`` :py:meth:`transform` will be
called.
:param bytes src: Full key to check.
:rtype: ``bool``
.. py:method:: in_range(prefix)
Checks if prefix is a valid prefix
:param bytes prefix: Prefix to check.
:returns: ``True`` if ``prefix`` is a valid prefix.
:rtype: ``bool``
.. py:method:: name()
Return the name of this transformation.
:rtype: ``bytes``

View file

@ -693,6 +693,28 @@ Options object
*Default:* ``None`` *Default:* ``None``
.. py:attribute:: prefix_extractor
If not ``None``, use the specified function to determine the
prefixes for keys. These prefixes will be placed in the filter.
Depending on the workload, this can reduce the number of read-IOP
cost for scans when a prefix is passed to the calls generating an
iterator (:py:meth:`rocksdb.DB.iterkeys` ...).
A python prefix_extractor must implement the
:py:class:`rocksdb.interfaces.SliceTransform` interface
For prefix filtering to work properly, "prefix_extractor" and "comparator"
must be such that the following properties hold:
1. ``key.starts_with(prefix(key))``
2. ``compare(prefix(key), key) <= 0``
3. ``If compare(k1, k2) <= 0, then compare(prefix(k1), prefix(k2)) <= 0``
4. ``prefix(prefix(key)) == prefix(key)``
*Default:* ``None``
CompressionTypes CompressionTypes
================ ================

View file

@ -20,10 +20,19 @@ Tested with python2.7 and python3.3
API <api/index> API <api/index>
Contributing
------------
Source can be found on `github <https://github.com/stephan-hof/pyrocksdb>`_.
Feel free to fork and send pull-requests or create issues on the
`github issue tracker <https://github.com/stephan-hof/pyrocksdb/issues>`_
RoadMap/TODO RoadMap/TODO
------------ ------------
* support prefix API * wrap Backup/Restore https://github.com/facebook/rocksdb/wiki/How-to-backup-RocksDB%3F
* wrap DestroyDB
* wrap RepairDB
* Links from tutorial to API pages (for example merge operator) * Links from tutorial to API pages (for example merge operator)
Indices and tables Indices and tables

View file

@ -190,4 +190,47 @@ The simple Associative merge ::
# prints b'2' # prints b'2'
print db.get(b"a") print db.get(b"a")
PrefixExtractor
===============
According to `Prefix API <https://github.com/facebook/rocksdb/wiki/Proposal-for-prefix-API>`_
a prefix_extractor can reduce IO for scans within a prefix range.
The following example presents a prefix extractor of a static size. So always
the first 5 bytes are used as the prefix ::
class StaticPrefix(rocksdb.interfaces.SliceTransform):
def name(self):
return b'static'
def transform(self, src):
return (0, 5)
def in_domain(self, src):
return len(src) >= 5
def in_range(self, dst):
return len(dst) == 5
opts = rocksdb.Options()
opts.create_if_missing=True
opts.prefix_extractor = StaticPrefix()
db = rocksdb.DB('test.db', opts)
db.put(b'00001.x', b'x')
db.put(b'00001.y', b'y')
db.put(b'00001.z', b'z')
db.put(b'00002.x', b'x')
db.put(b'00002.y', b'y')
db.put(b'00002.z', b'z')
db.put(b'00003.x', b'x')
db.put(b'00003.y', b'y')
db.put(b'00003.z', b'z')
it = db.iteritems(prefix=b'00002')
it.seek(b'00002')
# prints {b'00002.z': b'z', b'00002.y': b'y', b'00002.x': b'x'}
print dict(it)