Skip to content

Commit

Permalink
Reverb: Expand the documentation of Table.
Browse files Browse the repository at this point in the history
PiperOrigin-RevId: 314094641
Change-Id: I5930c638316c6e2b95b684bd7bc8b7694a3d500e
  • Loading branch information
acassirer authored and copybara-github committed Jun 1, 2020
1 parent a320389 commit 4244827
Show file tree
Hide file tree
Showing 2 changed files with 65 additions and 19 deletions.
38 changes: 21 additions & 17 deletions reverb/cc/table.h
Original file line number Diff line number Diff line change
Expand Up @@ -48,29 +48,33 @@ struct TableItem {
std::vector<std::shared_ptr<ChunkStore::Chunk>> chunks;
};

// A Table is a structure for storing `PriorityItem` objects. The Table uses two
// instances of ItemSelectorInterface, one for sampling (sampler) and another
// for removing (remover). PriorityItems are registered with both the sampler
// and remover when inserted in the `Table`. The `Table` uses the sampler to
// determine which items it should return when `Table::Sample()` is called.
// Similarly, the remover is used to determine which items should be deleted to
// ensure capacity.
// A `Table` is a structure for storing `TableItem` objects. The table uses two
// instances of `ItemSelectorInterface`, one for sampling (`sampler`) and
// another for removing (`remover`). All item operations (insert/update/delete)
// on the table are propagated to the sampler and remover with the original
// operation on the table. The `Table` uses the sampler to determine which items
// it should return when `Table::Sample()` is called. Similarly, the remover is
// used to determine which items should be deleted to ensure capacity.
//
// A `RateLimiter` is used to set the ratio of inserted to sampled
// items. This means that calls to `Table::InsertOrAssign()` and
// `Table::Sample()` may be blocked by the `RateLimiter` as it enforces this
// ratio.
//
// Please note that the removing implementation only limits the number of items
// in the table, not the number of timesteps (or actual memory) on this
// server. When we delete an item of a table, the reference counts for
// its chunks decreases and we can maybe delete the chunks. However, this is not
// guaranteed, as other tables might still hold references to the
// chunks in which case no memory is freed up. This means you must be careful
// when choosing the remover strategy. A dangerous example would be using a FIFO
// remover for one table and then introducing another with table with a
// LIFO remover. In this scenario, the two tables would not share any
// chunks and would this require twice the amount of storage.
// Please note that the remover is only used to limit the number of items in
// the table, not the number of data elements nor the memory used. Each item
// references one or more chunks, each chunk holds one or more data elements and
// consumes and "unknown" amount of memory. Each chunk can be referenced by any
// number of items across all tables on the server. Deleting a single item from
// one table simply decrements the reference count of the chunks it references
// and only when a chunk is referenced by zero items is it destroyed and its
// memory deallocated.
//
// This means you must be careful when choosing the remover strategy. A
// dangerous example would be using a FIFO remover for one table and then
// introducing another with table with a LIFO remover. In this scenario, the
// two tables would not share any chunks and would this require twice the
// amount of memory compared to two tables with the same type of remover.
//
class Table {
public:
Expand Down
46 changes: 44 additions & 2 deletions reverb/server.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,8 +51,50 @@ def build_internal_extensions(


class Table:
# TODO(b/157149247): Improve docstring.
"""Table defines how items are selected for sampling and removal."""
"""Item collection with configurable strategies for insertion and sampling.
A `Table` is the structure used to interact with the data stored on a server.
Each table can contain a limited number of "items" that can be retrieved
according to the strategy defined by the `sampler`. The size of a table, in
terms of number of items, is limited to `max_size`. When items are inserted
into an already full table the `remover` is used to decide which item should
be removed.
In addition to the selection strategies used to select items for retrieval and
removal the flow of data is controlled by a `RateLimiter`. A rate limiter
controlls high level relations between inserts and samples by defining a
target ratio between the two and what level of deviations from the target is
acceptable. This is particularily useful when scaling up from single machine
use cases to distributed systems as the same "logical" throughput can be kept
constant even though the scale has changed by orders of magnitude.
It is important to note that "data elements" and "items" are related but
distinct types of entities.
Data element:
- The actual data written using `Writer.append`.
- Immutable once written.
- Is not stored in a `Table`.
- Can be referenced by items from one or more distinct `Table`.
- Cannot be retrieved in any other way than as a part of an item.
Item:
- The entity stored in a `Table`.
- Inserted using `Writer.create_item`.
- References one or more data elements, creating a "sequence".
The fact that data elements can be referenced by more than one item from one
or multiple tables means thats one has to be careful not to equate the size of
a table (in terms of items) with the amount of data it references. The data
will remain in memory on the server until the last item that references it is
removed from its table. Removing an item from a table does therefore not
neccesarily result in any (significant) change in memory usage and one must be
careful when selecting remover strategies for a multi table server. Consider
for example a server with two tables. One has a FIFO remover and the other
LIFO remover. In this scenario, the two tables would not share any chunks and
would eventually consume twice the amount of memory compared a similar setup
where the two tables share the same type of removal strategy.
"""

def __init__(self,
name: str,
Expand Down

0 comments on commit 4244827

Please sign in to comment.