Incrementally snapshotable data structures
A fixed-length array (of length N) supports two O(1) operations (list A):
- Retrieve the element at index i (where 0 ≤ i < N).
- Set the element at index i (where 0 ≤ i < N) to some new value.
An incrementally snapshotable fixed-length array (of length N) supports two more O(1) operations (list A, continued):
- Take a snapshot of all N elements.
- Drain some of the snapshot to a sink.
Given the O(1) requirement, it shouldn't be surprising that operation A4 only drains some of the snapshot. It needs to be called O(N) times to fully drain the snapshot to the sink, but this is fine, as the entire snapshot of N elements is taken atomically as operation A3, and other operations can be performed between draining steps.
The only non-obvious part is how to implement operation A3 in O(1) time, as a naïve implementation would take O(N) time. To make A3 possible in O(1) time, some of the work needs to be pushed elsewhere (to A2 and A4), and a number of requirements need to be imposed (list B):
- At most one snapshot can exist at any time.
- Once a snapshot has been taken, it must be fully drained before the next snapshot is taken.
- The underlying data structure must be incrementally iterable (this is trivial for arrays).
- Given the iterator cursor from B3, and an element being mutated by operation A2, it needs to be possible to determine on which side of the cursor the element falls (also trivial for arrays).
- The sink must be willing to receive the snapshot contents in any order.
- The sink must be willing to receive snapshot contents at any time.
- An additional one bit of mutable storage must be available for each element in the data structure.
Some of these requirements (B1, B2, B5, and B6) can be relaxed slightly (at a cost elsewhere), but we'll go with the full set of requirements for now, as practical use-cases exist which are compatible with all of these requirements. Given all these requirements, the iteration cursor from B3 splits the underlying data structure into two pieces, which I'll call visited and pending. The additional one bit per element from B7 I'll call already-visited, and impose the invariant that already-visited is false for elements in the visited part of the data structure (whereas it can be either true or false for elements in the pending part). The list A operations are then implemented as (list C):
- Retrieve the element at index i (where 0 ≤ i < N): exactly the same as a fixed-length array without snapshots.
- Set the element at index i (where 0 ≤ i < N) to some new value: if element i is in the pending part of the data structure, and already-visited is false, then before setting the element, send the previous value of the element to the sink and set already-visited to true. After this, proceed exactly the same as a fixed-length array without snapshots.
- Take a snapshot of all N elements: Set the B3 iteration cursor to one end of the data structure, such that the entire structure is pending.
- Drain some of the snapshot to a sink: Advance the B3 iteration cursor by one element. If that element was already-visited, then set already-visited to false. Otherwise, send the element to the sink.
There is a choice in C3 as to whether to set the cursor to the very start of the structure, or to the very of end the structure. This then impacts C4: the iteration is either forwards (if starting from the start) or backwards (if starting from the end). Forwards iteration is more natural, but for data structures whose mutations often happen at the end, backwards iteration can reduce the number of already-visited elements, which can improve efficiency. Requirement B2 (once a snapshot has been taken, it must be fully drained before the next snapshot is taken) ensures that the cursor can be set to either of these points without having any simultaneously visited and already-visited elements, as C4 ensures that already-visited is false everywhere after the previous draining is complete.
C++ happens to give the name operator[]
to both operation A1 (retrieve) and A2 (set), albeit with different const
specifiers. This point of confusion aside, the implementation of a fixed-length array without snapshots is trivial:
template <typename T, size_t N>
struct fixed_length_array {
T _elems[N];
const T& operator[](size_t i) const { // A1
assert(("index should be within bounds", i < N));
return _elems[i];
}
T& operator[](size_t i) { // A2
assert(("index should be within bounds", i < N));
return _elems[i];
}
};
It can be extended to an incrementally snapshotable fixed-length array like so:
template <typename T, size_t N, typename Sink>
struct fixed_length_array_with_snapshot : fixed_length_array<T, N> {
bool _already_visited[N] = {};
Sink _sink = nullptr;
size_t _cursor = 0;
const T& operator[](size_t i) const { // A1
return fixed_length_array<T, N>::operator[](i);
}
T& operator[](size_t i) { // A2
if (i < _cursor && !_already_visited[i]) {
_already_visited[i] = true;
(*_sink)[i] = this->_elems[i];
}
return fixed_length_array<T, N>::operator[](i);
}
void take_snapshot(Sink sink) { // A3
assert(("requirement B2", _sink == nullptr));
assert(("sink should not be null", sink != nullptr));
_sink = sink;
_cursor = N;
}
bool drain_snapshot() { // A4
assert(("snapshot not yet taken", _sink != nullptr));
if (_cursor == 0) {
_sink = nullptr;
return true; // finished draining
} else if (_already_visited[--_cursor]) {
_already_visited[_cursor] = false;
return false;
} else {
(*_sink)[_cursor] = this->_elems[_cursor];
return false;
}
}
};
An example usage is:
fixed_length_array<int, 3> sink;
fixed_length_array_with_snapshot<int, 3, decltype(&sink)> arr;
arr[0] = 10;
arr[1] = 20;
arr[2] = 30;
arr.take_snapshot(&sink);
arr[1] = 25;
do {} while (arr.drain_snapshot() == false);
std::cout << arr[0] << " " << arr[1] << " " << arr[2] << "\n";
std::cout << sink[0] << " " << sink[1] << " " << sink[2] << "\n";
Requirement B6 (the sink must be willing to receive snapshot contents at any time) can be relaxed by introducing a buffer, with operation A2 pushing into this buffer when the sink is not willing to receive contents, and operation A4 popping from the buffer (if non-empty) rather than advancing the cursor. This gives a slightly different implementation:
template <typename T, size_t N>
struct fixed_length_array_with_snapshot : fixed_length_array<T, N> {
bool _already_visited[N] = {};
std::vector<std::pair<size_t, T>> _buffer;
size_t _cursor = 0;
const T& operator[](size_t i) const { // A1
return fixed_length_array<T, N>::operator[](i);
}
T& operator[](size_t i) { // A2
if (i < _cursor && !_already_visited[i]) {
_already_visited[i] = true;
_buffer.emplace_back(i, this->_elems[i]);
}
return fixed_length_array<T, N>::operator[](i);
}
void take_snapshot() { // A3
_cursor = N;
}
template <typename Sink>
bool drain_snapshot(Sink sink) { // A4
if (!_buffer.empty()) {
(*sink)[_buffer.back().first] = _buffer.back().second;
_buffer.pop_back();
return false;
} else if (_cursor == 0) {
return true; // finished draining
} else if (_already_visited[--_cursor]) {
_already_visited[_cursor] = false;
return false;
} else {
(*sink)[_cursor] = this->_elems[_cursor];
return false;
}
}
};
And slightly different example usage:
fixed_length_array<int, 3> sink;
fixed_length_array_with_snapshot<int, 3> arr;
arr[0] = 10;
arr[1] = 20;
arr[2] = 30;
arr.take_snapshot();
arr[1] = 25;
do {} while (!arr.drain_snapshot(&sink));
std::cout << arr[0] << " " << arr[1] << " " << arr[2] << "\n";
std::cout << sink[0] << " " << sink[1] << " " << sink[2] << "\n";
From here, requirement B5 (the sink must be willing to receive the snapshot contents in any order) can be relaxed by changing the data structure of the buffer: rather than a first-in-last-out stack, use a tree (as in binary tree or btree) data structure, and have A4 perform dual iteration (using the same cursor) of the underlying data structure and the buffer.
To explore a different area of the implementation space, we can reinstate requirements B5 and B6, and instead relax requirement B2 (once a snapshot has been taken, it must be fully drained before the next snapshot is taken). For this, one option is to replace the one bit of mutable storage per element with a sequence number per element. The implementation looks like:
template <typename T, size_t N, typename Sink>
struct fixed_length_array_with_snapshot : fixed_length_array<T, N> {
uint64_t _sequence[N] = {};
Sink _sink = nullptr;
uint64_t _snap_sequence = 0;
size_t _cursor = 0;
const T& operator[](size_t i) const { // A1
return fixed_length_array<T, N>::operator[](i);
}
T& operator[](size_t i) { // A2
if (i < _cursor && _sequence[i] < _snap_sequence) {
_sequence[i] = _snap_sequence;
(*_sink)[i] = this->_elems[i];
}
return fixed_length_array<T, N>::operator[](i);
}
void take_snapshot(Sink sink) { // A3
_sink = sink;
_snap_sequence += 1;
_cursor = N;
}
bool drain_snapshot() { // A4
if (_cursor == 0) {
return true; // finished draining
} else if (_sequence[--_cursor] < _snap_sequence) {
(*_sink)[_cursor] = this->_elems[_cursor];
}
return false;
}
};
One advantage of this implementation is that drain_snapshot
no longer needs to mutate anything except the cursor, which can be useful when mutation is expensive. It also provides a path to relaxing requirement B1 (at most one snapshot can exist at any time) slightly: to support a handful of snapshots at a time, each one needs its own _cursor
, its own _snap_sequence
, and its own _sink
, and mutating operator[]
needs to check against every snapshot and possibly send the element to every sink.
To explore yet another different area, we can reinstate requirements B1 and B2, and instead think about different data structures. A fixed-length array of T
is arguably the easiest data structure to consider, but the exact same methodology can be used to add incremental snapshots to any kind of array, or any kind of tree (including binary trees and btrees), or any kind of hash table: a cursor needs to be maintained, mutating operations (including element insertion and removal) need to check against that cursor, and some extra state (either one bit or a snapshot sequence number) needs to be maintained per element. If the underlying data structure already needs to maintain some state per element, then the state for snapshots might fit in very easily. For an example of this, consider a fixed-length array of Maybe[T]
. That is, an array where every element is either Nothing
or Something(T)
. A simple implementation without snapshots might look like:
template <typename T, size_t N>
struct fixed_length_maybe {
T _elems[N];
bool _has[N] = {};
bool has(size_t i) const {
return i < N && _has[i];
}
void emplace(size_t i, const T& value) {
assert(("index should be within bounds", i < N));
assert(("element already exists", !_has[i]));
_has[i] = true;
_elems[i] = value;
}
void remove(size_t i) {
assert(has(i));
_has[i] = false;
}
const T& get(size_t i) const {
assert(has(i));
return _elems[i];
}
void set(size_t i, const T& value) {
assert(has(i));
_elems[i] = value;
}
};
Note that the above implementation assumes that T
is default-constructible. It should be clear how to remove this assumption (change the element type of the _elems
array from T
to std::aligned_storage_t<sizeof(T), alignof(T)>
, use placement-new in emplace
, invoke destructor in remove
, write copy/move constructors/operators and destructor as appropriate), but the resultant boilerplate isn't interesting to the topic of this blog post.
The implementation can then be extended with support for incremental snapshots:
template <typename T, size_t N, typename Sink>
struct fixed_length_maybe_with_snapshot : fixed_length_maybe<T, N> {
bool _already_visited[N] = {};
Sink _sink = nullptr;
size_t _cursor = 0;
void emplace(size_t i, const T& value) {
_already_visited[i] = (i < _cursor);
return fixed_length_maybe<T, N>::emplace(i, value);
}
void remove(size_t i) {
if (i < _cursor && !_already_visited[i]) {
_sink->emplace(i, this->_elems[i]);
}
_already_visited[i] = false;
return fixed_length_maybe<T, N>::remove(i);
}
void set(size_t i, const T& value) {
if (i < _cursor && !_already_visited[i]) {
_already_visited[i] = true;
_sink->emplace(i, this->_elems[i]);
}
return fixed_length_maybe<T, N>::set(i, value);
}
void take_snapshot(Sink sink) {
assert(("requirement B2", _sink == nullptr));
assert(("sink should not be null", sink != nullptr));
_sink = sink;
_cursor = N;
}
bool drain_snapshot() {
assert(("snapshot not yet taken", _sink != nullptr));
if (_cursor == 0) {
_sink = nullptr;
return true; // finished draining
} else if (_already_visited[--_cursor]) {
_already_visited[_cursor] = false;
} else if (this->_has[_cursor]) {
_sink->emplace(_cursor, this->_elems[_cursor]);
}
return false;
}
};
A better implementation would note that the _has
array and the _already_visited
array can be combined into a single array, with every element being in one of three possible states:
Nothing
Something(T)
Something(T)
already-visited
A fixed-length array of Maybe[T]
might feel slightly contrived, but an open addressing hash table from K
to V
is basically just a fixed-length array of Maybe[Pair[K,V]]
. Depending on how deletion is implemented, a fourth element state (Tombstone
) might be added. It is also common for the Something
states to contain a fingerprint of the element hash (in the otherwise spare bits). The only non-obvious part is how rehashing (to grow the table) interacts with the snapshot state, but this turns out to be easy: at the end of rehashing, set the snapshot cursor such that the entire new table is pending, and when moving elements from the old table to the new table, set them as already-visited if they were already-visited to start with or they were in the visited part of the old table.