Data Structures

Bloom Filter

A Bloom filter is a space-efficient probabilistic data structure that is used to test whether an element is a member of a set. False positive matches are possible, but false negatives are not. Elements can be added to the set, but not removed; the more items added, the larger the probability of false positives.

HyperLogLog

HyperLogLog is an algorithm for the count-distinct problem, approximating the number of distinct elements in a multiset. Calculating the exact cardinality of the distinct elements of a multiset requires an amount of memory proportional to the cardinality, which is impractical for very large data sets. Probabilistic cardinality estimators, such as the HyperLogLog algorithm, use significantly less memory than this, but can only approximate the cardinality. The HyperLogLog algorithm is able to estimate cardinalities of > 109 with a typical accuracy (standard error) of 2%, using 1.5 kB of memory.

B-Tree

B-tree is a self-balancing tree data structure that maintains sorted data and allows searches, sequential access, insertions, and deletions in logarithmic time. The B-tree generalizes the binary search tree, allowing for nodes with more than two children. Unlike other self-balancing binary search trees, the B-tree is well suited for storage systems that read and write relatively large blocks of data, such as databases and file systems.

k-d tree

k-dimensional tree (k-d tree) is a space-partitioning data structure for organizing points in a k-dimensional space. K-dimensional is that which concerns exactly k orthogonal axes or a space of any number of dimensions. k-d trees are a useful data structure for several applications, such as searches involving a multidimensional search key (e.g. range searches and nearest neighbor searches) or creating point clouds. k-d trees are a special case of binary space partitioning trees.

Merkle tree

Merkle tree is a tree in which every "leaf" node is labelled with the cryptographic hash of a data block, and every node that is not a leaf (called a branch, inner node, or inode) is labelled with the cryptographic hash of the labels of its child nodes. A hash tree allows efficient and secure verification of the contents of a large data structure. A hash tree is a generalization of a hash list and a hash chain.

Radix Tree

Ropes

Rope science

CRDT (Conflict-free Replication Data Type)

An Interactive Intro to CRDTs - CRDT = Conflict-free Replicated Data Type
cola: a text CRDT for real-time collaborative editing - leightweight CRDT implemention in rust for plain text documents.
Collaborative Editing in ProseMirror
CRDTs Turned Inside Out #crdt #text-editing
Ink and Switch - Articles on collaborative peer-to-peer editing #crdt #text-editing
yjs - a shared editing framework
- Y-sweet - Y-Sweet is an open-source Yjs sync server

Resources

6.851: Advanced Data Structures