Reservoir Sampling is a randomized algorithm used to select k items uniformly at random from a stream of unknown or very large size. The key challenge is that you cannot store the entire dataset in memory, yet every element must have an equal probability of being chosen. Reservoir Sampling solves this by maintaining a small "reservoir" of elements and updating it probabilistically as new elements arrive.
This technique frequently appears in coding interviews when working with data streams, large datasets, or unknown input sizes. Companies use it to test your understanding of randomness, probability, and memory-efficient algorithms. Instead of loading millions of elements into memory, Reservoir Sampling lets you process the stream in a single pass with O(k) space. Many classic interview problems—like randomly picking an index from an array or sampling elements from a linked list—use this pattern.
To understand Reservoir Sampling deeply, it helps to be comfortable with Randomized algorithms and the fundamentals of Probability and Statistics. These concepts explain why replacing elements with probability k/i guarantees that every element in the stream has the same chance of being selected.
Common variations of Reservoir Sampling appear in problems involving:
In coding interviews, the most common pattern is k = 1, where you randomly select a single element while iterating once through the data. More advanced variations extend the idea to selecting multiple elements, weighted sampling, or adapting the algorithm to different data structures.
On FleetCode, you can practice carefully selected Reservoir Sampling problems that reinforce the probability intuition behind the algorithm and prepare you for real interview scenarios. By solving these problems, you'll learn how to implement the algorithm efficiently and recognize when this elegant technique is the best tool for streaming or large-input problems.
Basic mathematical reasoning helps prove why the replacement probability k/i guarantees uniform sampling across the entire dataset.
Many interview problems implement Reservoir Sampling over arrays where you randomly choose indices or elements while scanning the array once.
The algorithm is a classic randomized technique. Learning randomized algorithms helps you reason about expected behavior and correctness proofs for sampling methods.
Reservoir Sampling is commonly used when processing continuous or unbounded streams where storing all elements is impossible.
Reservoir Sampling relies on probability reasoning to ensure every element in a stream has equal selection chance. Understanding conditional probability and uniform distribution is essential.
Try broadening your search or exploring a different topic. There are thousands of problems waiting for you.
Frequently appear alongside Reservoir Sampling.
Common questions about Reservoir Sampling.
Reservoir Sampling occasionally appears in FAANG and other top tech interviews, especially for problems involving data streams or random selection. While not as common as arrays or dynamic programming, it is a valuable pattern because it tests understanding of probability, randomness, and space-efficient algorithms.
Reservoir Sampling runs in O(n) time because each element in the stream is processed exactly once. The space complexity is O(k), where k is the number of elements stored in the reservoir. This makes it ideal for very large datasets or streaming systems.
Reservoir Sampling is a randomized algorithm used to select k elements uniformly from a stream of n items when n is unknown or too large to store. It processes the stream in one pass and uses O(k) memory. The algorithm updates the reservoir with probability k/i for the i-th element to maintain equal selection probability.
Typical patterns include randomly selecting one element from a stream (k=1), selecting k elements from a large dataset, sampling nodes from linked structures, and performing weighted random selection. These problems often combine randomized algorithms with probability reasoning.
Common interview problems include Random Pick Index, Linked List Random Node, Random Pick with Weight variations, and sampling elements from a data stream. These problems test your ability to maintain uniform randomness while scanning data once. Practicing 3–5 variations usually covers the most common interview patterns.
Most candidates can understand the pattern after solving about 4–6 focused problems. Start with the single-element version (k=1), then move to k-element sampling and weighted variations. Practicing multiple implementations helps reinforce the probability reasoning behind the algorithm.