Thrill
0.1
|
Fast exact implementation of reservoir sampling using skip values.
Algorithm L from Kim-Hung Li: Reservoir Sampling Algorithms of Time Complexity O(n(1+log(N/n))), ACM TOMS 1994. The reservoir size is fixed, new items replace old ones such that all items in the stream are sampled with the same uniform probability.
Definition at line 85 of file reservoir_sampling.hpp.
#include <reservoir_sampling.hpp>
Public Member Functions | |
ReservoirSamplingFast (size_t size, std::vector< Type > &samples, RNG &rng) | |
initialize reservoir sampler More... | |
void | add (const Type &item) |
visit item, maybe add it to the sample. More... | |
size_t | count () const |
number of items seen More... | |
const std::vector< Type > & | samples () const |
access to samples More... | |
size_t | size () const |
size of reservoir More... | |
Private Member Functions | |
void | calc_next_gap () |
draw gap size from geometric distribution with p = size_ / count_ More... | |
Private Attributes | |
size_t | count_ = 0 |
number of items seen More... | |
size_t | gap_ |
number of items to skip until next sample More... | |
RNG & | rng_ |
source of randomness More... | |
std::vector< Type > & | samples_ |
reservoir More... | |
size_t | size_ |
size of reservoir More... | |
std::uniform_real_distribution< double > | uniform |
uniform [0.0, 1.0) distribution More... | |
double | W_ |
|
inline |
initialize reservoir sampler
Definition at line 89 of file reservoir_sampling.hpp.
|
inline |
visit item, maybe add it to the sample.
Definition at line 98 of file reservoir_sampling.hpp.
Referenced by SampleNode< ValueType >::SampleNode().
|
inlineprivate |
draw gap size from geometric distribution with p = size_ / count_
Definition at line 156 of file reservoir_sampling.hpp.
|
inline |
number of items seen
Definition at line 133 of file reservoir_sampling.hpp.
Referenced by SampleNode< ValueType >::Execute().
|
inline |
access to samples
Definition at line 136 of file reservoir_sampling.hpp.
|
inline |
size of reservoir
Definition at line 130 of file reservoir_sampling.hpp.
|
private |
number of items seen
Definition at line 142 of file reservoir_sampling.hpp.
|
private |
number of items to skip until next sample
Definition at line 144 of file reservoir_sampling.hpp.
|
private |
source of randomness
Definition at line 151 of file reservoir_sampling.hpp.
|
private |
reservoir
Definition at line 149 of file reservoir_sampling.hpp.
|
private |
size of reservoir
Definition at line 140 of file reservoir_sampling.hpp.
|
private |
uniform [0.0, 1.0) distribution
Definition at line 153 of file reservoir_sampling.hpp.
|
private |
random value for gap calculation, distribution: largest value in a sample of Uniform(0, old_W) of size size_, where old_W is 1 initially
Definition at line 147 of file reservoir_sampling.hpp.