Thrill
0.1
|
A data structure which takes an arbitrary value and extracts a key using a key extractor function from that value.
A key may also be provided initially as part of a key/value pair, not requiring to extract a key.
Afterwards, the key is hashed and the hash is used to assign that key/value pair to some bucket. A bucket can have one or more slots to store items. There are max_num_items_per_table_per_bucket slots in each bucket.
In case a slot already has a key/value pair and the key of that value and the key of the value to be inserted are them same, the values are reduced according to some reduce function. No key/value is added to the current bucket.
If the keys are different, the next slot (moving down) is considered. If the slot is occupied, the same procedure happens again. This prociedure may be considered as linear probing within the scope of a bucket.
Finally, the key/value pair to be inserted may either:
1.) Be reduced with some other key/value pair, sharing the same key. 2.) Inserted at a free slot in the bucket. 3.) Trigger a resize of the data structure in case there are no more free slots in the bucket.
The following illustrations shows the general structure of the data structure. There are several buckets containing one or more slots. Each slot may store a item. In order to optimize I/O, slots are organized in bucket blocks. Bucket blocks are connected by pointers. Key/value pairs are directly stored in a bucket block, no pointers are required here.
Partition 0 Partition 1 Partition 2 Partition 3 Partition 4 B00 B01 B02 B10 B11 B12 B20 B21 B22 B30 B31 B32 B40 B41 B42
+—+—+—+—+—+—+—+—+—+—+—+—+—+—+—+ || | | || | | || | | || | | || | | || +—+—+—+—+—+—+—+—+—+—+—+—+—+—+—+ | | | | | | | | | | | | | | | V V V V V V V V V V V V V V > +—+ +—+ | | | | +—+ +—+ ... | | | | +—+ +—+ | | V V +—+ +—+ | | | | +—+ +—+ ... | | | | +—+ +—+
Definition at line 89 of file reduce_bucket_hash_table.hpp.
#include <reduce_bucket_hash_table.hpp>
Classes | |
struct | BucketBlock |
Block holding reduce key/value pairs. More... | |
class | BucketBlockPool |
BucketBlockPool to stack allocated BucketBlocks. More... | |
Public Types | |
using | BucketBlockIterator = typename std::vector< BucketBlock * >::iterator |
Public Types inherited from ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction > | |
using | MakeTableItem = ReduceMakeTableItem< Value, TableItem, VolatileKey > |
using | ReduceConfig = ReduceConfig |
using | TableItem = typename std::conditional< VolatileKey, std::pair< Key, Value >, Value >::type |
Public Member Functions | |
ReduceBucketHashTable (Context &ctx, size_t dia_id, const KeyExtractor &key_extractor, const ReduceFunction &reduce_function, Emitter &emitter, size_t num_partitions, const ReduceConfig &config=ReduceConfig(), bool immediate_flush=false, const IndexFunction &index_function=IndexFunction(), const KeyEqualFunction &key_equal_function=KeyEqualFunction()) | |
ReduceBucketHashTable (const ReduceBucketHashTable &)=delete | |
non-copyable: delete copy-constructor More... | |
~ReduceBucketHashTable () | |
void | Dispose () |
Deallocate memory. More... | |
void | Initialize (size_t limit_memory_bytes) |
Construct the hash table itself. fill it with sentinels. More... | |
bool | Insert (const TableItem &kv) |
Inserts a value into the table, potentially reducing it in case both the key of the value already in the table and the key of the value to be inserted are the same. More... | |
ReduceBucketHashTable & | operator= (const ReduceBucketHashTable &)=delete |
non-copyable: delete assignment operator More... | |
Spilling Mechanisms to External Memory Files | |
void | SpillAnyPartition () |
Spill all items of an arbitrary partition into an external memory File. More... | |
void | SpillPartition (size_t partition_id) |
Spill all items of a partition into an external memory File. More... | |
void | SpillLargestPartition () |
Spill all items of the largest partition into an external memory File. More... | |
void | SpillSmallestPartition () |
Flushing Mechanisms to Next Stage or Phase | |
template<typename Emit > | |
void | FlushPartitionEmit (size_t partition_id, bool consume, bool, Emit emit) |
void | FlushPartition (size_t partition_id, bool consume, bool grow) |
void | FlushAll () |
Accessors | |
size_t | num_blocks () const |
Returns the number of block in the table. More... | |
Public Member Functions inherited from ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction > | |
ReduceTable (Context &ctx, size_t dia_id, const KeyExtractor &key_extractor, const ReduceFunction &reduce_function, Emitter &emitter, size_t num_partitions, const ReduceConfig &config, bool immediate_flush, const IndexFunction &index_function, const KeyEqualFunction &key_equal_function) | |
ReduceTable (const ReduceTable &)=delete | |
non-copyable: delete copy-constructor More... | |
void | Dispose () |
Deallocate memory. More... | |
void | InitializeSkip () |
Initialize table for SkipPreReducePhase. More... | |
ReduceTable & | operator= (const ReduceTable &)=delete |
non-copyable: delete assignment operator More... | |
Context & | ctx () const |
Returns the context. More... | |
size_t | dia_id () const |
Returns dia_id_. More... | |
const KeyExtractor & | key_extractor () const |
Returns the key_extractor. More... | |
const ReduceFunction & | reduce_function () const |
Returns the reduce_function. More... | |
const Emitter & | emitter () const |
Returns emitter_. More... | |
const IndexFunction & | index_function () const |
Returns index_function_. More... | |
IndexFunction & | index_function () |
Returns index_function_ (mutable) More... | |
const KeyEqualFunction & | key_equal_function () const |
Returns key_equal_function_. More... | |
std::vector< data::File > & | partition_files () |
Returns the vector of partition files. More... | |
size_t | num_partitions () |
Returns the number of partitions. More... | |
size_t | num_buckets () const |
Returns num_buckets_. More... | |
size_t | num_buckets_per_partition () const |
Returns num_buckets_per_partition_. More... | |
size_t | limit_memory_bytes () const |
Returns limit_memory_bytes_. More... | |
size_t | limit_items_per_partition () const |
Returns limit_items_per_partition_. More... | |
size_t | items_per_partition (size_t id) const |
Returns items_per_partition_. More... | |
size_t | num_items () const |
Returns the total num of items in the table. More... | |
size_t | num_items_calc () const |
Returns the total num of items in the table. More... | |
common::Range | key_range (size_t partition_id) |
calculate key range for the given output partition More... | |
bool | has_spilled_data () const |
returns whether and partition has spilled data into external memory. More... | |
bool | has_spilled_data_on_partition (size_t partition_id) |
Key | key (const TableItem &t) const |
TableItem | reduce (const TableItem &a, const TableItem &b) const |
IndexFunction::Result | calculate_index (const TableItem &kv) const |
Static Public Attributes | |
static constexpr size_t | block_size_ |
Static Public Attributes inherited from ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction > | |
static constexpr bool | debug |
Private Types | |
using | Super = ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction > |
Private Attributes | |
BucketBlockPool | block_pool_ |
Bucket block pool. More... | |
std::vector< BucketBlock * > | buckets_ |
Storing the items. More... | |
Fixed Operational Parameters | |
size_t | limit_blocks_ |
Number of blocks in the table before some items are spilled. More... | |
size_t | max_items_per_partition_ |
Maximal number of items per partition. More... | |
size_t | max_blocks_per_partition_ |
Maximal number of blocks per partition. More... | |
Current Statistical Parameters | |
size_t | num_blocks_ = 0 |
Total number of blocks in the table. More... | |
Static Private Attributes | |
static constexpr size_t | bucket_block_size = ReduceConfig::bucket_block_size_ |
target number of bytes in a BucketBlock. More... | |
static constexpr bool | debug_items = false |
Additional Inherited Members | |
Protected Attributes inherited from ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction > | |
Context & | ctx_ |
Context. More... | |
size_t | dia_id_ |
Associated DIA id. More... | |
Emitter & | emitter_ |
Emitter object to receive items outputted to next phase. More... | |
IndexFunction | index_function_ |
Index Calculation functions: Hash or ByIndex. More... | |
KeyEqualFunction | key_equal_function_ |
Comparator function for keys. More... | |
KeyExtractor | key_extractor_ |
Key extractor function for extracting a key from a value. More... | |
std::vector< data::File > | partition_files_ |
Store the files for partitions. More... | |
ReduceFunction | reduce_function_ |
Reduce function for reducing two values. More... | |
const size_t | num_partitions_ |
Number of partitions. More... | |
ReduceConfig | config_ |
config of reduce table More... | |
size_t | num_buckets_ |
size_t | num_buckets_per_partition_ |
Partition size, the number of buckets per partition. More... | |
size_t | limit_memory_bytes_ |
Size of the table in bytes. More... | |
size_t | limit_items_per_partition_ |
Number of items in a partition before the partition is spilled. More... | |
bool | immediate_flush_ |
size_t | num_items_ |
Current number of items. More... | |
std::vector< size_t > | items_per_partition_ |
Current number of items per partition. More... | |
using BucketBlockIterator = typename std::vector<BucketBlock*>::iterator |
Definition at line 133 of file reduce_bucket_hash_table.hpp.
|
private |
Definition at line 98 of file reduce_bucket_hash_table.hpp.
|
inline |
|
delete |
non-copyable: delete copy-constructor
|
inline |
|
inline |
Deallocate memory.
Definition at line 334 of file reduce_bucket_hash_table.hpp.
References ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::block_pool_, ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::buckets_, ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::BucketBlockPool::Destroy(), ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::BucketBlock::destroy_items(), ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::Dispose(), ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::BucketBlock::next, and tlx::vector_free().
|
inline |
Definition at line 528 of file reduce_bucket_hash_table.hpp.
References ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::FlushPartition(), and ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::num_partitions_.
|
inline |
Definition at line 520 of file reduce_bucket_hash_table.hpp.
References ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::emitter_, and ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::FlushPartitionEmit().
Referenced by ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::FlushAll(), and ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::SpillPartition().
|
inline |
Definition at line 468 of file reduce_bucket_hash_table.hpp.
References ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::block_pool_, ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::buckets_, ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::BucketBlockPool::Deallocate(), ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::BucketBlock::items, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::items_per_partition_, LOG, ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::BucketBlock::next, ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::num_blocks_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::num_buckets_per_partition_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::num_items_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::num_items_calc(), and ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::BucketBlock::size.
|
inline |
Construct the hash table itself. fill it with sentinels.
Definition at line 155 of file reduce_bucket_hash_table.hpp.
References ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::block_size_, ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::buckets_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::config_, ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::limit_blocks_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::limit_items_per_partition_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::limit_memory_bytes(), ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::limit_memory_bytes_, ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::max_blocks_per_partition_, ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::max_items_per_partition_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::num_buckets_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::num_buckets_per_partition_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::num_partitions_, ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::operator=(), ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::ReduceBucketHashTable(), and sLOG.
|
inline |
Inserts a value into the table, potentially reducing it in case both the key of the value already in the table and the key of the value to be inserted are the same.
An insert may trigger a partial flush of the partition with the most items if the maximal number of items in the table (max_items_per_table_table) is reached.
Alternatively, it may trigger a resize of table in case maximal number of items per bucket is reached.
kv | Value to be inserted into the table. \return true if a new key was inserted to the table |
Definition at line 259 of file reduce_bucket_hash_table.hpp.
References ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::block_pool_, ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::block_size_, ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::buckets_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::calculate_index(), ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::debug_items, ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::BucketBlockPool::GetBlock(), ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::BucketBlock::items, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::items_per_partition_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::key(), ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::key_equal_function_, ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::limit_blocks_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::limit_items_per_partition_, LOGC, thrill::mem::memory_exceeded, ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::BucketBlock::next, ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::num_blocks_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::num_buckets_per_partition_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::num_items_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::num_partitions_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::reduce(), ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::BucketBlock::size, ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::SpillAnyPartition(), ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::SpillPartition(), and TLX_UNLIKELY.
|
inline |
Returns the number of block in the table.
Definition at line 544 of file reduce_bucket_hash_table.hpp.
|
delete |
non-copyable: delete assignment operator
|
inline |
Spill all items of an arbitrary partition into an external memory File.
Definition at line 360 of file reduce_bucket_hash_table.hpp.
|
inline |
Spill all items of the largest partition into an external memory File.
Definition at line 419 of file reduce_bucket_hash_table.hpp.
References ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::items_per_partition_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::num_partitions_, and ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::SpillPartition().
|
inline |
Spill all items of a partition into an external memory File.
Definition at line 366 of file reduce_bucket_hash_table.hpp.
References ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::block_pool_, ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::buckets_, ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::BucketBlockPool::Deallocate(), ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::FlushPartition(), ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::immediate_flush_, ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::BucketBlock::items, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::items_per_partition_, ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::BucketBlock::next, ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::num_blocks_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::num_buckets_per_partition_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::num_items_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::num_items_calc(), ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::partition_files_, BlockWriter< BlockSink >::Put(), ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::BucketBlock::size, and sLOG.
Referenced by ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::Insert(), ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::SpillLargestPartition(), and ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::SpillSmallestPartition().
|
inline |
Spill all items of the smallest non-empty partition into an external memory File.
Definition at line 441 of file reduce_bucket_hash_table.hpp.
References ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::items_per_partition_, max(), ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::num_partitions_, and ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::SpillPartition().
|
private |
Bucket block pool.
Definition at line 629 of file reduce_bucket_hash_table.hpp.
Referenced by ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::Dispose(), ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::FlushPartitionEmit(), ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::Insert(), and ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::SpillPartition().
|
static |
calculate number of items such that each BucketBlock has about 1 MiB of size, or at least 8 items.
Definition at line 110 of file reduce_bucket_hash_table.hpp.
Referenced by ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::Initialize(), and ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::Insert().
|
staticprivate |
target number of bytes in a BucketBlock.
Definition at line 105 of file reduce_bucket_hash_table.hpp.
|
private |
Storing the items.
Definition at line 626 of file reduce_bucket_hash_table.hpp.
Referenced by ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::Dispose(), ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::FlushPartitionEmit(), ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::Initialize(), ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::Insert(), and ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::SpillPartition().
|
staticprivate |
|
private |
Number of blocks in the table before some items are spilled.
Definition at line 635 of file reduce_bucket_hash_table.hpp.
Referenced by ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::Initialize(), and ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::Insert().
|
private |
Maximal number of blocks per partition.
Definition at line 641 of file reduce_bucket_hash_table.hpp.
|
private |
Maximal number of items per partition.
Definition at line 638 of file reduce_bucket_hash_table.hpp.
|
private |
Total number of blocks in the table.
Definition at line 649 of file reduce_bucket_hash_table.hpp.
Referenced by ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::FlushPartitionEmit(), ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::Insert(), ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::num_blocks(), and ReduceBucketHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig, IndexFunction, KeyEqualFunction >::SpillPartition().