Thrill  0.1
ReduceOldProbingHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction > Class Template Reference

Detailed Description

template<typename TableItem, typename Key, typename Value, typename KeyExtractor, typename ReduceFunction, typename Emitter, const bool VolatileKey, typename ReduceConfig_, typename IndexFunction, typename KeyEqualFunction = std::equal_to<Key>>
class thrill::core::ReduceOldProbingHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >

A data structure which takes an arbitrary value and extracts a key using a key extractor function from that value.

A key may also be provided initially as part of a key/value pair, not requiring to extract a key.

Afterwards, the key is hashed and the hash is used to assign that key/value pair to some slot.

In case a slot already has a key/value pair and the key of that value and the key of the value to be inserted are them same, the values are reduced according to some reduce function. No key/value is added to the data structure.

If the keys are different, the next slot (moving to the right) is considered. If the slot is occupied, the same procedure happens again (know as linear probing.)

Finally, the key/value pair to be inserted may either:

1.) Be reduced with some other key/value pair, sharing the same key. 2.) Inserted at a free slot. 3.) Trigger a resize of the data structure in case there are no more free slots in the data structure.

The following illustrations shows the general structure of the data structure. The set of slots is divided into 1..n partitions. Each key is hashed into exactly one partition.

Partition 0 Partition 1 Partition 2 Partition 3 Partition 4
P00 P01 P02 P10 P11 P12 P20 P21 P22 P30 P31 P32 P40 P41 P42

+—+—+—+—+—+—+—+—+—+—+—+—+—+—+—+ || | | || | | || | | || | | || | | || +—+—+—+—+—+—+—+—+—+—+—+—+—+—+—+ <- LI -> LI..Local Index <- GI -> GI..Global Index PI 0 PI 1 PI 2 PI 3 PI 4 PI..Partition ID

Definition at line 77 of file reduce_old_probing_hash_table.hpp.

+ Inheritance diagram for ReduceOldProbingHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >:
+ Collaboration diagram for ReduceOldProbingHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >:

#include <reduce_old_probing_hash_table.hpp>

Public Types

using ReduceConfig = ReduceConfig_
 
using TableItemIterator = typename std::vector< TableItem >::iterator
 
- Public Types inherited from ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >
using MakeTableItem = ReduceMakeTableItem< Value, TableItem, VolatileKey >
 
using ReduceConfig = ReduceConfig_
 
using TableItem = typename std::conditional< VolatileKey, std::pair< Key, Value >, Value >::type
 

Public Member Functions

 ReduceOldProbingHashTable (Context &ctx, size_t dia_id, const KeyExtractor &key_extractor, const ReduceFunction &reduce_function, Emitter &emitter, size_t num_partitions, const ReduceConfig &config=ReduceConfig(), bool immediate_flush=false, const IndexFunction &index_function=IndexFunction(), const KeyEqualFunction &key_equal_function=KeyEqualFunction())
 
void Dispose ()
 Deallocate memory. More...
 
void Initialize (size_t limit_memory_bytes)
 
bool Insert (const TableItem &kv)
 Inserts a value into the table, potentially reducing it in case both the key of the value already in the table and the key of the value to be inserted are the same. More...
 
Spilling Mechanisms to External Memory Files
void SpillPartition (size_t partition_id)
 Spill all items of a partition into an external memory File. More...
 
void SpillAnyPartition ()
 Spill all items of an arbitrary partition into an external memory File. More...
 
void SpillLargestPartition ()
 Spill all items of the largest partition into an external memory File. More...
 
Flushing Mechanisms to Next Stage or Phase
template<typename Emit >
void FlushPartitionEmit (size_t partition_id, bool consume, bool, Emit emit)
 
void FlushPartition (size_t partition_id, bool consume, bool grow)
 
void FlushAll ()
 
- Public Member Functions inherited from ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >
 ReduceTable (Context &ctx, size_t dia_id, const KeyExtractor &key_extractor, const ReduceFunction &reduce_function, Emitter &emitter, size_t num_partitions, const ReduceConfig &config, bool immediate_flush, const IndexFunction &index_function, const KeyEqualFunction &key_equal_function)
 
 ReduceTable (const ReduceTable &)=delete
 non-copyable: delete copy-constructor More...
 
void Dispose ()
 Deallocate memory. More...
 
void InitializeSkip ()
 Initialize table for SkipPreReducePhase. More...
 
ReduceTableoperator= (const ReduceTable &)=delete
 non-copyable: delete assignment operator More...
 
Contextctx () const
 Returns the context. More...
 
size_t dia_id () const
 Returns dia_id_. More...
 
const KeyExtractor & key_extractor () const
 Returns the key_extractor. More...
 
const ReduceFunction & reduce_function () const
 Returns the reduce_function. More...
 
const Emitter & emitter () const
 Returns emitter_. More...
 
const IndexFunction & index_function () const
 Returns index_function_. More...
 
IndexFunction & index_function ()
 Returns index_function_ (mutable) More...
 
const KeyEqualFunction & key_equal_function () const
 Returns key_equal_function_. More...
 
std::vector< data::File > & partition_files ()
 Returns the vector of partition files. More...
 
size_t num_partitions ()
 Returns the number of partitions. More...
 
size_t num_buckets () const
 Returns num_buckets_. More...
 
size_t num_buckets_per_partition () const
 Returns num_buckets_per_partition_. More...
 
size_t limit_memory_bytes () const
 Returns limit_memory_bytes_. More...
 
size_t limit_items_per_partition () const
 Returns limit_items_per_partition_. More...
 
size_t items_per_partition (size_t id) const
 Returns items_per_partition_. More...
 
size_t num_items () const
 Returns the total num of items in the table. More...
 
size_t num_items_calc () const
 Returns the total num of items in the table. More...
 
common::Range key_range (size_t partition_id)
 calculate key range for the given output partition More...
 
bool has_spilled_data () const
 returns whether and partition has spilled data into external memory. More...
 
bool has_spilled_data_on_partition (size_t partition_id)
 
Key key (const TableItem &t) const
 
TableItem reduce (const TableItem &a, const TableItem &b) const
 
IndexFunction::Result calculate_index (const TableItem &kv) const
 

Private Types

using Super = ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >
 

Private Attributes

std::vector< TableItemitems_
 Storing the actual hash table. More...
 
size_t sentinel_partition_ = invalid_partition_
 

Static Private Attributes

static constexpr bool debug_items = false
 
static constexpr size_t invalid_partition_ = size_t(-1)
 sentinel for invalid partition or no sentinel. More...
 

Additional Inherited Members

- Static Public Attributes inherited from ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >
static constexpr bool debug
 
- Protected Attributes inherited from ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >
Contextctx_
 Context. More...
 
size_t dia_id_
 Associated DIA id. More...
 
Emitter & emitter_
 Emitter object to receive items outputted to next phase. More...
 
IndexFunction index_function_
 Index Calculation functions: Hash or ByIndex. More...
 
KeyEqualFunction key_equal_function_
 Comparator function for keys. More...
 
KeyExtractor key_extractor_
 Key extractor function for extracting a key from a value. More...
 
std::vector< data::Filepartition_files_
 Store the files for partitions. More...
 
ReduceFunction reduce_function_
 Reduce function for reducing two values. More...
 
const size_t num_partitions_
 Number of partitions. More...
 
ReduceConfig config_
 config of reduce table More...
 
size_t num_buckets_
 
size_t num_buckets_per_partition_
 Partition size, the number of buckets per partition. More...
 
size_t limit_memory_bytes_
 Size of the table in bytes. More...
 
size_t limit_items_per_partition_
 Number of items in a partition before the partition is spilled. More...
 
bool immediate_flush_
 
size_t num_items_
 Current number of items. More...
 
std::vector< size_t > items_per_partition_
 Current number of items per partition. More...
 

Member Typedef Documentation

◆ ReduceConfig

using ReduceConfig = ReduceConfig_

Definition at line 91 of file reduce_old_probing_hash_table.hpp.

◆ Super

using Super = ReduceTable<TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction>
private

Definition at line 86 of file reduce_old_probing_hash_table.hpp.

◆ TableItemIterator

using TableItemIterator = typename std::vector<TableItem>::iterator

Definition at line 92 of file reduce_old_probing_hash_table.hpp.

Constructor & Destructor Documentation

◆ ReduceOldProbingHashTable()

ReduceOldProbingHashTable ( Context ctx,
size_t  dia_id,
const KeyExtractor &  key_extractor,
const ReduceFunction &  reduce_function,
Emitter &  emitter,
size_t  num_partitions,
const ReduceConfig config = ReduceConfig(),
bool  immediate_flush = false,
const IndexFunction &  index_function = IndexFunction(),
const KeyEqualFunction &  key_equal_function = KeyEqualFunction() 
)
inline

Definition at line 94 of file reduce_old_probing_hash_table.hpp.

Member Function Documentation

◆ Dispose()

◆ FlushAll()

void FlushAll ( )
inline

Definition at line 385 of file reduce_old_probing_hash_table.hpp.

References ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::calculate_index(), ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::config_, ReduceOldProbingHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::FlushPartition(), ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::immediate_flush_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::index_function_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::items_per_partition_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::key(), ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::key_equal_function_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::limit_items_per_partition_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::limit_memory_bytes_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::num_buckets_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::num_buckets_per_partition_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::num_items_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::num_partitions_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::partition_files_, and ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::reduce().

◆ FlushPartition()

◆ FlushPartitionEmit()

void FlushPartitionEmit ( size_t  partition_id,
bool  consume,
bool  ,
Emit  emit 
)
inline

Definition at line 338 of file reduce_old_probing_hash_table.hpp.

References ReduceOldProbingHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::invalid_partition_, ReduceOldProbingHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::items_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::items_per_partition_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::key(), ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::key_equal_function_, LOG, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::num_buckets_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::num_buckets_per_partition_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::num_items_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::num_items_calc(), and ReduceOldProbingHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::sentinel_partition_.

Referenced by ReduceOldProbingHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::FlushPartition().

◆ Initialize()

void Initialize ( size_t  limit_memory_bytes)
inline

◆ Insert()

bool Insert ( const TableItem kv)
inline

Inserts a value into the table, potentially reducing it in case both the key of the value already in the table and the key of the value to be inserted are the same.

An insert may trigger a partial flush of the partition with the most items if the maximal number of items in the table (max_num_items_table) is reached.

Alternatively, it may trigger a resize of the table in case the maximal fill ratio per partition is reached.

Parameters
kvValue to be inserted into the table.
\return true if a new key was inserted to the table

Definition at line 171 of file reduce_old_probing_hash_table.hpp.

References ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::calculate_index(), ReduceOldProbingHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::invalid_partition_, ReduceOldProbingHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::items_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::items_per_partition_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::key(), ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::key_equal_function_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::limit_items_per_partition_, thrill::mem::memory_exceeded, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::num_buckets_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::num_buckets_per_partition_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::num_items_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::num_partitions_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::reduce(), thrill::mem::sentinel, ReduceOldProbingHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::sentinel_partition_, ReduceOldProbingHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::SpillAnyPartition(), ReduceOldProbingHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::SpillPartition(), and TLX_UNLIKELY.

◆ SpillAnyPartition()

◆ SpillLargestPartition()

◆ SpillPartition()

void SpillPartition ( size_t  partition_id)
inline

Spill all items of a partition into an external memory File.

Definition at line 262 of file reduce_old_probing_hash_table.hpp.

References ReduceOldProbingHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::FlushPartition(), ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::immediate_flush_, ReduceOldProbingHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::invalid_partition_, ReduceOldProbingHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::items_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::items_per_partition_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::key(), ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::key_equal_function_, LOG, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::num_buckets_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::num_buckets_per_partition_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::num_items_, ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::num_items_calc(), ReduceTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::partition_files_, BlockWriter< BlockSink >::Put(), and ReduceOldProbingHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::sentinel_partition_.

Referenced by ReduceOldProbingHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::Insert(), and ReduceOldProbingHashTable< TableItem, Key, Value, KeyExtractor, ReduceFunction, Emitter, VolatileKey, ReduceConfig_, IndexFunction, KeyEqualFunction >::SpillLargestPartition().

Member Data Documentation

◆ debug_items

constexpr bool debug_items = false
staticprivate

Definition at line 88 of file reduce_old_probing_hash_table.hpp.

◆ invalid_partition_

◆ items_

◆ sentinel_partition_


The documentation for this class was generated from the following file: