Thrill  0.1
examples::k_means Namespace Reference

Classes

struct  CentroidAccumulated
 A point which contains "count" accumulated vectors. More...
 
struct  ClosestCentroid
 Assignment of a point to a cluster, which is the input to. More...
 
class  KMeansModel
 Model returned by KMeans algorithm containing results. More...
 

Typedefs

template<size_t D>
using Point = thrill::common::Vector< D, double >
 Compile-Time Fixed-Dimensional Points. More...
 
template<typename Point >
using PointClusterId = std::pair< Point, size_t >
 
using VPoint = thrill::common::VVector< double >
 A variable D-dimensional point with double precision. More...
 

Functions

template<typename Point , typename InStack >
auto BisecKMeans (const DIA< Point, InStack > &input_points, size_t dimensions, size_t num_clusters, size_t iterations, double epsilon)
 Calculate k-Means using bisecting method. More...
 
template<typename Point , typename InStack >
auto KMeans (const DIA< Point, InStack > &input_points, size_t dimensions, size_t num_clusters, size_t iterations, double epsilon=0.0)
 

Typedef Documentation

◆ Point

◆ PointClusterId

using PointClusterId = std::pair<Point, size_t>

Definition at line 45 of file k-means.hpp.

◆ VPoint

A variable D-dimensional point with double precision.

Definition at line 42 of file k-means.hpp.

Function Documentation

◆ BisecKMeans()

auto examples::k_means::BisecKMeans ( const DIA< Point, InStack > &  input_points,
size_t  dimensions,
size_t  num_clusters,
size_t  iterations,
double  epsilon 
)

Calculate k-Means using bisecting method.

initial cluster size

model that is steadily updated and returned to the calling function

Definition at line 258 of file k-means.hpp.

References ClosestCentroid< Point >::center, ClosestCentroid< Point >::cluster_id, and KMeans().

Referenced by RunKMeansFile(), and RunKMeansGenerated().

◆ KMeans()

auto examples::k_means::KMeans ( const DIA< Point, InStack > &  input_points,
size_t  dimensions,
size_t  num_clusters,
size_t  iterations,
double  epsilon = 0.0 
)

Calculate k-Means using Lloyd's Algorithm. The DIA centroids is both an input and an output parameter. The method returns a std::pair<Point2D, size_t> = Point2DClusterId into the centroids for each input point.

Definition at line 175 of file k-means.hpp.

References DIA< ValueType_, Stack_ >::Cache(), ClosestCentroid< Point >::center, ClosestCentroid< Point >::cluster_id, Vector< D, Type >::DistanceSquare(), and CentroidAccumulated< Point >::p.

Referenced by BisecKMeans(), RunKMeansFile(), and RunKMeansGenerated().