mlpack::det::DTree Class Reference

A density estimation tree is similar to both a decision tree and a space partitioning tree (like a kd-tree). More...

Collaboration diagram for mlpack::det::DTree:
Collaboration graph
[legend]

List of all members.

Public Member Functions

 DTree (const arma::vec &maxVals, const arma::vec &minVals, const size_t totalPoints, const size_t start, const size_t end)
 Create a child node of a density estimation tree given the bounding box specified by maxVals and minVals, using the size given in start and end, and calculating the error with the total number of points given.
 DTree (const arma::vec &maxVals, const arma::vec &minVals, const size_t start, const size_t end, const double logNegError)
 Create a child node of a density estimation tree given the bounding box specified by maxVals and minVals, using the size given in start and end and the specified error.
 DTree (arma::mat &data)
 Create a density estimation tree on the given data.
 DTree (const arma::vec &maxVals, const arma::vec &minVals, const size_t totalPoints)
 Create a density estimation tree with the given bounds and the given number of total points.
 DTree ()
 Create an empty density estimation tree.
 ~DTree ()
 Clean up memory allocated by the tree.
double AlphaUpper () const
 Return the upper part of the alpha sum.
double ComputeValue (const arma::vec &query) const
 Compute the logarithm of the density estimate of a given query point.
void ComputeVariableImportance (arma::vec &importances) const
 Compute the variable importance of each dimension in the learned tree.
size_t End () const
 Return the first index of a point not contained in this node.
int FindBucket (const arma::vec &query) const
 Return the tag of the leaf containing the query.
double Grow (arma::mat &data, arma::Col< size_t > &oldFromNew, const bool useVolReg=false, const size_t maxLeafSize=10, const size_t minLeafSize=5)
 Greedily expand the tree.
DTreeLeft () const
 Return the left child.
double LogNegativeError (const size_t totalPoints) const
 Compute the log-negative-error for this point, given the total number of points in the dataset.
double LogNegError () const
 Return the log negative error of this node.
double LogVolume () const
 Return the inverse of the volume of this node.
arma::vec & MaxVals ()
 Modify the maximum values.
const arma::vec & MaxVals () const
 Return the maximum values.
arma::vec & MinVals ()
 Modify the minimum values.
const arma::vec & MinVals () const
 Return the minimum values.
double PruneAndUpdate (const double oldAlpha, const size_t points, const bool useVolReg=false)
 Perform alpha pruning on a tree.
double Ratio () const
 Return the ratio of points in this node to the points in the whole dataset.
DTreeRight () const
 Return the right child.
bool Root () const
 Return whether or not this is the root of the tree.
size_t SplitDim () const
 Return the split dimension of this node.
double SplitValue () const
 Return the split value of this node.
size_t Start () const
 Return the starting index of points contained in this node.
size_t SubtreeLeaves () const
 Return the number of leaves which are descendants of this node.
double SubtreeLeavesLogNegError () const
 Return the log negative error of all descendants of this node.
int TagTree (const int tag=0)
 Index the buckets for possible usage later; this results in every leaf in the tree having a specific tag (accessible with BucketTag()).
bool WithinRange (const arma::vec &query) const
 Return whether a query point is within the range of this node.
void WriteTree (FILE *fp, const size_t level=0) const
 Print the tree in a depth-first manner (this function is called recursively).

Private Member Functions

bool FindSplit (const arma::mat &data, size_t &splitDim, double &splitValue, double &leftError, double &rightError, const size_t maxLeafSize=10, const size_t minLeafSize=5) const
 Find the dimension to split on.
size_t SplitData (arma::mat &data, const size_t splitDim, const double splitValue, arma::Col< size_t > &oldFromNew) const
 Split the data, returning the number of points left of the split.

Private Attributes

double alphaUpper
 Upper part of alpha sum; used for pruning.
int bucketTag
 The tag for the leaf, used for hashing points.
size_t end
 The index of the last point in the dataset contained in this node (and its children).
DTreeleft
 The left child.
double logNegError
 log-negative-L2-error of the node.
double logVolume
 The logarithm of the volume of the node.
arma::vec maxVals
 Upper half of bounding box for this node.
arma::vec minVals
 Lower half of bounding box for this node.
double ratio
 Ratio of the number of points in the node to the total number of points.
DTreeright
 The right child.
bool root
 If true, this node is the root of the tree.
size_t splitDim
 The splitting dimension for this node.
double splitValue
 The split value on the splitting dimension for this node.
size_t start
 The index of the first point in the dataset contained in this node (and its children).
size_t subtreeLeaves
 Number of leaves of the subtree.
double subtreeLeavesLogNegError
 Sum of the error of the leaves of the subtree.

Detailed Description

A density estimation tree is similar to both a decision tree and a space partitioning tree (like a kd-tree).

Each leaf represents a constant-density hyper-rectangle. The tree is constructed in such a way as to minimize the integrated square error between the probability distribution of the tree and the observed probability distribution of the data. Because the tree is similar to a decision tree, the density estimation tree can provide very fast density estimates for a given point.

For more information, see the following paper:

 @incollection{ram2011,
   author = {Ram, Parikshit and Gray, Alexander G.},
   title = {Density estimation trees},
   booktitle = {{Proceedings of the 17th ACM SIGKDD International Conference
       on Knowledge Discovery and Data Mining}},
   series = {KDD '11},
   year = {2011},
   pages = {627--635}
 }

Definition at line 54 of file dtree.hpp.


Constructor & Destructor Documentation

mlpack::det::DTree::DTree (  ) 

Create an empty density estimation tree.

mlpack::det::DTree::DTree ( const arma::vec &  maxVals,
const arma::vec &  minVals,
const size_t  totalPoints 
)

Create a density estimation tree with the given bounds and the given number of total points.

Children will not be created.

Parameters:
maxVals Maximum values of the bounding box.
minVals Minimum values of the bounding box.
totalPoints Total number of points in the dataset.
mlpack::det::DTree::DTree ( arma::mat &  data  ) 

Create a density estimation tree on the given data.

Children will be created following the procedure outlined in the paper. The data will be modified; it will be reordered similar to the way BinarySpaceTree modifies datasets.

Parameters:
data Dataset to build tree on.
mlpack::det::DTree::DTree ( const arma::vec &  maxVals,
const arma::vec &  minVals,
const size_t  start,
const size_t  end,
const double  logNegError 
)

Create a child node of a density estimation tree given the bounding box specified by maxVals and minVals, using the size given in start and end and the specified error.

Children of this node will not be created recursively.

Parameters:
maxVals Upper bound of bounding box.
minVals Lower bound of bounding box.
start Start of points represented by this node in the data matrix.
end End of points represented by this node in the data matrix.
error log-negative error of this node.
mlpack::det::DTree::DTree ( const arma::vec &  maxVals,
const arma::vec &  minVals,
const size_t  totalPoints,
const size_t  start,
const size_t  end 
)

Create a child node of a density estimation tree given the bounding box specified by maxVals and minVals, using the size given in start and end, and calculating the error with the total number of points given.

Children of this node will not be created recursively.

Parameters:
maxVals Upper bound of bounding box.
minVals Lower bound of bounding box.
start Start of points represented by this node in the data matrix.
end End of points represented by this node in the data matrix.
mlpack::det::DTree::~DTree (  ) 

Clean up memory allocated by the tree.


Member Function Documentation

double mlpack::det::DTree::AlphaUpper (  )  const [inline]

Return the upper part of the alpha sum.

Definition at line 284 of file dtree.hpp.

References alphaUpper.

double mlpack::det::DTree::ComputeValue ( const arma::vec &  query  )  const

Compute the logarithm of the density estimate of a given query point.

Parameters:
query Point to estimate density of.
void mlpack::det::DTree::ComputeVariableImportance ( arma::vec &  importances  )  const

Compute the variable importance of each dimension in the learned tree.

Parameters:
importances Vector to store the calculated importances in.
size_t mlpack::det::DTree::End (  )  const [inline]

Return the first index of a point not contained in this node.

Definition at line 261 of file dtree.hpp.

int mlpack::det::DTree::FindBucket ( const arma::vec &  query  )  const

Return the tag of the leaf containing the query.

This is useful for generating class memberships.

Parameters:
query Query to search for.
bool mlpack::det::DTree::FindSplit ( const arma::mat &  data,
size_t &  splitDim,
double &  splitValue,
double &  leftError,
double &  rightError,
const size_t  maxLeafSize = 10,
const size_t  minLeafSize = 5 
) const [private]

Find the dimension to split on.

double mlpack::det::DTree::Grow ( arma::mat &  data,
arma::Col< size_t > &  oldFromNew,
const bool  useVolReg = false,
const size_t  maxLeafSize = 10,
const size_t  minLeafSize = 5 
)

Greedily expand the tree.

The points in the dataset will be reordered during tree growth.

Parameters:
data Dataset to build tree on.
oldFromNew Mappings from old points to new points.
useVolReg If true, volume regularization is used.
maxLeafSize Maximum size of a leaf.
minLeafSize Minimum size of a leaf.
DTree* mlpack::det::DTree::Left (  )  const [inline]

Return the left child.

Definition at line 278 of file dtree.hpp.

References left.

double mlpack::det::DTree::LogNegativeError ( const size_t  totalPoints  )  const

Compute the log-negative-error for this point, given the total number of points in the dataset.

Parameters:
totalPoints Total number of points in the dataset.
double mlpack::det::DTree::LogNegError (  )  const [inline]

Return the log negative error of this node.

Definition at line 267 of file dtree.hpp.

double mlpack::det::DTree::LogVolume (  )  const [inline]

Return the inverse of the volume of this node.

Definition at line 276 of file dtree.hpp.

References logVolume.

arma::vec& mlpack::det::DTree::MaxVals (  )  [inline]

Modify the maximum values.

Definition at line 289 of file dtree.hpp.

const arma::vec& mlpack::det::DTree::MaxVals (  )  const [inline]

Return the maximum values.

Definition at line 287 of file dtree.hpp.

arma::vec& mlpack::det::DTree::MinVals (  )  [inline]

Modify the minimum values.

Definition at line 294 of file dtree.hpp.

const arma::vec& mlpack::det::DTree::MinVals (  )  const [inline]

Return the minimum values.

Definition at line 292 of file dtree.hpp.

double mlpack::det::DTree::PruneAndUpdate ( const double  oldAlpha,
const size_t  points,
const bool  useVolReg = false 
)

Perform alpha pruning on a tree.

Returns the new value of alpha.

Parameters:
oldAlpha Old value of alpha.
points Total number of points in dataset.
useVolReg If true, volume regularization is used.
Returns:
New value of alpha.
double mlpack::det::DTree::Ratio (  )  const [inline]

Return the ratio of points in this node to the points in the whole dataset.

Definition at line 274 of file dtree.hpp.

References ratio.

DTree* mlpack::det::DTree::Right (  )  const [inline]

Return the right child.

Definition at line 280 of file dtree.hpp.

References right.

bool mlpack::det::DTree::Root (  )  const [inline]

Return whether or not this is the root of the tree.

Definition at line 282 of file dtree.hpp.

References root.

size_t mlpack::det::DTree::SplitData ( arma::mat &  data,
const size_t  splitDim,
const double  splitValue,
arma::Col< size_t > &  oldFromNew 
) const [private]

Split the data, returning the number of points left of the split.

size_t mlpack::det::DTree::SplitDim (  )  const [inline]

Return the split dimension of this node.

Definition at line 263 of file dtree.hpp.

References splitDim.

double mlpack::det::DTree::SplitValue (  )  const [inline]

Return the split value of this node.

Definition at line 265 of file dtree.hpp.

References splitValue.

size_t mlpack::det::DTree::Start (  )  const [inline]

Return the starting index of points contained in this node.

Definition at line 259 of file dtree.hpp.

size_t mlpack::det::DTree::SubtreeLeaves (  )  const [inline]

Return the number of leaves which are descendants of this node.

Definition at line 271 of file dtree.hpp.

References subtreeLeaves.

double mlpack::det::DTree::SubtreeLeavesLogNegError (  )  const [inline]

Return the log negative error of all descendants of this node.

Definition at line 269 of file dtree.hpp.

References subtreeLeavesLogNegError.

int mlpack::det::DTree::TagTree ( const int  tag = 0  ) 

Index the buckets for possible usage later; this results in every leaf in the tree having a specific tag (accessible with BucketTag()).

This function calls itself recursively.

Parameters:
tag Tag for the next leaf; leave at 0 for the initial call.
bool mlpack::det::DTree::WithinRange ( const arma::vec &  query  )  const

Return whether a query point is within the range of this node.

void mlpack::det::DTree::WriteTree ( FILE *  fp,
const size_t  level = 0 
) const

Print the tree in a depth-first manner (this function is called recursively).

Parameters:
fp File to write the tree to.
level Level of the tree (should start at 0).

Member Data Documentation

Upper part of alpha sum; used for pruning.

Definition at line 250 of file dtree.hpp.

Referenced by AlphaUpper().

The tag for the leaf, used for hashing points.

Definition at line 247 of file dtree.hpp.

size_t mlpack::det::DTree::end [private]

The index of the last point in the dataset contained in this node (and its children).

Definition at line 215 of file dtree.hpp.

The left child.

Definition at line 253 of file dtree.hpp.

Referenced by Left().

log-negative-L2-error of the node.

Definition at line 229 of file dtree.hpp.

The logarithm of the volume of the node.

Definition at line 244 of file dtree.hpp.

Referenced by LogVolume().

arma::vec mlpack::det::DTree::maxVals [private]

Upper half of bounding box for this node.

Definition at line 218 of file dtree.hpp.

arma::vec mlpack::det::DTree::minVals [private]

Lower half of bounding box for this node.

Definition at line 220 of file dtree.hpp.

double mlpack::det::DTree::ratio [private]

Ratio of the number of points in the node to the total number of points.

Definition at line 241 of file dtree.hpp.

Referenced by Ratio().

The right child.

Definition at line 255 of file dtree.hpp.

Referenced by Right().

bool mlpack::det::DTree::root [private]

If true, this node is the root of the tree.

Definition at line 238 of file dtree.hpp.

Referenced by Root().

size_t mlpack::det::DTree::splitDim [private]

The splitting dimension for this node.

Definition at line 223 of file dtree.hpp.

Referenced by SplitDim().

The split value on the splitting dimension for this node.

Definition at line 226 of file dtree.hpp.

Referenced by SplitValue().

size_t mlpack::det::DTree::start [private]

The index of the first point in the dataset contained in this node (and its children).

Definition at line 212 of file dtree.hpp.

Number of leaves of the subtree.

Definition at line 235 of file dtree.hpp.

Referenced by SubtreeLeaves().

Sum of the error of the leaves of the subtree.

Definition at line 232 of file dtree.hpp.

Referenced by SubtreeLeavesLogNegError().


The documentation for this class was generated from the following file:

Generated on 13 Aug 2014 for MLPACK by  doxygen 1.6.1