Decision Trees
This library provides support for creating and applying decision trees.
Basic Usage
For an example of how to use this library, see the Gradient Boosted Decision Trees tutorial.
Interface
This library includes support both for decision trees with integer features and
labels (DecisionTree) and for decision trees with real-valued features and
labels (DecisionTreeReal).
Types
type label- Labels (the decisions that a decision tree makes) are either
intorreal.
- Labels (the decisions that a decision tree makes) are either
type feature- Individual features are either
int * intorint * real, where the left hand side is the feature ID and the right hand side is the feature value.
- Individual features are either
type features- Feature vectors are either an
int arrayorreal array, where the index into the array is the feature ID.
- Feature vectors are either an
type tDecisionTree.tandDecisionTreeReal.tare the types of the decision trees themselves.
Methods
val lessFeature: features * (int * label) -> boolval forward: t * features -> labelval makeLf: label -> tval makeNd: t * feature * t -> tval toString: t -> stringval split: (features * label) list * (int * label) -> ((features * label) list) * ((features * label) list)val recordsToString: (features * label) list -> stringval leafNum: t -> int
Method Overview
lessFeature (ft, (featureId, featureValue))- Determine whether the value of the given feature is less than the feature
value in the feature vector
ft.
- Determine whether the value of the given feature is less than the feature
value in the feature vector
forward (dt, ft)- Run the decision tree
dtto produce a label for the feature vectorft.
- Run the decision tree
makeLf label- Creates a decision tree that is just a leaf node that always returns the
same label
label.
- Creates a decision tree that is just a leaf node that always returns the
same label
makeNd (dt1, feature, dt2)- Creates a decision tree that uses the left tree to get a label if feature
vector being processed is less than
featureand uses the right tree otherwise.
- Creates a decision tree that uses the left tree to get a label if feature
vector being processed is less than
toString dt- Converts a decision tree to a string.
split (data, (featureId, featureValue))- Splits a list of input data by a feature. The input data is a collection of feature vectors and labels.
recordsToString data- Converts a list of feature vectors and their labels to a string.
leafNum dt- Returns the number of leaves in a decision tree.