GestureRecognitionToolkit
Version: 0.1.0
The Gesture Recognition Toolkit (GRT) is a cross-platform, open-source, c++ machine learning library for real-time gesture recognition.
|
#include <ClassificationData.h>
Public Member Functions | |
ClassificationData (UINT numDimensions=0, std::string datasetName="NOT_SET", std::string infoText="") | |
ClassificationData (const ClassificationData &rhs) | |
virtual | ~ClassificationData () |
ClassificationData & | operator= (const ClassificationData &rhs) |
ClassificationSample & | operator[] (const UINT &i) |
const ClassificationSample & | operator[] (const UINT &i) const |
void | clear () |
bool | setNumDimensions (UINT numDimensions) |
bool | setDatasetName (std::string datasetName) |
bool | setInfoText (std::string infoText) |
bool | setClassNameForCorrespondingClassLabel (std::string className, UINT classLabel) |
bool | setAllowNullGestureClass (bool allowNullGestureClass) |
bool | addSample (UINT classLabel, const VectorFloat &sample) |
bool | removeSample (const UINT index) |
bool | removeLastSample () |
bool | reserve (const UINT N) |
bool | addClass (const UINT classLabel, const std::string className="NOT_SET") |
UINT | removeClass (const UINT classLabel) |
UINT | eraseAllSamplesWithClassLabel (const UINT classLabel) |
bool | relabelAllSamplesWithClassLabel (const UINT oldClassLabel, const UINT newClassLabel) |
bool | setExternalRanges (const Vector< MinMax > &externalRanges, const bool useExternalRanges=false) |
bool | enableExternalRangeScaling (const bool useExternalRanges) |
bool | scale (const Float minTarget, const Float maxTarget) |
bool | scale (const Vector< MinMax > &ranges, const Float minTarget, const Float maxTarget) |
bool | save (const std::string &filename) const |
bool | load (const std::string &filename) |
bool | saveDatasetToFile (const std::string &filename) const |
bool | loadDatasetFromFile (const std::string &filename) |
bool | saveDatasetToCSVFile (const std::string &filename) const |
bool | loadDatasetFromCSVFile (const std::string &filename, const UINT classLabelColumnIndex=0) |
bool | printStats () const |
bool | sortClassLabels () |
bool | merge (const ClassificationData &data) |
ClassificationData | partition (const UINT partitionPercentage, const bool useStratifiedSampling=false) |
bool | spiltDataIntoKFolds (const UINT K, const bool useStratifiedSampling=false) |
ClassificationData | getTrainingFoldData (const UINT foldIndex) const |
ClassificationData | getTestFoldData (const UINT foldIndex) const |
ClassificationData | getClassData (const UINT classLabel) const |
ClassificationData | getBootstrappedDataset (UINT numSamples=0, bool balanceDataset=false) const |
RegressionData | reformatAsRegressionData () const |
UnlabelledData | reformatAsUnlabelledData () const |
std::string | getDatasetName () const |
std::string | getInfoText () const |
std::string | getStatsAsString () const |
UINT | getNumDimensions () const |
UINT | getNumSamples () const |
UINT | getNumClasses () const |
UINT | getMinimumClassLabel () const |
UINT | getMaximumClassLabel () const |
UINT | getClassLabelIndexValue (const UINT classLabel) const |
std::string | getClassNameForCorrespondingClassLabel (const UINT classLabel) const |
Vector< MinMax > | getRanges () const |
Vector< UINT > | getClassLabels () const |
Vector< UINT > | getNumSamplesPerClass () const |
Vector< ClassTracker > | getClassTracker () const |
MatrixFloat | getClassHistogramData (const UINT classLabel, const UINT numBins) const |
Vector< MatrixFloat > | getHistogramData (const UINT numBins) const |
Vector< ClassificationSample > | getClassificationData () const |
VectorFloat | getClassProbabilities () const |
VectorFloat | getClassProbabilities (const Vector< UINT > &classLabels) const |
VectorFloat | getMean () const |
VectorFloat | getStdDev () const |
MatrixFloat | getClassMean () const |
MatrixFloat | getClassStdDev () const |
MatrixFloat | getCovarianceMatrix () const |
Vector< UINT > | getClassDataIndexes (const UINT classLabel) const |
MatrixDouble | getDataAsMatrixDouble () const |
MatrixFloat | getDataAsMatrixFloat () const |
Public Member Functions inherited from GRTBase | |
GRTBase (void) | |
virtual | ~GRTBase (void) |
bool | copyGRTBaseVariables (const GRTBase *GRTBase) |
std::string | getClassType () const |
std::string | getLastWarningMessage () const |
std::string | getLastErrorMessage () const |
std::string | getLastInfoMessage () const |
bool | setInfoLoggingEnabled (const bool loggingEnabled) |
bool | setWarningLoggingEnabled (const bool loggingEnabled) |
bool | setErrorLoggingEnabled (const bool loggingEnabled) |
GRTBase * | getGRTBasePointer () |
const GRTBase * | getGRTBasePointer () const |
Static Public Member Functions | |
static bool | generateGaussDataset (const std::string filename, const UINT numSamples=10000, const UINT numClasses=10, const UINT numDimensions=3, const Float range=10, const Float sigma=1) |
Static Public Member Functions inherited from GRTBase | |
static std::string | getGRTVersion (bool returnRevision=true) |
static std::string | getGRTRevison () |
Additional Inherited Members | |
Protected Member Functions inherited from GRTBase | |
Float | SQR (const Float &x) const |
Protected Attributes inherited from GRTBase | |
std::string | classType |
DebugLog | debugLog |
ErrorLog | errorLog |
InfoLog | infoLog |
TrainingLog | trainingLog |
TestingLog | testingLog |
WarningLog | warningLog |
GRT MIT License Copyright (c) <2012> <Nicholas Gillian, Media Lab, MIT>
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Definition at line 43 of file ClassificationData.h.
GRT_BEGIN_NAMESPACE ClassificationData::ClassificationData | ( | UINT | numDimensions = 0 , |
std::string | datasetName = "NOT_SET" , |
||
std::string | infoText = "" |
||
) |
Constructor, sets the name of the dataset and the number of dimensions of the training data. The name of the dataset should not contain any spaces.
numDimensions | the number of dimensions of the training data, should be an unsigned integer greater than 0 |
datasetName | the name of the dataset, should not contain any spaces |
infoText | some info about the data in this dataset, this can contain spaces |
Definition at line 25 of file ClassificationData.cpp.
ClassificationData::ClassificationData | ( | const ClassificationData & | rhs | ) |
Copy Constructor, copies the ClassificationData from the rhs instance to this instance
rhs | another instance of the ClassificationData class from which the data will be copied to this instance |
Definition at line 40 of file ClassificationData.cpp.
|
virtual |
Default Destructor
Definition at line 44 of file ClassificationData.cpp.
bool ClassificationData::addClass | ( | const UINT | classLabel, |
const std::string | className = "NOT_SET" |
||
) |
This function adds the class with the classLabel to the class tracker. If the class tracker already contains the classLabel then the function will return false.
const | UINT classLabel: the class label you want to add to the classTracker |
const | std::string className: the name associated with the new class |
Definition at line 235 of file ClassificationData.cpp.
bool ClassificationData::addSample | ( | UINT | classLabel, |
const VectorFloat & | sample | ||
) |
Adds a new labelled sample to the dataset. The dimensionality of the sample should match the number of dimensions in the ClassificationData. The class label should be greater than zero (as zero is used as the default null rejection class label).
UINT | classLabel: the class label of the corresponding sample |
const | UINT VectorFloat &sample: the new sample you want to add to the dataset. The dimensionality of this sample should match the number of dimensions in the ClassificationData |
Definition at line 132 of file ClassificationData.cpp.
void ClassificationData::clear | ( | ) |
Clears any previous training data and counters
Definition at line 69 of file ClassificationData.cpp.
bool ClassificationData::enableExternalRangeScaling | ( | const bool | useExternalRanges | ) |
Sets if the dataset should be scaled using an external range (if useExternalRanges == true) or the ranges of the dataset (if false). The external ranges need to be set FIRST before calling this function, otherwise it will return false.
useExternalRanges | sets if these ranges should be used to scale the dataset |
Definition at line 346 of file ClassificationData.cpp.
UINT ClassificationData::eraseAllSamplesWithClassLabel | ( | const UINT | classLabel | ) |
Deletes from the dataset all the samples with a specific class label.
const | UINT classLabel: the class label of the samples you wish to delete from the dataset |
Definition at line 231 of file ClassificationData.cpp.
|
static |
Generates a labeled dataset that can be used for basic training/testing/validation for ClassificationData.
Samples in the dataset will be generated based on K randomly select models, with Gaussian noise. K is set by the numClasses argument.
The range of each dimension will be [-range range]. Sigma controls the amount of Gaussian noise added.
The dataset will be saved to the file specified by filename.
filename | the name of the file the dataset will be saved to |
numSamples | the total number of samples in the dataset |
numClasses | the number of classes in the dataset |
numDimensions | the number of dimensions in the dataset |
range | the range the data will be sampled from, range will be [-range range] for each dimension |
sigma | the amount of Gaussian noise |
Definition at line 1490 of file ClassificationData.cpp.
ClassificationData ClassificationData::getBootstrappedDataset | ( | UINT | numSamples = 0 , |
bool | balanceDataset = false |
||
) | const |
Gets a bootstrapped dataset from the current dataset. If the numSamples parameter is set to zero, then the size of the bootstrapped dataset will match the size of the current dataset, otherwise the size of the bootstrapped dataset will match the numSamples parameter.
numSamples | the size of the bootstrapped dataset |
balanceDataset | if true will use stratified sampling to balance the dataset returned, otherwise will use random sampling |
Definition at line 1029 of file ClassificationData.cpp.
ClassificationData ClassificationData::getClassData | ( | const UINT | classLabel | ) | const |
Returns the all the data with the class label set by classLabel. The classLabel should be a valid classLabel, otherwise the dataset returned will be empty.
classLabel | the class label of the class you want the data for |
Definition at line 1006 of file ClassificationData.cpp.
Vector< UINT > ClassificationData::getClassDataIndexes | ( | const UINT | classLabel | ) | const |
Gets the indexes for all the samples in the current dataset belonging to the classLabel.
const | UINT classLabel: the classLabel of the class you want the indexes for |
Definition at line 1436 of file ClassificationData.cpp.
MatrixFloat ClassificationData::getClassHistogramData | ( | const UINT | classLabel, |
const UINT | numBins | ||
) | const |
Computes a histogram for a specific class.
const | UINT classLabel: the class label of the class you want to compute the histogram data for |
const | UINT numBins: the number of bins in the histogram |
Definition at line 1284 of file ClassificationData.cpp.
|
inline |
Gets the classification data.
Definition at line 534 of file ClassificationData.h.
UINT ClassificationData::getClassLabelIndexValue | ( | const UINT | classLabel | ) | const |
Gets the index of the class label from the class tracker.
Definition at line 1164 of file ClassificationData.cpp.
Vector< UINT > ClassificationData::getClassLabels | ( | ) | const |
Gets the class label associated with class[i].
Definition at line 1231 of file ClassificationData.cpp.
MatrixFloat ClassificationData::getClassMean | ( | ) | const |
Gets the mean values for each class in the dataset. This is returned in an [K N] matrix, where K is the number of classes in the dataset and N is the number of dimensions in the dataset.
Definition at line 1330 of file ClassificationData.cpp.
std::string ClassificationData::getClassNameForCorrespondingClassLabel | ( | const UINT | classLabel | ) | const |
Gets the name of the class with a given class label. If the class label does not exist then the string "CLASS_LABEL_NOT_FOUND" will be returned.
Definition at line 1174 of file ClassificationData.cpp.
MatrixFloat ClassificationData::getClassStdDev | ( | ) | const |
Gets the standard deviation values for each class in the dataset. This is returned in an [K N] matrix, where K is the number of classes in the dataset and N is the number of dimensions in the dataset.
Definition at line 1354 of file ClassificationData.cpp.
|
inline |
Gets the class tracker for each class in the dataset.
Definition at line 509 of file ClassificationData.h.
MatrixFloat ClassificationData::getCovarianceMatrix | ( | ) | const |
Gets the covariance matrix across all the classes in the dataset. This is returned in an [N N] matrix, where N is the number of dimensions in the dataset.
Definition at line 1379 of file ClassificationData.cpp.
MatrixDouble ClassificationData::getDataAsMatrixDouble | ( | ) | const |
Gets the data as a MatrixDouble. This returns just the data, not the labels. This will be an M by N MatrixDouble, where M is the number of samples and N is the number of dimensions.
Definition at line 1461 of file ClassificationData.cpp.
MatrixFloat ClassificationData::getDataAsMatrixFloat | ( | ) | const |
Gets the data as a MatrixFloat. This returns just the data, not the labels. This will be an M by N MatrixFloat, where M is the number of samples and N is the number of dimensions.
Definition at line 1476 of file ClassificationData.cpp.
|
inline |
Gets the name of the dataset.
Definition at line 418 of file ClassificationData.h.
Vector< MatrixFloat > ClassificationData::getHistogramData | ( | const UINT | numBins | ) | const |
Computes a histogram for each class in the dataset.
numBins | the number of bins in the histogram |
Definition at line 1396 of file ClassificationData.cpp.
|
inline |
Gets the infotext for the dataset
Definition at line 425 of file ClassificationData.h.
UINT ClassificationData::getMaximumClassLabel | ( | ) | const |
Gets the maximum class label in the dataset. If there are no values in the dataset then the value 0 will be returned.
Definition at line 1152 of file ClassificationData.cpp.
VectorFloat ClassificationData::getMean | ( | ) | const |
Gets the mean values across all classes in the dataset.
Definition at line 1255 of file ClassificationData.cpp.
UINT ClassificationData::getMinimumClassLabel | ( | ) | const |
Gets the minimum class label in the dataset. If there are no values in the dataset then the value 99999 will be returned.
Definition at line 1139 of file ClassificationData.cpp.
|
inline |
Gets the number of classes.
Definition at line 453 of file ClassificationData.h.
|
inline |
Gets the number of dimensions of the labelled classification data.
Definition at line 439 of file ClassificationData.h.
|
inline |
Gets the number of samples in the classification data across all the classes.
Definition at line 446 of file ClassificationData.h.
Vector< UINT > ClassificationData::getNumSamplesPerClass | ( | ) | const |
Gets the number of samples in each class.
Definition at line 1243 of file ClassificationData.cpp.
Gets the ranges of the classification data.
Definition at line 1210 of file ClassificationData.cpp.
std::string ClassificationData::getStatsAsString | ( | ) | const |
Gets the stats of the dataset as a string
Definition at line 1185 of file ClassificationData.cpp.
VectorFloat ClassificationData::getStdDev | ( | ) | const |
Gets the standard deviation values across all classes in the dataset.
Definition at line 1269 of file ClassificationData.cpp.
ClassificationData ClassificationData::getTestFoldData | ( | const UINT | foldIndex | ) | const |
Returns the test dataset for the k-th fold for cross validation. The spiltDataIntoKFolds(UINT K) function should have been called once before using this function. The foldIndex should be in the range [0 K-1], where K is the number of folds the data was spilt into.
foldIndex | the index of the fold you want the test data for, this should be in the range [0 K-1], where K is the number of folds the data was spilt into |
Definition at line 975 of file ClassificationData.cpp.
ClassificationData ClassificationData::getTrainingFoldData | ( | const UINT | foldIndex | ) | const |
Returns the training dataset for the k-th fold for cross validation. The spiltDataIntoKFolds(UINT K) function should have been called once before using this function. The foldIndex should be in the range [0 K-1], where K is the number of folds the data was spilt into.
foldIndex | the index of the fold you want the training data for, this should be in the range [0 K-1], where K is the number of folds the data was spilt into |
Definition at line 939 of file ClassificationData.cpp.
bool ClassificationData::load | ( | const std::string & | filename | ) |
Load the classification data from a file. If the file format ends in '.csv' then the function will try and load the data from a csv format. If this fails then it will try and load the data as a custom GRT file.
filename | the name of the file the data will be loaded from |
Definition at line 383 of file ClassificationData.cpp.
bool ClassificationData::loadDatasetFromCSVFile | ( | const std::string & | filename, |
const UINT | classLabelColumnIndex = 0 |
||
) |
Loads the labelled classification data from a CSV file. This assumes the data is formatted with each row representing a sample. The class label should be the first column followed by the sample data as the following N columns, where N is the number of dimensions in the data. If the class label is not the first column, you should set the 2nd argument (UINT classLabelColumnIndex) to the column index that contains the class label.
filename | the name of the file the data will be loaded from |
classLabelColumnIndex | the index of the column containing the class label. Default value = 0 |
Definition at line 597 of file ClassificationData.cpp.
bool ClassificationData::loadDatasetFromFile | ( | const std::string & | filename | ) |
Loads the labelled classification data from a custom file format.
filename | the name of the file the data will be loaded from |
Definition at line 437 of file ClassificationData.cpp.
bool ClassificationData::merge | ( | const ClassificationData & | data | ) |
Adds the data to the current instance of the ClassificationData. The number of dimensions in both datasets must match. The names of the classes from the data will be added to the current instance.
data | the dataset to add to this dataset |
Definition at line 803 of file ClassificationData.cpp.
ClassificationData & ClassificationData::operator= | ( | const ClassificationData & | rhs | ) |
Sets the equals operator, copies the data from the rhs instance to this instance
rhs | another instance of the ClassificationData class from which the data will be copied to this instance |
Definition at line 47 of file ClassificationData.cpp.
|
inline |
Array Subscript Operator, returns the ClassificationSample at index i. It is up to the user to ensure that i is within the range of [0 totalNumSamples-1]
i | the index of the training sample you want to access. Must be within the range of [0 totalNumSamples-1] |
Definition at line 82 of file ClassificationData.h.
|
inline |
Const Array Subscript Operator, returns the ClassificationSample at index i. It is up to the user to ensure that i is within the range of [0 totalNumSamples-1]
i | the index of the training sample you want to access. Must be within the range of [0 totalNumSamples-1] |
Definition at line 93 of file ClassificationData.h.
ClassificationData ClassificationData::partition | ( | const UINT | partitionPercentage, |
const bool | useStratifiedSampling = false |
||
) |
Partitions the dataset into a training dataset (which is kept by this instance of the ClassificationData) and a testing/validation dataset (which is returned as a new instance of a ClassificationData).
partitionPercentage | sets the percentage of data which remains in this instance, the remaining percentage of data is then returned as the testing/validation dataset |
useStratifiedSampling | sets if the dataset should be broken into homogeneous groups first before randomly being spilt, default value is false |
Definition at line 701 of file ClassificationData.cpp.
bool ClassificationData::printStats | ( | ) | const |
Prints the dataset info (such as its name and infoText) and the stats (such as the number of examples, number of dimensions, number of classes, etc.) to the std out.
Definition at line 687 of file ClassificationData.cpp.
RegressionData ClassificationData::reformatAsRegressionData | ( | ) | const |
Reformats the ClassificationData as LabelledRegressionData to enable regression algorithms like the MLP to be used as a classifier. This sets the number of targets in the regression data equal to the number of classes in the classification data. The output target ouput of each regression sample will therefore be all zeros, except for the index matching the class label which will be 1. For this to work, the labelled classification data cannot have any samples with a class label of 0!
Definition at line 1087 of file ClassificationData.cpp.
UnlabelledData ClassificationData::reformatAsUnlabelledData | ( | ) | const |
Reformats the ClassificationData as UnlabelledData so the data can be used to train unsupervised training algorithms such as K-Means Clustering and Gaussian Mixture Models.
Definition at line 1122 of file ClassificationData.cpp.
bool ClassificationData::relabelAllSamplesWithClassLabel | ( | const UINT | oldClassLabel, |
const UINT | newClassLabel | ||
) |
Relabels all the samples with the class label A with the new class label B.
const | UINT oldClassLabel: the class label of the samples you want to relabel |
const | UINT newClassLabel: the class label the samples will be relabelled with |
Definition at line 288 of file ClassificationData.cpp.
UINT ClassificationData::removeClass | ( | const UINT | classLabel | ) |
Deletes from the dataset all the samples with a specific class label.
const | UINT classLabel: the class label of the samples you wish to delete from the dataset |
Definition at line 254 of file ClassificationData.cpp.
bool ClassificationData::removeLastSample | ( | ) |
Removes the last training sample added to the dataset.
Definition at line 212 of file ClassificationData.cpp.
bool ClassificationData::removeSample | ( | const UINT | index | ) |
Removes the training sample at the specific index from the dataset.
Definition at line 177 of file ClassificationData.cpp.
bool ClassificationData::reserve | ( | const UINT | N | ) |
Reserves that the Vector capacity be at least enough to contain N elements.
If N is greater than the current Vector capacity, the function causes the container to reallocate its storage increasing its capacity to N (or greater).
const | UINT N: the new memory size |
Definition at line 222 of file ClassificationData.cpp.
bool ClassificationData::save | ( | const std::string & | filename | ) | const |
Saves the classification data to a file. If the file format ends in '.csv' then the data will be saved as comma-seperated-values, otherwise it will be saved to a custom GRT file (which contains the csv data with an additional header).
filename | the name of the file the data will be saved to |
Definition at line 372 of file ClassificationData.cpp.
bool ClassificationData::saveDatasetToCSVFile | ( | const std::string & | filename | ) | const |
Saves the labelled classification data to a CSV file. This will save the class label as the first column and the sample data as the following N columns, where N is the number of dimensions in the data. Each row will represent a sample.
filename | the name of the file the data will be saved to |
Definition at line 574 of file ClassificationData.cpp.
bool ClassificationData::saveDatasetToFile | ( | const std::string & | filename | ) | const |
Saves the labelled classification data to a custom file format.
filename | the name of the file the data will be saved to |
Definition at line 394 of file ClassificationData.cpp.
bool ClassificationData::scale | ( | const Float | minTarget, |
const Float | maxTarget | ||
) |
Scales the dataset to the new target range.
Definition at line 354 of file ClassificationData.cpp.
bool ClassificationData::scale | ( | const Vector< MinMax > & | ranges, |
const Float | minTarget, | ||
const Float | maxTarget | ||
) |
Scales the dataset to the new target range, using the Vector of ranges as the min and max source ranges.
Definition at line 359 of file ClassificationData.cpp.
bool ClassificationData::setAllowNullGestureClass | ( | bool | allowNullGestureClass | ) |
Sets if the user can add samples to the dataset with the label matching the GRT_DEFAULT_NULL_CLASS_LABEL. If the allowNullGestureClass is set to true, then the user can add labels matching the default null class label (which is normally 0). If the allowNullGestureClass is set to false, then the user will not be able to add samples that have a class label matching the default null class label.
Definition at line 127 of file ClassificationData.cpp.
bool ClassificationData::setClassNameForCorrespondingClassLabel | ( | std::string | className, |
UINT | classLabel | ||
) |
Sets the name of the class with the given class label. There should not be any spaces in the className. Will return true if the name is set, or false if the class label does not exist.
Definition at line 114 of file ClassificationData.cpp.
bool ClassificationData::setDatasetName | ( | std::string | datasetName | ) |
Sets the name of the dataset. There should not be any spaces in the name. Will return true if the name is set, or false otherwise.
Definition at line 97 of file ClassificationData.cpp.
bool ClassificationData::setExternalRanges | ( | const Vector< MinMax > & | externalRanges, |
const bool | useExternalRanges = false |
||
) |
Sets the external ranges of the dataset, also sets if the dataset should be scaled using these values. The dimensionality of the externalRanges Vector should match the number of dimensions of this dataset.
externalRanges | an N dimensional Vector containing the min and max values of the expected ranges of the dataset. |
useExternalRanges | sets if these ranges should be used to scale the dataset, default value is false. |
Definition at line 336 of file ClassificationData.cpp.
bool ClassificationData::setInfoText | ( | std::string | infoText | ) |
Sets the info string. This can be any string with information about how the training data was recorded for example.
infoText | the infoText |
Definition at line 109 of file ClassificationData.cpp.
bool ClassificationData::setNumDimensions | ( | UINT | numDimensions | ) |
Sets the number of dimensions in the training data. This should be an unsigned integer greater than zero. This will clear any previous training data and counters. This function needs to be called before any new samples can be added to the dataset, unless the numDimensions variable was set in the constructor or some data was already loaded from a file
numDimensions | the number of dimensions of the training data. Must be an unsigned integer greater than zero |
Definition at line 77 of file ClassificationData.cpp.
bool ClassificationData::sortClassLabels | ( | ) |
Sorts the class labels (in the class tracker) in ascending order.
Definition at line 694 of file ClassificationData.cpp.
bool ClassificationData::spiltDataIntoKFolds | ( | const UINT | K, |
const bool | useStratifiedSampling = false |
||
) |
This function prepares the dataset for k-fold cross validation and should be called prior to calling the getTrainingFold(UINT foldIndex) or getTestingFold(UINT foldIndex) functions. It will spilt the dataset into K-folds, as long as K < M, where M is the number of samples in the dataset.
K | the number of folds the dataset will be split into, K should be less than the number of samples in the dataset |
useStratifiedSampling | sets if the dataset should be broken into homogeneous groups first before randomly being spilt, default value is false |
Definition at line 834 of file ClassificationData.cpp.