GestureRecognitionToolkit  Version: 0.1.0
The Gesture Recognition Toolkit (GRT) is a cross-platform, open-source, c++ machine learning library for real-time gesture recognition.
RegressionData Class Reference

#include <RegressionData.h>

Public Member Functions

 RegressionData (const UINT numInputDimensions=0, const UINT numTargetDimensions=0, const std::string datasetName="NOT_SET", const std::string infoText="")
 
 RegressionData (const RegressionData &rhs)
 
 ~RegressionData ()
 
RegressionDataoperator= (const RegressionData &rhs)
 
RegressionSampleoperator[] (const UINT &i)
 
const RegressionSampleoperator[] (const UINT &i) const
 
void clear ()
 
bool setInputAndTargetDimensions (const UINT numInputDimensions, const UINT numTargetDimensions)
 
bool setDatasetName (const std::string &datasetName)
 
bool setInfoText (const std::string &infoText)
 
bool addSample (const VectorFloat &inputVector, const VectorFloat &targetVector)
 
bool removeLastSample ()
 
bool reserve (const UINT N)
 
bool setExternalRanges (const Vector< MinMax > &externalInputRanges, const Vector< MinMax > &externalTargetRanges, const bool useExternalRanges)
 
bool enableExternalRangeScaling (const bool useExternalRanges)
 
bool scale (const Float minTarget, const Float maxTarget)
 
bool scale (const Vector< MinMax > &inputVectorRanges, const Vector< MinMax > &targetVectorRanges, const Float minTarget, const Float maxTarget)
 
bool save (const std::string &filename) const
 
bool load (const std::string &filename)
 
bool saveDatasetToFile (const std::string &filename) const
 
bool loadDatasetFromFile (const std::string &filename)
 
bool saveDatasetToCSVFile (const std::string &filename) const
 
bool loadDatasetFromCSVFile (const std::string &filename, const UINT numInputDimensions, const UINT numTargetDimensions)
 
bool printStats () const
 
bool merge (const RegressionData &regressionData)
 
RegressionData partition (const UINT trainingSizePercentage)
 
bool spiltDataIntoKFolds (const UINT K)
 
RegressionData getTrainingFoldData (const UINT foldIndex) const
 
RegressionData getTestFoldData (const UINT foldIndex) const
 
UINT removeDuplicateSamples ()
 
std::string getDatasetName () const
 
std::string getInfoText () const
 
std::string getStatsAsString () const
 
UINT getNumInputDimensions () const
 
UINT getNumTargetDimensions () const
 
UINT getNumSamples () const
 
Vector< MinMaxgetInputRanges () const
 
Vector< MinMaxgetTargetRanges () const
 
Vector< RegressionSamplegetData () const
 

Detailed Description

GRT MIT License Copyright (c) <2012> <Nicholas Gillian, Media Lab, MIT>

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Definition at line 41 of file RegressionData.h.

Constructor & Destructor Documentation

GRT_BEGIN_NAMESPACE RegressionData::RegressionData ( const UINT  numInputDimensions = 0,
const UINT  numTargetDimensions = 0,
const std::string  datasetName = "NOT_SET",
const std::string  infoText = "" 
)

Constructor, set the number of input dimensions, number of target dimensions, dataset name and the infotext for the dataset. The name of the dataset should not contain any spaces.

Parameters
numInputDimensionsthe number of input dimensions of the training data, should be an unsigned integer greater than 0
numTargetDimensionsthe number of target dimensions of the training data, should be an unsigned integer greater than 0
datasetNamethe name of the dataset, should not contain any spaces
infoTextsome info about the data in this dataset, this can contain spaces

Definition at line 25 of file RegressionData.cpp.

RegressionData::RegressionData ( const RegressionData rhs)

Copy Constructor, copies the RegressionData from the rhs instance to this instance

Parameters
rhsanother instance of the RegressionData class from which the data will be copied to this instance

Definition at line 38 of file RegressionData.cpp.

RegressionData::~RegressionData ( )

Default Destructor

Definition at line 42 of file RegressionData.cpp.

Member Function Documentation

bool RegressionData::addSample ( const VectorFloat inputVector,
const VectorFloat targetVector 
)

Adds a new labelled sample to the dataset. The input and target dimensionality of the sample should match that of the dataset.

Parameters
inputVectorthe new input Vector you want to add to the dataset. The dimensionality of this sample should match the number of input dimensions in the dataset
targetVectorthe new target Vector you want to add to the dataset. The dimensionality of this sample should match the number of target dimensions in the dataset
Returns
true if the sample was correctly added to the dataset, false otherwise

Definition at line 106 of file RegressionData.cpp.

void RegressionData::clear ( )

Clears any previous training data and counters

Definition at line 65 of file RegressionData.cpp.

bool RegressionData::enableExternalRangeScaling ( const bool  useExternalRanges)

Sets if the dataset should be scaled using an external range (if useExternalRanges == true) or the ranges of the dataset (if false). The external ranges need to be set FIRST before calling this function, otherwise it will return false.

Parameters
useExternalRangessets if these ranges should be used to scale the dataset
Returns
returns true if the useExternalRanges variable was set, false otherwise

Definition at line 156 of file RegressionData.cpp.

Vector< RegressionSample > RegressionData::getData ( ) const
inline

Gets the regression data.

Returns
a Vector of RegressionSample

Definition at line 358 of file RegressionData.h.

std::string RegressionData::getDatasetName ( ) const
inline

Gets the name of the dataset.

Returns
returns the name of the dataset

Definition at line 307 of file RegressionData.h.

std::string RegressionData::getInfoText ( ) const
inline

Gets the infotext for the dataset

Returns
returns the infotext of the dataset

Definition at line 314 of file RegressionData.h.

Vector< MinMax > RegressionData::getInputRanges ( ) const

Gets the input ranges of the dataset.

Returns
a Vector of minimum and maximum values for each input dimension of the data

Definition at line 194 of file RegressionData.cpp.

UINT RegressionData::getNumInputDimensions ( ) const
inline

Gets the number of input dimensions of the labelled regression data.

Returns
an unsigned int representing the number of input dimensions in the dataset

Definition at line 323 of file RegressionData.h.

UINT RegressionData::getNumSamples ( ) const
inline

Gets the number of samples in the classification data across all the classes.

Returns
an unsigned int representing the total number of samples in the classification data

Definition at line 337 of file RegressionData.h.

UINT RegressionData::getNumTargetDimensions ( ) const
inline

Gets the number of target dimensions of the labelled regression data.

Returns
an unsigned int representing the number of target dimensions in the dataset

Definition at line 330 of file RegressionData.h.

Vector< MinMax > RegressionData::getTargetRanges ( ) const

Gets the target ranges of the dataset.

Returns
a Vector of minimum and maximum values for each target dimension of the data

Definition at line 213 of file RegressionData.cpp.

RegressionData RegressionData::getTestFoldData ( const UINT  foldIndex) const

Returns the test dataset for the k-th fold for cross validation. The spiltDataIntoKFolds(UINT K) function should have been called once before using this function. The foldIndex should be in the range [0 K-1], where K is the number of folds the data was spilt into.

Parameters
foldIndexthe index of the fold you want the test data for, this should be in the range [0 K-1], where K is the number of folds the data was spilt into
Returns
returns a test dataset

Definition at line 413 of file RegressionData.cpp.

RegressionData RegressionData::getTrainingFoldData ( const UINT  foldIndex) const

Returns the training dataset for the k-th fold for cross validation. The spiltDataIntoKFolds(UINT K) function should have been called once before using this function. The foldIndex should be in the range [0 K-1], where K is the number of folds the data was spilt into.

Parameters
foldIndexthe index of the fold you want the training data for, this should be in the range [0 K-1], where K is the number of folds the data was spilt into
Returns
returns a training dataset

Definition at line 386 of file RegressionData.cpp.

bool RegressionData::load ( const std::string &  filename)

Load the data from a file. If the file format ends in '.csv' then the function will try and load the data from a csv format. If this fails then it will try and load the data as a custom GRT file.

Parameters
filenamethe name of the file the data will be loaded from
Returns
true if the data was loaded successfully, false otherwise

Definition at line 489 of file RegressionData.cpp.

bool RegressionData::loadDatasetFromCSVFile ( const std::string &  filename,
const UINT  numInputDimensions,
const UINT  numTargetDimensions 
)

Loads the labelled regression data from a CSV file. Each row represents a sample, the first N columns should represent the input Vector data with the remaining T columns representing the target sample. The user must specify the length of the input Vector (N) and the length of the target Vector (T).

Parameters
filenamethe name of the file the data will be saved to
umInputDimensionsthe length of an input Vector
numTargetDimensionsthe length of a target Vector
Returns
true if the data was saved successfully, false otherwise

Definition at line 692 of file RegressionData.cpp.

bool RegressionData::loadDatasetFromFile ( const std::string &  filename)

Loads the labelled regression data from a custom file format.

Parameters
filenamethe name of the file the data will be loaded from
Returns
true if the data was loaded successfully, false otherwise

Definition at line 544 of file RegressionData.cpp.

bool RegressionData::merge ( const RegressionData regressionData)

Adds the data in the regressionData set to the current instance of the RegressionData. The number of dimensions in both datasets must match.

Parameters
regressionDatathe dataset to add to this dataset
Returns
returns true if the datasets were merged, false otherwise

Definition at line 303 of file RegressionData.cpp.

RegressionData & RegressionData::operator= ( const RegressionData rhs)

Sets the equals operator, copies the data from the rhs instance to this instance

Parameters
rhsanother instance of the RegressionData class from which the data will be copied to this instance
Returns
a reference to this instance of RegressionData

Definition at line 44 of file RegressionData.cpp.

RegressionSample& RegressionData::operator[] ( const UINT &  i)
inline

Array Subscript Operator, returns the LabelledRegressionSample at index i. It is up to the user to ensure that i is within the range of [0 totalNumSamples-1]

Parameters
ithe index of the training sample you want to access. Must be within the range of [0 totalNumSamples-1]
Returns
a reference to the i'th RegressionSample

Definition at line 82 of file RegressionData.h.

const RegressionSample& RegressionData::operator[] ( const UINT &  i) const
inline

Const Array Subscript Operator, returns the LabelledRegressionSample at index i. It is up to the user to ensure that i is within the range of [0 totalNumSamples-1]

Parameters
ithe index of the training sample you want to access. Must be within the range of [0 totalNumSamples-1]
Returns
a reference to the i'th RegressionSample

Definition at line 93 of file RegressionData.h.

RegressionData RegressionData::partition ( const UINT  trainingSizePercentage)

Partitions the dataset into a training dataset (which is kept by this instance of the RegressionData) and a testing/validation dataset (which is returned as a new instance of a RegressionData).

Parameters
partitionPercentagesets the percentage of data which remains in this instance, the remaining percentage of data is then returned as the testing/validation dataset
Returns
a new RegressionData instance, containing the remaining data not kept but this instance

Definition at line 262 of file RegressionData.cpp.

bool RegressionData::removeLastSample ( )

Removes the last training sample added to the dataset.

Returns
true if the last sample was removed, false otherwise

Definition at line 120 of file RegressionData.cpp.

bool RegressionData::reserve ( const UINT  N)

Reserves that the Vector capacity be at least enough to contain N elements.

If N is greater than the current Vector capacity, the function causes the container to reallocate its storage increasing its capacity to N (or greater).

Parameters
Nthe new memory size
Returns
true if the memory was reserved successfully, false otherwise

Definition at line 135 of file RegressionData.cpp.

bool RegressionData::save ( const std::string &  filename) const

Saves the data to a file. If the file format ends in '.csv' then the data will be saved as comma-seperated-values, otherwise it will be saved to a custom GRT file (which contains the csv data with an additional header).

Parameters
filenamethe name of the file the data will be saved to
Returns
true if the data was saved successfully, false otherwise

Definition at line 478 of file RegressionData.cpp.

bool RegressionData::saveDatasetToCSVFile ( const std::string &  filename) const

Saves the labelled regression data to a CSV file. This will save the input Vector as the first N columns and the target data as the following T columns. Each row will represent a sample.

Parameters
filenamethe name of the file the data will be saved to
Returns
true if the data was saved successfully, false otherwise

Definition at line 665 of file RegressionData.cpp.

bool RegressionData::saveDatasetToFile ( const std::string &  filename) const

Saves the labelled regression data to a custom file format.

Parameters
filenamethe name of the file the data will be saved to
Returns
true if the data was saved successfully, false otherwise

Definition at line 500 of file RegressionData.cpp.

bool RegressionData::scale ( const Float  minTarget,
const Float  maxTarget 
)

Scales the dataset to the new target range.

Parameters
minTargetthe minimum target the dataset will be scaled to
maxTargetthe maximum target the dataset will be scaled to
Returns
true if the data was scaled correctly, false otherwise

Definition at line 164 of file RegressionData.cpp.

bool RegressionData::scale ( const Vector< MinMax > &  inputVectorRanges,
const Vector< MinMax > &  targetVectorRanges,
const Float  minTarget,
const Float  maxTarget 
)

Scales the dataset to the new target range, using the Vector of ranges as the min and max source ranges.

Returns
true if the data was scaled correctly, false otherwise

Definition at line 170 of file RegressionData.cpp.

bool RegressionData::setDatasetName ( const std::string &  datasetName)

Sets the name of the dataset. There should not be any spaces in the name. Will return true if the name is set, or false otherwise.

Parameters
datasetNamethe new dataset name (must not include any spaces)
Returns
returns true if the name is set, or false otherwise

Definition at line 89 of file RegressionData.cpp.

bool RegressionData::setExternalRanges ( const Vector< MinMax > &  externalInputRanges,
const Vector< MinMax > &  externalTargetRanges,
const bool  useExternalRanges 
)

Sets the external input and target ranges of the dataset, also sets if the dataset should be scaled using these values. The dimensionality of the externalRanges Vector should match the numInputDimensions and numTargetDimensions of this dataset.

Parameters
externalInputRangesan N dimensional Vector containing the min and max values of the expected input ranges of the dataset
externalTargetRangesan N dimensional Vector containing the min and max values of the expected target ranges of the dataset
useExternalRangessets if these ranges should be used to scale the dataset, default value is false
Returns
returns true if the external ranges were set, false otherwise

Definition at line 144 of file RegressionData.cpp.

bool RegressionData::setInfoText ( const std::string &  infoText)

Sets the info string. This can be any string with information about how the training data was recorded for example.

Parameters
infoTextthe infoText
Returns
true if the infoText was correctly updated, false otherwise

Definition at line 101 of file RegressionData.cpp.

bool RegressionData::setInputAndTargetDimensions ( const UINT  numInputDimensions,
const UINT  numTargetDimensions 
)

Sets the number of input and target dimensions in the training data. These should be unsigned integers greater than zero. This will clear any previous training data and counters. This function needs to be called before any new samples can be added to the dataset, unless the numInputDimensions and numTargetDimensions variables was set in the constructor or some data was already loaded from a file

Parameters
numInputDimensionsthe number of input dimensions of the training data. Must be an unsigned integer greater than zero
numTargetDimensionsthe number of target dimensions of the training data. Must be an unsigned integer greater than zero
Returns
true if the number of input and target dimensions was correctly updated, false otherwise

Definition at line 73 of file RegressionData.cpp.

bool RegressionData::spiltDataIntoKFolds ( const UINT  K)

This function prepares the dataset for k-fold cross validation and should be called prior to calling the getTrainingFold(UINT foldIndex) or getTestingFold(UINT foldIndex) functions. It will spilt the dataset into K-folds, as long as K < M, where M is the number of samples in the dataset.

Parameters
Kthe number of folds the dataset will be split into, K should be less than the number of samples in the dataset
Returns
returns true if the dataset was split correctly, false otherwise

Definition at line 327 of file RegressionData.cpp.


The documentation for this class was generated from the following files: