Home Search News Contact Downloads
 

Network QC

Up Input Neurons Initialisation Adaptive Learning Cluster Learning Network Size Stopping Network QC

 


Quality Control

The quality control is based on data the network never has seen before. From the whole dataset a subset, called the test set, is built. The error calculated using that test data is a measure for the quality of the created model. That test error is independent from all model building operations and should not be used for any further decisions.

cVision has the possibility to monitor all the errors during a training run. The errors are written into particular worksheets

  • The @cxBuffer worksheet contains the data of the cluster expert buffer (cxBuffer) that are very detailed information about the training process.
  • The @exBuffer worksheet contains the data of the ensemble expert buffer (exBuffer). The exBuffer data can be considered as a summary of the cluster expert buffer.
  • The @gxBuffer worksheet contains the data of the generation expert buffer (gxBuffer). The gxBuffer data can be considered as a summary of the exBuffer. The gxBuffer is of interest if multi fold cross validation has been selected.

On request cVision automatically creates charts from all of the buffered data and stores that charts in a separate worksheet, the @BufferChart worksheet.

cVision offers a special feature to monitor not only errors but also classification quality when categorical output channels are used. It creates so called global confusion matrices and stores them into the @GCMatrix worksheet.

The information in the @GCMatrix worksheet is organized in three parts, each part is related to a particular data subset and contains a set of classification information. Each part contains the same type of information for different training data subsets. The first part containing the data for the learning data subset will be placed on top of the worksheet and the second part containing the data for the validation data subset below it. The third part containing the data for the test data subset will be placed below the second part.

The first line of each part contains the channel name, for better illustration the cells are pointed out in different colours. The learning subset is denoted by the postfix ":L" and a green background colour, the validation subset by the postfix ":V" and an orange background colour and the testing subset by the postfix ":T" and a blue background colour. The figure below shows in detail the meaning of the particular items.

 

The name of the categorical output channel is "iris", the data origin from the learning data subset of the 5th multi fold cross validation ensemble. The correct classification rate is the the number of correct classified samples by the total number of data samples in the subset,

(31+28+30) / (31+29+30) × 100 % = 98.8 %

The second lines contain the category names that appear in the iris channel, all data below are related to that particular categories. The categorical classification rate shows the correct classification rate per category. The confusion matrix shows if samples are false classified to which category they have been assigned. In the example above there is one of the 29 versicolor samples false classified, it is classified as virginica, but it should be versicolor.


Back | Home | Up

Notice our Terms of Use

Last Update: 08.05.09

2004-2009 © NGS - Neuro Genetic Solutions GmbH