Quality Control
The quality control is based on data
the network never has seen before. From the whole dataset a subset, called the test set, is built. The error calculated using that
test data is a measure for the quality of the created model. That test error is independent from all
model building operations and should not be used for any further
decisions.
cVision
has the possibility to monitor all the errors during a training run. The errors
are written into particular worksheets
On request
cVision
automatically creates charts from all of the buffered data and stores that
charts in a separate worksheet, the @BufferChart worksheet.

cVision
offers a special feature to monitor not only errors but also classification
quality when categorical output channels are used. It
creates so called global confusion matrices and stores them into the
@GCMatrix worksheet.
The information in the @GCMatrix worksheet is organized in three parts, each part is
related to a particular data subset and contains a set of classification
information. Each part contains the same type of information for different
training data subsets. The first part containing the data for the learning data
subset will be placed on
top of the worksheet and the second part containing the data for the validation
data subset below it. The third part containing the data for the test data
subset will be placed below the second part.

The first line of each part contains the channel name, for better
illustration the cells are pointed out in different colours. The learning subset
is denoted by the postfix ":L" and a green background colour, the validation
subset by the postfix ":V" and an orange background colour and the testing
subset by the postfix ":T" and a blue background colour. The figure below shows
in detail the meaning of the particular items.

The name of the categorical output channel is "iris", the data origin from
the learning data subset of the 5th
multi fold cross validation ensemble. The correct classification rate is the the
number of correct classified samples by the total number of data samples in the
subset,
(31+28+30) / (31+29+30)
× 100 % = 98.8 %
The second lines contain the category names that appear in the iris channel,
all data below are related to that particular categories. The categorical
classification rate shows the correct classification rate per category. The
confusion matrix shows if samples are false classified to which category they
have been assigned. In the example above there is one of the 29 versicolor
samples false classified, it is classified as virginica, but it should be
versicolor.
|