Skip to content

Latest commit

 

History

History
 
 

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

README.md

TensorFlow evaluation metrics and summary statistics

Evaluation metrics

Metrics are used in evaluation to assess the quality of a model. Most are "streaming" ops, meaning they create variables to accumulate a running total, and return an update tensor to update these variables, and a value tensor to read the accumulated value. Example:

value, update_op = metrics.streaming_mean_squared_error( predictions, targets, weight)

Most metric functions take a pair of tensors, predictions and ground truth targets (streaming_mean is an exception, it takes a single value tensor, usually a loss). It is assumed that the shape of both these tensors is of the form [batch_size, d1, ... dN] where batch_size is the number of samples in the batch and d1 ... dN are the remaining dimensions.

The weight parameter can be used to adjust the relative weight of samples within the batch. The result of each loss is a scalar average of all sample losses with non-zero weights.

The result is 2 tensors that should be used like the following for each eval run:

predictions = ...
labels = ...
value, update_op = some_metric(predictions, labels)

for step_num in range(max_steps):
  update_op.run()

print "evaluation score: ", value.eval()