Meteor Score

  • METEOR = Metric for Evaluation of Translation with Explicit Ordering
  • METEOR can be thought as the harmonic mean of BLEU Score and ROUGE-N Score
    • More precisely harmonic mean of unigram precision and recall

[!def] METEOR Score
$$
Precision = \frac{\text{# of unigrams matched}}{\text{# of unigrams in generation}}
$$
$$
Recall = \frac{\text{# of unigrams matched}}{\text{# of unigrams in reference}}
$$
$$
F_{mean} = \frac{10 * Precision * Recall}{Recall + 9 * Precision}
$$
$$
Penalty = 0.5 * (\frac{\text{# of minimum chunks where chunks are matched}}{\text{# number of unigrams in the matched chunks}})^3
$$
$$
METEOR = F_{means} * (1 - penalty)
$$