ROUGE-N Score
- ROGUE = Recall-Oriented Understudy for Gisting Evaluation
- As ROGUE compare with the ALL target sentences, it is often compared with Recall
- ROGUE-N score is the sum of Recall of N-grams
- Heavily used in Text Summarization, Also usually used in Machine Translation with BLEU Score
[!def] ROGUE Score
$$
\text{ROGUE-N} = \sum_{i=1}^N Recall_i
$$
$$
\text{Recall}_n = \frac{\text{# of n-grams matched both on generation and on reference}}{\text{# of n-grams in reference}}
$$
Problems with ROGUE Score
- Doesn't consider semantic meaning
- Hard to compare with different tokenizers
- Doesn't consider synonyms