ROUGE-LSUM Score

  • ROGUE-LSUM is the average of all LCS on sentence level.
    • Where ROUGE-L Score doesn't care about new line, Rogue-LSUM takes LCS only from each sentences
  • ROGUE = Recall-Oriented Understudy for Gisting Evaluation
  • As ROGUE compare with the ALL target sentences, it is often compared with Recall
  • Heavily used in Text Summarization, Also usually used in Machine Translation with BLEU Score
  • ROGUE-LSUM is used more on those cases where sentence level similarity is needed
    • Like Extractive Summarization

[!def] ROGUE-LSUM Score
$$
\text{ROGUE-LSUM} = \frac{1}{\text{# of Sentences}} \sum_i \frac{\text{Length of LCS in sentence}_i \text{ of Candidate & Reference}}{\text{Length of Sentence}_i \text{ in Reference}}
$$

Problems with ROGUE Score

  1. Hard to compare with different tokenizers
  2. Doesn't consider synonyms