lingvo.core.scorers module
Helper classes for computing scores.
- lingvo.core.scorers.NGrams(lst, order)[source]
Generator that yields all n-grams of the given order present in lst.
- class lingvo.core.scorers.Unsegmenter(separator_type=None)[source]
Bases:
object
Un-segments (merges) segmented strings.
Used to retain back the original surface form of strings that are encoded using byte-pair-encoding (BPE), word-piece-models (WPM) or sentence-piece-models (SPM).
- _BPE_SEPARATOR = '@@ '
- _WPM_SEPARATOR = '▁'
- class lingvo.core.scorers.BleuScorer(max_ngram=4, separator_type=None)[source]
Bases:
object
Scorer to compute BLEU scores to measure translation quality.
The BLEU score is the geometric average precision of all token n-grams of order 1 to max_ngram across all sentences.
Successive calls to AddSentence() accumulate statistics which are converted to an overall score on calls to ComputeOverallScore().
Example usage: >>> scorer = BleuScorer(max_ngram=4) >>> scorer.AddSentence(“hyp matches ref str”, “hyp matches ref str”) >>> scorer.AddSentence(“almost right”, “almost write”) >>> print(scorer.ComputeOverallScore()) 0.6687…
- property unsegmenter