lingvo.tasks.asr.tools.simple_wer_v2 module

The new version script to evalute the word error rate (WER) for ASR tasks.

Tensorflow and Lingvo are not required to run this script.

Example of Usage:

  1. python simple_wer_v2.py file_hypothesis file_reference

  2. python simple_wer_v2.py file_hypothesis file_reference file_keyphrases

where file_hypothesis is the filename for hypothesis text, file_reference is the filename for reference text, and file_keyphrases is the optional filename for important phrases (one phrase per line).

Note that the program will also generate a html to diagnose the errors, and the html filename is {$file_hypothesis}_diagnois.html.

Another way is to use this file as a stand-alone library, by calling class SimpleWER with the following member functions:

lingvo.tasks.asr.tools.simple_wer_v2.TxtPreprocess(txt)[source]

Preprocess text before WER caculation.

lingvo.tasks.asr.tools.simple_wer_v2.RemoveCommentTxtPreprocess(txt)[source]

Preprocess text and remove comments in the brancket, such as [comments].

lingvo.tasks.asr.tools.simple_wer_v2.HighlightAlignedHtml(hyp, ref, err_type)[source]

Generate a html element to highlight the difference between hyp and ref.

Parameters
  • hyp – Hypothesis string.

  • ref – Reference string.

  • err_type – one of ‘none’, ‘sub’, ‘del’, ‘ins’.

Returns

a html string where disagreements are highlighted.

Note hyp is highlighted in green, and marked with <del> </del> ref is highlighted in yellow. If you want html with nother styles, consider to write your own function.

Raises

ValueError – if err_type is not among [‘none’, ‘sub’, ‘del’, ‘ins’]. or if when err_type == ‘none’, hyp != ref

lingvo.tasks.asr.tools.simple_wer_v2.ComputeEditDistanceMatrix(hyp_words, ref_words)[source]

Compute edit distance between two list of strings.

Parameters
  • hyp_words – the list of words in the hypothesis sentence

  • ref_words – the list of words in the reference sentence

Returns

Edit distance matrix (in the format of list of lists), where the first index is the reference and the second index is the hypothesis.

lingvo.tasks.asr.tools.simple_wer_v2.RemoveTags(txt)[source]

Remove angle-bracket enclosed tags, such as <tag>.

class lingvo.tasks.asr.tools.simple_wer_v2.HtmlHandler[source]

Bases: object

Template class for HtmlHandler children.

Each handler needs to implmement the Render method which incrementally writes the html for the current word. It has access to various relevant variables from kwargs, such as the current hyp and ref word, the current hyp and ref positions, and the error type.

Optionally can implement the Setup method which is run once at the beginning.

Setup(hypothesis, reference)[source]

Setup the handler state prior to looping through the transcript.

Render(**kwargs)[source]

Render the html for each word as we loop through the transcript.

Parameters

**kwargs – dict of arguments needed for the handler to render correctly

Returns

Html string for word

class lingvo.tasks.asr.tools.simple_wer_v2.HighlightAlignedHtmlHandler(highlight_fn=<function HighlightAlignedHtml>)[source]

Bases: HtmlHandler

Handler for HighlightAlignedHtml.

Render(hyp_word=None, ref_word=None, err_type=None, **kwargs)[source]

Render the html for each word as we loop through the transcript.

Parameters

**kwargs – dict of arguments needed for the handler to render correctly

Returns

Html string for word

class lingvo.tasks.asr.tools.simple_wer_v2.SimpleWER(key_phrases=None, html_handler=<lingvo.tasks.asr.tools.simple_wer_v2.HighlightAlignedHtmlHandler object>, preprocess_handler=<function RemoveCommentTxtPreprocess>)[source]

Bases: object

Compute word error rates after the alignment.

key_phrases

list of important phrases.

aligned_htmls

list of diagnois htmls, each of which corresponding to a pair of hypothesis and reference.

hyp_keyphrase_counts

dict. hyp_keyphrase_counts[w] counts how often a key phrases w appear in the hypotheses.

ref_keyphrase_counts

dict. ref_keyphrase_counts[w] counts how often a key phrases w appear in the references.

matched_keyphrase_counts

dict. matched_keyphrase_counts[w] counts how often a key phrase w appear in the aligned transcripts when the reference and hyp_keyphrase match.

wer_info

dict with four keys: ‘sub’ (substitution error), ‘ins’ (insersion error), ‘del’ (deletion error), ‘nw’ (number of words). We can use wer_info to compute word error rate (WER) as (wer_info[‘sub’]+wer_info[‘ins’]+wer_info[‘del’])*100.0/wer_info[‘nw’]

AddHypRef(hypothesis, reference)[source]

Update WER when adding one pair of strings: (hypothesis, reference).

Parameters
  • hypothesis – Hypothesis string.

  • reference – Reference string.

Raises

ValueError – when the program fails to parse edit distance matrix.

GetWER()[source]

Compute Word Error Rate (WER).

Note WER can be larger than 100.0, esp when there are many insertion errors.

Returns

WER as percentage number, usually between 0.0 to 100.0

GetBreakdownWER()[source]

Compute breakdown WER.

Returns

A dictionary with del/ins/sub as key, and the error rates in percentage number as value.

GetKeyPhraseStats()[source]

Measure the Jaccard similarity of key phrases between hyps and refs.

Returns

jaccard similarity, between 0.0 and 1.0 F1_keyphrase: F1 score (=2/(1/prec + 1/recall)), between 0.0 and 1.0 matched_keyphrases: num of matched key phrases. ref_keyphrases: num of key phrases in the reference strings. hyp_keyphrases: num of key phrases in the hypothesis strings.

Return type

jaccard_similarity

GetSummaries()[source]

Generate strings to summarize word errors and key phrase errors.

Returns

string summarizing total error, total word and WER. str_details: string breaking down three error types: del, ins, sub. str_str_keyphrases_info: string summarizing kerphrase information.

Return type

str_sum

lingvo.tasks.asr.tools.simple_wer_v2.main(argv)[source]