lingvo.tasks.asr.tools.custom_html_handlers module

Html handler customizations for simple_wer_v2.

class lingvo.tasks.asr.tools.custom_html_handlers.ChainOfHtmlHandlers(html_handlers=None)[source]

Bases: HtmlHandler

A handler that is a chain of handlers.

All Setup and Render functions are called in sequential order.

Setup(hypothesis, reference)[source]

Setup the handler state prior to looping through the transcript.

Render(**kwargs)[source]

Render the html for each word as we loop through the transcript.

Parameters

**kwargs – dict of arguments needed for the handler to render correctly

Returns

Html string for word

lingvo.tasks.asr.tools.custom_html_handlers.FindTags(hyp_words)[source]

Find the tags in the hypothesis.

Tags are words enclosed by angle-brackets, such as <tag>. Tags are meant to be visible and not affect WER.

Parameters

hyp_words – List of words in the hypothesis sentence.

Returns

A list of tags in the hypothesis. Each tag is a 2-element tuple. First element is the position, second is the string.

class lingvo.tasks.asr.tools.custom_html_handlers.TagHtmlHandler[source]

Bases: HtmlHandler

Handler to cache and add tags back to original positions in transcript.

Setup(hypothesis, reference)[source]

Setup the handler state prior to looping through the transcript.

Render(pos_hyp=None, **kwargs)[source]

Show tags in the output html.

Parameters
  • pos_hyp – current word position in the hypothesis

  • **kwargs – unused

Returns

Tag strings

class lingvo.tasks.asr.tools.custom_html_handlers.NewlineHtmlHandler(num_words_per_line=-1)[source]

Bases: HtmlHandler

Handler to insert newline into html at fixed ref word intervals.

Useful for side-by-side comparisons of long transcripts.

Setup(hypothesis, reference)[source]

Setup the handler state prior to looping through the transcript.

Render(err_type=None, **kwargs)[source]

Set number of ref words to display per line.

Parameters
  • err_type – error type

  • **kwargs – unused

Returns

Newline string when number of words reaches num_words_per_line