Evaluating Phonetic Search KWS

Complexity reduction algorithm evaluation should be carefully performed to assess its performance and usability. A basic evaluation in such cases measures two aspects as compared to the exhaustive search: the relative decrease in computational complexity

  • PDF / 107,691 Bytes
  • 5 Pages / 439.37 x 666.142 pts Page_size
  • 96 Downloads / 190 Views

DOWNLOAD

REPORT


Evaluating Phonetic Search KWS

Complexity reduction algorithm evaluation should be carefully performed to assess its performance and usability. A basic evaluation in such cases measures two aspects in comparison to the exhaustive search: the relative decrease in computational complexity and the relative change (increase or decrease) in recognition performance in the reduced computational complexity mode. In addition, evaluation of the phonetic search itself is performed using textual data which is transformed into a sequence of phonemes simulating 100% phoneme recognition. The next section describes the performance metrics, evaluation process and evaluation databases while the following section describes the evaluation results.

5.1

Performance Metrics

The success of a keyword spotting task is measured on the basis of two main metrics: the percentage of words correctly recognized, referred to as the Detection Rate (DR), and on the number of false alarms per hour per vocabulary word, referred to as the False Alarm Rate (FAR). To better explain these metrics, consider a speech signal (or DB) and a given KWS vocabulary and define: W ¼ {Wi} DS KWS-VOC-SIZE KW-NUM

The sequence of words describing the content of the speech signal The duration in hours of the speech signal The size of the KWS vocabulary The number of actual keywords included in the speech signal (with all its repetitions)

A. Moyal et al., Phonetic Search Methods for Large Speech Databases, SpringerBriefs in Speech Technology, DOI 10.1007/978-1-4614-6489-1_5, # The Author(s) 2013

29

30

5 Evaluating Phonetic Search KWS

Now, consider the output of a KWS engine which is a list of recognized keywords and define: KW-OUT KW-OUT-CORRECT KW-OUT-NOT KW-OUT-FA

The total number of keywords detected by the engine The number of keywords in the speech signal correctly recognized The number of keywords in the speech signal that were not recognized The number of keywords in the output that represent false alarms

Based on these, the following metrics can be defined: DR ¼ (KW-OUT-CORRECT/KW-NUM)*100 MD ¼ (KW-OUT-NOT/KW-NUM)*100 FA ¼ (KW-OUT-FA/KW-NUM)*100 FAR ¼ KW-OUT-FA/(DS*KWS-VOC-SIZE)

Detection Rate Missed Detection False Alarm False Alarm Rate

100.00% 90.00% 80.00% 70.00% 60.00% 50.00% 40.00% 30.00% 20.00% 10.00%

0.70 P2

0.60

P1

0.50 0.40 0.30 0.20

P3

0.10

0.00%

0.00 0.30 0.32 0.34 0.36 0.38 0.40 0.42 0.44 0.46 0.48 0.50 0.52 Threshold DR [%]

Fig. 12 FAR and DR as a function of threshold

FAR

FAR

DR [%]

The DR and FAR values are used to determine the working point for the KWS engine. The working point performance is usually based on a distance threshold that controls the tradeoff between the DR and FAR. The KWS engine generates KWS hypotheses, each with a distance value. A threshold for the distance value allows the system to determine whether a given keyword hypothesis will be labeled as a spotted word or rejected. Plotting the DR and FAR values as a function of the distance level on a graph enables an analysis of the KWS engine performance a