Support System for Lecture Captioning Using Keyword Detection by Automatic Speech Recognition

We propose a support system for lecture captioning. The system can detect the keywords of a lecture and present them to captionists. The captionists can understand what an instructor said even when they cannot understand the keywords, and can input keywor

PDF / 885,260 Bytes
7 Pages / 439.37 x 666.142 pts Page_size
91 Downloads / 230 Views

DOWNLOAD

REPORT

Graduate School of Information Science, Nagoya University, Furo-cho, Chikusa-ku, Nagoya 464-8603, Japan [email protected], {matumoto,kudo,ohnishi}@is.nagoya-u.ac.jp 2 Department of Information Systems, School of Informatics, Daido University, 10-3 Takiharu-cho, Minami-ku, Nagoya 457-8530, Japan [email protected]

Abstract. We propose a support system for lecture captioning. The system can detect the keywords of a lecture and present them to captionists. The captionists can understand what an instructor said even when they cannot understand the keywords, and can input keywords rapidly by pressing the corresponding function key. The system detects the keywords by automatic speech recognition (ASR). To improve the detection rate of keywords, we adapt the language model of ASR using web documents. We collect 2,700 web documents, which include 1.2 million words and 5,800 sentences. We conducted an experiment to detect keywords of a real lecture and showed that the system can achieve higher F-measure of 0.957 than that of a base language model (0.871). Keywords: Speech recognition · Language model · Lectures · Keyword detection · Hearing-impaired

1

Introduction

Hearing-impaired students often need complementary technologies, such as signlanguage interpretation and PC captioning, to enable them to fully understand college lectures. PC captioning is a method in which supporters transcribe an instructor’s speech by typing on a keyboard in real time. Figure 1 shows the state of caption presentation by PC captioning in the lecture. The caption is displayed next to the lecture slide. In this lecture, captionists in a remote location type in real time while watching a lecture video sent from the classroom [1]. To enable high-speed input, captionists usually type captions in pairs. In addition, captionists obtain lecture materials such as presentation slide in advance, and prepared for captioning. For example, a predictive conversion of kana and kanji (Japanese characters) is trained by inputting texts in lecture c Springer International Publishing Switzerland 2016 K. Miesenberger et al. (Eds.): ICCHP 2016, Part II, LNCS 9759, pp. 377–383, 2016. DOI: 10.1007/978-3-319-41267-2 53

378

N. Ikeda et al.

Fig. 1. State of caption presentation by PC captioning

materials. Such preparation is necessary for good PC captioning, which is already being used at several universities, and various groups have done work in this area. The lectures at university often deal with technical content, so it is desirable that captionists understand the lecture content. However, it is sometimes diﬃcult to secure such captionists, and there are many cases in which out-of-ﬁeld volunteers perform PC captioning. Such lectures are very diﬃcult to listen to and then accurately input unfamiliar words such as technical terms or proper nouns. Kato et al. proposed a system for presenting keywords in a lecture [2]. They concluded that presenting such keywords to captionists is eﬀective for improving captioning; however, this system requir

Data Loading...

Support System for Lecture Captioning Using Keyword Detection by Automatic Speech Recognition

Recommend Documents

Isolated Word Automatic Speech Recognition System

Pattern Recognition for Speech Detection

Automatic speech recognition: a survey

Automatic Speech Recognition of Galo

Toward Lexicon-Free Bangla Automatic Speech Recognition System

Annotation of Heterogeneous Multimedia Content Using Automatic Speech Recognition

A Novel Idea for Designing a Speech Recognition System Using Computer Vision Object Detection Techniques

DenseHyper: an automatic recognition system for detection of hypertensive retinopathy using dense features transform and

Federated Acoustic Model Optimization for Automatic Speech Recognition

Holonic Multi-agent System Model for Fuzzy Automatic Speech / Speaker Recognition

Surveillance System for Intruder Detection Using Facial Recognition

Medical reporting using speech recognition