Volltext-Downloads (blau) und Frontdoor-Views (grau)
  • search hit 6 of 18
Back to Result List

Segmentation Strategies for Passage Retrieval from Internet Video using Speech Transcripts

  • We compare the effect of different segmentation strategies for passage retrieval of user generated internet video. We consider retrieval of passages for rather abstract and complex queries that go beyond finding a certain object or constellation of objects in the visual channel. Hence the retrieval methods have to rely heavily on the recognized speech. Passage retrieval has mainly been studied to improve document retrieval and to enable question answering. In these domains best results were obtained using passages defined by the paragraph structure of the source documents or by using arbitrary overlapping passages. For the retrieval of relevant passages in a video no author defined paragraph structure is available. We compare retrieval results from 5 different types of segments: segments defined by shot boundaries, prosodic segments, fixed length segments, a sliding window and semantically coherent segments based on speech transcripts. We evaluated the methods on the corpus of the MediaEval 2011 Rich Speech Retrieval task. Our main conclusions are (1) that fixed length and coherent segments are clearly superior to segments based on speaker turns or shot boundaries; (2) that the retrieval results highly depend on the right choice for the segment length; and (3) that results using the segmentation into semantically coherent parts depend much less on the segment length. Especially, the quality of fixed length and sliding window segmentation drops fast when the segment length increases, while quality of the semantically coherent segments is much more stable. Thus, if coherent segments are defined, longer segments can be used and consequently fewer segments have to be considered at retrieval time.

Download full text files

Export metadata

Additional Services

Share in Twitter Search Google Scholar

Statistics

frontdoor_oas
Metadaten
Author:Christian WartenaORCiDGND
URN:urn:nbn:de:bsz:960-opus4-11126
ISSN:0972-7272
Parent Title (English):Journal of Digital Information Management
Document Type:Article
Language:English
Year of Completion:2013
Release Date:2017/07/10
Tag:Multimedia Retrieval; Text Segmentation; User Generated Content; Video Segmentation
Volume:2013
Issue:11(6)
First Page:400
Last Page:408
Institutes:Fakultät III - Medien, Information und Design
Dewey Decimal Classification:020 Bibliotheks- und Informationswissenschaft
Licence (German):License LogoHinweis zum Urheberrecht