|ijפ(Conference Paper)

Metadata for Integrating Chinese Text and Speech Documents

in a Multimedia Retrieval System

Yue-Shi Lee and Hsin-Hsi Chen

Department of Computer Science and Information Engineering

National Taiwan University

Taipei, Taiwan, R.O.C.


Multimedia documents place new requirements on the conventional text retrieval systems. This paper presents a multimedia retrieval system that employs the content-based strategy to retrieve both text and speech documents. Its input can be a sequence of spoken words which are digitized waveforms or a sequence of characters, and its output is a list of ranked text and/or speech documents. In this system, a new metadata especially designed for both text and speech documents is proposed. The metadata is automatically generated with special consideration of the characteristics of Chinese. The presented approach is very easy to implement and the preliminary tests give very encouraging results.