国际中文语言资源联盟
Home
Chinese Corpus Consortium
About | Membership | Corpora | Resources | Links
Chinese Corpus Consortium

Welcome to the homepage of the Chinese Corpus Consortium! You can return to this page at any time by clicking the CCC logo at the top of each page.

Speech and text databases are increasingly important for the research and development of automatic speech recognition (ASR), text-to-speech (TTS), natural language processing (NLP), and other language-related technology. There are several existing consortiums and associations devoted to collecting and distributing such data resources, including the Linguistic Data Consortium, the European Language Resources Association, and The Gengo Shigen Kyouyuukikou (GSK) Language Resource Consortium.

The aim of the CCC is to provide corpora for Chinese ASR, TTS, NLP, perception analysis, phonetics analysis, linguistic analysis, and other related tasks. The corpora can be speech- or text-based; read or spontaneous; wideband or narrowband; standard or dialectal Chinese; clean or with noise; or of any other kinds which are deemed helpful for the foresaid purposes.

Currently there are numerous corpora available from the CCC.

 
About | Membership | Corpora | Resources | Links

Copyright (C) Chinese Corpus Consortium. All Rights Reserved.