검색 상세

기초 어휘 선정을 위한 형태소 분석기의 오류 유형에 대하여

On the Error Types of Morpheme Analyzer for the Decision of Basic Vocabulary


This paper aims at investigating the error types of morpheme analyser ‘Utagger,’ which can differentiate homonyms for the decision of basic vocabulary. Samples for analysing are composed of a corpus of 10,000 words of written Korean and 6,000 words of spoken Korean. An analysis showed that the error rate of spoken Korean is higher than that of written Korean. Also the error types of spoken Korean was shown to be much wider than that of written Korean. These error types deserve special attention in that they may have a significant impact on the classification of basic vocabulary as well as on the decision of basic vocabulary.
