Name:
DANSK DS/ISO 24614-2 PDF
Published Date:
09/12/2011
Status:
[ Active ]
Publisher:
Dansk Standard
The basic concepts and general principles for word segmentation defined in Part 1 are applied for Chinese, Japanese and Korean (CJK). The objective of the word segmentation is to suit the requirements for the computational applications of language resources, for the natural language processing, and for other specific applications such as IR (information retrieval) and MT (machine translation). Part 2 is restricted to a particular task delineated by word segmentation, which is distinct from morphological or syntactic analysis per se, although word segmentation greatly depends on morpho-syntactic analysis. The main task of Part 2 is to define word segmentation unit for Chinese, Japanese and Korean. Although they are related to each other at the lexical level, each of these three languages has distinct structural differences and these differences have to be reflected on the definition of word segmentation and its practical guidelines. Due to the fact that these three languages share similarities in words composed of Chinese characters, general rules for identifying word segmentation units (WSU) in Chinese text can also be applied to the processing for Japanese and Korean to some extent.
| Edition : | 11 |
| File Size : | 1 file , 640 KB |
| Number of Pages : | 54 |
| Product Code(s) : | DS-051, DS-051 |
| Published : | 09/12/2011 |