PAN localization : a study on collation of languages from developing Asia

Date

2008

Journal Title

Journal ISSN

Volume Title

Publisher

Center for Research in Urdu Language Processing, National University of Computer and Emerging Science, Lahore, PK

Abstract

Collation of all written languages are defined in their dictionaries, developed over centuries, and are thus very representative of cultural tradition. However, though it is well understood in these cultures, it is not always thoroughly documented or well understood in the context of existing character encodings, especially the Unicode. This volume aims to address the complex algorithms needed for sorting out the words in sequence for a small but diverse set of scripts and languages chosen from developing Asian region. The set is chosen for the variety it exhibits and to show the challenges it poses to solve the collation puzzle. This work must be taken as an initial step towards addressing the collation of languages in the region as there is still more which can be said about collation of these languages, and there are many more languages which need to be documented. The data on different languages has been obtained from the dictionaries published in these languages, and through interacting with the PAN Localization project teams in relevant countries.

Description

Copublished with and copyrighted by International Development Research Center

Keywords

ASIAN LANGUAGE COMPUTING, LOCALIZATION, COLLATION, SORTING, UNICODE COLLATION ALGORITHM, ACCESS TO INFORMATION, INTERNET, ASIAN LANGUAGES, COMPUTER PROGRAMS, SOFTWARE ENGINEERING, INFORMATION SOCIETY, ASIA

Citation

DOI