![]() It has been seen to break the barrier that makes the two look different, although they are facets of the same language. ![]() It has received a lot of attention from the NLP research community of South Asia. Hindi and Urdu transliteration has always been treated special among Indic script transliterations. Hindi-Urdu Transliteration Written on 08 July 2016 We can only approach the best possible results by selecting a proper ML technique which itself depends on the problem definition. The accuracy of the model mostly depends on th number of training samples, the material impact of training samples on the variable being predicted and ML algorithm selected to train the model. No matter how amazing the ML approach used to model a problem, it can never guarantee 100% results or even close. K-Best Transliterations Written on 28 June 2016 I extract all word pairs which occur as 1-to-1 alignments in the word-aligned corpus as potential transliteration equivalents. Initially, the parallel corpus is word-aligned using GIZA++, and the alignments are refined using the grow-diag-final-and heuristic. As mentioned in the proposal, I use the sentence aligned ILCI parallel corpora and Indo-wordnet synsets to extract the transliteration pairs. Since I am using ML approach to develop Roman-Indic transliteration systems, I need to create the training data first. Roman-Indic Transliteration Written on 14 June 2016 Machine transliteration can play an important role in natural language application such as information retrieval and machine translation, especially for handling proper nouns and technical terms, cross-language applications, data mining and information retrieval system. Machine transliteration is the computer automated process of transcribing a character or word from one language script to another. Transliteration simply means conversion of a text from one script to another. Transliteration: Background & Approaches Written on Created different test cases to evaluate these existing systems, to gauge the weaknesses and strengths of these systems. For the project, I would be mentored by Riyaz Ahmad Bhat and Santhosh Thottingal.ĭuring the community bonding period I explored some of the existing tools on Indic-transliteration. ![]() I would be contributing towards the automatic script transliteration between scheduled languages of India including English. Libindic is an open source library that supports many utilities for text processing of Indian languages. I feel privileged to have been given the opportunity to work for Libindic organization under the Google Summer of Code 2016.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |