skip to main content
research-article

Word Prediction System for Text Entry in Hindi

Published:01 June 2014Publication History
Skip Abstract Section

Abstract

Word prediction is treated as an efficient technique to enhance text entry rate. Existing word prediction systems predict a word when a user correctly enters the initial few characters of the word. In fact, a word prediction system fails if the user makes errors in the initial input. Therefore, there is a need to develop a word prediction system that predicts desired words while coping with errors in initial entries. This requirement is more relevant in the case of text entry in Indian languages, which are involved with a large set of alphabets, words with complex characters and inflections, phonetically similar sets of characters, etc. In fact, text composition in Indian languages involves frequent spelling errors, which presents a challenge to develop an efficient word prediction system. In this article, we address this problem and propose a novel word prediction system. Our proposed approach has been tried with Hindi, the national language of India. Experiments with users substantiate 43.77% keystroke savings, 92.49% hit rate, and 95.82% of prediction utilization with the proposed word prediction system. Our system also reduces the spelling error by 89.75%.

References

  1. Ahmed, U. Z., Bali, K., Choudhury, M., and VB, S. 2011. Challenges in designing input method editors for Indian languages: The role of word-origin and context. In Proceedings of the Workshop on Advances in Text Input Methods (WTIM).Google ScholarGoogle Scholar
  2. Alm, N., Arnott, L. J., and Newell, A. F. 1992. Prediction and conversational momentum in an augmentative communication system. Commun. ACM 35, 5, 46--57. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Balasubramaniam, L. 2006. Translation article knowledgebase - Spell it Right! (in the Context of Hindi). http://www.proz.com/translation-articles/articles/705/.Google ScholarGoogle Scholar
  4. Begum, R., Husain, S., Dhwaj, A., Misra, D., Bai, L., and Sangal, R. 2008. Dependency annotation scheme for Indian languages. In Proceedings of the International Joint Conference on Natural Language Processing.Google ScholarGoogle Scholar
  5. Bharati, A., Rao, P., Sangal, R., and Bendre, S. M. 2002. Basic statistical analaysis of corpus and cross comparision. In Proceedings of ICON.Google ScholarGoogle Scholar
  6. Bhatia, T. A History of the Hindi Grammatical Tradition: Hindi-Hindustani Grammar, Grammarians, History and Problems. E.J. Brill, Leiden, Netherlands.Google ScholarGoogle Scholar
  7. Boissiere, P. 2003. An overview of existing writing assistance system. In Proceedings of the IFRATH Workshop.Google ScholarGoogle Scholar
  8. Carlberger, A., Carlberger, J., Magnuson, T., Hunnicutt, M. S., Palazuelos-Cagigas, S. E., and Navarro, S. A. 1997a. Profet, a new generation of word prediction: An evaluation study. In Proceedings of the 2nd Workshop on NLP for Communication Aids.Google ScholarGoogle Scholar
  9. Carlberger, A., Magnuson, T., Carlberger, J., Wachtmeister, H., and Hunnicutt, S. 1997b. Probability-based word prediction for writing support in dyslexia. In Proceedings of the Fonetik Conference. 17--20.Google ScholarGoogle Scholar
  10. Carlberger, J. 1997. Design and Implementation of a Probabilistic Word Prediciton Program. M.S. thesis, Computer Science, Nada, KTH, Stockholm, Sweden.Google ScholarGoogle Scholar
  11. CDAC. 2010a. Indian language search engine technologies-problems and solutions. http://iplugin.cdac.in/search-engine.htm.Google ScholarGoogle Scholar
  12. CDAC. 2010b. Problems with existing unicode based engines. http://pune.cdac.in/html/gist/research-areas/set.aspx.Google ScholarGoogle Scholar
  13. Consortium, T. U. 2009. Unicode detail. http://www.unicode.org.Google ScholarGoogle Scholar
  14. Consortium, U. 2010. Unicode normalization forms. http://www.unicode.org/reports/tr15/.Google ScholarGoogle Scholar
  15. Consortium, U. 2011. South Asian scripts-I. http://www.unicode.org/versions/Unicode5.0.0/ch09.pdf.Google ScholarGoogle Scholar
  16. Constable, P. 2004. Proposal on clarification and consolidation of the function of ZERO WIDTH JOINER in Indic scripts. Review document, Unicode Consortium. http://www.unicode.org/review/pr-37.pdf.Google ScholarGoogle Scholar
  17. Cormen, T. H., Leiserson, C. E., Rivest, R. L., and Stein, C. 2001. Introduction to Algorithms 3rd Ed. MIT Press, Cambridge, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Darragh, J. J. and Witten, I. H. 1991. Adaptive predictive text generation and the reactive keyboard. Interact. Comput. 3, 1, 27--50. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Dyke, J. A. V. 1991. Word prediction for disabled users: Applying natural language processing to enhance communication. M.S. thesis, University of Delaware.Google ScholarGoogle Scholar
  20. Fazly, A. 2002. The Use of Syntax in Word Completion Utilities. M.S. thesis, Department of Computer Science, University of Toronto.Google ScholarGoogle Scholar
  21. Fazly, A. and Hirst, G. 2003. Testing the efficacy of part-of-speech information in word completion. In Proceedings of the 10th Conference of the European Chapter of the Association for Computational Linguistics. 9--16. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Garay-Vitoria, N. and Abascal, J. 2004. A comparison of prediction techniques to enhance the communication rate. In User-Centered Interaction Paradigms for Universal Access in the Information Society, Vol. 3196, Springer, Berlin, 400--417.Google ScholarGoogle Scholar
  23. Garay-Vitoria, N. and Abascal, J. 2006. Text prediction systems: A survey. Univ. Access Inform. Soc. 4, 3, 188--203. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Garay-Vitoria, N. and González, J. A. 1997. Intelligent word prediction to enhance text input rate (a syntactic analysis based word prediction aid for people with severe motor and speech disability). In Proceedings of the 2nd International Conference on Intelligent User Interfaces. 241--244. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Ghosh, P. K. and Knuth, D. E. 1983. An approach to type design and text composition in Indian scripts. Ph.D. thesis, Stanford University.Google ScholarGoogle Scholar
  26. Go, K. and Endo, Y. 2008. Advances in human-computer interaction. A touchscreen software keyboard for finger typing. In Advances in Affective and Pleasurable Design, Yong Gu Ji Ed., 287--296.Google ScholarGoogle Scholar
  27. Google. 2010. Google Indic on-screen keyboard. http://www.google.co.in/.Google ScholarGoogle Scholar
  28. Google. 2014. Input tools. http://www.google.co.in/inputtools/.Google ScholarGoogle Scholar
  29. Group, T. R. 2011. Email statistics report, 2011-15. http://www.radicati.com/wp/wp-content/uploads/2011/05/ Email-Statistics-Report-2011-2015-Executive-Summary.pdf.Google ScholarGoogle Scholar
  30. Gupta, A. and Jamal, G. 2006. An analysis of reading errors of dyslexic readers in Hindi and English. Asia Pacific Disability Rehab. J. 17, 1, 73--86.Google ScholarGoogle Scholar
  31. Herold, M. 2004. The use of word prediction as a tool to accelerate the typing speed and increase the spelling accuracy of primary school children. Ph.D. thesis, University of Pretoria.Google ScholarGoogle Scholar
  32. Herold, M., Alant, E., and Bornman, J. 2008. Typing speed, spelling accuracy, and the use of word prediction. South African J. Educ. 28, 1, 117--134.Google ScholarGoogle Scholar
  33. Higginbotham, D. J. 1992. Evaluation of keystroke savings across five assistive communication technologies. Augmentative Alt. Commun. 8, 258--272.Google ScholarGoogle ScholarCross RefCross Ref
  34. IBM. 2011. SPSS - Statistical package for the social sciences. http://www-01.ibm.com/software/analytics/spss/products/statistics/.Google ScholarGoogle Scholar
  35. Ishida, R. 2010. An introduction to writing systems & unicode: A review of script characteristics affecting computer-based script support and unicode. http://people.w3.org/rishida/docs/unicode-tutorial.Google ScholarGoogle Scholar
  36. Isokoski, P. 2004. Manual text input: Experiments, models, and systems. Ph.D. thesis, Department of Computer Sciences, University of Tampere.Google ScholarGoogle Scholar
  37. Joshi, A., Ganu, A., Chand, A., and Mathur, V. P. G. 2004. Keylekh: A keyboard for text entry in Indic scripts. In Proceedings of the Conference on Extended Abstracts on Human factors in Computing Systems (CHI). 928--942. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Joshi, R., Shoff, K., and Mudur, S. 2003. A phonemic code based scheme for effective processing of Indian languages. In Proceedings of 23rd Internationalization and Unicode Conference.Google ScholarGoogle Scholar
  39. Jurafsky, D. and Martin, J. 2000. Speech and Language Processing. Prentice Hall, New Jersey. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Katz, S. 1987. Estimation of probabilities from sparse data for the language model component of a speech recognizer. IEEE Trans. Acoust., Speech, Signal Process. 35, 3, 400--401.Google ScholarGoogle ScholarCross RefCross Ref
  41. Klund, J. and Novak, M. 1995. If word prediction can help, which program do you choose. In Proceedings of the 18th Annual Conference on Rehabilitation Technology.Google ScholarGoogle Scholar
  42. Koester, H. H. and Levine, S. P. 1998. Model Simulations of User Performance with Word Prediction. Augmentative and Alternative Communication 14, 1, 25--35.Google ScholarGoogle ScholarCross RefCross Ref
  43. Koul, O. N. 2008. Modern Hindi Grammar. Dunwoody Press, Hyattsville.Google ScholarGoogle Scholar
  44. Kristensson, P. O. 2009. Five challenges for intelligent text entry methods. AI Mag. 30, 4, 85--94.Google ScholarGoogle ScholarCross RefCross Ref
  45. Kukich, K. 1992. Techniques for automatically correcting words in text. ACM Comput. Surv. 24, 4, 377--439. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Levinson, S. E. 1985. Structural methods in automatic speech recognition. Proc. IEEE 73, 11, 1625--1650.Google ScholarGoogle ScholarCross RefCross Ref
  47. Lipik. 2012.Lipik: A predictive text input system. http://www.lipik.Google ScholarGoogle Scholar
  48. MacArthur, C. A. 1998. Word processing with speech synthesis and word prediction: Effects on the dialogue journal writing of students with learning disabilities. Learn. Disabil. Quart. 21, 2, 151--166.Google ScholarGoogle ScholarCross RefCross Ref
  49. MacArthur, C. A., Graham, S., Haynes, J. B., and De La Paz, S. 1996. Spelling checkers and students with learning disabilities: Performance comparisons and impact on spelling. J. Special Educ. 30, 35--57.Google ScholarGoogle ScholarCross RefCross Ref
  50. MacKenzie, I. S., Kober, H., Smith, D., Jones, T., and Skepner, E. 2001. LetterWise: Prefix-based disambiguation for mobile text input. In Proceedings of the 14th Annual ACM Symposium on User Interface Software and Technology. 111--120. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. MacKenzie, I. S., Shawn, I. S., Zhang, X., and Soukoreff, R. W. 1999. Text entry using soft keyboards. Behav. Inform. Technol., 18, 4, 235--244.Google ScholarGoogle ScholarCross RefCross Ref
  52. MacKenzie, I. S. and Tanaka-Ishii, K. 2007. Text Entry Systems: Mobility, Accessibility, Universality. Morgan Kaufmann, San Francisco, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  53. Manning, C. D. and Schütze, H. June 1999. Foundations of Statistical Natural Language Processing. MIT Press, Cambridge, MA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Matiasek, J., Baroni, M., and Trost, H. 2002. FASTY - A multi-lingual approach to text prediction. In Proceedings of the 8th International Conference on Computers Helping People with Special Needs. 243--250. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. Mohanan, T. 1994. Argument Structure in Hindi. Center for the Study of Language and Information, Leland Stanford Junior University, CA.Google ScholarGoogle Scholar
  56. NCIP. 1994. Writing with word prediction software. http://www.edc.org/NCIP/LIBRARY/wp/Profile.htm.Google ScholarGoogle Scholar
  57. Oulasvirta, A., Reichel, A., Li, W., Zhang, Y., Bachynskyi, M., Vertanen, K., and Kristensson, P. O. 2013. Improving two-thumb text entry on touchscreen devices. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 2765--2774. Google ScholarGoogle ScholarDigital LibraryDigital Library
  58. Pedler, J. 2007. Computer correction of real word spelling errors in dyslexic text. Ph.D. thesis, Birkbeck, University of London.Google ScholarGoogle Scholar
  59. Ramanathan, A., Choudhary, H., Ghosh, A., and Bhattacharyya, P. 2009. Case markers and morphology: Addressing the crux of the fluency problem in English-Hindi SMT. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP. Volume 2, 800--808. Google ScholarGoogle ScholarDigital LibraryDigital Library
  60. Samanta, D., Sarcar, S., and Ghosh, S. 2013. An approach to design virtual keyboards for text composition in Indian languages. Intern. J. Hum.-Comput. Interact. 29, 8.Google ScholarGoogle ScholarCross RefCross Ref
  61. Sarcar, S., Ghosh, S., Saha, P. K., and Samanta, D. 2010. Virtual keyboard design: State of the arts and research issues. In Proceedings of IEEE Students’ Technology Symposium. 289--299.Google ScholarGoogle Scholar
  62. Seymore, K. and Rosenfeld, R. 1996. Scalable backoff language models. In Proceedings of the 4th International Conference on Spoken Language. 232--235.Google ScholarGoogle Scholar
  63. Sharma, M. K., Dey, S., Saha, P. K., and Samanta, D. 2010. Parameters effecting the predictive virtual keyboard. In Proceedings of the IEEE Students’ Technology Symposium. 268--275.Google ScholarGoogle Scholar
  64. SLM, C. 1999. The CMU statistical language modeling (SLM) toolkit. http://homepages.inf.ed.ac.uk/lzhang10/slm.html.Google ScholarGoogle Scholar
  65. Stolberg, H. O., Norman, G., and Trop, I. 2004. Fundamentals of clinical research for radiologists. Amer. J. Radiol. 183, 1539--1544.Google ScholarGoogle Scholar
  66. Strassel, S., Maxwell, M., and Cieri, C. 2003. Linguistic resource creation for research and technology development: A recent experiment. ACM Trans. Asian Lang. Inform. Process. 2, 2, 101--117. Google ScholarGoogle ScholarDigital LibraryDigital Library
  67. Swiffin, A. L., Arnott, J. L., and Newell, A. F. 1987. The use of syntax in a predictive communication aid for the physically handicapped. In Proceedings of the 10th Annual Conference on Rehabilitation Technology. 124--126.Google ScholarGoogle Scholar
  68. Tachyon Technologies. 2012. Typing Hindi with Quillpad. http://www.quillpad.in/index.html\#.UtznOfvhUdV.Google ScholarGoogle Scholar
  69. Trnka, K., McCaw, J., Yarrington, D., McCoy, K. F., and Pennington, C. 2009. User interaction with word prediction: The effects of prediction quality. ACM Trans. Access. Comput. 1, 3, 1--34. Google ScholarGoogle ScholarDigital LibraryDigital Library
  70. Vanderheiden, G. C. and Kelso, D. P. 1987. Comparative analysis of fixed-vocabulary communication acceleration techniques. Augmentative Alt. Commun. 3, 4, 196--206.Google ScholarGoogle ScholarCross RefCross Ref
  71. Wandmacher, T. 2009. Adaptive Word Prediction and its Application in an Assistive Communication System. Ph.D. thesis, University of Tubingan.Google ScholarGoogle Scholar
  72. Wolf, E., Vembu, S., and Miller, T. 2006. On the use of topic models for word completion. Adv. Natural Lang. Process. 4139, 500--511. Google ScholarGoogle ScholarDigital LibraryDigital Library
  73. Wood, M. 1996. Syntactic pre processing in single word prediction for disabled people. Ph.D. thesis, University of Bristol.Google ScholarGoogle Scholar
  74. Zordell, J. 1990. The use of word prediction and spelling correction software with mildly handicapped students. Clos. Gap 9, 1, 10--11.Google ScholarGoogle Scholar

Index Terms

  1. Word Prediction System for Text Entry in Hindi

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Asian Language Information Processing
        ACM Transactions on Asian Language Information Processing  Volume 13, Issue 2
        June 2014
        99 pages
        ISSN:1530-0226
        EISSN:1558-3430
        DOI:10.1145/2636326
        Issue’s Table of Contents

        Copyright © 2014 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 1 June 2014
        • Accepted: 1 February 2014
        • Revised: 1 January 2014
        • Received: 1 May 2013
        Published in talip Volume 13, Issue 2

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader