In:
World Englishes, Wiley
Abstract:
It is well‐known that Outer Circle English has undergone extensive contact‐induced lexical and grammatical restructuring. Is it possible to use common NLP tools developed for Inner Circle English to process Outer Circle English texts? Here, we report our experience of using the Stanford PoS tagger to tag the Singaporean component of the International Corpus of English (ICE‐SIN). We isolate two major contact‐related causes of tagging errors: (1) lexical and grammatical loans directly borrowed from the local languages; and (2) English‐origin words with new grammatical meanings acquired from the local languages. While the first type may be easy to overcome, the latter type is intractable, creating an extra layer of morphosyntactic complexity. We achieved comparable accuracy rates in the more formal registers, and a lower but still decent 88% in the informal register of private conversations. A tagged ICE‐SIN allows us to investigate lexical and grammatical restructuring at unprecedented levels of detail.
Type of Medium:
Online Resource
ISSN:
0883-2919
,
1467-971X
Language:
English
Publisher:
Wiley
Publication Date:
2022
detail.hit.zdb_id:
1495564-7
detail.hit.zdb_id:
2271256-2
SSG:
5,3
Permalink