The 4th Named Entities Workshop (NEWS) 2012
- Shared Task on Transliteration
An ACL 2012 Workshop, 12 July 2012, Jeju, Republic of Korea

Latest workshop updates

12 Jun 2012

Workshop Program

Workshop Program is now available.

 

9 May 2012

Accepted Paper List

Accepted Paper List is now available.

 

28 Mar 2012

Submission of Results closes

Submission of Results has been closed.

 

20 Mar 2012

Submission of results extends

Submission of results will be extended until 25 March 2012 23:59 Singapore Time

 

20 Mar 2012

Shared task results are available

Shared tasks results are now available.

 

14 Mar 2012

Result Submission Portal

Result Submission Portal is now open.

 

12 Mar 2012

Test data for transliteration generation tasks is available

Test data is available for transliteration generation tasks. It can be downloaded from corpora page. You can only download test data for the tasks to which you have registered by downloading or requesting training and development data.

 

11 Mar 2012

Registration Closed

Registration has been closed.

 

16 Feb 2012

English to Persian (EnPe) Corpus

The mappings of Persian character to its code used in the NEWS 2012 evaluation are:

Persian_Char_Unicode.txt

Persian_Char_Unicode.pdf

 

3 Feb 2012

Persian data

Note: If you use Persian data, please cite the following paper:

S. Karimi, A. Turpin, F. Scholer, Corpus Effects on the Evaluation of Automated Transliteration Systems, The 45th Annual Meeting of the Association for Computational Linguistics (ACL'07), pages 640-647, Prague, Czech Republic, June 2007

 

27 Jan 2012

Notes: EnJa, EnKo, JnJk, and ArEn Corpus from CJK Institute

Participants are required to obtain the license of Training and Development data for EnJa, EnKo, JnJk, and ArEn language pairs at a nominal fee of USD 300.00 from CJK Institute directly after filled up online request forms.

If you are interested in the above 4 subtasks, a flat licensing fee USD 300.00 is applied for up to four datasets.

In other words, you still need to pay the licensing fee US$300 even if you are only interested in one dataset.

 

20 Jan 2012

Training and development corpora for each language pair are now available

Training and development corpora for each language pair are now available. English-Japanese (EnJa, JnJk), English-Arabic (ArEn) and English-Korean (EnKo) corpora are obtained via request from CJK Institute; other corpora can be downloaded from this website after agreeing to the license terms and conditions of the data owners.

Please make sure you must submit your final runs after you download the NEWS 2012 datasets !

Pleae use the password we emailed to you to extract online downloaded corpus datasets !

 

19 Jan 2012

Whitepaper on Machine Transliteration

The Whitepaper on Machine Transliteration is available.

 

18 Jan 2012

Registration Opens

Registration is now available.

Pleae provide valid Email address to receive a password to extract corpus datasets !

Registration to task is required !

Please make sure you must submit your final runs after you download the NEWS 2012 datasets !

 

18 Jan 2012

Paper Submission Portal

Paper Submission Portal is now available.

 

4 Jan 2012

Important dates

Important dates for research paper, shared task and task paper are now available.

 

3 Jan 2012

Accepted by ACL2012

The 4th Named Entities Workshop (NEWS) 2012 - Shared Task on Transliteration has been accepted and scheduled on 12 July at Jeju, Republic of Korea in the ACL 2012.

 

6 Dec 2011

Website Built Up

NEWS Group Website has been built up