Johann Roturier | NortonLifeLock

Johann Roturier
Researcher

Johann Roturier's current research interests lie at the intersection of natural language processing, localization, and human factors in security. Johann completed his Ph.D. thesis in 2007, which investigated the impact of controlled language rules on various characteristics of machine-translated documentation. Since then, he has transferred some of his research findings into production processes, co-authored several papers and patents, and worked with multiple product teams.

During that time, he has also authored a book on localization (published by Routledge), taken part in standardization activities, served on numerous program committees for top-tier conferences (e.g. ACL, NAACL, EMNLP), co-supervised several Ph.D. Computer Science and Applied Language students, and acted as the scientific representative of the FP7 ACCEPT collaborative research project.

Selected Academic Papers

Foreebank: Syntactic Analysis of Customer Support Forums

Rasoul Kaljahi ; Jennifer Foster ; Johann Roturier ; Corentin Ribeyre ; Teresa Lynn ; Joseph Le Roux

In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP 2015)
We present a new treebank of English and French technical forum content which has been annotated for grammatical errors and phrase structure. This double annotation allows us to empirically measure the effect of errors on parsing performance. While it is slightly easier to parse the corrected versions of the forum sentences, the errors are not the main factor in making this kind of text hard to parse.
This paper introduces the Foreebank data set, a data set created for training user-generated content parsers. By clicking on the link below to access the Foreebank data set, or by accessing and/or using the Foreebank data set, you agree to be bound by these Terms of Use. If you do not agree to the Terms of Use, do not access or use the ForeeBank Data Set.

Evaluation of Machine-Translated User Generated Content: A pilot study based on User Ratings

Linda Mitchell, Johann Roturier

In Proceedings of the 16th Annual Conference of the European Association for Machine Translation (EAMT 2012)

Examining the Adoption and Abandonment of Security, Privacy, and Identity Theft Protection Practices

Yixin Zou, Kevin A. Roundy, Acar Tamersoy, Saurabh Shintre, Johann Roturier, Florian Schaub

In Proceedings of ACM CHI Conference on Human Factors in Computing Systems (CHI 2020) (Honorable Mention Award)
Our online survey of 902 individuals studies the reasons for which users struggle to adhere to expert-recommended security, privacy, and identity-protection practices. We examined 30 of these practices, finding that gender, education, technical background, and prior negative experiences correlate with practice adoption levels. We found that practices were abandoned when they were perceived as low-value, inconvenient, or when overridden by subjective judgment. We discuss how tools and expert recommendations can better align to user needs.

DCU-Symantec Submission for the WMT 2012 Quality Estimation Task

Raphaël Rubino, Johann Roturier, Rasoul Samad Zadeh Kaljahi, Fred Hollowood, Jennifer Foster, Joachim Wagner

In Proceedings of the 7th Workshop on Statistical Machine Translation (NAACL 2012)

Localizing Apps: A practical guide for translators and translation students

Johann Roturier

Published by Routledge

A Detailed Analysis of Phrase-based and Syntax-based Machine Translation: The Search for Systematic Differences

Rasoul Samad Zadeh Kaljahi, Raphael Rubino, Johann Roturier, Jennifer Foster

In Proceedings of the 10th Conference of the Association for Machine Translation in the Americas (AMTA 2012)

DCU-Symantec at the WMT 2013 Quality Estimation Shared Task

Raphaël Rubino, Johann Roturier, Rasoul Samad Zadeh Kaljahi, Fred Hollowood, Jennifer Foster, Joachim Wagner

In Proceedings of the 8th Workshop on Statistical Machine Translation (ACL 2013)

Using Automatic Machine Translation Metrics to Analyze the Impact of Source Reformulations

Johann Roturier, Linda Mitchell, Robert Grabowski, Melanie Siegel

In Proceedings of the 10th Biennial Conference of the Association for Machine Translation in the Americas (AMTA-2012)

Syntax and Semantics in Quality Estimation of Machine Translation

Rasoul Kaljahi, Jennifer Foster, Johann Roturier

In Proceedings of the 8th Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST-8)

Evaluation of MT systems to translate user generated content

Johann Roturier, Anthony Bensadoun

In Proceedings of the 13th Machine Translation Summit (MT Summit XIII)

Community-based post-editing of machine-translated content: monolingual vs. bilingual

Linda Mitchell, Johann Roturier, Sharon O’Brien

In Proceedings of the 2nd MT Summit XIV Workshop on Post-editing Technology and Practice (WPTP 2013)

The ACCEPT Post-Editing environment: a flexible and customisable online tool to perform and analyse machine translation post-editing

Johann Roturier, Linda Mitchell, David Silva

In Proceedings of the 14th Machine Translation Summit (MT Summit 2013)

Domain adaptation in statistical machine translation of user-forum data using component-level mixture modeling in statistical machine translation of user-forum data using component-level mixture modeling

Pratyush Banerjee, Sudip Kumar Naskar, Johann Roturier, Andy Way, Josef Van Genabith

In Proceedings of the 13th Machine Translation Summit (MT Summit XIII)

Translation Quality-Based Supplementary Data Selection by Incremental Update of Translation Models

Pratyush Banerjee, Sudip Kumar Naskar, Andy Way, Josef van Genabith, Johann Roturier

In Proceedings of the 24th International Conference on Computational Linguistics (COLING 2012)

Domain Adaptation in SMT of User-Generated Forum Content Guided by OOV Word Reduction: Normalization and/or Supplementary Data?

Pratyush Banerjee, Sudip Kumar Naskar, Andy Way, Josef van Genabith, Johann Roturier

In Proceedings of the 16th Annual Conference of the European Association for Machine Translation (EAMT 2012)

Quality Estimation of English-French Machine Translation: A Detailed Study of the Role of Syntax

Rasoul Kaljahi, Jennifer Foster, Johann Roturier, Raphael Rubino

In Proceedings of the 25th International Conference on Computational Linguistics (COLING 2014)

Bootstrapping a Natural Language Interface to a Cyber Security Event Collection System using a Hybrid Translation Approach

Johann Roturier, Brian Schlatter, David Silva

In Proceedings of the 17th Machine Translation Summit (MT Summit XVII)
We present a system that can be used to generate Elasticsearch (database) query strings for English-speaking cyberthreat hunters, security analysts or responders (agents) using a natural language interface.

Quality Estimation-guided Data Selection for Domain Adaptation of SMT

Pratyush Banerjee, Raphael Rubino, Johann Roturier, Josef van Genabith

In Proceedings of the 14th Machine Translation Summit (MT Summit 2013)

Who Knows I Like Jelly Beans? An Investigation Into Search Privacy

Daniel Kats, David Luz Silva, and Johann Roturier

In Proceedings of the 22nd Privacy Enhancing Technologies Symposium (PETS 2022)