CamemBERT: a Tasty French Language Model

We release CamemBERT a Tasty French Language Model. CamemBERT is trained on 138GB of French text. It establishes a new state of the art in POS tagging, Dependency Parsing and NER, and achieves strong results in NLI. Bon appétit !

ASSET: A Dataset for Tuning and Evaluation of Sentence Simplification Models with Multiple Rewriting Transformations

In order to simplify a sentence, human editors perform multiple rewriting transformations: they split it into several shorter sentences, paraphrase words (i.e. replacing complex words or phrases by simpler synonyms), reorder components, and/or delete …

Les modèles de langue contextuels Camembert pour le Français : impact de la taille et de l'hétérogénéité des données d'entrainement

We explore the impact of the training data size and heterogeneity on French language modeling.

Controllable Sentence Simplification

Text simplification aims at making a text easier to read and understand by simplifying grammar and structure while keeping the underlying information identical. It is often considered an all-purpose generic task where the same simplification is …

EASSE: Easier Automatic Sentence Simplification Evaluation

We introduce EASSE, a Python package aiming to facilitate and standardise automatic evaluation and comparison of Sentence Simplification (SS) systems. EASSE provides a single access point to a broad range of evaluation resources: standard automatic …

Reference-less Quality Estimation of Text Simplification Systems

ELMoLex: Connecting ELMo and Lexicon Features for Dependency Parsing

In this paper, we present the details of the neural dependency parser and the neural tagger submitted by our team ParisNLP to the CoNLL 2018 Shared Task on parsing from raw text to Universal Dependencies. We augment the deep Biaffine (BiAF) parser …