New Paper! BabyLM (CoNLL 2023)

TL;DR:

In this paper, we described our submission to the BabyLM Challenge, and investigated sample-efficient pretraining strategies. On the data side, we focused on improving data utilization, specifically different batching strategies for training. Our findings indicated that the formatting of the input data can significantly impact and improve downstream task performance. On the modelling side, we proposed part-of-speech augmentation to enrich the training signals derived from the datasets, and we showed that inducing structural biases in the model through part-of- speech trees yields modest benefits.

New Paper! BabyLM (CoNLL 2023)

Ziling Cheng, Rahul Aralikatte, Ian Porada, Cesare Spinoso-Di Piano, and Jackie Chi Kit Cheung

TL;DR:

Poster: