Application of Transformer-based Language Models to Detect Hate Speech in Social Media
Keywords:transformer, hate speech, social media, RoBERTa, XLNet, fine-tuning, natural language processing
Detecting and removing hateful speech in various online social media is a challenging task. Researchers tried to solve this problem by using both classical and deep learning methods, which are found to have limitations in terms of the requirement of extensive hand-crafted features, model architecture design, and pretrained embeddings, that are not very proficient in capturing semantic relations between words. Therefore, in this paper, we tackle the problem using Transformer-based pretrained language models which are specially designed to produce contextual embeddings of text sequences. We have evaluated two such models—RoBERTa and XLNet—using four publicly available datasets from different social media platforms and compared them to the existing baselines. Our investigation shows that the Transformer-based models either surpass or match all of the existing baseline scores by significant margins obtained by previously used models such as 1-dimensional convolutional neural network (1D-CNN) and long short-term memory (LSTM). The Transformer-based models proved to be more robust by achieving native performance when trained and tested on two different datasets. Our investigation also revealed that variations in the characteristics of the data produce significantly different results with the same model. From the experimental observations, we are able to establish that Transformer-based language models exhibit superior performance than their conventional counterparts at a fraction of the computation cost and minimal need for complex model engineering.