Hate Me Not: Detecting Hate Inducing Memes in Code Switched Languages

[paper]

Abstract

With the advent of social media, the rise in the number of social media users has upsurged the hateful content being posted online. In countries like India, where multiple languages are spoken, these abhorrent posts are not only from a particular language but rather from an unusual blend of languages called code-switched languages. In India, a particular pair, namely Hinglish is the most popular and the task of hate speech detection gains its intricacy from the fact that there are no fixed spellings, grammar and semantics for this language. Also, this hate speech is depicted with the help of images to form “Memes” which create a long-lasting impact on the human mind. This poses a substantial threat for the users that often become a victim of the derogatory speech and abuses on such platforms. In this paper, we take up the task of hate and offense detection from multimodal data, i.e. images (Memes) that contain text in code-switched languages. We firstly present a novel triply annotated Indian political Memes (IPM)dataset, which comprises memes from various Indian political events that have taken place post-independence and are classified into three distinct categories. We also propose a binary-channeled CNN cum LSTM based model wherein we individually process the images using the CNN model and text extracted from the images using the LSTM model and finally recombine them to get the result. This one-of-a-kind model outperforms all other models to give state-of-the-art results in the domain of offensive image (Memes) classification in code-switched languages - Hinglish. We also release the code, model and IPM dataset for research purposes.