Cranfield Online Research Data (CORD)
2 files

Evaluation of Cyberbullying using Optimized Multi-Stage ML Framework and NLP

conference contribution
posted on 2021-12-11, 19:02 authored by LIDA KETSBAIALIDA KETSBAIA, Biju IssacBiju Issac, Xiaomin Chen
Due to the evolution of technology, online hate is increasing, more specifically in areas of social media amongst the general population. Online hate has become a phenomenon that destructively impacts individuals, with victims suffering long-lasting mental and psychological issues. Since cyberhate is conveyed as an ever-growing social problem, researchers have tried to tackle the matter. One of the main methods researchers have focused on is through the means of Machine Learning to help classify whether a piece of textual data can be identified as cyberbullying or not. Therefore, the purpose of the research is to employ a multi-stage optimized Machine Learning Framework that will look at using a combination of two data balancing methods (RUS and SMOTE), the feature selection method PCA as well as the bio-inspired metaheuristic optimization techniques PSO and GA. The framework applied increases the performance of the Machine Learning Classifier Logistic Regression to help detect instances of cyberbullying. Furthermore, the paper will show the potential of using various NLP methods such as RoBERTa, XLNet and DistilBERT to find the most suitable model to use within the textual analysis of cyberhate.


Authoriser (e.g. PI/supervisor)