Analysis of Hate Speech in 2024 Elections on Social Media Platforms Using Natural Language Processing (NLP) Methods
DOI:
https://doi.org/10.55123/ijisit.v1i2.5Keywords:
Hate Speech Detection, NLP, Lexicon, Naive Bayes, Bi - LSTMAbstract
The surge in digital activity during the 2024 General Election in Indonesia has triggered the spread of hate speech on social media, potentially disrupting democratic stability. This study aims to analyze the effectiveness of hate speech detection on the X platform using Natural Language Processing (NLP) methods. The technical approach combines Regular Expressions for data pre-processing and a Lexicon-Based method to identify offensive words based on a predefined dictionary. Data collection was conducted using the Uninvolved Conversation Observation Technique (TSBLC) with a total of 100 election-related comment samples. Test results indicate that this rule-based program is capable of detecting the presence of hate speech within the data samples. However, comparative analysis reveals that the lexicon method yields lower accuracy compared to advanced Machine Learning models, such as Naive Bayes (93%) and Bi-LSTM (96.9%), utilized in previous studies. In conclusion, while the lexicon approach offers structured basic detection, the integration of machine learning models is highly recommended to enhance accuracy and achieve deeper contextual understanding in future research.
Downloads
References
[1] R. M. Widayat, A. Nurmandi, Y. Rosilawati, Z. Qodir, S. Usman, and T. Baharuddin, “2019 Election Campaign Model in Indonesia Using Social Media,” Webology, vol. 19, no. 1, pp. 5216–5235, Jan. 2022, doi: 10.14704/web/v19i1/web19351.
[2] F. E. Siregar, “The Role of the Elections Supervisory Agency to Contend Hoax and Hate Speech in the Course of 2019 Indonesian General Election,” Padjadjaran Jurnal Ilmu Hukum, vol. 7, no. 2, pp. 158–180, 2020, doi: 10.22304/pjih.v7n2.a2.
[3] M. M. Nasution, J. Izar, and I. H. Afifah, “https://ejournal.unida-aceh.ac.id/index.php/jetli AN ANALYSIS OF HATE SPEECH AGAINST K-POP IDOLS AND THEIR FANS ON INSTAGRAM AND TWITTER FROM THE PERSPECTIVE OF PRAGMATICS 1*.” [Online]. Available: https://ejournal.unida-aceh.ac.id/index.php/jetli
[4] O. K. Lekik, S. Palinggi, and I. C. Ranteallo, “The Descriptive Analysis of Hoax Spread through Social Media in Indonesia Media Perspective,” Scitepress, Sep. 2020, pp. 276–286. doi: 10.5220/0009441402760286.
[5] F. Deni, D. dan Muhammad, and Ikhwan M. Said, “JENIS UJARAN KEBENCIAN (HATE SPEECH) DALAM KOLOM KOMENTAR INSTAGRAM JOKOWI PADA MASA PPKM: ANALISIS LINGUISTIK FORENSIK,” Jurnal Indonesia Sosial Teknologi, vol. 3, no. 5, pp. 574–585, May 2022, doi: 10.36418/jist.v3i5.422.
[6] G. M. Abaido, “Cyberbullying on social media platforms among university students in the United Arab Emirates,” Int J Adolesc Youth, vol. 25, no. 1, pp. 407–420, Dec. 2020, doi: 10.1080/02673843.2019.1669059.
[7] H. Jiang, Y. Hua, D. Beeferman, and D. Roy, “Annotating the Tweebank Corpus on Named Entity Recognition and Building NLP Models for Social Media Analysis,” Jan. 2022, [Online]. Available: http://arxiv.org/abs/2201.07281
[8] T. Fontes, F. Murcos, E. Carneiro, J. Ribeiro, and R. J. F. Rossetti, “Leveraging Social Media as a Source of Mobility Intelligence: An NLP-Based Approach,” IEEE Open Journal of Intelligent Transportation Systems, vol. 4, pp. 663–681, 2023, doi: 10.1109/OJITS.2023.3308210.
[9] T. T. A. Putri, S. Sriadhi, R. D. Sari, R. Rahmadani, and H. D. Hutahaean, “A comparison of classification algorithms for hate speech detection,” in IOP Conference Series: Materials Science and Engineering, Institute of Physics Publishing, May 2020. doi: 10.1088/1757-899X/830/3/032006.
[10] F. Tchakounté, K. Amadou Calvin, A. A. A. Ari, and D. J. Fotsa Mbogne, “A smart contract logic to reduce hoax propagation across social media,” Journal of King Saud University - Computer and Information Sciences, vol. 34, no. 6, pp. 3070–3078, Jun. 2022, doi: 10.1016/j.jksuci.2020.09.001.
[11] D. Sinha and S. Roy Chowdhury, “Blockchain-based smart contract for international business – a framework,” Journal of Global Operations and Strategic Sourcing, vol. 14, no. 1, pp. 224–260, Mar. 2021, doi: 10.1108/JGOSS-06-2020-0031.
[12] Z. Zheng et al., “An Overview on Smart Contracts: Challenges, Advances and Platforms,” Dec. 2019, doi: 10.1016/j.future.2019.12.019.
[13] S. Kreps, “THE ROLE OF TECHNOLOGY IN ONLINE MISINFORMATION,” 2020.
[14] S. Mitra, “Regular expressions: A detailed study for the understanding of their role and methods for efficient application Soham Mitra,” ~ 71 ~ International Journal of Research in Circuits, Devices and Systems, vol. 2, no. 2, pp. 71–76, 2021, [Online]. Available: www.circuitsjournal.com
[15] K. M. O. Nahar, A. Jaradat, M. S. Atoum, and F. Ibrahim, “Sentiment analysis and classification of arab jordanian facebook comments for jordanian telecom companies using lexicon-based approach and machine learning,” Jordanian Journal of Computers and Information Technology, vol. 6, no. 3, pp. 247–262, Sep. 2020, doi: 10.5455/jjcit.71-1586289399.
[16] E. A. Abdelnabi, A. M. Maatuk, T. M. Abdelaziz, and S. M. Elakeili, “Generating UML Class Diagram using NLP Techniques and Heuristic Rules,” in Proceedings - STA 2020: 2020 20th International Conference on Sciences and Techniques of Automatic Control and Computer Engineering, Institute of Electrical and Electronics Engineers Inc., Dec. 2020, pp. 277–282. doi: 10.1109/STA50679.2020.9329301.
[17] M. Kashina, I. D. Lenivtceva, and G. D. Kopanitsa, “Preprocessing of unstructured medical data: The impact of each preprocessing stage on classification,” in Procedia Computer Science, Elsevier B.V., 2020, pp. 284–290. doi: 10.1016/j.procs.2020.11.030.
[18] D. Gibney and S. V. Thankachan, “Text indexing for regular expression matching,” Algorithms, vol. 14, no. 5, May 2021, doi: 10.3390/a14050133.
[19] F. A. T. Tobing and R. Nainggolan, “ANALISIS PERBANDINGAN PENGGUNAAN METODE BINARY SEARCH DENGAN REGULAR SEARCH EXPRESSION,” METHOMIKA Jurnal Manajemen Informatika dan Komputerisasi Akuntansi, vol. 4, no. 2, pp. 168–172, Oct. 2021, doi: 10.46880/jmika.Vol4No2.pp168-172.
[20] P. Wang, C. Brown, J. A. Jennings, and K. T. Stolee, “An Empirical Study on Regular Expression Bugs,” in Proceedings - 2020 IEEE/ACM 17th International Conference on Mining Software Repositories, MSR 2020, Association for Computing Machinery, Inc, Jun. 2020, pp. 103–113. doi: 10.1145/3379597.3387464.
[21] A. Syakur, “IMPLEMENTASI METODE LEXICON BASE UNTUK ANALISIS SENTIMEN KEBIJAKAN PEMERINTAH DALAM PENCEGAHAN PENYEBARAN VIRUS CORONA COVID-19 PADA TWITTER,” Jurnal Ilmiah Informatika Komputer, vol. 26, no. 3, pp. 247–260, 2021, doi: 10.35760/ik.2021.v26i3.4720.
[22] A. Koufakou and J. Scott, “Lexicon-Enhancement of Embedding-based Approaches Towards the Detection of Abusive Language,” 2020. [Online]. Available: https://www.kaggle.com/c/jigsaw-toxic-comment-classification-
[23] K. H. Prasetya, H. Subakti, and A. Musdolifah, “Pelanggaran Prinsip Kesantunan Berbahasa Peserta Didik terhadap Guru Sekolah Dasar,” Jurnal Basicedu, vol. 6, no. 1, pp. 1019–1027, Jan. 2022, doi: 10.31004/basicedu.v6i1.2067.
[24] D. Elisabeth, I. Budi, and M. O. Ibrohim, “Hate Code Detection in Indonesian Tweets using Machine Learning Approach: A Dataset and Preliminary Study,” in 2020 8th International Conference on Information and Communication Technology, ICoICT 2020, Institute of Electrical and Electronics Engineers Inc., Jun. 2020. doi: 10.1109/ICoICT49345.2020.9166251.
[25] A. R. Isnain, A. Sihabuddin, and Y. Suyanto, “Bidirectional Long Short Term Memory Method and Word2vec Extraction Approach for Hate Speech Detection,” IJCCS (Indonesian Journal of Computing and Cybernetics Systems), vol. 14, no. 2, p. 169, Apr. 2020, doi: 10.22146/ijccs.51743.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Richardo Johan Tanujaya, Farhan Mohammed, Gabriela Callista Halim, Cindy Thalia, Shane Michael Colyn

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish their manuscripts in the International Journal of Computer Science and Information Technology agree to the following terms:
Copyright: Copyright on any article in the International Journal of Computer Science and Information Technology is fully retained by its authors under the Creative Commons Attribution-ShareAlike 4.0 International License / CC BY SA 4.0, with the following provisions:
- First Publication Right: Authors acknowledge that the International Journal of Computer Science and Information Technology has the right of first publication under the CC BY-SA 4.0 license.
- Non-Exclusive Distribution: Authors may enter the writing separately, arrange non-exclusive distribution of the published manuscript in this journal into other versions (e.g., submit to the author's institutional repository, publish in a book, etc.), acknowledging that the manuscript was first published in the International Journal of Computer Science and Information Technology.
- Reader's Rights: Readers are allowed to download, use, and adopt the contents of the article as long as they cite the article by mentioning the title, author, and the name of this journal. Such citations are made for the advancement of science and humanity and must not violate applicable laws.










