Improving Performance Sentiment Analysis Movie Review Film using Random Forest with Feature Selection Information Gain

(1) Vinsent Brilian Adiguna Mail (Department of Faculty Informatics Engineering, University Dian Nuswantoro, Semarang, Indonesia)
(2) Muslihul Aqqad Mail (Department of Faculty Informatics Engineering, University Dian Nuswantoro, Semarang, Indonesia)
(3) * Purwanto Purwanto Mail (Department of Faculty Informatics Engineering, University Dian Nuswantoro, Semarang, Indonesia)
(4) Jaluanto Sunu Jaluanto Sunu Mail (Department of Faculty Economics and Business, University 17 August 1945 Semarang, Semarang, Indonesia)
(5) Honorata Ratnawati Honorata Ratnawati Mail (Department of Faculty Economics and Business, University 17 August 1945 Semarang, Semarang, Indonesia)
*corresponding author

Abstract


Sentiment analysis in film reviews is an important task to understand the audience's opinion towards a cinematic work. However, the complexity and subjectivity of language in film reviews pose a challenge. This research explores the application of Random Forest algorithm, an ensemble learning method, to perform sentiment classification on film reviews. Random Forest is built from a set of decision trees, each of which provides a prediction, and the final result is obtained from majority voting. This approach has the advantage of handling overfitting data. This research uses 500 review datasets along with positive and negative sentiment labels. The review text is represented as Information Gain and TF-IDF features to model the weight of each word. The Random Forest model is then trained using these features to predict sentiment labels. The performance of the model is evaluated using metrics such as accuracy, precision, recall and f1-score. The experimental results show that Random Forest is able to achieve 95.20% accuracy in sentiment classification of film reviews, surpassing the Support Vector Machine classification algorithm which in previous studies only achieved 92%. These findings provide a new perspective on the benefits of ensemble learning in sentiment analysis and its potential application in other domains such as marketing and public opinion analysis.


Keywords


random forest, information gain, feature selection, sentiment analysis.

   

DOI

https://doi.org/10.29099/ijair.v8i1.1.1227
      

Article metrics

10.29099/ijair.v8i1.1.1227 Abstract views : 227 | PDF views : 88

   

Cite

   

Full Text

Download


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

________________________________________________________

The International Journal of Artificial Intelligence Research

Organized by: Departemen Teknik Informatika
Published by: STMIK Dharma Wacana
Jl. Kenanga No.03 Mulyojati 16C Metro Barat Kota Metro Lampung

Email: jurnal.ijair@gmail.com

View IJAIR Statcounter

Creative Commons License
This work is licensed under  Creative Commons Attribution-ShareAlike 4.0 International License.