Sentiment Analysis With Sarcasm Detection On Politician’s Instagram

Sarcasm is one of the problem that affect the result of sentiment analysis. According to Maynard and Greenwood (2014), performance of sentiment analysis can be improved when sarcasm also identified. Some research used Naïve Bayes and Random Forest method on sentiment analysis process. On Salles, dkk (2018) research, in some cases Random Forest outperform the performance by Support Vector Machine that known as a superior method. In this research, we did sentiment analysis on comment section on Instagram account of Indonesian politician. This research compare the accuracy of sentiment analysis with sarcasm detection and analysis sentiment without sarcasm detection, sentiment analysis with Naïve Bayes and Random Forest method then Random Forest for sarcasm detection. This research resulted in accuracy value in sentiment analysis without sarcasm detection with Naïve Bayes 61%, with Random Forest method 72%. Accuracy on sentiment analysis with sarcasm detection using Naïve Bayes – Random Forest method is 60% and using Random Forest – Random Forest method is 71%.


INTRODUCTION
Analysis from the Public Connection Survey shows that media consumption, along with demographic, trust, success and social capital measures, influences public connections and political participation (Couldry et al., 2007). Social media is a tool that has a major influence on political activities. According to Lopez-Lopez, et al (2014) stated that residents, with roles as supporters or consumers, mostly visit the social media of an organization (eg government, political parties) to complain (shitstorm phenomenon) or support and good experiences (candystorm phenomenon).
Instagram as one of the largest social media for people to express opinions, share thoughts and reports in real-time. There were as many as 62,230,000 active Instagram users in Indonesia in January 2020, accounting for 22.7% of the entire population. (NapoleonCat.com, 2020). Instagram is also a platform that is often used by Indonesian social media users aged 16 to 64 years, 23% higher than Twitter (Hootsuite, 2020). The amount of Instagram data increases with the height of its popularity. The maximum number of characters in Instagram upload comments and captions is 2200 characters, much different from Twitter which can only contain 280 characters. With very large data and a higher maximum number of characters, it certainly makes the text in Instagram comments more complex to analyze.
Complaints or support on Instagram accounts of politicians who represent the government in Indonesia, need to be analyzed to assist in the selection of the next policy. Various data mining techniques are applied to understand public opinion. One popular technique for analyzing data is sentiment analysis. Opinions on sentiment analysis are classified as positive, negative or neutral (Pang and Lee, 2008). In some circumstances sentiment analysis has significant drawbacks. One of them when the text contains sarcasm. It is possible that a sarcastic text that actually mocks a politician is detected as a positive opinion. The results of the research of Antonakaki, et al (2017), 11% of Twitter users who are active in the topic of the United States presidential election express opinions in the form of sarcasm. Sarcasm, as a special type of communication, where the explicit meaning is different from the implicit meaning, cannot be identified effectively with conventional data mining techniques such as sentiment analysis (Yee Liau and Pei Tan, 2014). Maynard and Greenwood (2014) say that the performance of sentiment analysis can be improved when sarcasm can be identified. In a previous study, Yunitasari, et al (2019) detected sarcasm using the Random Forest method with 4 features, namely unigram, sentiment-related features, punctuation-related features and lexycal and syntactic features. Using these 4 features, the accuracy of sentiment analysis increased from 75% to 80%. Alita and Rahman's (2020) research succeeded in increasing the accuracy of sentiment analysis by 16.61% by detecting sarcasm in tweets about public services. There were 69 sarcasm tweets from 122 tweets with positive sentiment predictions about "Jokowi", and 82 sarcasm tweets from 100 tweets with positive sentiments about "Ahok".
In a study by Bouazizi and Otsuki (2016), a comparison was made using 4 classification methods on sarcasm detection, the highest accuracy result of 83% was obtained by the Random Forest algorithm. In addition, in the research of Salles, et al (2018) in some cases Random Forest can outperform the performance of the Support Vector Machine which is known to be superior. Based on the research mentioned, there is an algorithm that obtains high accuracy in classification with certain datasets. To find out the best method to improve the accuracy of sentiment analysis, in this study a comparison of the Machine Learning algorithm on sentiment analysis with sarcasm detection was carried out with a modification of the research flow from Yunitasari, et al (2019) Therefore, this study uses the Naïve Bayes and Random Forest algorithms with a dataset of Instagram comments from Indonesian politician's accounts.

METHODS
This study focuses on detecting sarcasm in Instagram comments on politician's posts in Indonesia using the Naïve Bayes and Random Forest method. The flow of this research can be seen on figure 1. The steps taken are collecting data, labeling data, preprocessing, feature extraction, classification of sentiment analysis, evaluation of sentiment classification results, classification of sarcasm comments, evaluation of sarcasm classification results, sentiment reversal if comments are detected as sarcasm, evaluation after sentiment class reversal.
Comments were obtained from data scraping using Selenium by taking comments from uploaded accounts of politicians such as Puan Maharani, Joko Widodo and the DPR RI. After the dataset is collected, the next step is the data labeling process.
The next stage is data preprocessing. At the data preprocessing stage using several methods such as data cleaning, case folding, tokenization, stopword removal, conversion of emoticons into strings, slang words into standard words, stemming. The next step, feature extraction. The features used in the model are unigram, bi-gram, sentiment-related features, punctuation-related features and lexical features.
The next step is to classify the sentiments of the data using the Naïve Bayes classifier and Random Forest classifier methods. The next step is to classify sarcasm from the data using the Random Forest method. Furthermore, the data with the label of sarcasm will be changed to a negative class. After that process, then each model is evaluated. The evaluation of the model is calculated based on the performance value, which contains accuracy, precision, recall, and f1score.

Data Collecting and Labeling
Collecting data is done using Selenium to retrieve the data from Instagram. Number of comments taken as many as 3140 data. The number of classes that have been labeled with sentiment, namely 815 negative classes, 521 positive classes, 1804 neutral. The number of classes that have been labeled sarcasm class, namely 2528 non-sarcasm class and 612 sarcasm class. The results of data collection and labeling can be seen in Table 1.

Table 1 Data colection result
The class distribution of the data can be seen on Figure 2. The green bar describe number of non sarcasm data and the blue bar describe number of sarcasm data.

Data Preprocessing
After the data is labeled, the data preprocessing stage is carried out. Data preprocessing is done by converting emoji into strings, deleting unimportant parts of the text, separating sentences into word parts called tokens (tokenization), changing language slang words into standard words, deleting words that often appear and don't have important meaning (stopword removal) and changing words into basic words (stemming). The result of data preprocessing can be seen on Table 2.
The process of converting Indonesian slang words into standard words is assisted by a dictionary from github Louis Owen (https://github.com/louisowen6/NLP_Bahasa_resources/ blob/master/combined_slang_words.txt). The process of getting sentiment from the text to calculate the sentiment of each word using the InSet Lexicon dictionary which contains 3609 positive words and 6609 negative words with a range of -5 to +5, from github Fajri Koto (https://github.com/fajri91/InSet ).

Features Extraction
The next step after preprocessing the data is feature extraction. The features used are TF-IDF unigram, sentiment-related features, punctuation-related features and lexical features.

TF-IDF
TF-IDF weighting combines term frequency (tf) and inverse document frequency (idf) models. The first element, TF counts the occurrence of terms (words) in the document. TF with term t, is calculated as follows: where nt is the number of occurrences of t in the document and nd is the number of terms in the document. The second element, IDF calculates the importance of a term. The IDF is calculated as follows: where N is the number of documents and dfi is the number of documents containing term t. The final result of tf-idf is the multiplication of tf and idf.

Sentiment-related Features
In the sentiment-related features feature set, there are several features taken, namely the sentiment value of the emoji, the sentiment contrast value of the emoji, the word sentiment value and the sentiment contrast value of the word.

Punctuation-related Features
In the set of punctuation-related features, there are several features taken, namely the calculation of the number of occurrences of exclamation marks, question marks, periods, quotation marks, capital letters in words, the number of repetitions of letters in one word.

Lexical Features
In this feature, the number of repetitions of laughter in the text is counted.

Splitting Data and Oversampling
The datasets that have gone through the data preprocessing and feature extraction processes are then divided into training data and testing data. With a comparison between training data and testing data, which is 8:2.
The data obtained from scraping comments on Instagram is very diverse, so the class of the dataset obtained is not balanced. In the process of sentiment analysis, there are more data with negative classes than the other two classes. In the sarcasm prediction process, the data with the non-sarcasm class is much more than the sarcasm class. In order to overcome the problem of unbalanced data, the SMOTE oversampling library is used which duplicates samples from the minority class.

Sentiment Analysis
Sentiment analysis is a text analysis technique to detect the polarity of a text in a document, paragraph, sentence, or clause. Sentiment analysis is often used to detect sentiment in social data, measure brand reputation, and understand customers.
In this research, conducted sentiment analysis with Naïve Bayes and Random Forest classifier.

Naïve Bayes Model
Naïve Bayes is a fast algorithm, high-scale model formation and assessment, can be used for binary and multiclass classification, and lightweight for training because it does not need complicated optimizations (Oracle, 2021). In Bayes' theorem, the conditional probability or probability is expressed as: where X is the proof, H is the hypothesis, P(H|X) is the probability that the hypothesis H is true for the proof X, P(X|H) is the probability that the proof X is true for the hypothesis H, P(H) is the prior probability of the hypothesis H , and P(X) is the prior probability of the proof X.

Random Forest Model
Random Forest is one of the ensemble methods in figure 3, that combines a number of k learning models with the aim of creating an improved classification model. The ensemble method

Sarcasm Detection and Changing Sentiment Label
In this research, sarcasm detection is done with Random Forest algorithm. After label from sarcasm detection retrieved, the sentiment label and sarcasm label is being checked. If the data is sarcasm and the sentiment is positive or netral, the sentiment is changed to negative sentiment.

Model Evaluation
The results of the evaluation of the data and their classification can be represented in a 2x2 matrix called the Confusion Matrix (Table 3). The accuracy value can be calculated by dividing the number of correct classification results by the sum of all data with the equation: Precision is the level of accuracy between the information requested and the answer given by the system. Equation of precision: Recall is the success value of the system to retrieve information with the equation: Then the f1-score shows the performance of precision and recall: 3. RESULTS AND DISCUSSION

Sentiment Analysis With Random Forest
In this Random Forest model, hyperparameter tuning is performed to determine the parameters in order to obtain the best model. The results of the Random Forest parameters obtained are 'n_estimators' as much as 1400, 'min_samples_split' as much as 2, 'min_samples_leaf' as much as 1, 'max_features' with a value of 'sqrt', 'max_depth' as much as 80, 'bootstrap' with a boolean value of False. The time required for training data, predictions on testing and evaluation data is 21772.37 seconds. The accuracy of the Random Forest model using the above parameters is 72%.

Sentiment Analysis With Naïve Bayes
In this Multinomial Naive Bayes model, hyperparameter tuning is performed to determine the parameters in order to obtain the best model. The result of the Multinomial Naive Bayes parameter obtained is 'alpha' with a value of 0.00001. The time required for training data, predictions on testing and evaluation data is 14.48 seconds. The accuracy of the Random Forest model using the above parameters is 61%.

Sarcasm Detection With Random Forest
In this Random Forest model, hyperparameter tuning is performed to determine the parameters in order to obtain the best model. The results of the Random Forest parameters obtained are 'n_estimators' as much as 1400, 'min_samples_split' as much as 2, 'min_samples_leaf' as much as 1, 'max_features' with a value of 'sqrt', 'max_depth' as much as 80, 'bootstrap' with a boolean value of False. The time required for training data, predictions on testing and evaluation data is 19183.17 seconds. The accuracy of the Random Forest model using the above parameters is 83%.

Sentiment Label Changed Results
The label results from sentiment prediction using Random Forest and Naive Bayes and then checking for sarcasm from the text. If the text is predicted to be sarcasm and the sentiment is neutral (0) or positive(1), then the sentiment value will be changed to negative (-1). A comparison of the evaluation of the Random Forest and Naive Bayes sentiment analysis model after the label was changed can be seen in table 4.

CONCLUSION
After all the research steps have been carried out, the following conclusions can be drawn is the best accuracy in the sentiment analysis model is obtained using Random Forest with an accuracy of 71%. The accuracy of the sarcasm model with Random Forest is 83%.Sentiment analysis without sarcasm detection obtained better results in both models, Random Forest and Naive Bayes. The result of sentiment analysis accuracy without sarcasm detection is one percent higher than sarcasm detection.