Journal of Data Science and Its Applications <p>Journal of Data Science and Its Applications (JDSA)</p> en-US (Dr. Kemas Muslim L.) (Said Al Faraby) Fri, 15 Nov 2019 13:59:50 +0000 OJS 60 Visualizing Language Lexical Similarity Clusters: A Case Study of Indonesian Ethnic Languages <p>Language similarity clusters are useful for computational linguistic researches that rely on language similarity or cognate recognition. The existing language similarity clustering approach which utilizes hierarchical clustering and k-means clustering has difficulty in creating clusters with a middle range of language similarity. Moreover, it lacks an interactive visualization that user can explore. To address these issues, we formalize a graph-based approach of creating and visualizing language lexical similarity clusters by utilizing ASJP database to generate the language similarity matrix, then formalize the data as an undirected graph. To create the clusters, we apply a connected components algorithm with a threshold of language similarity range. Our interactive online tool allows a user to dynamically create new clusters by changing the threshold of language similarity range and explore the data based on language similarity range and number of speakers. We provide an implementation example of our approach to 119 Indonesian ethnic languages. The experiment result shows that for the case of low system execution burden, the system performance was quite stable. For the case of high system execution burden, despite the fluctuated performance, the response times were still below 25 seconds, which is considered acceptable.</p> Arbi Haza Nasution, Yohei Murakami ##submission.copyrightStatement## Fri, 15 Nov 2019 00:00:00 +0000 Understanding Public Attitude towards Political Candidate through Conversational Network in West Java Regional Election <p>Social media is a very important part of the political campaign strategy. By using information about various policies as well as public opinion, will provide rich information in political issues during elections. The problem is how political attitudes in social media relate to the results of the election winners. In this paper, we proposed a methodology of social network analysis to measure conversational network activity. As a case study, we select the Pilkada Jawa Barat 2018 for the reason most populous province in Indonesia. We get the conversation in online social network service <em>Twitter</em> and collected 70335 tweets from June 20 to June 26, 2018. Our findings indicate that the network properties of each candidate is in accordance to the real count and the candidate that appear most often are <em>"@ridwankamil</em>", the name of the winner of the regional elections in the Pilkada Jawa Barat 2018. We summarize all the conversations of each candidate and our results show there are high correlations with the results of the election winners. Because the higher the conversations network of each candidate, the greater the possibility of winning the election.</p> Rimba Pratama Putra, Hanif Fakhrurroja, Andry Alamsyah ##submission.copyrightStatement## Fri, 15 Nov 2019 00:00:00 +0000 Public Response Analysis toward Poverty Reduction Program in Indonesia 2014-2018 through Twitter Data <p><em>Poverty reduction is the main priority of national development, it aims to encourage the improvement of the welfare of the poor in order to enjoy increasingly quality economic growth . But, every government program really needs an evaluation and how the response from the public. Social media monitoring can be used to understand and provide real-time feedback about policy reform. Therefore, government are expected to consider using similar technology in the current evaluation and operational framework. So they need tools to analyze the public response to poverty reduction programs in Indonesia by using Big Data through twitter data. With this research, it is expected that the public response from social media twitter can be a critique and suggestion for the government in establishing and implementing future poverty reduction policies. This research conducted to use twitter data on poverty reduction policies in Indonesia, such as: “Bedah Kemiskinan Rakyat Sejahtera(BEKERJA)” Program, “Padat Karya Tunai” Program, and “Program Keluarga Harapan”. Data are sourced from all community tweets related to the programs. There are analyzed using the text mining method. The results of this study indicate that in general the public response received and supported three poverty reduction programs in Indonesia for the 2014-2018 period. So public response can be used as an evaluation and input to the government.</em></p> Mohammad Ilham Nur Rohman, Siti Mariyah ##submission.copyrightStatement## Fri, 15 Nov 2019 13:50:47 +0000 Sentiment Analysis of Movie Review using Naïve Bayes Method with Gini Index Feature Selection <p>In movie reviews, there is information that determines whether the movie is good or bad. Sentiment analysis is used to process information to determine the polarity of the sentence. With unstructured reviews and a lot of data attributes so that it requires much time and computational capabilities that become a problem in the classification process. To process a lot of data selection features becomes a solution to reduce dimensions so it accelerate the classification process and reduce the occurrence of misclassification. The first Gini Index Text feature selection used to classify documents and successfully enhanced the classifier performance. Multinomial Naïve Bayes (MNNB)&nbsp; is a popular classifier used for document classification however, will the Gini Index Text feature selection able to improve MNNB classification performance. Therefore in this study the author aims to use the Gini Index Text (GIT) for text feature selection with MNNB classifier to classify movie review&nbsp; into positive and negative classes. The data used is IMDB dataset that contains reviews in English sentences, the data will be divided into two parts, training data is 90% and data testing is 10%. The test results prove that the Gini index as a selection feature can increase accuracy where accuracy without feature selection is 56% and with feature selection of 59.54% with an increase of 3.54%.</p> Riko Bintang Purnomoputra, Adiwijaya Adiwijaya, Untari Novia Wisesty ##submission.copyrightStatement## Fri, 15 Nov 2019 00:00:00 +0000 Tourism Recommender System Using Item-Based Hybrid Clustering Method (Case Study: Bandung Raya Region) <p><span lang="IN">The recommender system can be used to provide recommender for an item based on the highest. Therefore, the information recommended by the system can be as needed. The recommender system can help tourists to determine their travel choices, especially for tourists in the city of Bandung. In the recommender system there are two commonly used methods, namely collaborative filtering and content-based filtering methods. However, both methods still have drawbacks among them, in content-based filtering methods cannot recommend various items. While the collaborative filtering method cannot recommend items that have not been rated at all or cold start problems. The collaborative filtering method also cannot recommend to new users because new users do not have history. With the shortcomings of the two methods, the item-based clustering hybrid method (ICHM) is proposed to combine the two methods. The analysis was carried out by comparing the Mean Absolute Error (MAE) on several tests that have been carried out. In a cluster of 30 and c coefficient of 0.9, the average MAE value obtained is 0.2459 in cold start problems and 0.2488 in non cold problems. The smaller the MAE value is generated, that means the higher the level of accuracy.</span></p> Qisti R Arvianti, Z. K. Abdurahman Baizal, Dede Tarwidi ##submission.copyrightStatement## Fri, 15 Nov 2019 13:57:56 +0000