An Evaluation of Predictive Accuracy of Common Machine Learning Algorithms on Data Stream
Keywords:Data Stream, Classifier, Machine Learning Algorithms, Predictive Accuracy, WEKA, Multilayer Perception, K-Nearest Neighbour (KNN), Random Forest.
Streamed data are a potentially infinite sequence of incoming data at every high speed and may evolve over real-time. This causes several challenges in large scale high-speed data streams in real-time applications. Hence, this field has gained a lot of attention from researchers in recent years. In recent times, there are several areas of human endeavors where data generated are periodical or constantly growing. For instance, capital market, social networks, metrological data, E-commerce, online gaming and betting platforms. This research implemented the common machine learning algorithm on data streams and random data sets to describe the kind of data with more accurate predictions. Data samples were obtained from social media platforms such as Twitter and Instagram within the periods of 4th to 15th February 2019 with a total of 510,738 data samples collected. This is due to the sheer size of these platforms. Random forest, multi-layer perceptron and the k-nearest neighbor algorithms were used to model the data streams using the WEKA and RapidMiner data mining programs. The result of the research shows that Multilayer Perceptron produced the highest level of accuracy in both programs used when compared to the other algorithms used in the research. The findings of this research will be relevant to other researchers willing to develop machine learning tools to test the accuracy of data streams on social media platforms and related fields.
How to Cite
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.