Monte Carlo Simulation-Based Regression Tree Algorithm for Predicting Energy Consumption from Scarce Dataset

Authors

  • Tony Darmanto Department of Information System, Widya Dharma Pontianak University, Indonesia
  • Jimmy Tjen Department of Informatics, Widya Dharma Pontianak University, Indonesia
  • Genrawan Hoendarto Department of Informatics, Widya Dharma Pontianak University, Indonesia

DOI:

https://doi.org/10.47852/bonviewJDSIS42022395

Keywords:

Monte Carlo simulation, regression tree, power consumption, scarce dataset

Abstract

Most data-driven techniques rely on the availability of data. Hence, when the data provided are not sufficient, the algorithm might not work as intended. Thus, it is important to be able to predict the dynamics of the data, even when the number of available data is low, or scarce. This study aimed to predict the power consumption of a building given a scarce dataset via a novel Monte Carlo simulation-based Regression Tree (MCRT) algorithm. The main idea is to train Monte Carlo simulation on each leaf generated by the regression tree algorithm. Thus, the prediction no longer depends on the average of the samples contained in the leaf, but now depends on the probability of the samples. The proposed algorithm was validated on 2 datasets obtained from Universitas Widya Dharma Pontianak (UWDP), Indonesia, and Trapeznikov Institute of Control Sciences (TICS), Russia. To show that the MCRT algorithm is better than the regression tree (RT) algorithm, a two-tail hypothesis was proposed. Based on the experiments which were run on Python software with 16 GB RAM, 7th Gen Core i7 machine on 50 datasets randomly generated from the UWDP electrical data, it can be concluded that the MCRT algorithm performs better than the previous RT algorithm used to model scarce datasets with P-value = 0.000319. Furthermore, the proposed algorithm improves the model predictive accuracy of the RT algorithm by up to 2%.

 

Received: 30 December 2023 | Revised: 21 March 2024 | Accepted: 8 April 2024

 

Conflicts of Interest

The authors declare that they have no conflicts of interest to this work.

 

Data Availability Statement

The data that support the findings of this study are openly available in [Google Drive] at https://docs.google.com/spreadsheets/d/1o8sawOaOcX1kEm-dIdkcCUZhKoBduTAz/edit?usp=drive_link&ouid=115962907255429746256&rtpof=true&sd=true

 

Author Contribution Statement

Tony Darmanto: Conceptualization, Validation, Investigation, Resources, Data curation, Writing - original draft, Writing - review & editing, Visualization, Supervision, Project administration. Jimmy Tjen: Conceptualization, Methodology, Software, Formal analysis, Writing - original draft, Writing - review & editing, Visualization. Genrawan Hoendarto: Validation, Investigation, Resources, Data curation, Writing - original draft, Writing - review & editing.


Downloads

Published

2024-04-15

Issue

Section

Research Articles

How to Cite

Darmanto, T., Tjen, J., & Hoendarto, G. (2024). Monte Carlo Simulation-Based Regression Tree Algorithm for Predicting Energy Consumption from Scarce Dataset. Journal of Data Science and Intelligent Systems. https://doi.org/10.47852/bonviewJDSIS42022395