Refine
Year of publication
- 2022 (2)
Document Type
- Article (1)
- Bachelor Thesis (1)
Has Fulltext
- yes (2)
Is part of the Bibliography
- no (2) (remove)
Keywords
- machine learning (2) (remove)
Institute
Nowadays, problems related with solid waste management become a challenge for most countries due to the rising generation of waste, related environmental issues, and associated costs of produced wastes. Effective waste management systems at different geographic levels require accurate forecasting of future waste generation. In this work, we investigate how open-access data, such as provided from the Organisation for Economic Co-operation and Development (OECD), can be used for the analysis of waste data. The main idea of this study is finding the links between socioeconomic and demographic variables that determine the amounts of types of solid wastes produced by countries. This would make it possible to accurately predict at the country level the waste production and determine the requirements for the development of effective waste management strategies. In particular, we use several machine learning data regression (Support Vector, Gradient Boosting, and Random Forest) and clustering models (k-means) to respectively predict waste production for OECD countries along years and also to perform clustering among these countries according to similar characteristics. The main contributions of our work are: (1) waste analysis at the OECD country-level to compare and cluster countries according to similar waste features predicted; (2) the detection of most relevant features for prediction models; and (3) the comparison between several regression models with respect to accuracy in predictions. Coefficient of determination (R2), Mean Absolute Error (MAE), Root Mean Square Error (RMSE), and Mean Absolute Percentage Error (MAPE), respectively, are used as indices of the efficiency of the developed models. Our experiments have shown that some data pre-processings on the OECD data are an essential stage required in the analysis; that Random Forest Regressor (RFR) produced the best prediction results over the dataset; and that these results are highly influenced by the quality of available socio-economic data. In particular, the RFR model exhibited the highest accuracy in predictions for most waste types. For example, for “municipal” waste, it produced, respectively, R2 = 1 and MAPE = 4.31 global error values for the test set; and for “household” waste, it, respectively, produced R2 = 1 and MAPE = 3.03. Our results indicate that the considered models (and specially RFR) all are effective in predicting the amount of produced wastes derived from input data for the considered countries.
Im ländlichen Raum können Mobilitätsbedarfe schwer über den öffentlichen Personennahverkehr gedeckt werden. Wie diese Bedarfslücke über den Einsatz kombinierter Transportkonzepte von Personen und Gütern reduziert werden kann, wird prototypisch über eine agentenbasierte Simulationsanwendung in der Simulationssoftware AnyLogic untersucht. Reale Mobilitätsdaten werden dabei jedoch nicht berücksichtigt.
Das Ziel der vorliegenden Arbeit ist die Verbesserung der Datengrundlage des Prototypen mit Hilfe von Machine Learning. Unter Verwendung des Forschungsansatzes Design Science Research wurden ML-Modelle entlang des CRISP-DM Frameworks entwickelt. Diese verarbeiten die zur Verfügung stehenden Mobilitätsdaten und können nach deren Integration in den Prototypen zur Parametrierung genutzt werden. Im Zuge der Arbeit werden dazu geeignete Parameter identifiziert, die Mobilitätsdaten beschafft und umfangreich für das Modelltraining in H2O Driverless AI transformiert. Das beste ML-Modell wird in den Prototypen integriert und es werden notwendige Anpassungen vorgenommen, um die Parametrierung zu ermöglichen. Die anschließende Evaluation der Simulationsanwendung zeigt eine datenbasierte und realitätsgetreuere Simulation des simultanen und kombinierten Transports von Personen und Gütern.