MALAYSIAN JOURNAL OF CHEMISTRY (MJChem)

MJChem is double-blind peer reviewed journal published by the Malaysian Institute of Chemistry (Institut Kimia Malaysia) E-ISSN: 2550-1658

Investigation of the Clustering of High Quality Stingless Bee Honeys using Unsupervised Machine Learning Models

Yusnaini Md Yusoff
Universiti Kebangsaan Malaysia
Nalinah Poongavanam
Infrastructure University Kuala Lumpur
Jalifah Latip
Universiti Kebangsaan Malaysia
Mohd Razif Mamat
National Institutes of Biotechnology Malaysia
Lim Seng Joe
Universiti Kebangsaan Malaysia
Wardah Mustafa Din
Universiti Kebangsaan Malaysia
Dian Indrayani Jambari
Universiti Kebangsaan Malaysia

DOI: https://doi.org/10.55373/mjchem.v27i1.110

Keywords: Stingless bee honey; Malaysia; quality; unsupervised machine learning; clustering

Abstract

Honey quality and authenticity are crucial due to its health benefits and rising demand, yet challenges like environmental factors and adulteration persist. This study evaluated 106 honey samples for quality, bee species distribution, and patterns using machine learning models. Unsupervised clustering techniques, including K-Means, Agglomerative, Hierarchical Clustering, and DBSCAN, were applied. Component plane analysis of the Self-Organizing Map (SOM) highlighted key clustering factors. Hierarchical clustering (unscaled dendrogram) outperformed others with a Silhouette score of 0.351, a Davies-Bouldin Index of 0.977, and a Cophenetic Correlation Coefficient of 0.709. Quality was assessed based on pH, moisture content, sugar levels, and 5-hydroxymethylfurfural (HMF) using the Malaysian Standard for stingless bee Honey (MS 2683:2017) and Codex Alimentarius guidelines. All samples met quality standards, indicating freshness and high quality. Four distinct clusters emerged with unique physicochemical properties and species distributions. The application of various unsupervised clustering techniques (e.g., K-Means, Hierarchical Clustering, DBSCAN) and a Self-Organizing Map (SOM) for analyzing honey quality and bee species distribution is innovative. While honey quality assessments are common, incorporating advanced data analytics to uncover patterns and relationships is relatively novel.

PDF

Published 24 February 2025


Issue Vol 27 No 1 (2025): Malaysian Journal of Chemistry

Section