Abstract:
Milk adulteration is a significant problem globally, as it is the most widely consumed and essential food product. Due to this, monitoring milk quality is necessary for sustaining human health. A Machine Learning (ML) based non-destructive system was developed to identify water adulteration in milk using Near Infrared (NIR) Spectroscopy. A database was created by mixing water in milk in varying proportions (0-40 percent) and capturing spectra using compact TI DLP NIR scan Nano spectroscopy in the 900-1700 nm range. The captured spectra were pre-processed with the Savitzky-Golay (SG) filter, Multiplicative Scatter Correction (MSC), and Standard Normal Variate (SNV) method. The most informative wavelength points were selected using the wavelength/feature selection technique, and the dimensions of these wavelengths were reduced using Principal Component Analysis (PCA). Various ML models were employed to predict the water concentration in milk. Both classification and regression methods were applied to check the system's performance. In the regression analysis, the k-Nearest Neighbour (KNN) achieved the best R sup(2), Root Mean Square Error (RMSE), Standard Error of Prediction (SEP), Mean Absolute Error (MAE), Ratio of Performance to Deviation (RPD), Leave One Out Cross-Validation (LOOCV)-R sup(2), and LOOCV-RMSE of 0.999, 0.399 mL (percent v/v), 0.096 mL (percent v/v), 0.227 mL (percent v/v), 33.005, 0.999, and 0.353 mL (percent v/v), respectively, while for classification analysis, the Random Forest (RF) achieved 100 percent accuracy and Matthew's Correlation Coefficient (MCC). The impact of the proposed portable system, which has the potential to reshape practices and set new standards for food quality assurance, transformative, and offers solutions to critical challenges in the dairy industry.