Accessing Imbalance Learning Using Dynamic Selection Approach in Water Quality Anomaly Detection

Dogo, Eustace M.; Nwulu, Nnamdi; Twala, Bhekisipho; Aigbavboa, Clinton

Please use this identifier to cite or link to this item: http://repository.futminna.edu.ng:8080/jspui/handle/123456789/6689

Full metadata record

DC Field	Value	Language
dc.contributor.author	Dogo, Eustace M.	-
dc.contributor.author	Nwulu, Nnamdi	-
dc.contributor.author	Twala, Bhekisipho	-
dc.contributor.author	Aigbavboa, Clinton	-
dc.date.accessioned	2021-07-06T08:25:43Z	-
dc.date.available	2021-07-06T08:25:43Z	-
dc.date.issued	2021-05	-
dc.identifier.other	https://doi.org/10.3390/ sym13050818	-
dc.identifier.uri	http://repository.futminna.edu.ng:8080/jspui/handle/123456789/6689	-
dc.description.abstract	Automatic anomaly detection monitoring plays a vital role in water utilities’ distribution systems to reduce the risk posed by unclean water to consumers. One of the major problems with anomaly detection is imbalanced datasets. Dynamic selection techniques combined with ensemble models have proven to be effective for imbalanced datasets classification tasks. In this paper, water quality anomaly detection is formulated as a classification problem in the presences of class imbalance. To tackle this problem, considering the asymmetry dataset distribution between the majority and minority classes, the performance of sixteen previously proposed single and static ensemble classification methods embedded with resampling strategies are first optimised and compared. After that, six dynamic selection techniques, namely, Modified Class Rank (Rank), Local Class Accuracy (LCA), Overall-Local Accuracy (OLA), K-Nearest Oracles Eliminate (KNORA-E), K-Nearest Oracles Union (KNORA-U) and Meta-Learning for Dynamic Ensemble Selection (META-DES) in combination with homogeneous and heterogeneous ensemble models and three SMOTE-based resampling algorithms (SMOTE, SMOTE+ENN and SMOTE+Tomek Links), and one missing data method (missForest) are proposed and evaluated. A binary real-world drinking-water quality anomaly detection dataset is utilised to evaluate the models. The experimental results obtained reveal all the models benefitting from the combined optimisation of both the classifiers and resampling methods. Considering the three performance measures (balanced accuracy, F-score and G-mean), the result also shows that the dynamic classifier selection (DCS) techniques, in particular, the missForest+SMOTE+RANK and missForest+SMOTE+OLA models based on homogeneous ensemble-bagging with decision tree as the base classifier, exhibited better performances in terms of balanced accuracy and G-mean, while the Bg+mF+SMENN+LCA model based on homogeneous ensemble-bagging with random forest has a better overall F1-measure in comparison to the other models.	en_US
dc.language.iso	en	en_US
dc.publisher	Symmetry MDPI	en_US
dc.subject	classification	en_US
dc.subject	imbalance learning	en_US
dc.subject	Dynamic selection	en_US
dc.subject	missing data	en_US
dc.subject	anomaly detection	en_US
dc.subject	Water quality	en_US
dc.title	Accessing Imbalance Learning Using Dynamic Selection Approach in Water Quality Anomaly Detection	en_US
dc.type	Article	en_US
Appears in Collections:	Computer Engineering

Files in This Item:

File	Description	Size	Format
symmetry-13-00818 (4).pdf		3.12 MB	Adobe PDF	View/Open

Show simple item record