This research presents viable solutions for prediction modelling of schistosomiasis disease based on vector density. Novel training models pro- posed in this work aim to address various aspects of interest in the artificial intelligence applications domain. Topics discussed include data imputation, semi-supervised learning and synthetic instance simulation when using sparse training data. This research applies Remote Sensing and Earth Observation sample data provided by European Space Agency satellites as well as environ- ment feature characteristics extracted by research partners at The Academy of Opto-Electronics in China. Innovative semi-supervised ensemble learning paradigms are proposed which focus on labelling threshold selection and strin- gency of classification confidence levels. A Regression-Correlation Combina- tion (RCC) imputation method is also introduced for handling of partially complete training data. Results presented in this work show data imputa- tion precision improvement over benchmark value replacement using proposed RCC method. Proposed Incremental Transductive models have provided in- teresting findings based on threshold constraints that can be applied with alternative environment-based epidemic disease domains. The Synthetic Mi- nority Over-Sampling Technique (SMOTE) Equilibrium approach has yielded subtle classification performance increases which can be further interrogated to assess classification performance and efficiency relationships with synthetic instance generation.
|Number of pages||31|
|Journal||International Journal of Machine Learning and Cybernetics|
|Publication status||Accepted/In press - 28 Oct 2019|
- Disease Prediction Modelling
- Data Imputation
- Synthetic Data Simulation