HIV analysis using computational intelligence
Leke Betechuoh, Brain
In this study, a new method to analyze HIV using a combination of autoencoder networks and genetic algorithms is proposed. The proposed method is tested on a set of demographic properties of individuals obtained from the South African antenatal survey. The autoencoder model is then compared with a conventional feedforward neural network model and yields a classification accuracy of 92% compared to 84% obtained for the conventional feedforward model. The autoencoder model is then used to propose a new method of approximating missing entries in the HIV database using ant colony optimization. This method is able to estimate missing input to an accuracy of 80%. The estimated missing input values are then used to analyze HIV. The autoencoder network classifier model yields a classification accuracy of 81% in the presence of missing input values. The feedforward neural network classifier model yields a classification accuracy of 82% in the presence of missing input values. A control mechanism is proposed to assess the effect of demographic properties on the HIV status of individuals, based on inverse neural networks, and autoencoder networks-based-on-genetic algorithms. This control mechanism is aimed at understanding whether HIV susceptibility can be controlled by modifying some of the demographic properties. The inverse neural network control model has accuracies of 77% and 82%, meanwhile the genetic algorithm model has accuracies of 77% and 92%, for the prediction of educational level of individuals, and gravidity, respectively. HIV modelling using neuro-fuzzy models is then investigated, and rules are extracted, which provide more valuable insight. The classification accuracy obtained by the neuro-fuzzy model is 86%. A rough set approximation is then investigated for rule extraction, and it is found that the rules present simplistic and understandable relationships on how the demographic properties affect HIV risk. The study concludes by investigating a model for automatic relevance determination, to determine which of the demographic properties is important for HIV modelling. A comparison is done between using the full input data set and the data set using the input parameters selected by the technique for the HIV classification. Age of the individual, gravidity, province, region, reported pregnancy and educational level were amongst the input parameters selected as relevant for classification of an individual’s HIV risk. This study thus proposes models, which can be used to understand HIV dynamics, and can be used by policy-makers to more effectively understand the demographic influences driving HIV infection.