Towards a robust, universal predictor of gas hydrate equilibria through the means of a deep learning regression

Landgrebe, M. K. B.
Journal Title
Journal ISSN
Volume Title
Gas hydrate equilibria of natural gas mixtures has proven to be a highly non-linear, multimodal phenomenon, and extensive investment has been made over decades in order to understand and accurately predict natural gas hydrate equilibrium conditions. While most models applied toward predicting gas hydrate equilibria industrially are computerised thermodynamic models based on intrinsic molecular behaviour, these approaches are often limited in their capability to predict actual phenomena over a wide range of conditions due to the high degrees of non-linearity and complexities resulting from other factors which prove difficult to model explicitly. In this research, an artificial neural network was developed using publicly available experimental gas hydrate equilibrium data. A regression was achieved by means of a deep-learning multi-layer perceptron consisting of three hidden layers with a high neuron count, and an output layer comprised of a single neuron, corresponding with the predicted equilibrium pressure. 9 model features are present in the input layer, consisting of the temperature and the molar fractions of methane, ethane, propane, iso-butane, n-butane, carbon dioxide, nitrogen and a lumped fraction of organic molecules consisting of at least five carbon atoms. Models have been evaluated according to the ability to predict a wide range of data, multicomponent prediction accuracy, and dependency on individual sources of data. 670 multicomponent experimental equilibrium data samples have been obtained from literature. Due to the limited amount of multicomponent equilibrium data published, the incorporation of pure and binary methane mixtures into a second dataset including multicomponent data has proven imperative to achieve the best possible model. The complete dataset consists of 1209 equilibrium data samples. To ensure multicomponent data is accurately modelled, several models have been developed using both datasets to prove that the pure and binary inclusive dataset models do not simply inflate results through inclusion of easily predicted data. Regression scoring was assessed using the coefficient of determination, the R2 score. Cross-validation and hold-out validation have been employed in conjunction to assess the model’s ability to predict unseen data, while facilitating parameter optimization and yielding the bias and variance associated with the model. Cross-validation has been implemented by means of 10-fold validation, with a randomized 70%-30% train-test split performed to determine the test indices for each fold. Hold-out validation has been achieved by means of a 10% stratified-split, whereby the proportion of data from each independent source is held approximately constant across training and hold-out validation sets with the purpose of ensuring a wide range of conditions are tested. A cross validation R2 score of 0.9860 is achieved with a standard deviation of 0.0035. Hold-out validation yields an R2 of 0.9926. Results indicate a sufficiently accurate model has been achieved with a low enough variance to consider the model universal over the range of equilibrium data included in this investigation. The dependency on individual experimental data sources is of concern due to the limited amount of multicomponent equilibrium data available, and the age of equilibrium measurement practices for many sources and time frames associated with hydrate equilibrium measurements. However, the inclusion of pure methane and methane binary compounds does assist in reducing the susceptibility of the model to these errors. Dependency on individual data sources has been assessed by means of grouped cross-validation being performed on neural network models. Grouping results do indicate a lack of independently obtained data covering certain ranges of conditions, however binary inclusive models are shown to present a damping effect on the magnitude of experimental or measurement error on the model at large. Due to a lack of independent experimental studies covering a wide range of conditions, hydrogen sulphide could not be included as a feature in model development. As such, the developed model is noted to be applicable to sweet natural gas flow systems, where hydrate structures I or II are exhibited.
A research report submitted to the Faculty of Engineering and the Built Environment, University of the Witwatersrand, Johannesburg, in partial fulfilment of the requirements for the degree of Master of Science in Engineering
Landgrebe, Michael Konrad Bernhard (2019) Towards a robust, universal predictor of gas hydrate equilibria through the means of a deep learning regression, University of the Witwatersrand, Johannesburg, <>