Full Text Available

Note: Clicking the button above will open the full text document at the original institutional repository in a new window.

Empirical statistical modelling for crop yields predictions: bayesian and uncertainty approaches

Includes bibliographical references

Saved in:
Bibliographic Details
Main Author: Adeyemi, Rasheed Alani
Other Authors: Guo, Renkuan
Format: Thesis
Language:English
Published: Department of Statistical Sciences 2015
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1867613344160022529
access_status_str Open Access
author Adeyemi, Rasheed Alani
author2 Guo, Renkuan
author_browse Adeyemi, Rasheed Alani
Guo, Renkuan
author_facet Guo, Renkuan
Adeyemi, Rasheed Alani
author_sort Adeyemi, Rasheed Alani
collection Thesis
description Includes bibliographical references
format Thesis
id oai:open.uct.ac.za:11427/15533
institution University of Cape Town (South Africa)
language eng
last_indexed 2026-06-10T12:34:39.078Z
license_str Not specified — see source repository
provenance_str_mv Harvested via OAI-PMH from UCTD — University of Cape Town Open Access Repository
publishDate 2015
publishDateRange 2015
publishDateSort 2015
publisher Department of Statistical Sciences
publisherStr Department of Statistical Sciences
record_format dspace
source_str UCTD — University of Cape Town Open Access Repository
spelling oai:open.uct.ac.za:11427/15533 Empirical statistical modelling for crop yields predictions: bayesian and uncertainty approaches Adeyemi, Rasheed Alani Guo, Renkuan Dunne, Tim Mathematical Statistics Includes bibliographical references This thesis explores uncertainty statistics to model agricultural crop yields, in a situation where there are neither sampling observations nor historical record. The Bayesian approach to a linear regression model is useful for predict ion of crop yield when there are quantity data issue s and the model structure uncertainty and the regression model involves a large number of explanatory variables. Data quantity issues might occur when a farmer is cultivating a new crop variety, moving to a new farming location or when introducing a new farming technology, where the situation may warrant a change in the current farming practice. The first part of this thesis involved the collection of data from experts' domain and the elicitation of the probability distributions. Uncertainty statistics, the foundation of uncertainty theory and the data gathering procedures were discussed in detail. We proposed an estimation procedure for the estimation of uncertainty distributions. The procedure was then implemented on agricultural data to fit some uncertainty distributions to five cereal crop yields. A Delphi method was introduced and used to fit uncertainty distributions for multiple experts' data of sesame seed yield. The thesis defined an uncertainty distance and derived a distance for a difference between two uncertainty distributions. We lastly estimated the distance between a hypothesized distribution and an uncertainty normal distribution. Although, the applicability of uncertainty statistics is limited to one sample model, the approach provides a fast approach to establish a standard for process parameters. Where no sampling observation exists or it is very expensive to acquire, the approach provides an opportunity to engage experts and come up with a model for guiding decision making. In the second part, we fitted a full dataset obtained from an agricultural survey of small-scale farmers to a linear regression model using direct Markov Chain Monte Carlo (MCMC), Bayesian estimation (with uniform prior) and maximum likelihood estimation (MLE) method. The results obtained from the three procedures yielded similar mean estimates, but the credible intervals were found to be narrower in Bayesian estimates than confidence intervals in MLE method. The predictive outcome of the estimated model was then assessed using simulated data for a set of covariates. Furthermore, the dataset was then randomly split into two data sets. The informative prior was later estimated from one-half called the "old data" using Ordinary Least Squares (OLS) method. Three models were then fitted onto the second half called the "new data": General Linear Model (GLM) (M1), Bayesian model with a non-informative prior (M2) and Bayesian model with informative prior (M3). A leave-one-outcross validation (LOOCV) method was used to compare the predictive performance of these models. It was found that the Bayesian models showed better predictive performance than M1. M3 (with a prior) had moderate average Cross Validation (CV) error and Cross Validation (CV) standard error. GLM performed worst with least average CV error and highest (CV) standard error among the models. In Model M3 (expert prior), the predictor variables were found to be significant at 95% credible intervals. In contrast, most variables were not significant under models M1 and M2. Also, The model with informative prior had narrower credible intervals compared to the non-information prior and GLM model. The results indicated that variability and uncertainty in the data was reasonably reduced due to the incorporation of expert prior / information prior. We lastly investigated the residual plots of these models to assess their prediction performance. Bayesian Model Average (BMA) was later introduced to address the issue of model structure uncertainty of a single model. BMA allows the computation of weighted average over possible model combinations of predictors. An approximate AIC weight was then proposed for model selection instead of frequentist alternative hypothesis testing (or models comparison in a set of competing candidate models). The method is flexible and easy to interpret instead of raw AIC or Bayesian information criterion (BIC), which approximates the Bayes factor. Zellner's g-prior was considered appropriate as it has widely been used in linear models. It preserves the correlation structure among predictors in its prior covariance. The method also yields closed-form marginal likelihoods which lead to huge computational savings by avoiding sampling in the parameter space as in BMA. We lastly determined a single optimal model from all possible combination of models and also computed the log-likelihood of each model. 2015-12-03T14:08:13Z 2015-12-03T14:08:13Z 2015 Master Thesis Masters MSc http://hdl.handle.net/11427/15533 eng application/pdf Department of Statistical Sciences Faculty of Science University of Cape Town
spellingShingle Mathematical Statistics
Adeyemi, Rasheed Alani
Empirical statistical modelling for crop yields predictions: bayesian and uncertainty approaches
thesis_degree_str Master's
title Empirical statistical modelling for crop yields predictions: bayesian and uncertainty approaches
title_full Empirical statistical modelling for crop yields predictions: bayesian and uncertainty approaches
title_fullStr Empirical statistical modelling for crop yields predictions: bayesian and uncertainty approaches
title_full_unstemmed Empirical statistical modelling for crop yields predictions: bayesian and uncertainty approaches
title_short Empirical statistical modelling for crop yields predictions: bayesian and uncertainty approaches
title_sort empirical statistical modelling for crop yields predictions bayesian and uncertainty approaches
topic Mathematical Statistics
url http://hdl.handle.net/11427/15533
work_keys_str_mv AT adeyemirasheedalani empiricalstatisticalmodellingforcropyieldspredictionsbayesiananduncertaintyapproaches