The size of a fish is usually measured using its body length. However, regulating fishing activity to prevent overfishing and stocks depletion also requires body weight estimate. In particular, weight is often important to estimate average stock biomass. Average length information is available for many stocks; because of this, estimating weight from length can be crucial for fisheries management. Species weight is correlated to length by means of parameters related to the species growth rate and its shape. A recent work by Froese et al. (2014)[1] has estimated these parameters for 1,821 species, based on 5,150 length-weight relationship studies. The Bayesian model they produced combined data with prior expert knowledge and statistics on species having similar shape. Unfortunately, the models required 20 days of computational time to run on the complete data set, making it very hard to produce and update the estimates.
In order to speed the process up, the authors asked to execute the script on the e-Infrastructure BlueBRIDGE relies upon (D4Science). The e-Infrastructure staff ported the R script implementing the Bayesian model on the e-Infrastructure by adapting its execution to a Cloud computing approach. Without altering the code provided by the scientists, the Cloud computing platform was able to parallelise the execution of the script after splitting the input into several chunks[2]. Using an average set of 20 machines (including European Grid Infrastructure resources), the execution time became 11 hours (i.e., it was reduced by 95.4%). The produced data allowed the authors to finalise and publish the paper, and have been used in high-level European Commission decisions on fisheries regulation. The algorithm has also been published as-a-Service in the e-Infrastructure, and is hosted by BlueBRIDGE under a standard representation interface (Web Processing Service) to allow invocation from other e-Infrastructures and users (e.g. ENVRI, MyExperiment, Biodiversity Catalogue).
[1] Froese, R., Thorson, J. T., & Reyes, R. B. (2014). A Bayesian approach for estimating length‐weight relationships in fishes. Journal of Applied Ichthyology,30(1), 78-85.
[2] Coro, G., Candela, L., Pagano, P., Italiano, A., & Liccardo, L. (2014). Parallelizing the execution of native data mining algorithms for computational biology. Concurrency and Computation: Practice and Experience.
