Skip to main content

    Comparison of estimators of effective sample size for catch-at-age and catch-at-length data using simulated data from the Dirichlet-multinomial Distribution

    Request Meeting Document
    Document Number:
    S.G. Candy (Australia)
    Agenda Item(s)

    The incorporation of “effective sample size” (ESS) in integrated assessments is an approximate but simple way of modelling the distribution of catch-at-age or catch-at-length frequencies using a multinomial likelihood when there is extra-multinomial heterogeneity in age class or length class frequencies. The ESS applied within the definition of the negative log-likelihood contribution to the objective function in CASAL determines the implicit weight given to the commercial catch-at-age or catch-at-length frequency data relative to the other types of data used in integrated assessments of toothfish stocks. An appropriate and accurate estimate of the ESS for catch frequency data for each fishery and fishing year is therefore important for such assessments and this issue is studied using simulation. Between-haul heterogeneity within fishing year was simulated using samples from the Dirichlet-multinomial (D-M) distribution with marginal class probabilities generated using a simple age-structured model incorporating fishing selectivity. Either between-year “process” or “systematic” error in these probabilities was also generated by varying one of the selectivity function parameters across years randomly or linearly, respectively. Five alternative methods of estimation of effective sample size were compared using this simulation model. Two existing methods are based on the lack-of-fit of predictions of class probabilities using aggregate year-level frequencies. The other three estimators use the haul-level frequencies, including a method based on a conditional profile maximum likelihood estimate of the D-M dispersion parameter. This last method generally gave the best estimator of an ESS that is appropriate for haul-level heterogeneity with another of the haul-level methods giving similar estimates. The year-level methods gave very inaccurate estimates of this ESS when process error variance was set to zero with relative mean square error an order of magnitude worse than the best two haul-level methods. When process error was incorporated one of the year-level methods gave reduced estimates of ESS. An appropriate distributional model that incorporates process error in addition to haul-level heterogeneity while giving a marginal variance relationship which allows an ESS to be defined does not appear to be available so heuristic arguments and simulation results are used to discuss the issue of estimating ESS in the presence of process error. It is shown that care should be taken to avoid year-to-year model lack-of-fit due to systematic deviations in observed versus predicted class frequencies being mistaken for process error and used to reduce the ESS inappropriately.