As described in the introduction of my (draft) "Reply to 'Reply to Whitehead'", I suspect that I have used the incorrect confidence intervals when analyzing the Desvousges, Mathews and Train (2015) data. Park, Loomis and Creel (1991) introduced the Krinsky-Robb approach for estimating confidence intervals for willingness to pay estimates from dichotomous choice contingent valuation models. Cameron (1991) introduced the Delta Method approch. As indicated by their Google Scholar citations, 461 and 229 respectively, they have both been used extensively in the applied CVM literature. Hole (2007) compares the two approaches (along with Fieller and bootstrap approaches) and finds little difference in the approaches for well-behaved simulated data. However, Hole (2007) points out that the Delta Method requires that the willingness to pay be normally distributed for the confidence interval to be accurate. He states that "... it is likely that WTP is approximately normally distributed when the model is estimated using a large sample and the estimate of the coefficient for the cost attribute is sufficiently precise." (p. 830) In Whitehead (2020) I used the Delta Method confidence intervals in my statistical tests. This is very likely an inappropriate approach due to the imprecision of the estimate of the parameter on the cost amount.
When working on Whitehead (2020) I used NLogit (www.limdep.com) software to estimate the confidence intervals. NLogit allows for both the Delta Method and Krinsky-Robb approaches to be used. But the Krinsky-Robb confidence intervals may require the assumption of normality. Hole (2007): "The [Krinsky-Robb] confidence interval could also be derived by using the draws to calculate the variance of WTP ..., but this approach, like the delta method confidence interval, hinges on the assumption that WTP is symmetrically distributed." (p. 831) Almost all of the Krinsky-Robb confidence intervals estimated by NLogit "blew up" when using the DMT (2015) data, in other words the upper and lower limits were in the 10s and 100s of thousands (positive and negative). This made little sense to me at the time but now my guess is that when the WTP normality assumption is violated the NLogit software can not handle the estimation. Typically, Delta Method and Krinsky-Robb confidence intervals are not very different when estimated in NLogit (as shown below).
Following my reading of Desvousges, Mathews and Train (forthcoming) I thought through the above (obviously, I should have thought through it before) and estimated WTP using with the Krinsky-Robb intervals in SAS (my program is available upon request). My Krinsky-Robb intervals are akin to what Hole (2007) calls the Monte Carlo percentile approach. I take one million draws from the variance-covariance matrix and trip the α/2 highest and lowest WTP values, where α=0.05. Hole's (2007) Krinsky-Robb intervals are based on a resampling approach, but he finds little difference in the resampling and Monte Carlo Krinsky-Robb intervals.
For this analysis I am only using the whole scenario from DMT (2015) since this is sufficient to show that WTP for the whole can not be statistically distinguished from WTP for the sum of the parts with the Krinsky-Robb Monte Carlo percentile intervals. The logit models are presented below for the full sample (n=172), the sample with observations with missing demographics deleted (n=163) and the Chapman et al. (2009) data. In each model the constant and the coefficient on the cost amount are statistically different from zero. But, the precision of the cost coefficients with the DMT (2015) data are low relative to other CVM studies. Combined with small samples, the Desvousges, Mathews and Train WTP estimate may not be normally distributed. The Chapman et al. (2009) study, on the other hand, has a large sample size and a precisely estimated coefficient on the cost amount.
The WTP estimates (restricting WTP to be positive) and confidence intervals are presented below. The Delta Method confidence intervals are estimated in NLogit and the Krinsky-Robb percentile intervals are estimated in SAS. The appropriateness of the Delta Method with the DMT (2015) data is questionnable. First, the Krinsky-Robb lower bound on the DMT (2015) full sample (n=172) estimate is less than 50% of the Krinsky-Robb lower bound. Second, the Krinsky-Robb upper bound is 269% larger than the Delta Method upper bound. The imprecision of the coefficient on the cost amount is driving the asymmetry. The cost estimate in the less than full sample (n=163) is estimated even more imprecisely. The Krinsky-Robb confidence interval includes zero.
These results should be considered in contrast to the WTP estimate from the Chapman et al. (2009) data. The Delta Method and Krinsky-Robb intervals are very close. The symmetric Krinsky-Robb confidence interval estimated in NLogit is [236.90, 320.37] which is also very close to the Delta Method. One benchmark for symmetric confidence intervals in CVM studies, therefore, is a sample size greater than 1000 and a t-statistic on the coefficient for the cost coefficient of -9.5. Of course, sensitivity around these benchmarks should be assessed since not many CVM studies have these characteristics. (note: I'll do some of this sort of work when I go back to my past and analyze some of the CVM data from the old days that Desvousages, Mathews and Train (forthcoming) assert is just as bad as their own data.)
The point estimate of the sum of the WTP parts for the full sample is $1114.36. The WTP for the sum of the parts is within the Krinsky-Robb interval for the whole suggesting that we can not reject the hypothesis that WTP for the whole is equal to WTP for the sum of the parts at the p<0.05 level. The 90% interval is [264.84, 1314.14] which indicates that any statistical equality is at a confidence level below p=0.10. The point estimate of the sum of the WTP parts for the trimmed sample is $1079.73. Again, the WTP for the sum of the parts is within the Krinsky-Robb 95% interval for the whole. Note that the WTP for the whole is not different from zero with this sample so any statistical inference makes less sense than if WTP was different from zero. These results are consistent with my (erroneous) conclusion (Whitehead 2020) that the data in Desvousges, Mathews and Train (2015) are not sufficient to conclude that contingent valuation does not pass the adding up test.
References
Cameron, Trudy Ann. "Interval estimates of non-market resource values from referendum contingent valuation surveys." Land Economics 67, no. 4 (1991): 413-421.
Hole, Arne Risa. "A comparison of approaches to estimating confidence intervals for willingness to pay measures." Health economics 16, no. 8 (2007): 827-840.
Park, Timothy, John B. Loomis, and Michael Creel. "Confidence intervals for evaluating benefits estimates from dichotomous choice contingent valuation studies." Land economics 67, no. 1 (1991): 64-73.