Bill Greene, author of Econometric Analysis and developer of NLOGIT, describes the differences between robust, cluster and panel estimators on the Limdep listserv (reprinted with permission):

(1) "robust covariance matrix." Computed using inv(-H) * G'G * inv(H) where H is the second derivatives matrix and G is the matrix whose each row is the first derivatives of logL. Rarely clear what this matrix is robust to. In is favor, it rarely matter much in practice. If the theory of the model is correct, this matrix estimates the same thing as inv(-H), so it is harmless. Important point, this matrix treats the data as if it were a cross section. It is making no use of any panel aspects of the sample, or correlation across observations.(2) "Cluster correction" Computed using inv(-H) * Gc ' Gc * inv(-H) where H is as before, Gc has rows that are each equal to the within group sums of the first derivatives. This is implicitly attempting to account for correlations across observations of the score vectors. Resembles corrections such as Newey West. Rarely clear what the source of the correlation is. Generally, if the data are actually a panel and the panel aspect of the data is ignored by the estimator, this cluster estimator will pick up something. In the same situation, (2) is usually larger than (1), often much larger. In a panel data situation, fitting a probit model or a Poisson model by pooling the data instead of using a panel data estimator, I have seen standard errors rise by a factor of 2 or 4.

(3) "Panel estimator," Computed using inv(-H) or inv(G'G) where the hessians or first derivatives are appropriately computed using the likelihood for the panel data model. (3) often resembles (2), but frequently not because the estimator in (3) is the FIML panel estimator and the one in (2) is typically a "pooled" estimator that does not account for the panel nature of the data. ...

This was in response to a question I posed while trying to satisfy a referee who wanted us to useclustered standard errors on top of a random effects panel estimator. Dr. Greene's argument is that there is no need to cluster when the random effects panel estimator is used. This is reflected in NLOGIT where a cluster correction is not an option with the panel estimators. Apparently, and note that I don't use it so I don't know, Stata allows clustering on top of the panel estimator. It seems like overkill to me but I'm no econometrician (in case you are wondering, Tim is an [applied?] econometrician, I certainly asked him about it, and he agreed that the random effects model was adequately addressing the issue).

To illustrate the differences between clustered and random effects here is a supplemental table developed for our, now published, EARE paper.

In the pooled model where we treat each observation as a different consumer, 21 out of 24 coefficient estimates are statistically significant. In the clustered model that accounts for correlation among respondents but not the panel nature of the data, only four coefficients are statistically signficant. In the random effects panel model 15 out of 24 coefficients are statistically significant. In the random parameters model with only a constant random parameter, which I'm told is equivalent to the random effects model, 17 coefficients are statistically significant. I don't know what the results would be if we clustered the random effects panel model since I don't have Stata, but would be happy to share the data if anyone wants to try it out.

Consumer surplus per meal is economically different in the clustered and random effects models, $39 and $27, respectively. The clustered model finds no evidence of hypothetical bias (i.e., the coefficient on the stated preference data [SP] is not statistically significant) while there is evidence of hypothetical bias in the random effects model. Looking at the raw data, I'm suspicious of that non-result.

I'll let readers judge if our standard errors are too small. Any comments that would help me understand this issue would be great, since it seems like journal referees who use Stata hate the idea of statistically significant coefficient estimates. :)