Mark Thoma:

There's a version of this in econometrics, i.e. you know the model is correct, you are just having trouble finding evidence for it. It goes as follows. You are testing a theory you came up with, but the data are uncooperative and say you are wrong. But instead of accepting that, you tell yourself "My theory is right, I just haven't found the right econometric specification yet. I need to add variables, remove variables, take a log, add an interaction, square a term, do a different correction for misspecification, try a different sample period, etc., etc., etc." Then, after finally digging out that one specification of the econometric model that confirms your hypothesis, you declare victory, write it up, and send it off (somehow never mentioning the intense specification mining that produced the result).

Too much econometric work proceeds along these lines. Not quite this blatantly, but that is, in effect, what happens in too many cases. I think it is often best to think of econometric results as the best case the researcher could make for a particular theory rather than a true test of the model.

via economistsview.typepad.com

... cast the first stone.

I apologize ahead of time for the rant that's ahead. The rant is motivated by daily interactions with my colleagues who mostly have PhD's in statistics, computer science and mathematics. This gives me a different and less-economisty view of model building and validating in general, so bare with me.

When you sit down and read your average economics paper you generally know what's coming in terms of statistics. You're going to see a regression (often a variation of least squares), there's going to be some fancy talk about the error term and the author will likely base most of the conclusions on what the coefficients look like. This is actually true of many social science papers and it has been the primary means of social science modeling for quite a while.

However, when I see some of the cool new tools coming out of statistics or computer science I ask myself "why can't economists use these?" Those who follow finance relatively closely are aware of neural nets and some boosting techniques for time series forecasting but as a general rule many economists seem to turn a blind eye to these cutting edge techniques. I'm sure I'll get some comments mentioning Hal Varian's paper on machine learning and economics or Matthew Jackson's work with networks, but these seem to be exceptions rather than a trend. I've come up with a list of reasons why this may be, along with some comments:

1) *Economists are obsessed with error terms. *Anecdotally true - I've heard that it's hard to get a paper accepted to a decent journal that doesn't contain error terms for coefficients. With many machine learning models, you're generally concerned with some sort of cross-validation error rather than coefficient standard errors.

2) *There isn't much interaction between economists and statisticians/mathematicians/computer scientists. *This also is (sadly) a factor. I've seen a decent amount of interaction between econometricians and statisticians, but it's generally only frequentist statisticians. Not the new-fangled statisticians that are interested in some interesting Bayesian classifiers or machine learning techniques. Additionally, I rarely see economists publishing with authors from any of these three fields. There's a lot we could learn from them (and them from us!) so I see no reason why this should be binding.

3) *It's harder to explain results from models more complicated than regression techniques. *I can see this being an issue, particularly when some of these new techniques are first introduced and economists are not used to them. I can imagine the early seminars for these papers with no standard errors will lead to a lot of grumbling in the audience. However, I've seen economists explain some fairly dense material in a user-friendly way. It will take some practice, but I see nothing wrong with economists getting better at explaining models - it will add to our audience!

4) *Economists don't have access to large enough data sets to use these techniques. *I'm not sure about this one. After spending time in the tech field it seems like everyone has "big data," but I realize for many important topics, that's just not true. Some of it is due to not having the data "all in one place," but that is what research assistants are for! Particularly for revealed preference data, it seems like there should be some low-hanging fruit in terms of larger data sets where these new modeling methods could be used.

5) *The math is too hard.* No! Well, I guess it's *different* math, but that's nothing to be afraid of. For many methods such as trees and model boosting and bagging, there's not a lot of math involved that would be too different from existing econometrics.

6) *Economists aren't always trying to predict something. *Yes and no - sometimes we're trying to describe a system and sometimes we're trying to predict what the system will do. Decision trees are a perfect example of a method that lends itself relatively well to descriptive studies. However, I can understand why neural nets might not be as useful. It's a fair scientific question: what are our models capable of and what should they be used for? I don't think changing economics to a discipline that produces only predictive studies would be a good progression (and then we'd get even more comparisons to weathermen!) but I do think that we should take the time to evaluate whether existing models or new models can be used for the purposes we intend.

I'd love to hear comments - I would be happy to be proven wrong about this trend.

"Type I" and "Type II" errors, names first given by Jerzy Neyman and Egon Pearson to describe rejecting a null hypothesis when it's true and accepting one when it's not, are too vague for stat newcomers (and in general). This is better. [via]

via flowingdata.com

To this day I must think hard to figure out Type I and II errors.

Hat tip: Jayjit Roy

Last April I posted some results supplementing a recently published paper comparing approaches to handle panel data in Limdep. A referee asked for clustered standard errors, which Limdep doesn't do on top of a random effects panel Poisson estimator. Bill Greene provided some explanation for why on the Limdep listserv.

Eric Duquette (who, I seem to recall, won our NCAA tournament one year) left some good comments and via email offered to estimate some comparison models with Stata (thanks Eric!). His results are below (note that I've deleted 12 coefficients that are not statistically signficant in any of the models). The random effects Poisson results are in column (3) and the random effects Poisson with clustered standard errors results are in column (4). The biggest difference between the two is the standard error on the MISSICK coefficient, which is 72% higher in column (4). The other standard errors are, on average, 27% lower in column (4) compared to column (3).

The more troubling thing is the difference in the Limdep and Stata coefficient estimates. The consumer surplus for each seafood meal is about $27 (1/.0372) with Limdep and $1.72 (1/.580) with Stata. This would seem to have policy implications. Any ideas on why the coefficient estimates differ so much?

**Update: I deleted "Poisson" from the title. Eric reports that the results below are from the continuous dependent variable regression. The RE Poisson results are the same in both Limdep and Stata (whew!). Also, it appears that Stata does not have an option to cluster standard errors in a RE Poisson, so referees who suggest that are mistaken. I've edited the post to reflect this (underlines are added and strikethroughs are cut). **

Note the bold-underlined phrases in the following:

US scientists found that even small changes in temperature or rainfall correlated with a rise in assaults, rapes and murders, as well as group conflicts and war.

The team says with the current projected levels of climate change, the world is likely to become a more violent place.

The study is published in Science.

Marshall Burke, from the University of California, Berkeley, said: "This is a relationship we observe across time and across all major continents around the world. The relationship we find between these climate variables and conflict outcomes are often very large."

The researchers looked at 60 studies from around the world, with data spanning hundreds of years.

They report

a "substantial" correlationbetween climate and conflict.Their examples include an increase in domestic violence in India during recent droughts, and a spike in assaults, rapes and murders during heatwaves in the US.

The report also suggests rising temperatures correlated with larger conflicts, including ethnic clashes in Europe and civil wars in Africa.

Mr Burke said: "We want to be careful, you don't want to attribute any single event to climate in particular, but there are some really interesting results."

The researchers say they are now trying to understand why

this causal relationshipexists."The literature offers a couple of different hints," explained Mr Burke.

"One of the main mechanisms that seems to be at play is changes in economic conditions. We know that climate affects economic conditions around the world, particularly agrarian parts of the world.

via www.bbc.co.uk

Looks like the confusion between correlation iand causation is one of bad reporting. From the Science article itself (page 2):

Reliably measuring an effect of climatic conditions on human conflict is complicated by the inherent complexity of social systems. In particular, a central concern is whether statistical relationships can be interpreted causally or if they are confounded by omitted variables. To address this concern, we restrict our attention to studies with research designs that are a scientific experiment or that approximate one...

Bill Greene, author of Econometric Analysis and developer of NLOGIT, describes the differences between robust, cluster and panel estimators on the Limdep listserv (reprinted with permission):

(1) "robust covariance matrix." Computed using inv(-H) * G'G * inv(H) where H is the second derivatives matrix and G is the matrix whose each row is the first derivatives of logL. Rarely clear what this matrix is robust to. In is favor, it rarely matter much in practice. If the theory of the model is correct, this matrix estimates the same thing as inv(-H), so it is harmless. Important point, this matrix treats the data as if it were a cross section. It is making no use of any panel aspects of the sample, or correlation across observations.(2) "Cluster correction" Computed using inv(-H) * Gc ' Gc * inv(-H) where H is as before, Gc has rows that are each equal to the within group sums of the first derivatives. This is implicitly attempting to account for correlations across observations of the score vectors. Resembles corrections such as Newey West. Rarely clear what the source of the correlation is. Generally, if the data are actually a panel and the panel aspect of the data is ignored by the estimator, this cluster estimator will pick up something. In the same situation, (2) is usually larger than (1), often much larger. In a panel data situation, fitting a probit model or a Poisson model by pooling the data instead of using a panel data estimator, I have seen standard errors rise by a factor of 2 or 4.

(3) "Panel estimator," Computed using inv(-H) or inv(G'G) where the hessians or first derivatives are appropriately computed using the likelihood for the panel data model. (3) often resembles (2), but frequently not because the estimator in (3) is the FIML panel estimator and the one in (2) is typically a "pooled" estimator that does not account for the panel nature of the data. ...

This was in response to a question I posed while trying to satisfy a referee who wanted us to useclustered standard errors on top of a random effects panel estimator. Dr. Greene's argument is that there is no need to cluster when the random effects panel estimator is used. This is reflected in NLOGIT where a cluster correction is not an option with the panel estimators. Apparently, and note that I don't use it so I don't know, Stata allows clustering on top of the panel estimator. It seems like overkill to me but I'm no econometrician (in case you are wondering, Tim is an [applied?] econometrician, I certainly asked him about it, and he agreed that the random effects model was adequately addressing the issue).

To illustrate the differences between clustered and random effects here is a supplemental table developed for our, now published, EARE paper.

In the pooled model where we treat each observation as a different consumer, 21 out of 24 coefficient estimates are statistically significant. In the clustered model that accounts for correlation among respondents but not the panel nature of the data, only four coefficients are statistically signficant. In the random effects panel model 15 out of 24 coefficients are statistically significant. In the random parameters model with only a constant random parameter, which I'm told is equivalent to the random effects model, 17 coefficients are statistically significant. I don't know what the results would be if we clustered the random effects panel model since I don't have Stata, but would be happy to share the data if anyone wants to try it out.

Consumer surplus per meal is economically different in the clustered and random effects models, $39 and $27, respectively. The clustered model finds no evidence of hypothetical bias (i.e., the coefficient on the stated preference data [SP] is not statistically significant) while there is evidence of hypothetical bias in the random effects model. Looking at the raw data, I'm suspicious of that non-result.

I'll let readers judge if our standard errors are too small. Any comments that would help me understand this issue would be great, since it seems like journal referees who use Stata hate the idea of statistically significant coefficient estimates. :)

An example for your econometrics course (source: Chronicle of Higher Education):

If one would like to create a spreadsheet for one's use, here are links to the president and faculty salary data:

My publisher has notified me that I can purchase hard copies of my Climatopolis book for $2.26 each. This isn't good news in terms of my expected future royalties but demand curves do slope down. I am purchasing 200 copies and giving them away for free to my UCLA students. So, this should provide you with a lower bound on how much I care about my students! The only good news here is that Climatopolis continues to be talked about in unusual places such as the National Review and random Chinese blogs.

On the broad topic of climate change adaptation, I plan to do two things;

1. I will be writing a historical migration paper with Leah Boustan and Paul Rhode on how U.S migrants responded to past disasters.

2. I plan to write a short overview paper listing the open micro economic research agenda (both in terms of reduced form slop and fancier structural work) on how to pin down the costs of climate adaptation.

A few comments:

- Dang. I paid full price for Climatopolis.
- Reduced form slop? Yikes, a rare elitist slip.
- Happy birthday eve Matt!

Here is how an entire day can get away from me. While I was motivating myself to shake loose from teh covers I scanned my Google Reader and found Jennifer Imazeki's post on the ASSA panel on teaching with blogs:

... I thought it was a bit odd that the first thing Levitt said was, "I had never thought about using blogs to teach economics until I was asked to be a part of this panel" - on the one hand, I suppose that might be the position many audience members were in (i.e., they were at the session in part to find out more) but on the other hand, it immediately raised the question (at least in my head): then why was he there (other than to draw people in with the big name)? Another thing that seemed a bit odd was when he said that he thought requiring students to read economics blogs is "misguided" because so much of what gets published on blogs is not "right". That is, bloggers often write off the top of their head and so at least some of what they write ends up being incorrect. That could be confusing for students and it would be better to have them read more developed ideas, such as peer-reviewed work. ...

The last thing you want undergraduate students at 98% of the universities in the U.S. to do is hit the peer-reviewed journals in most courses.

Levitt also mentioned that in the course of writing his textbook, he went back through the archives of the Freakonomics blog, expecting to find lots of posts he could use to highlight various concepts. But what he found is that there weren't that many posts that were useful for talking about cost curves or any of the other abstract models that are so common in intermediate micro texts. One might think this would lead to some questioning about the relevance of those models but instead, he seemed to see that as another reason not to have students read blogs; i.e., if blogs aren't really connected to what students are learning in class, students don't need to be reading them. It didn't seem to occur to him that the problem might be with what students are learning in class... (Peter Dorman, who offers a less flattering view of the panel, notes the oddity of this as well).

Intermediate micro is probably the last course in the economics curriculum that could benefit from reading blogs. On the other hand, intermediate macro students could gain a lot from reading the debates between those guys that call each other out over their macro models.

After reading the rest of the post (read all of Jennifer's post [and her ASSA presentation post] if you'd like a nice discussion of the potential contribution of blogs to teaching) I couldn't help but click the "less flattering" link. Peter Dorman at Econospeak:

... There was an amusing moment in which Leavitt, noting the disconnect between the arguments economists make on the web when they discuss current issues and the parade of models in the textbooks, considered the possibility that the textbooks might be irrelevant. That moment lasted no more than ten seconds; he dismissed the heresy and recommended that teachers spend more time on the textbooks and less on the blogs.

I congratulate myself for not getting cranky. I made a comment which was intended to be entirely constructive. One point was that none of the panelists had mentioned Mark Thoma’s Economist’s View, which is an essential aggregator. I considered mentioning that one of the virtues of Mark’s site is that he links to noneconomists that economists ought to be interested in, like Andrew Gelman, the Bayesian statistician, but decided not to in order to spare the feelings of Leavitt.

So, I just had to click "the feelings of Leavitt" link, how could I resist (warning: changing the subject). That took me to an Andrew Gelman piece on the problems with SuperFreakonomics where I followed another link to an American Scientist article by Gelman and Kaiser Fung titled "Freakonomics: What Went Wrong?":

As the authors of statistics-themed books for general audiences, we can attest that Levitt and Dubner’s success is not easily attained. And as teachers of statistics, we recognize the challenge of creating interest in the subject without resorting to clichéd examples such as baseball averages, movie grosses and political polls. The other side of this challenge, though, is presenting ideas in interesting ways without oversimplifying them or misleading readers. We and others have noted a discouraging tendency in the Freakonomics body of work to present speculative or even erroneous claims with an air of certainty. Considering such problems yields useful lessons for those who wish to popularize statistical ideas. ...

In our analysis of the Freakonomics approach, we encountered a range of avoidable mistakes, from back-of-the-envelope analyses gone wrong to unexamined assumptions to an uncritical reliance on the work of Levitt’s friends and colleagues. This turns accessibility on its head: Readers must work to discern which conclusions are fully quantitative, which are somewhat data driven and which are purely speculative. ...

The risks of driving a car:InSuperFreakonomics, Levitt and Dubner use a back-of-the-envelope calculation to make the contrarian claim that driving drunk is safer than walking drunk, an oversimplified argument that was picked apart by bloggers. The problem with this argument, and others like it, lies in the assumption that the driver and the walker are the same type of person, making the same kinds of choices, except for their choice of transportation. Such all-else-equal thinking is a common statistical fallacy. In fact, driver and walker are likely to differ in many ways other than their mode of travel. What seem like natural calculations are stymied by the impracticality, in real life, of changing one variable while leaving all other variables constant.

Here is my concern about the drunk walking assertion from 2009. And this reminded me that the drunk walking stuff in SuperFreakonomics is still getting positive play at the Freakonomics blog. So, while I enjoyed Freakonomics, SuperFreakonomics and the blog, and applaud the authors for its positive effect on the economics profession, I'm nodding my head while reading "What Went Wrong?" and the other Gelman posts from last month and this month. Am I too critical of the Freakonomics guys (mostly SuperFreakonomics)? I guess I must agree with Gelman's disclaimer from 2010:

I’ve been picking on Freakonomics a lot recently, but really this is the result of selection bias: when Freakonomics has material of its usual high quality, I don’t have much to add, and when there’s material of more questionable value, I notice and sometimes comment on it. Those of us who’ve contributed to the burgeoning “what went wrong with Freakonomics 2?” literature are doing so only because we believe its authors could do better, if they were to put in the effort.

And that is how an entire day can get away from me.