A while back I was looking for recreation data that I could match to water quality data, particularly in the Gulf. The National Park Service provides visitor park statistics. These provide pretty cool data by park with a lot of detail and background (for example). Here's an example for the Gulf Islands National Seashore:
This is great, right? You could pair the visitation data for a bunch of sites with water quality, temperature, hurricane, and other data and that might be pretty cool in a count data model.
The NPS site also has a link for each park with comments on each month's data collection. Let's see if there's anything interesting there.
Well, crap. A lot of the data is estimated and some portion of the data is missing due to non-response. And even though I posted a snippet, some sort of comment appears to be made each month.
Kudos to NPS for posting such detailed information about their data. Each park's statistics also include "Visitor Use Counting Procedures," which are a series of PDF files on the methods, which NPS suggests you check to see if a change in procedures could have affected counts.
I wanted to circle back to a theme that's come up in this blog before, regarding stated vs. revealed preference data. Many economists would dismiss SP data about the Gulf but accept a table of numbers from the government. But this data is really, really messy. I'm not even sure how you would handle it--you'd need to separately control for any changes in count procedures and also estimation by site each month. Maybe beg for the daily data, if those exist?
Note: We didn't use the data, not because it's messy, but because there aren't enough sites across the Gulf.
This work is not a product of the United States Government or the United States Environmental Protection Agency, and the author is not doing this work in any governmental capacity. The views expressed are those of the author only and do not necessarily represent those of the United States or the US EPA.