Blogs.worldbank.org

Can you measure flows over short periods? Aka why Justin Wolfers might (NOT) want to reconsider that parenting study

2015-04-02

[EDIT: I POSTED TOO HASTILY HERE, SEE AN ADDENDUM BELOW WHERE I AGREE WITH JUSTIN AFTER ALL]

Today in the Upshot, Justin Wolfers heavily criticizes a recent study that has received lots of media attention claiming that child outcomes are barely correlated with the time that parents spend with their children. He writes:
“This nonfinding largely reflects the failure of the authors to accurately measure parental input… it measures how much time each parent spends with children on only two particular days — one a weekday and the other a weekend day….The result is that whether you are categorized as an intensive or a distant parent depends largely on which days of the week you happened to be surveyed. For instance, I began this week by taking a couple of days off to travel with the children to Disneyworld. A survey asking about Sunday or Monday would categorize me as a very intense parent who spent every waking moment engaged with my children. But today, I’m back at work and am unlikely to see them until late. And so a survey asking instead about today would categorize me as an absentee parent. The reality is that neither is accurate. …Trying to get a sense of the time you spend parenting from a single day’s diary is a bit like trying to measure your income from a single day. If yesterday was payday, you look rich, but if it’s not, you would be reported as dead broke. You get a clearer picture only by looking at your income — or your parenting time — over a more meaningful period.”

Measuring flows over short periods

What is needed depends on the question of interest. Clearly if one wants to measure accurately the parenting of Justin Wolfers, then a 2 day sample will give a much less accurate picture than measuring over a week or a month. But the study is not interested in individual people, but in averages, or groups of people. So what if I want to measure the parenting time inputs of Economics Professors. With a fixed budget, perhaps I can survey 1000 of them for 2 days each, or go back to them every 2 days for a month and survey 100 of them for a month at a time. Can I get unbiased estimates using either approach? If so, which is better?

Luckily I have a paper on this (ungated version, see section 3.2). [EDIT: ACTUALLY MY PAPER IS ON THE RELATED TOPIC OF MEASURING FLOWS AS AN OUTCOME VARIABLE] Let us assume that the sampling covers the population of interest (so e.g. in the parental time input case, they randomly choose whether to ask you about Saturday or Sunday, and which weekday to ask you. They also randomly choose the week to ask you about – clearly if they only measure Saturdays and Mondays, and do this during a week where Monday is a public holiday, this won’t give an accurate measure of average parental inputs over a longer term). Then either approach will give an unbiased estimate of the average parental time input of Economics Professors – in the case where we survey 1000 for 2 days each, we happen to get some Justin’s in Disneyworld, and some Fred’s travelling for a conference, and some Maria’s teaching a triple section and working late, and some Pierre’s who are staying home because their kid was sick, etc. – and so perhaps we get incredibly noisy estimates of the long-term time inputs of any one of them, but still an unbiased estimate of the mean.

Then my paper shows the choice of whether to prefer a larger cross-section or longer time series depends on the estimation method used, and then, if the most powerful (Ancova) method is being used, on the autocorrelation of the outcome of interest. The bottom line is then the following:

“when the autocorrelation is high it is better to do a larger cross-section and fewer survey rounds, whereas when the autocorrelation is low, it is better to do relatively more survey rounds and a smaller cross-section”

So if we are surveying outcomes like profits, consumption, and some types of income, there is more power to be had from surveying multiple times/measuring over longer time periods. But for highly autocorrelated outcomes like test scores, it is better to have a larger cross-section. I don’t know what the autocorrelation of parental time inputs is, but my hunch would be that it is likely to be somewhere in between, in which case it may not make that much difference which approach is used.

Now there are lots of other reasons to be skeptical of a cross-sectional regression study in which the assumption is parental time inputs are exogenous conditional on observables, but I don’t think measurement is the key problem here.

ADDENDUM

I can justly be accused of seeing the world like a nail for my “more T paper” hammer here. I apologize for posting too quickly and not being clear enough. Let me clarify.

My paper and post above is about measuring short-term flows when you want to estimate a mean or use them as an outcome (LHS variable) in a regression.

However, the paper Justin commented on runs the following types of regressions:

Current Child outcome (e.g. reading score) = a + b1*Current Parental Time input + controls + e (1)

Here Current Parental Time input is calculated as 2*weekend day measure + 5*weekday measure

b1 here then gives the association between this particular measure of parental time input and the child outcome.

I now understand that Justin’s point was that we really think the relationship of interest should be instead something like:

Current Child outcome (e.g. reading score) = a + b2*Long-term Average Parental Time input + controls +u (2)

Note here that b2 is then measuring a different association from b1.

We can run the regression (1) and get an estimate of b1 which tells us the association between current parental time input and current child outcomes. If we measured parental time input over a longer-term we could run regression (2) and get a different parameter b2.

The problem is if we run b1 but think we are running b2, then we face an errors-in-variables problem – then the approximation of Long-term parental time input with current parental time input is subject to measurement error, and the classic OLS attenuation bias comes into play. This is what I now understand to be Justin’s point.

So bottom lines are:

You can measure flows over short-term periods when using it as an outcome variable – but as my paper shows, for outcomes with low autocorrelations it is good to take multiple measures.

If you are using this as a RHS variable, then you have to be clear what parameter it is you are trying to estimate, and if you are trying to estimate a long-term relationship with short-term data, the classic errors-in-variables comes into play.

Thanks Justin, Sendhil, Aprajit and others on twitter for helping clarify where I had gone down the wrong track here.