Archive for Stimulus

Recovery Review Is Fully Live

Seriously, I’m going to get back to posting regularly now that this project is complete.

On Monday, I submitted Recovery Review for the Sunlight Foundation’s Design for America contest. (The project requires the installation of Silverlight if you don’t already have it installed.)

Recovery Review allows users to search and visualize stimulus data. It also allows user to flag data that they think is inaccurate. I think it’s a pretty cool little project, although I have a small list of things I’d like to improve about it. (The list isn’t comprehensive, but it’s a start.)

I’ve also started a blog for Recovery Review to go along with the project. Right now, the blog has some discussion of design decisions and the stimulus data.

One of my biggest frustrations is the fact that the data was updated on the Recovery.gov website when I was right in the middle of my project. As a result, I’m running the project (at least until the judging is complete) on the old data, which represents everything up to 2009, Q4.

What is strange to me is that it looks like the data updates are a little frustrating. Anything about a particular project can change in the updates, from the amount of money awarded to the project to the date the project was started to the number of jobs the project “saved or created”. Sometimes these changes make sense. Sometimes they make no damn sense at all.

It looks like I picked a hell of a complex data set to work with.

Introducing Recovery Review (Alpha)

I’d like to apologize for something and then give a good reason for it.

The Sunlight Foundation is a fantastic organization that pushes for government transparency and every once in a while, they run a contest. This year, the contest is “Design for America“. It started in early March is meant to be a 10 week design contest with several categories for entries.

When the contest started, I didn’t think I had time to build the project I wanted to build because I had a major professional conference in April. But after the conference ended, I decided I might still have time to hack something together. And hack I did.

My project is called “Recovery Review” and is meant to be a way to crowd-source the task of checking the stimulus data.

Users can search through the stimulus data given a couple key variables and get a report of the stimulus projects that match their search. They can then expand the item to a full view (the “+” icon in the top right corner) and then flag the item if it has any inaccurate or questionable data. They can also add a link if there is a news article or blog post discussing that specific project.

So, please, be my test users if you have some time. Head over there and look through the data. If anything in the data seems inaccurate to you, flag it and add to our database of what items are accurate and what items are inaccurate.

And let me know here if you have any errors. I’m still working on refining parts of the project, but anything that breaks the project is going to be of the highest importance to me. Thanks!

Debunking the “Republican Congress Creates Jobs” Chart Or “How To Make Numbers Say Anything You Want”

This is a companion piece to the previous post, so please read both of them. Here I’m going to lay out the script I had written for debunking the chart I created that asked the question “Does a Republican Congress Create More Jobs?” and then implied with a chart that this was indeed the case. I’ll walk through some process for creating charts and then talk about why I would create a chart that I was just going to debunk.

I apologize for the similarity to the post where I debunk the Obama stimulus chart. These two scripts were meant to be together.

<Start Script>

How to Make Number Say Anything You Want

Do you want to convince people that your side is right with only the flimsiest proof? Does the idea of tricking people with numbers make you all happy inside? Then come join us as we walk through “How To Use Charts To Say Anything”

Step 1: Massaging the Data

The first step is to grab the data that makes your point the best. Let’s use it to prove that a Democratic Congress is bad for jobs.

“How can we do such a thing” you ask?

In the first case, the raw jobs data looks like this

but the final chart looks like this.

How did they do that? Was it magic?

Nope, we simply smoothed the data. The raw data is a little too chaotic and has too many data point to tell the straightforward story that we want. So instead, we’ll average the monthly data so that we have quarterly data. There… now we have some nice smooth straightforward data

Step 2: Pick colors that make you look good

Next, we pick some colors. Let’s make the Democrats blue dark and bold, give it a bit of an angry feel to it. This is our way of getting the audience to look at the democrats in a harsh way. We could try to soften up on the Republicans more, but too soft of a red would look pink and we don’t want that.

Let’s compare our colors to the Excel defaults:

Step 3: Do NOT give any context!

Finally, and this is the most important part, only give information that is helpful.

Let everyone know that we saw 8 million jobs added to the economy while the Republicans were in charge and make a point to show that we lost 8 million jobs while the Democrats were in charge. But don’t mention that the Republicans took Congress only a year after 9/11 at a time when the job market was particularly low. Otherwise people will think it’s a “Well, they can’t fall off the floor” thing.

And make sure you don’t mention anything about the real estate market and how the bubble drove the labor market in a way that was clearly unsustainable. We don’t want the viewers to be confused with all these relevant details. We want them to say “Republicans good, Democrats bad”.

<End Script>

Everyone here was incredibly kind to put up with my bullshit chart for as long as I left it up without explanation. I’d like to say unequivocally: My chart is propaganda… just like the Obama administration’s chart. I was trying to use my chart as a visual talking point that said:

If you have no ethical qualms, data visualizations can be manipulated to say exactly what you want them to say.

My chart implies that the Republicans were responsible for the jobs growth between 2003 and 2007 and that Democrats were responsible for the drastic decline from 2007 to the present. Let me state plainly, I do not think that is the case.

But if we just play around with the data the right way, we get what seems to be a clear picture that portrays a correlation and gets on its hands and knees and begs us to draw causation from it. Most people will do exactly that.

I can spend hours walking patiently through what is wrong with the Obama administration’s chart. Let me recap the high points here:

  • If you look at the data with the context of what President Obama’s team was hoping the stimulus would do, the power of the chart disappears.
  • If you look at the data with the understanding that they’re charting a first derivative, you realize that we haven’t gained jobs, we’re just losing them more slowly and the power of the chart disappears.
  • If you look at the data with the understanding that they didn’t even start spending the stimulus until the job loss had started slowing down, the power of the chart disappears.
  • If you look at the data in the context of other recessions, you’ll realize that, far from showing a drastic improvement, the numbers represent a devastatingly slow jobs recovery compared to other recoveries and the power of the chart disappears.

But this kind of explanatory rebuttal would interest those already convinced. The chart I made had a power that an calm explanatory video wouldn’t have. Quite frankly, I hate that this is the case. Like President Obama’s chart, my chart doesn’t teach people anything about economics or lead people to learn important things about unemployment.

The only valuable thing my chart teaches is that charts can portray accurate data and still be manipulated to coach people along to poor conclusions. The only reason I even put my chart up is because it is the graphical equivalent of drawing out the Obama administration’s argument to its logical conclusion. My chart works with the same data, the same assumptions, and the same implications. And it leads to a completely different conclusion.

I’ve heard people describe President Obama’s chart as “powerful” and “brilliant”. The popular information visualization blog Flowing Data even tossed it up for public discussion among info viz professionals.

My point here is that it isn’t brilliant. It’s juvenile. It’s the chart equivalent of a crass political cartoon with a Snidely Whiplash mustache drawn on the bad guys. It’s a design trick imagined by cynical, self-congratulatory children fresh out of graduate school who pat themselves on the back for their ability to fool people who they think are too stupid to know the difference. They think they are special because they can get powerful people to flatter them for their ability to lie.

But they aren’t special. I can play that same childish game in my free time. The difference if that I want people to know that it’s a trick. They would rather see people fooled.

Debunking the Obama Stimulus Chart Or “How To Make Numbers Say Anything You Want”

I’ve been trying to find the time to make a video for this, but the fact of the matter is that I’m simply too slammed with all my work (I have a huge conference in two days). And I’m really kind of sick of my chart that I put up with basically no explanation. I basically created my chart as a rebuttal to this chart put out by the Obama administration. In this post, I debunk the Obama chart. In the next one, I debunk my own.

I’m basically just going to dump the script that I had written. Imagine my voice with some happy visuals that I don’t have time to make. I’ll add some additional comments at the end. Imagine a sing-song snake-oil salesman. That was what I was going for.

<Start Script>

How To Use Charts To Say Anything

Do you want to convince people that your side is right with only the flimsiest proof? Does the idea of tricking people with numbers make you all happy inside? Then come join us as we walk through “How To Use Charts To Say Anything”.

Step 1: Massaging the Data

The first step is to grab the data that makes your point the best. Let’s use it to prove that a Democratic president is good for jobs.

“How can we do such a thing” you ask?

Let’s grab some raw jobs data. We’re going to take this data

and make it look like this:

How did we do that? Was it magic?

Nope, it’s called the first derivative. It works like this. Instead of worrying about how high the line is, we’re only going to worry about how steep the line is. That way, the number will look good even if we keep losing jobs. Instead of charting how many jobs there are, we’re charting how many jobs we’re still losing.

That turns the first chart (which looks bad) into the second chart (which looks good).

Step 2: Pick colors that make you look good

Next, we pick some colors. We could pick the default colors that Excel gives us when we chart two different kinds of numbers. But that’s too neutral. By way of comparison:

As you can see, we’ve taken the default red (for George Bush) and made it darker and richer. This is like drawing a Snidely Whiplash mustache on him so that we know he’s the bad guy. Then, we’ll make the President Obama blue lighter and softer so we know he’s the good guy.

Step 3: Do NOT give any context!

Finally, and this is the most important part, only give information that is helpful. And by helpful, I mean favorable to your side.

It’s OK to mention that President Obama signed the stimulus bill into law in the first quarter of 2009.

It’s not OK to mention that the initial stimulus reports from the first and second quarter were totally blank, which means that they didn’t really start spending the money until July.

Also, you should forget to mention that as of December, we’ve only spent 10% of the stimulus money.

If you give all of this unhelpful information, people might draw the conclusion that the stimulus didn’t really help very much.

And that would be bad.

Remember, we’re not interested in helping people understand the complexities of the economy. We just want them to look at the chart and say, “Bush bad. Obama good.”

<End Script>

I got my numbers for the last part of this from the stimulus reports on recovery.gov. Since I started looking at the data back in late 2009, they’ve changed the way they organize the data. Until a little over a month ago, the reports for 2009, Q1 and 2009, Q2 were blank. Zero data. Nothing. In the 2009 Q3 data they reported giving out about 4% of the stimulus money. By the end of 2009 Q4, they had reported giving out 10% of the simulus money.

Since then, they took the empty Q1, Q2 and the actual Q3 data and relabeled the file so that the Q3 data now says “February 17 – September 30, 2009″. There is no way to tell for certain when the money was sent out, but the amount of money marked as “recieved” ran on a curve that was about 4 months off. (Example: Most of the money that was marked as “recieved” was applied for in March, April and May. Very few places that applied for money after May marked it as recieved by the end of September. So…we see job losses slowing even before the money was making it out the door.

OK. Now to talk about my rebuttal chart and a well deserved explanation. I have the greatest readers of all time and many of you have pointed out that my rebuttal chart (seen here) commits many of the same fallacies that the Obama chart has.

My response to that would be “Yes it does. It was meant to.” I created that chart as the visual equivalent of saying “If your logic is correct, than you would be forced to accept this other conclusion as well since it uses the same logic.”

Both charts use jobs data taken from the same place, displayed the same way, stripped of context and used to push an ideological point using an implicit “correlation mean causation” line of argumentation.

Let me be clear: I do not think that a Republican Congress is the driving factor behind 8 million jobs created and I would NEVER say that. But I would say “Your chart implies that Obama is responsible for the slowing of job loss. If that is your argument, I would like to use the same chart logic to say that we need a Republican Congress to regain those jobs. By your own argument, you should be voting Republican this November.” I meant my chart to be a sort of visual rhetorical trick to be played in the context of the Obama stimulus chart to show that the numbers can be spun in either direction.

Does a Republican Congress Create More Jobs?

UPDATE: I discuss this chart in detail in my new posts, “How To Make Numbers Say Anything You Want” Part 1 and Part 2

For your consideration.

Download the large version
Download the small version

Data gathered from the US Bureau of Labor Statistics. Employment numbers are averaged by quarter and charted from 2003 to the present. (2010 Q1 is just January, 2010) Republicans took control of both houses of Congress in January 2003. Democrats took control of both houses of Congress in January 2007.

I’ve more to say, but it can wait till later.

President Obama, I Fixed Your Chart For You

You may have recently seen the new chart put out by the Obama administration pushing the idea that the President’s policies are responsible for the decrease in newly unemployed. It looks something exactly like this:

Now… as a piece of visual political propaganda, this is brilliant. The colors draw sharp contrast, the symmetry is appealing. And the numbers are right.

But keep in mind how carefully I phrased the units being used “decrease in newly unemployed”. This isn’t an increase in jobs or a decrease in unemployment. It just means that we’re losing jobs slower that we were before.

Make no mistake… this is good news. And we can bicker back and forth as to whether President Obama’s policies are responsible for this slowdown in newly lost jobs. He would say yes and point to the stimulus.

But in order to point effectively to the stimulus, we would have to take a look at the expectations of the stimulus. Everyone expected that we would come out of the recession eventually and that job loss would slow. The question was how quickly that would happen.

To help us visualize the expectations of the stimulus against the reality of it, I’ve added that piece of context to the graph. See if you can spot it.

I got these numbers by multiplying the labor force by the expected unemployment rate with the stimulus (per this chart) and then subtracting that number from the labor force times the actual unemployment rate.

One may say that this is unfair. I would actually kind of agree. Economic predictions are pretty hard to make. But the original chart is similarly unfair. Keep in mind that it took a few months to get the stimulus money out the door. In fact, they didn’t even release any data on the stimulus funds for second quarter 2009 (the first stimulus report was for third quarter 2009).

Side Note: This data has actually been scrubbed from the website. They’ve re-compiled the data into new categories. But I’m wary about trusting the data since it looks like, according to the official data, about $12 billion of the stimulus was spent before the stimulus was signed with projects being approved as early as 2000.

So the first several months of decline don’t even reflect the impact of the stimulus. The decline in new job losses seems to be just a happy coincidence that looks good on a chart.

What Does the Federal Budget Freeze Look Like?

The first part of this post is just an overview of the data I used to make this video, so if you don’t care about that, you can skip over it to the part where I talk about what the budget freeze means.

First, I’ve got a new video up called “What Does The Federal Budget Freeze Look Like?”

Here is the data summary of this video:

I got the budget numbers (budget, discretionary, mandatory) from the overview of the 2010 budget which includes projections for 2011. I did this because the 2011 budget is not available yet (although I understand that those projections are a bit low and the real budget will be bigger than the projection).

That gives us the following numbers:

  • 2011 Federal Budget – $3.7 trillion
  • Mandatory portion of federal budget – $2.322 trillion
  • Discretionary portion of federal budget – $1,380 trillion

I’ve seen it consistently reported that the freeze will affect $447 billion of the budget, although I imagine that number is subject to change. The amount saved from this freeze has been consistently reported as $15 billion in the first year and $250 billion over 10 years.

The stimulus funds as reported by recovery.gov at the time of this post are:

That leaves:

  • $195 billion in tax cuts that have not been applied
  • $202 billion in contracts, grants and loans that haven’t been spent
  • $121 billion in entitlements (what a creepy name) that haven’t been spent

If we left the tax cuts in place, but canceled the rest of the spending, we’d save $323 billion… which is a shade less than what I said in the video. Apparently, that is the result of some rounding errors in my spreadsheet, but the $4 billion comes out to about one and a half teaspoons, which isn’t enough to make a difference in the visualization.

As for the water part of it… If we assume that the budget is 192 ounces of water that we’ve split into 4 oz cups, then all the math in the video works out. I actually under-counted the unspent stimulus (it would be 17 ounces instead of 16). I measured my ice cube tray and found that each ice cube was 1.5 ounces and I used 1 and a half tablespoons of water to measure out the .75 ounces that would be equivalent to $15 billion.

<End of Boring Math Things>

OK… now to comment on what I think about the budget freeze to anyone who cares what I actually think.

First of all, I hate the “we’re saving $250 billion over 10 years” line. It is a piece of crass political rhetoric and I’m disappointed that the administration would use it. If they actually implement a three year freeze on the portion of the budget they’re talking about (which is a big if, but let’s assume the best), why measure the effects in the space of 10 years?

The answer is “To make the freeze look bigger”. They’re basically just basing the extended savings off of projected interest payments and “savings” due to the fact that the baseline on that portion of the budget hasn’t moved. It is setting a dangerous data precedent where politicians realize that all they have to do is calculate a projection out as far as they need in order to get the numbers they want. It would be like giving an employee a $5,000 bonus, but saying that you gave them a $8,000 bonus based on a 5% return of that investment over the course of 10 years.  They might as well say that they’re saving a trillion dollars over the next 25 years or a hundred trillion over the next 300 years. It is a data statement designed to trick people.

Second, I hate the “We’re saving all this money by not spending it” line because it is similarly political. If a future politician wants to play this stupid numbers game, all they have to do is “project” that they will spend like a crazy person next year and when the next year comes, they decided to spend like a half crazy person. Then they can claim that they have “saved” all this money because they “reduced” their projected spending.

As a slapdash example, a politician could project that they will increase spending by 5% next year and then decide at the last moment to increase it by 3%. They could then spin that decision to increase by a smaller amount as a decision to “cut” their spending (which wasn’t real spending, only projected spending) by 2%.

Last, my attempt to visualize the scale of the budget freeze does not mean I don’t support it. I really like to see cuts to the budget and I personally think this is not an insignificant one.I think it is worth our energy to do exactly what the Obama administration seems to be doing…freezing increases and looking around for crappy programs to cut.

Keep in mind the hypocrisy on both sides of the aisle. The Republicans are hypocrites for claiming that this is a totally inconsequential budget cut. In 2005, George W. Bush proposed a 1% cut (not a freeze, a cut) in discretionary spending that wasn’t Department of Defense or Homeland Security. Translated to today, Bush’s cuts would have “saved” $33 billion using the calculation metric for the current freeze; more than twice the amount that this freeze would save us. At the time, John McCain called it a “very austere budget” and Dick Cheney went out pushing their credentials as cost cutters. I find it strange that they were ecstatic about saving the equivalent of $33 billion but think that $15 billion is a drop in a bucket.

Of course the Democrats blasted Bush’s cuts as a gimmick too small to make a difference, but seem to have lost much of their skepticism over these new, smaller “cuts”.

Overall, it looks like both sides are more interested in political gain than in having a frank discussion about the numbers and what they mean. This should surprise no one, but I confess to finding myself somewhat dismayed that the Obama administration, for all their hype about being pro-science and pro-data, has no problem spinning the numbers in a way that decreases clear comprehension in order to increase message potency.

Why Take Math? So Your Ignorance Isn’t Broadcast Nationwide on the AP Wire

This is pretty funny. Or horrifying. Depends on how you want to look at it.

Several days ago, I noted on Twitter that there were a lot of “saved” jobs that weren’t saved at all but actually cost of living increases. About 24 hours after I noted this, there was an Associated Press article about that very phenomena.

Coincidence? Almost certainly. But I’ll flatter myself anyway.

But the laugh riot comes several paragraphs into the article as they look into why Southwest Georgia Community Action Council was able to save 935 jobs with a cost of living increase for only 508 people. The director of the action council said:

“she followed the guidelines the Obama administration provided. She said she multiplied the 508 employees by 1.84 — the percentage pay raise they received — and came up with 935 jobs saved.

“I would say it’s confusing at best,” she said. “But we followed the instructions we were given.”

“Confusing at best”? The multiplication of percentages is “confusing at best”? It seems obvious to me she should have multiplied 508 people by the amount the increase (.0184) and gotten 9.3. But she forgot that you have to divide the percentage by 100 before you multiply.

The fact that she had “saved” more jobs than there were people in the organization should have been a tip-off. But this is a pretty common problem with people who don’t have a very good grasp on mathematics… they don’t recognize obvious mathematical errors, they just plug in the numbers and go with whatever comes out.

And this, children, is why you pay attention at school. So you don’t get in the national news for doing something really stupid and then blame it on the instruction manual.

Dirty Stimulus Jobs Data Exaggerates Stimulus Impact

One of the key talking points for the stimulus that was passed earlier this year was that it would “save or create” jobs. Lots of jobs. Oodles of jobs. Jobs piled so high, we’ll have to hire people to dig us out of all the jobs we will have.

Or, more specifically, the Obama administration stated that they would “save or create” 4 million jobs.

This led to a great deal of mockery over the “save or create” turn of phrase, but the administration set out to actually measure the number of jobs that were saved or created by having recipients of the stimulus funds fill out a form in which they indicate how many jobs that particular chunk of the stimulus created (that form can be found here).

Now, if you look at recovery.gov, you’ll see that the stimulus has “saved or created” 640,000 jobs. That is only 16% of the promised jobs, but it’s still a pretty big number. I was curious how they got it, so I downloaded the raw data and started sifting through it. This is what I found:

  • Over 6,500 of all the “created or saved” jobs are cost-of-living adjustments (COLA), which is really just a raise of about 2% for 6,500 people. That’s not a job saved, no matter how you calculate it.
  • Over 6,000 of the jobs are federal work study jobs, which are part time jobs for needy students. As such, they’re not really “jobs” in the sense that most other federal agencies report job statistics (We don’t count full time college students as “unemployed” in the statistics.)
  • About half of the jobs (over 300,000) fall under the “State Fiscal Stabilization Fund”, which can be described like so: Your state (perhaps it rhymes with Balicornia) can’t afford all the programs it has running, but when the state government tries to raise taxes, people yell and scream and threaten to move. The federal government comes in with stimulus funds and subsidizes the state programs. Consider this a “reach-around” tax in which the state can’t raise taxes its citizens any more, but the federal government can. So the federal government just gives the state the money to keep running programs they can’t afford on their own.
  • There are, scattered hither and non, contracts and grants that state in no unclear language that “This project has no jobs created or retained” but lists dozens, if not hundreds, of jobs that have been “saved or created” by the project. It makes no sense whatsoever.

Finally, there is a statistical problem to the data here that I’ve not heard discussed at all, the problem of job duration.

Because there is no guidance in the forms on the proper way to measure “a job”, recipients are left to themselves to figure out what counts as a job. Some of them fill it out by calculating “man-weeks” and assume one “job-year” to be the measurement of a single job. Others fulfill contracts that only require two weeks, but they count every person they hire for every job to be a separate job created.

As an illustration: Let’s say you have a highway construction project in the Salt Lake City area that takes one month. A foreman is hired for the project and he brings on 20 guys he likes to work with to fill out his crew. That is 21 jobs “saved or created”. While that job is being completed, the funding if being secured for another highway construction project. By the time that funding goes through, the first project is done and they decide to just move the whole crew over to the next project. That is another 21 jobs “saved or created”.

If this happens four more times, on paper it looks like 124 jobs have been “saved or created” when in reality 21 people have been fully employed for six months. But if you judge jobs through a “man-weeks”/”job-years” lens, you have 10.5 jobs.

This is how the Blooming Grove Housing Authority in San Antonio, Texas can run a project titled “Stemules Grant” to create 450 roofing jobs for only $42 per job. My educated guess is that they hired day-laborers, paid them minimum wage or below and only worked them for a single day. Each new day brought new workers which meant more jobs “created”. Either that or they simply lied on the form. (UPDATE: USA Today interviewed the owner here. He says that he used only 5 people on the roofing jobs but that a federal official told him that his original number wasn’t right, so he adjusted it to count the number of hours worked, not the numbers of jobs created.)

Rational people can see that this kind of behavior skews the data upward. How much upward? It’s hard to say, although it is a safe bet that any project that manages to create a job for less than $20,000 is probably telling you some kind of fib.

My ultimate conclusion from looking at the jobs data is that:

  • The jobs numbers reported on recovery.gov are heavily exaggerated
  • The jobs numbers reported are not subjected to any scrutiny or auditing whatsoever; they are a simple data dump and therefore be seen with heavy skepticism
  • The jobs numbers are a laudable transparency effort. I’m impressed that so much work has gone into trying to measure the results of the stimulus funding. Normally, these kinds of numbers would be shrouded in mystery and a normal Joe like myself would be unable to investigate them. Kudos to the Obama administration for implementing this data gathering and display initiative. However, they put too much faith in the data and statements like “The stimulus has saved or created 640,000 jobs” are uttered with a profound ignorance in the nitty-gritty details of what the data actually says.

For more interesting stimulus jobs data, you can see Paul Krugman getting angry about it here and Greg Mankiw responding to that anger here and Brad DeLong calling Allan Meltzer a shameless partisan hack about the topic over here and a story of how $900 worth of boots became 9 jobs over here. Or you can just download the jobs data and look through it yourself. There’s lots of interesting stories in there.