UNIT 11: CAUSAL ANALYSIS, continued.

UNTANGLING THE EFFECTS OF EDUCATION AND OCCUPATION ON EARNINGS

 

In unit 11, we saw how the Fishnet graph tells us many things about the effects, separately and in combination, of Education and Occupation on Earnings.

In this unit we are going to ask how much of the effect of education on earnings ( that is, the more education you have, the more you earn) is due to education itself, and how much is due to the fact that more education gets you a better job, and a better job gets you more earnings.

We will be working with the datafile WORK9Y.DAT. This is the datafile we made by Modifying WORK9X.DAT in various ways, and then saving the result as a new datafile. I will put big bold numbers on the various pieces of your output, so we can refer to these pieces in your printout. First comes the computing, then we will do things with the output.

Start CHIP going, in File menu pick Log and name your output file. Now

Open WORK9Y.DAT.

ITEM 1: In the Command Menu, do Info.

ITEM 2: In the Command Menu, do Marginals on OCC.

ITEM 3: Crosstab EDUC / EARNING, pct Across.

ITEM 4: Still in Crosstab EDUC / EARNING, in Options menu control OCC, No more, and pct Across. This will result in four "subtables," one for each category of OCC, and each a table of EDUC / EARNING for all those in one of the occupation types.

ITEM 5: Still with crosstab EDUC / EARNING, and control OCC, in the Standard menu at the top of the screen, pick Standardize. The screen will show what looks like Info output, with the variables in the following order: the Control variable (OCC), then the independent variable (EDUC), then the dependent variable (EARNING), then the other variables. In the box below, it says: The Data have been standardized.

ITEM 6: Run Percent Across.

ITEM 7: In the Standard menu, pick Restore Run percent across. This will give you the same table as Item 3, but I find it convenient to put this table right after the standardized version in Item 6, where it is easy to find.

That's all the computing for now, so in Options Exit the crosstab, in File, End Log. And get a printout of the Log.

Now, printout in hand, or rather, on desk or table, here is what to do: Slope graph (no other kind) "Fig. 11.1: Percent earning $25K or more, by education level. Full-time workers, aged 25-54. Source: US Census 1990, <WORK9Y.DAT." This is a simple slope graph showing the effect of education on earnings. Use the percents from the Item 3 table.

Graph Fig. 11.2, of a generous size, titled: "Percent earning $25K or more, by education level and occupation. Full-time workers, aged 25-54. Source: US Census, 1990. <WORK9Y.DAT." This is a fishnet graph, identical to the one you did in the last unit. If you have it, you may copy it. If not, make a graph frame with vertical scale from 80 or 90 -- or even 100 -- down. Horizontal scale the four categories of EDUC.

In the frame there will be four slopes, one from each OCC subtable. Write the top minus bottom percent differences at the end of each slope, and label the slope. Fill in the numbers from Item 4, rounding off to the nearest percent.

Now what do these two figures mean? We are asking what part of the effect (top minus bottom percent difference) of education on earnings is due to education itself, and what part is due to the fact that higher education leads to higher jobs, and higher jobs lead, as we saw last time, to higher pay.

The subtables show us the effect of education on earnings when there is no occupation difference. In each subtable, everybody has the same occupation, so the only effect in that subtable is that of education on earnings, apart from occupation differences. If the subtable d's are lower than the d in Figure 1, then occupation does account for part of the effect of education on earnings. We can tell that because when we remove the effect of occupation by "holding it constant," or "controlling it," the effect of education goes down. In four different cases.

How about making an average so we could see an overall picture of what is left of the effect of education when the effect of occupation is removed, by holding it constant.

So in Fig.2, we could just add up the d's from each of the four slopes, and divide by 4. We could, but there is a problem. As Item 2 shows, the four groups are not the same size. It would not be fair to give Service with its 9 % of the people the same weight in the average as Otr-Wc with its 31% of the people.

So the right thing to do is make a Weighted Average. Each job type will count in the average according to its relative size. Here's how:

OCC type

Marginal (decimal)

Percent diff

to tenth nn.n

weighted diff

to 3 decimals

Service .231 x 24.6 = 5.682

Otr-Wc .146 x 40.5 = 5.913

Lab+Farm .086 x 33.8 = 2.907

Top-Wc .537 x 5.6 = 3.007

SUM last column which is a weighted ave = 17.509

I've put in some made-up numbers. Make a table like this, and put in the right ones. In this illustration, the Marginal pct for Service was 23.1, translate to the decimal .231, etc. Multiply by the percent diff in the corresponding subtable, this time using percents to the 10th of a percent, and record the result in the last column, to 3 decimal places.

The result you get in the lower right corner is the weighted average. It tells us the effect as a percentage difference of education on earnings, when the additional effect of occupation is removed.

Now take a look at the percent difference in Item 6. If you have done your arithmetic right, it should come out very close indeed to the weighted average. What that means is that you don't have to get out

your calculator, or your spread sheet, and copy in all those numbers, and multiply, and, and, and. Just use Standardize in the Standard menu. What's more, Standardize gives you not only a weighted average of the percent difference you are interested in, it gives a weighted average of the whole table! What's more, you can standardize more than one control variable at a time, whereas with Control subtables that soon gets unmanageable -- too many subtables, with too little in them.

Now make Figure 11.3, with the title: "Percent making $25K or more, by education, raw and standardizing Occupation. Fill-time workers, aged 25-54. Source: Census 1990, <WORK9Y.DAT." Occupation categories across the bottom. Slope graph the 25Kup column of percents from Item 3 -- or Item 7. Write in the pct numbers, rounded to the nearest percent. (I had you do the weighted average including the tenths just to show that Standardize and weighted average give the same result very exactly.) Label this line Raw, and make it a solid line. Make another slope, dotted line or ----, from Item 6, and label it Std Occ.

Underneath this figure, make a little table. We will be doing this often.

Using my made-up numbers in the table above, and making up another difference, 28 pct, as the raw difference, here is what the little table looks like.

 

Raw diff: 28 educ -> earnings

- Std diff 18 educ -> earnings, minus Occ.

raw - std 10 Part of raw due to Occ.

10 / 28 x 100 18 /28 x 100

=

= 36% of Educ effect is due to occup,

64% to educ net of occupation