![]() |
Education by Theodore Fuller In this module, we will be looking at educational issues, particularly children who are having problems with the educational process.
Open the file "tool_us.xls" (click on "Enable Macros" SSDAN is a reliable source) and click on "Chart". Then, make "Percent of 4th Graders Scoring below the basic reading level" the X variable and make "Percent of teens who are high school drop outs" (DROPOUTS) the Y variable. (Do this by clicking on the X variable box and Y variable box, respectively, and scrolling down to select the specified variable.) For both variables, select the most recent data available; for 4th grade reading level the most recent data are from 1998 and for DROPOUTS the most recent data are from 1999. After you select the two variables and the years, "tool_us.xls" will automatically display a scatter plot showing the relationship between the two variables. A scatter plot is a simple graph that shows the relationship between two variables by showing a point for each case on a two-dimensional graph (in this situation, each case is a state or the District of Columbia). Looking at a scatter plot, we can get an idea about whether the two variables are related and, if so, how strong the relationship is. I would expect that those geographic areas that have a higher percent of 4th graders who score below the basic reading level will have more teenagers who drop out of high school. (If more kids are having reading problems in 4th grade, more will drop out of high school.) One measure of the strength of the relationship between two variables is called a "correlation coefficient". The correlation coefficient can vary from 1.0 to 1.0. 1.0 means the two variables are perfectly related to each other in a positive direction; in other words, if one variable increases, the other one increases by a corresponding amount. 1.0 also means the two variables are perfectly related to each other, but in a negative direction; if one variable increases, the other one decreases by a corresponding amount. 0.0 means that the two variables are not related; a change is one variable is not predictably related to a change in the other variable. In practice, correlations are usually not close to 1.0 or 1.0. A correlation of .2 is usually considered a weak relationship; a correlation of .6 is strong; a correlation of .8 is extremely strong. "tool_us.xls" automatically reports the correlation between the two variables in the scatter plot. What is the correlation between the percent of 4th graders who have reading problems and the percent of teenagers who drop out of high school? There are several "outliers". "Outliers" are points that are not close to the other points. (Each point corresponds to a state or the District of Columbia.) Several outliers have a very low percent of 4th graders with low reading scores, compared to the other states. One outlier has a much higher percent of 4th graders who have low reading scores. What geographic area is the outlier that has a much higher percent of 4th graders with low reading scores? (You can find out by putting the cursor on top of a point. After a second, the name of the geographic area will appear, as well as the X and Y value for that area.) In general, we see from these data that the more 4th graders have reading problems, the more teenagers are likely to drop out of high school. It is not a perfectly predictable relationship, but it is fairly strong. Make "Dropouts" the X variables and make "Percent of teens not attending school and not working" ("Idle") the Y variable. (Select "1999" for both variables.) I would expect that states that have a larger percent of drop outs have a larger percent of teens who are "idle". What is the correlation between these two variables? The results in the scatter plot are consistent with the expectation that "dropouts" is related to "idle". In fact, the relationship is fairly strong. |
|||||||||||||