Refresher - Linear Regression

Linear regression for two variables is based on a linear equation with one independent
variable. It has the form:

y'=a+bx

where a and b are constant numbers.

x is the independent variable, and y is the dependent variable. Typically, you choose a value to substitute for the independent variable and then solve for the dependent variable.

For the linear equation y'=a+bx, b=slope and a=y-intercept.

From algebra recall that the slope is a number that describes the steepness of a line and the y-intercept is the y- coordinate of the point (0,a)(0,a) where the line crosses the y-axis.

 

Example 1

Sharon tutors to make extra money for college. For each tutoring session, she charges a onetime fee of $25 plus $15 per hour of tutoring. A linear equation that expresses the total amount of money Sharon earns for each session she tutors is y=25+15x.

 

Problem 1
What are the independent and dependent variables?

Solution:
The independent variable (x) is the number of hours Sharon tutors each session.  The dependent variable (y) is the amount, in dollars, Sharon earns for each session.

 

Problem 2
What is the y-intercept and what is the slope? Interpret them using complete sentences.

Solution:
The y-intercept is 25(a = 25).At the start of the tutoring session, Sharon charges a one-time fee of $25 (this is when x = 0).
The slope is 15(b = 15). For each session, Sharon earns $15 for each hour she tutors.

 

Scatter Plots

A scatter plot shows the direction and strength of a relationship between the variables. A clear direction happens when there is either:

  • High values of one variable occurring with high values of the other variable or low values of one variable occurring with low values of the other variable.
  • High values of one variable occurring with low values of the other variable.

You can determine the strength of the relationship by looking at the scatter plot and seeing how close the points are to a line, a power function, an exponential function, or to some other type of function.

When you look at a scatter plot, you want to notice the overall pattern and any deviations from the pattern. The following scatter plot examples illustrate these concepts.

 

We are interested in scatter plots that show a linear pattern. Linear patterns are quite common. The linear relationship is strong if the points are close to a straight line. If we think that the points show a linear relationship, we would like to draw a line on the scatter plot. This line can be calculated through a process called linear regression.
However, we only calculate a regression line if one of the variables helps to explain or predict the other variable. If x is the independent variable and y the dependent variable, then we can use a regression line to predict y for a given value of x.

Data rarely fit a straight line exactly. Usually, you must be satisfied with rough predictions. Typically, you have a set of data whose scatter plot appears to "fit" a straight line. This is called a Line of Best Fit or Least Squares Line.

OC0111124_Stats05.gif