Content
As the number of aspirin taken increases from 1 to 5 aspirin, relief increases. However, after 5 aspirin, adding more aspirin doesn’t increase relief; it decreases it. There is not a linear relation between aspirin and relief. Taking 9 aspirin is NOT better than taking 4 aspirin, as the graph above indicates.
Slope A measure of the direction and steepness of a line; for every one unit increase in \(x\), the change in \(y\). For every one unit increase in \(x\) the predicted value of \(y\) increases by the value of the slope. In the population, the \(y\)-intercept is denoted as \(\beta_0\) and the slope is denoted as \(\beta_1\). This example uses the ‘StudentSurvey’ dataset from the Lock5 textbook. The data was collected from a sample of 362 college students. If \(p \leq \alpha\) reject the null hypothesis, there is evidence of a relationship in the population.
However, a cause-and-effect relationship between the independent variable and the dependent variable will result in a high coefficient of determination. Same for all of the values of the independent variables. Correlation analysis requires that both variables be measured at least at the interval level. There are other procedures to measure relationships with nominal and ordinal data. So, what do you do if you detect a curvilinear relation?
Thecorrelation coefficient (r) is a statistic that tells you the strengthand direction of that relationship. It is expressed as a positive ornegative number between -1 and 1. The value of the number indicates the strengthof the relationship: r = 0 means there is no correlation.
Recall from Lesson 3, regression uses one or more explanatory variables (\(x\)) to predict one response variable (\(y\)). In this lesson we will be learning specifically about simple linear regression. The “simple” part is that we will be using only one explanatory variable. If there are two or more explanatory variables, then multiple the coefficient of determination is symbolized by linear regression is necessary. The “linear” part is that we will be using a straight line to predict the response variable using the explanatory variable. The residual values in a regression analysis are the differences between the observed values in the dataset and the estimated values calculated with the regression equation.
It would not be appropriate to use this regression model to predict the height of a child. For one, children are a different population and were not included in the sample that was used to construct this model. And second, the height of a child will likely not fall within the range of heights used to construct this regression model. If we wanted to use height to predict weight in children, we would need to obtain a sample of children and construct a new model.
Each Y’ can be considered to the average Y value that can be predicted for all of the cases in the distribution with a corresponding X value. Know how to interpret a correlation coefficient of in terms of percent of variance accounted for. If either of the variables has a restricted range , the correlation will be spuriously low . This is because error will be a larger proportion of the variance in a restricted range. If the change in Y values was consistent as you moved to the right it would be a linear relationship. If the change in Y values was inconsistent as you moved to the right it would be a non-linear relationship.
R-squared (R2) is a statistical measure that represents the proportion of the variance for a dependent variable that's explained by an independent variable in a regression model.