Coefficient of Determination Calculator (R²)

Calculate R-squared to measure the goodness of fit for linear regression models

Data Input (Enter x,y coordinates)

Point
X Value
Y Value
Predicted Y
Actions
#1
-
#2
-
#3
-

Example Calculation

Sample Data

Points: (0,1), (2,4), (4,4)

Mean of Y: ȳ = 3

Regression line: y = 0.75x + 1.5

Sum of Squares

SST = 6 (total variation)

SSR = 4.5 (explained variation)

SSE = 1.5 (unexplained variation)

Result

R² = SSR/SST = 4.5/6 = 0.75

75% variance explained

R² Interpretation Guide

0.9 - 1.0: Excellent
Very strong relationship
0.7 - 0.9: Good
Strong relationship
0.5 - 0.7: Moderate
Moderate relationship
0.3 - 0.5: Weak
Weak relationship
0.0 - 0.3: Very Weak
Little to no relationship

Key Formulas

R² = SSR / SST
Basic R-squared formula
R² = 1 - SSE / SST
Alternative form
R² = r²
Square of correlation coefficient

Understanding the Coefficient of Determination

What is R-squared?

The coefficient of determination (R²) is a statistical measure that explains the proportion of variance in a dependent variable that can be predicted from the independent variable(s). It provides an indication of goodness of fit and therefore a measure of how well unseen samples are likely to be predicted by the model.

Key Characteristics

  • Values range from 0 to 1 (0% to 100%)
  • Higher values indicate better model fit
  • R² = 1 means perfect fit
  • R² = 0 means no explanatory power

Sum of Squares Breakdown

Total Sum of Squares (SST)

SST = Σ(yᵢ - ȳ)²

Measures total variation in the data

Regression Sum of Squares (SSR)

SSR = Σ(ŷᵢ - ȳ)²

Measures explained variation

Error Sum of Squares (SSE)

SSE = Σ(yᵢ - ŷᵢ)²

Measures unexplained variation

Applications

  • Linear regression analysis
  • Model comparison and selection
  • Predictive modeling evaluation
  • Scientific research validation

Advantages

  • Easy to interpret (0-100%)
  • Widely recognized measure
  • Standardized across models
  • Shows proportion of explained variance

Limitations

  • Can be inflated by adding variables
  • Doesn't indicate causation
  • Sensitive to outliers
  • Assumes linear relationship