FEATURE (X)LABEL (Y)

OLS Model parameters

Drag any point in the scatter box above. The model will instantly update the best-fit line and calculate MSE loss metrics.

MSE (Loss)25.4
R-squared (Rยฒ)0.988
Pearson R-0.994
Cov(X, Y)-2940.0
Var(X)4240.0
Slope (m)0.693
Manage Dataset (5 Points)
Point P1 (60, 50)
Point P2 (100, 90)
Point P3 (150, 110)
Point P4 (200, 150)
Point P5 (240, 180)

Linear Regression & Statistics

STATS

Linear Regression models the relationship between a dependent variable Y and independent variable X. Ordinary Least Squares (OLS) calculates the best-fit line by minimizing the sum of squared residuals (vertical distances from points to the line). In AI, this is the core foundation of training simple models to predict outcomes based on features.

y=mx+cy = mx + c

Whiteboard Solver Steps

Step 1

Calculate Centroid (Means), Variance, and Covariance

Step-by-Step Calculation: 1. Centroid: Sum the coordinates of all scatter points and divide by the number of points (N=5N = 5) to get the center of mass: (Xห‰,Yห‰)=(150.00,124.00)(\bar{X}, \bar{Y}) = (150.00, 124.00). 2. Variance of X (Var): Measures the dispersion of X data points. Sum up the squared deviations (Xiโˆ’Xห‰)2(X_i - \bar{X})^2 and divide by NN to get 4240.004240.00.3.โˆ—โˆ—CovarianceofX,Y(Cov)โˆ—โˆ—:MeasureshowXandYvarytogether.Sumupthecrossโˆ’multiplieddeviations. 3. **Covariance of X,Y (Cov)**: Measures how X and Y vary together. Sum up the cross-multiplied deviations (X_i - \bar{X})(Y_i - \bar{Y})anddivideby and divide by Ntoget to get -2940.00.

Xห‰=1Nโˆ‘Xi=150.00Yห‰=1Nโˆ‘Yi=124.00Var(X)=โˆ‘(Xiโˆ’Xห‰)2N=4240.00Cov(X,Y)=โˆ‘(Xiโˆ’Xห‰)(Yiโˆ’Yห‰)N=โˆ’2940.00\begin{aligned}\bar{X} = \frac{1}{N}\sum X_i = 150.00 \\ \bar{Y} = \frac{1}{N}\sum Y_i = 124.00 \\\\ \text{Var}(X) = \frac{\sum(X_i - \bar{X})^2}{N} = 4240.00 \\ \text{Cov}(X, Y) = \frac{\sum(X_i - \bar{X})(Y_i - \bar{Y})}{N} = -2940.00\end{aligned}
Step 2

Ordinary Least Squares (OLS) Slope & Intercept

Step-by-Step Calculation: 1. **Slope (mm)**: Divide the Covariance of X and Y by the Variance of X. This represents the rate of change. Here, mโ‰ˆโˆ’0.6934m \approx -0.6934 (since SVG coordinates have Y increasing downwards, we negate this for standard Cartesian coordinates: mcartesianโ‰ˆ0.6934m_{cartesian} \approx 0.6934). 2. **Intercept (cc)**: Plug the centroid coordinates and the computed slope into y=mx+cy = mx + c and solve for c=Yห‰โˆ’mXห‰c = \bar{Y} - m\bar{X}, giving cโ‰ˆ228.01c \approx 228.01.

m=Cov(X,Y)Var(X)=โˆ’2940.004240.00โ‰ˆโˆ’0.6934c=Yห‰โˆ’mXห‰=124.00โˆ’(โˆ’0.6934)ร—150.00โ‰ˆ228.01\begin{aligned}m = \frac{\text{Cov}(X,Y)}{\text{Var}(X)} = \frac{-2940.00}{4240.00} \approx -0.6934 \\\\ c = \bar{Y} - m\bar{X} = 124.00 - (-0.6934) \times 150.00 \approx 228.01\end{aligned}
Step 3

Loss Evaluation: Residuals & Mean Squared Error (MSE)

Step-by-Step Loss Minimization: 1. Residuals: For each point, calculate the prediction error (actual Y minus predicted Y: Yiโˆ’Y^iY_i - \hat{Y}_i). 2. Mean Squared Error (MSE): Square each residual error, sum them together, and divide by the dataset size NN to find the average squared error: MSEโ‰ˆ25.42MSE \approx 25.42. 3. **Coefficient of Determination (R2R^2)**: The squared correlation coefficient R2โ‰ˆ0.9877R^2 \approx 0.9877 shows that 98.8%98.8\% of the variance in Y is explained by the regression line model. Machine Learning Context: - In AI, we call MSE the Loss Function. An optimization algorithm (like Gradient Descent) iteratively adjusts the slope mm and intercept cc (model weights) to find the absolute minimum point of this MSE loss curve.

ypredicted=โˆ’0.693x+228.01MSE=1Nโˆ‘i=1N(Yiโˆ’Ypredicted)2โ‰ˆ25.42R2=Rร—Rโ‰ˆ0.9877\begin{aligned}y_{predicted} = -0.693x + 228.01 \\\\ MSE = \frac{1}{N} \sum_{i=1}^{N} (Y_i - Y_{predicted})^2 \approx 25.42 \\ R^2 = R \times R \approx 0.9877\end{aligned}