Square (자승)
Overview
Square (자승, square) refers to the operation of multiplying a number by itself in mathematics, and in statistics, it denotes the squared difference between an observed value and the expected value (or mean). This concept plays a key role in calculating variance and standard deviation, serving as a fundamental tool for measuring data variability. Squaring converts the magnitude of errors into positive values, allowing them to be summed, and has the advantage of restoring the original unit via the square root.
Main Content
1. Definition and Mathematical Meaning of Square
A square is defined as x² = x × x for any number x. For example, the square of 3 is 9, and the square of 5 is 25. In statistics, the square is primarily used as the square of a deviation. A deviation is the difference between each data value and the mean; squaring this value makes it positive, solving the problem of cancellation when summing all deviations.
2. Sum of Squares (SS)
The sum of squares is the sum of the squared differences between each observed value and the mean. The formula is as follows:
SS = Σ(xᵢ - μ)² (population) or SS = Σ(xᵢ - x̄)² (sample)
Here, xᵢ is each observed value, μ is the population mean, and x̄ is the sample mean. The sum of squares is an intermediate step in calculating variance and represents the total variation in the data.
3. Variance and Standard Deviation
Variance is the sum of squares divided by the number of data points. Population variance (σ²) = SS/N, sample variance (s²) = SS/(n-1). Standard deviation is the square root of variance, having the same unit as the original data, making interpretation easier. For example, if the standard deviation of test scores is 10 points, it means that scores are, on average, about 10 points away from the mean.
4. Applications of Squares
- Regression Analysis: The ordinary least squares (OLS) method, which minimizes the sum of squared residuals, is a basic method for estimating regression coefficients. A smaller residual sum of squares (RSS) indicates a better model fit.
- ANOVA (Analysis of Variance): Decomposes total variation into between-group and within-group sums of squares to test the significance of differences between groups.
- Chi-Square Test: Squares the difference between observed and expected frequencies to compute the test statistic.
- Machine Learning: Mean squared error (MSE), the average of squared differences between predicted and actual values, is widely used to evaluate regression model performance.
5. Limitations and Alternatives of Squares
Squares are sensitive to outliers. Large deviations have a greater impact because they are squared. To address this, robust methods such as mean absolute error (MAE) or Huber loss are used. Additionally, since squaring changes the unit, interpretation may not be intuitive, so standard deviation is more commonly used.
Recent Trends
As of 2024–2025, the concept of squares is becoming increasingly important in artificial intelligence and big data analysis. Mean squared error (MSE) remains widely used as a loss function in deep learning models, but the use of robust alternatives (e.g., mean absolute error, Huber loss) is on the rise. Additionally, sum-of-squares-based variance analysis is applied in high-dimensional data for estimating variability in fields such as genomics, financial risk management, and climate modeling. In particular, in 2024, distributed computing techniques for efficiently calculating sums of squares have advanced, enabling real-time analysis of large-scale datasets. In education, the proliferation of statistical software (R, Python) has spread tools for visually understanding the concept of squares.
Related Topics
- [[Variance]]
- [[Standard deviation]]
- [[Least squares method]]
- [[Mean squared error]]
- [[Regression analysis]]
---
AI-generated document · Improved by the community