Standardized coefficient

From Wikipedia, the free encyclopedia

In statistics, standardized (regression) coefficients, also called beta coefficients or beta weights, are the estimates resulting from a regression analysis where the underlying data have been standardized so that the variances of dependent and independent variables are equal to 1.[1] Therefore, standardized coefficients are unitless and refer to how many standard deviations a dependent variable will change, per standard deviation increase in the predictor variable.

Usage[edit]

Standardization of the coefficient is usually done to answer the question of which of the independent variables have a greater effect on the dependent variable in a multiple regression analysis where the variables are measured in different units of measurement (for example, income measured in dollars and family size measured in number of individuals). It may also be considered a general measure of effect size, quantifying the "magnitude" of the effect of one variable on another. For simple linear regression with orthogonal predictors, the standardized regression coefficient equals the correlation between the independent and dependent variables.

Implementation[edit]

A regression carried out on original (unstandardized) variables produces unstandardized coefficients. A regression carried out on standardized variables produces standardized coefficients. Values for standardized and unstandardized coefficients can also be re-scaled to one another subsequent to either type of analysis. Suppose that is the regression coefficient resulting from a linear regression (predicting by ). The standardized coefficient simply results as , where and are the (estimated) standard deviations of and , respectively.[1]

Sometimes, standardization is done only without respect to the standard deviation of the regressor (the independent variable ).[2][3]

Advantages and disadvantages[edit]

Standardized coefficients' advocates note that the coefficients are independent of the involved variables' units of measurement (i.e., standardized coefficients are unitless), which makes comparisons easy.[3]

Critics voice concerns that such a standardization can be very misleading.[2][4] Due to the re-scaling based on sample standard deviations, any effect apparent in the standardized coefficient may be due to confounding with the particularities (especially: variability) of the involved data sample(s). Also, the interpretation or meaning of a "one standard deviation change" in the regressor may vary markedly between non-normal distributions (e.g., when skewed, asymmetric or multimodal).

Terminology[edit]

Some statistical software packages like PSPP, SPSS and SYSTAT label the standardized regression coefficients as "Beta" while the unstandardized coefficients are labeled "B". Others, like DAP/SAS label them "Standardized Coefficient". Sometimes the unstandardized variables are also labeled as "b".

See also[edit]

References[edit]

  1. ^ a b Menard, S. (2004), "Standardized regression coefficients", in Lewis-Beck, M.S.; Bryman, A.; Liao, T.F. (eds.), The Sage Encyclopedia of Social Science Research Methods, Thousand Oaks, CA, USA: Sage Publications, pp. 1069–1070, doi:10.4135/9781412950589.n959, ISBN 9780761923633
  2. ^ a b Greenland, S.; Schlesselman, J. J.; Criqui, M. H. (1986). "The fallacy of employing standardized regression coefficients and correlations as measures of effect". American Journal of Epidemiology. 123 (2): 203–208. doi:10.1093/oxfordjournals.aje.a114229. PMID 3946370.
  3. ^ a b Newman, T. B.; Browner, W. S. (1991). "In defense of standardized regression coefficients". Epidemiology. 2 (5): 383–386. doi:10.1097/00001648-199109000-00014. PMID 1742391.
  4. ^ Greenland, S.; Maclure, M.; Schlesselman, J. J.; Poole, C.; Morgenstern, H. (1991). "Standardized regression coefficients: A further critique and review of some alternatives". Epidemiology. 2 (5): 387–392. doi:10.1097/00001648-199109000-00016. PMID 1742393.

Further reading[edit]

  • Schroeder, Larry D.; Sjoquist, David L.; Stephan, Paula E. (1986). Understanding Regression Analysis. Sage Publications. pp. 31–32. ISBN 0-8039-2758-4.
  • Vittinghoff, Eric; Glidden, David V.; Shiboski, Stephen C.; McCulloch, Charles E. (2005). Regression Methods in Biostatistics: Linear, Logistic, Survival, and Repeated Measures Models. Springer. pp. 75–76. ISBN 0-387-20275-7.
  • Neter, J.; Kutner, M. H.; Nachtsheim, C.J.; Wasserman, W. (1996). "7.5 Standardized multiple regression model". Applied Linear Statistical Models (4th ed.). McGraw-Hill. pp. 281–284. ISBN 0-256-11736-5.

External links[edit]