Changes

Jump to navigation Jump to search
m
Line 10: Line 10:  
}}
 
}}
   −
[[File:Correlation-and-regression-analysis fig1.png|thumb|{{figure number|1}}Linear regression of x-on-y. Note the negative slope corresponding to a negative correlation. The regression line is determined so as to minimize the sum of squared deviations: <math>\sum_i{e_i^2}</math>]]
+
[[File:Correlation-and-regression-analysis fig1.png|300px|thumb|{{figure number|1}}Linear regression of x-on-y. Note the negative slope corresponding to a negative correlation. The regression line is determined so as to minimize the sum of squared deviations: <math>\sum_i{e_i^2}</math>]]
    
Correlation analysis, and its cousin, regression analysis, are well-known statistical approaches used in the study of relationships among multiple physical properties. The investigation of [[permeability]]-[[porosity]] relationships is a typical example of the use of correlation in geology.
 
Correlation analysis, and its cousin, regression analysis, are well-known statistical approaches used in the study of relationships among multiple physical properties. The investigation of [[permeability]]-[[porosity]] relationships is a typical example of the use of correlation in geology.
Line 16: Line 16:  
The term ''correlation'' most often refers to the linear association between two quantities or variables, that is, the tendency for one variable, x, to increase or decrease as the other, y, increases or decreases, in a straight-line trend or relationship.<ref name=Draper_etal_1966>Draper, N. R., and H. Smith, 1966, Applied regression analysis, 2nd ed.: New York, John Wiley, 709 p.</ref> <ref name=Snedecor_etal_1967>Snedecor, G. W., and W. G. Cochran, 1967, Statistical methods, 6th ed.: Ames, Iowa State Univ. Press, 593 p.</ref> The ''correlation coefficient'' (also called the Pearson correlation coefficient), r, is a dimensionless numerical index of the strength of that relationship. The sample value of r, which can range from -1 to +1, is computed using the following formula:
 
The term ''correlation'' most often refers to the linear association between two quantities or variables, that is, the tendency for one variable, x, to increase or decrease as the other, y, increases or decreases, in a straight-line trend or relationship.<ref name=Draper_etal_1966>Draper, N. R., and H. Smith, 1966, Applied regression analysis, 2nd ed.: New York, John Wiley, 709 p.</ref> <ref name=Snedecor_etal_1967>Snedecor, G. W., and W. G. Cochran, 1967, Statistical methods, 6th ed.: Ames, Iowa State Univ. Press, 593 p.</ref> The ''correlation coefficient'' (also called the Pearson correlation coefficient), r, is a dimensionless numerical index of the strength of that relationship. The sample value of r, which can range from -1 to +1, is computed using the following formula:
   −
:<math>r = \frac{\displaystyle \sum_{i} (x_{i}-\bar{x})(y_{i}-\bar{y})}{\sqrt{\displaystyle \sum_i (x_{i}-\bar{x})^2 \cdot \displaystyle \sum_i (y_{i}-\bar{y})^2}}</math>
+
:<math>r = \frac{\displaystyle \sum_{i} (x_{i}-\bar{x})(y_{i}-\bar{y})}{\sqrt{\displaystyle \sum_i (x_{i}-\bar{x})^2 \times \displaystyle \sum_i (y_{i}-\bar{y})^2}}</math>
    
where the summation is made over the n sample values available and where
 
where the summation is made over the n sample values available and where
Line 68: Line 68:  
==Multiple and multivariate regression==
 
==Multiple and multivariate regression==
   −
The most important extension of the two-variable case is to situations involving more than two variables. When there is still one dependent variable but many predictor variables, the fitting technique is called ''multiple linear regression.'' When there are also more than one dependent variable, the approach is called ''multivariate regression'' (see [[Multivariate data analysis]]). The methods of simple bivariate regression extend directly to these multivariate situations. A typical geological application of multiple regression is the prediction of fold thickness from various geometric attributes, as given by the following equation:
+
The most important extension of the two-variable case is to situations involving more than two variables. When there is still one dependent variable but many predictor variables, the fitting technique is called ''multiple linear regression.'' When there are also more than one dependent variable, the approach is called ''multivariate regression'' (see [[Multivariate data analysis]]). The methods of simple bivariate regression extend directly to these multivariate situations. A typical geological application of multiple regression is the prediction of [[fold]] thickness from various geometric attributes, as given by the following equation:
    
:<math>\text{Thickness } = a+b~(\text{attitude}) + c~(\text{tightness}) + d~(\text{asymmetry}) </math>
 
:<math>\text{Thickness } = a+b~(\text{attitude}) + c~(\text{tightness}) + d~(\text{asymmetry}) </math>
Line 85: Line 85:     
[[Category:Geological methods]] [[Category:Test content]][[Category:Pages with unformatted equations]]
 
[[Category:Geological methods]] [[Category:Test content]][[Category:Pages with unformatted equations]]
 +
[[Category:Methods in Exploration 10]]

Navigation menu