Correlation coefficient - characteristic of the correlation model

Education

Correlation model (CM) is a programcomputation, providing a mathematical equation in which the result indicator is quantified depending on one or more indicators.

yx = a0 + a1x1

where: y - the resultant indicator, depending on the factor x;

x is a factor characteristic;

a1 is the CM parameter that shows how much the resultant exponent y will change as the factor x changes by one, if in this case all other factors affecting y remain unchanged;

ao is the CM parameter, which shows the influence of all other factors on the resultant exponent y, except for the factor sign x

When choosing the productive and factorialit is necessary to take into account that the resultant indicator in the chain of cause-effect relations is at a higher level than the factor indices.

Characteristics of the correlation model

After calculating the parameters of the correlation model, the correlation coefficient is calculated.

p is the coefficient of pair correlation, -1 ≤ p ≤ 1,shows the strength and direction of the influence factor factor on the resultant. The closer to 1, the stronger the connection, the closer to 0, the stronger the connection. If the correlation coefficient has a positive value, then the connection is direct, if the negative is the inverse.

The correlation coefficient of the formula: pxy = (xy-x * 1 / y) / ex * yy

ax = xx2- (x) 2; yy = y2- (y) 2

If the CM is a linear multifactor, having the form:

yx = a0 + a1x1 + a2x2 + ... + anxn

then a multiple correlation coefficient is calculated for it.

0 ≤ Р ≤ 1 and shows the strength of the influence of all the factor factors together on the resultant.

P = 1- ((yx-yu) 2 / (yu-oo) 2)

Where: uh - the result indicator - the calculated value;

yi - actual value;

actual value, average.

The calculated value yx is obtained as a result of substitution in the correlation model for x1, x2 etc. their actual values.

For the one-factor and multifactor non-linear models, the correlation ratio is calculated:

-1 ≤ m ≤ 1;

0 ≤ m ≤ 1

It is believed that the relationship between the productive andThe factors included in the model are weak if the value of the tightness of the connection (m) is within the range 0-0.3; if 0.3-0.7 - the tightness of the connection is average; above 0.7-1 - the connection is strong.

Since the correlation coefficient (pair) p,the correlation coefficient (multiple) P, the correlation ratio m - the probabilistic ones, then for them the coefficients of their importance are calculated (determined from the tables). If these coefficients are greater than their tabular value, then the tightness coefficients of the connection are significant causes. If the coefficients of materiality of the tightness of the connection are less than the tabulated values ​​or if the coupling coefficient itself is less than 0.7, then the model does not include all the factor indicators that significantly affect the result.

The coefficient of determination clearly demonstrates how the percentages included in the model determine the formation of the result.

D = P2 * 100%

D = p2 * 100%

D = m2 * 100%

If the coefficient of determination is greater than 50, thenthe model adequately describes the process under investigation, if less than 50, then we must return to the first stage of construction and review the selection of factor indicators for inclusion in the model.

The Fisher coefficient or the Fisher testcharacterizes the effectiveness of the model as a whole. If the calculated value of the coefficient exceeds the tabulated value, then the constructed model is suitable for analysis, as well as planning indicators, calculations for the future. Approximately table value = 1.5. If the estimated value is less than the tabulated value, it is necessary to build the model first, including the factors that significantly influence the result. In addition to the effectiveness of the model as a whole, each regression coefficient influences materiality. If the calculated value of this coefficient is higher than the tabulated value, then the regression coefficient will be significant, if less, then the factor for which this coefficient is calculated is removed from the sample, calculations begin first, but without this factor.