# How do I interpret correlation, R?

Cost - 2.42, 1.14, 1.34, 1.24, 1.59, 1.18, 3.11, 2.16, 2.76, 2.31

Content- 560, 1560, 1740, 1600, 1520, 1570, 690, 640, 610,660

r = ???

It's all so very confusing to me!

### 2 Answers

- vekkus4Lv 69 years agoFavorite Answer
Follow the formulas!

http://mathworld.wolfram.com/CorrelationCoefficien...

http://en.wikipedia.org/wiki/Pearson_product-momen...

First compute these three things called

ss_(xy)

ss_(xx)

ss_(yy)

I think "ss" means sum of squares here.

Then do (ss_(xy))^2 / (ss_(xx) ss_(yy) ) = r^2

Then take the square root of it.

Count how many points there are. There are 10 Cost numbers and 10 Content numbers. It doesn't even matter which ones are the x and which ones are the y to compute the correlation. Let x1, x2, ... x10 be the Cost numbers.

Find the average: add up all the Costs and divide by 10. Call that X. Then add up the sum of the squares:

ssxx = (x1 - X)^2 + (x2 - X)^2 + ... + (x10-X)^2.

Then do the same process with the 10 Content numbers - find the mean, call it Y, then do the sum of squares of the differences from the mean. That gives you ssyy.

Next do the cross terms ssxy. To do this you have to keep the numbers in order in the list. If they have gotten shuffled around then it will give the wrong answer.

It will look like this

(x1 - X)(y1 - Y) + (x2 - X) (y2 - Y) + (x3 - X) (y3 - Y) + .... + (x10 - X) (y10 - Y)

If you get bored too fast doing this yourself on paper with a calculator it might be sweeter to find a calculator that does statistics. Find the manual and figure out how to enter data points for (x,y) and how to use the stats button. Or get a spreadsheet program like Open Office Calc or Microsoft Excel. But if you just want to use regular paper with lines then you would just have about 7 columns of numbers

xi yi xi-X yi-Y (xi-X)^2 (yi-Y)^2 (xi-X)(yi-Y)

Then at the end of the first column you put the sum of the xi and the mean,

at the end of the yi column you put the sum of the yi and the mean.

If doing it without machinery then you can round off the mean to 2 decimal places (3 or more is safer if you need it to be very accurate. Hopefully you have wide paper or you can write small.)

The three answers you need are the total for the 5th column, the total for the 6th column, and the total for the 7th column.

Looking at these numbers I think the correlation should be less than zero because when the cost goes down like in the 1.-- then the content is higher like in the 1000s but if the cost is higher like 2.something then the content is lower like in the 500s or 600s.

Maybe that is what you are asking and I didn't have to go to all that trouble to talk about how it is computed :-)

- Login to reply the answers

- papkeLv 44 years ago
you are attempting out the null hypothesis that the inhabitants correlation coefficient is 0 against the alternative hypothesis that it is >0. in case you have a table of great values you come across that with a pattern of length 5 the intense 0.05 fee of r is 0.8054 and because this 0.845 exceeds this, reject the null in want of the alternative. settle for that there is adequate data of a favorable correlation between X and Y. in spite of the undeniable fact that, it form of feels unusual to be assuming that the relation is beneficial. it is effective considering that with this technique had r been (say) 0.645 there could be inadequate data to help the thought.

- Login to reply the answers