Skip to contents
loading...

Compute the correlation matrix between all columns of a matrix or data frame.

Usage

correlation(x, ...)

Correlation(x, ...)

# S3 method for formula
correlation(formula, data = NULL, subset, na.action, ...)

# S3 method for default
correlation(
  x,
  y = NULL,
  use = "everything",
  method = c("pearson", "kendall", "spearman"),
  ...
)

is.Correlation(x)

is.correlation(x)

as.Correlation(x)

as.correlation(x)

# S3 method for Correlation
print(x, digits = 3, cutoff = 0, ...)

# S3 method for Correlation
summary(
  object,
  cutpoints = c(0.3, 0.6, 0.8, 0.9, 0.95),
  symbols = c(" ", ".", ",", "+", "*", "B"),
  ...
)

# S3 method for summary.Correlation
print(x, ...)

# S3 method for Correlation
plot(
  x,
  y = NULL,
  outline = TRUE,
  cutpoints = c(0.3, 0.6, 0.8, 0.9, 0.95),
  palette = rwb.colors,
  col = NULL,
  numbers = TRUE,
  digits = 2,
  type = c("full", "lower", "upper"),
  diag = (type == "full"),
  cex.lab = par("cex.lab"),
  cex = 0.75 * par("cex"),
  ...
)

# S3 method for Correlation
lines(
  x,
  choices = 1L:2L,
  col = par("col"),
  lty = 2,
  ar.length = 0.1,
  pos = NULL,
  cex = par("cex"),
  labels = rownames(x),
  ...
)

Arguments

x

A numeric vector, matrix or data frame (or any object for is.Correlation() or as.Correlation()).

...

Further arguments passed to functions.

formula

A formula with no response variable, referring only to numeric variables.

data

An optional data frame (or similar, see model.frame()) containing the variables in the formula. By default the variables are taken from environment(formula).

subset

An optional vector used to select rows (observations) of the data matrix x.

na.action

A function which indicates what should happen when the data contain NAs. The default is set by the na.action setting of options() and na.fail() is used if that is not set. The 'factory-fresh' default is na.omit().

y

NULL (default), or a vector, matrix or data frame with compatible dimensions to x for Correlation(). The default is equivalent to x = y, but more efficient.

use

An optional character string giving a method for computing correlations in the presence of missing values. This must be (an abbreviation of) one of the strings "everything", "all.obs", "complete.obs", "na.or.complete", or "pairwise.complete.obs".

method

A character string indicating which correlation coefficient is to be computed. One of "pearson" (default), "kendall", or "spearman", can be abbreviated.

digits

Digits to print after the decimal separator.

cutoff

Correlation coefficients lower than this (in absolute value) are suppressed.

object

A 'Correlation' object.

cutpoints

The cut points to use for categories. Specify only positive values (absolute value of correlation coefficients are summarized, or negative equivalents are automatically computed for the graph. Do not include 0 or 1 in the cutpoints).

symbols

The symbols to use to summarize the correlation matrix.

outline

Do we draw the outline of the ellipse?

palette

A function that can produce a palette of colors.

col

Color of the ellipse. If NULL (default), the colors will be computed using cutpoints and palette.

numbers

Do we print correlation values in the center of the ellipses?

type

Do we plot a complete matrix, or only lower or upper triangle?

diag

Do we plot items on the diagonal? They have always a correlation of one.

cex.lab

The expansion factor for labels.

cex

The expansion factor for text.

choices

The items to select.

lty

The line type to draw.

ar.length

The length of the arrow head.

pos

The position relative to arrows.

labels

The label to draw near the arrows.

Value

Correlation() and as.Correlation() create a 'Correlation' object, while is.Correlation() tests for it.

There are print() and summary() methods for the 'Correlation' object that differ in the symbolic encoding of the correlations, (using symnum() for summary()), which makes large correlation matrices more readable.

The plot() method draws ellipses on a graph to represent the correlation matrix visually. This is essentially the plotcorr() function from package ellipse, with slightly different default arguments and with default cutpoints equivalent to those used in the summary() method.

See also

cov(), cov2cor(), cov.wt(), symnum(), plotcorr() and look also at panel_cor()

Author

Philippe Grosjean phgrosjean@sciviews.org, wrapping code in package ellipse, function plotcorr() for the plot.Correlation() method.

Examples

# This is a simple correlation coefficient
cor(rnorm(10), runif(10))
#> [1] -0.4933597
Correlation(rnorm(10), runif(10))
#> Matrix of Pearson's product-moment correlation:
#> (calculation uses everything)
#>   x      y     
#> x  1.000 -0.305
#> y -0.305  1.000

# 'Correlation' objects allow better inspection of the correlation matrices
# than the output of default R cor() function
(longley.cor <- Correlation(longley))
#> Matrix of Pearson's product-moment correlation:
#> (calculation uses everything)
#>              GNP.deflator GNP    Unemployed Armed.Forces Population Year  
#> GNP.deflator  1.000        0.992  0.621      0.465        0.979      0.991
#> GNP           0.992        1.000  0.604      0.446        0.991      0.995
#> Unemployed    0.621        0.604  1.000     -0.177        0.687      0.668
#> Armed.Forces  0.465        0.446 -0.177      1.000        0.364      0.417
#> Population    0.979        0.991  0.687      0.364        1.000      0.994
#> Year          0.991        0.995  0.668      0.417        0.994      1.000
#> Employed      0.971        0.984  0.502      0.457        0.960      0.971
#>              Employed
#> GNP.deflator  0.971  
#> GNP           0.984  
#> Unemployed    0.502  
#> Armed.Forces  0.457  
#> Population    0.960  
#> Year          0.971  
#> Employed      1.000  
summary(longley.cor) # Synthetic view of the correlation matrix
#> Matrix of Pearson's product-moment correlation:
#> (calculation uses everything)
#>              GNP. GNP U A P Y E
#> GNP.deflator 1                 
#> GNP          B    1            
#> Unemployed   ,    ,   1        
#> Armed.Forces .    .     1      
#> Population   B    B   , . 1    
#> Year         B    B   , . B 1  
#> Employed     B    B   . . B B 1
#> attr(,"legend")
#> [1] 0 ‘ ’ 0.3 ‘.’ 0.6 ‘,’ 0.8 ‘+’ 0.9 ‘*’ 0.95 ‘B’ 1
plot(longley.cor)    # Graphical representation


# Use of the formula interface
(mtcars.cor <- Correlation(~ mpg + cyl + disp + hp, data = mtcars,
  method = "spearman", na.action = "na.omit"))
#> Matrix of Spearman's rank correlation rho:
#> (missing values are managed with na.omit)
#>      mpg    cyl    disp   hp    
#> mpg   1.000 -0.911 -0.909 -0.895
#> cyl  -0.911  1.000  0.928  0.902
#> disp -0.909  0.928  1.000  0.851
#> hp   -0.895  0.902  0.851  1.000

mtcars.cor2 <- Correlation(mtcars, method = "spearman")
print(mtcars.cor2, cutoff = 0.6)
#> Matrix of Spearman's rank correlation rho:
#> (calculation uses everything)
#>      mpg    cyl    disp   hp     drat   wt     qsec   vs     am     gear  
#> mpg   1.000 -0.911 -0.909 -0.895  0.651 -0.886         0.707              
#> cyl  -0.911  1.000  0.928  0.902 -0.679  0.858        -0.814              
#> disp -0.909  0.928  1.000  0.851 -0.684  0.898        -0.724 -0.624       
#> hp   -0.895  0.902  0.851  1.000         0.775 -0.667 -0.752              
#> drat  0.651 -0.679 -0.684         1.000 -0.750                0.687  0.745
#> wt   -0.886  0.858  0.898  0.775 -0.750  1.000               -0.738 -0.676
#> qsec                      -0.667                1.000  0.792              
#> vs    0.707 -0.814 -0.724 -0.752                0.792  1.000              
#> am                 -0.624         0.687 -0.738                1.000  0.808
#> gear                              0.745 -0.676                0.808  1.000
#> carb -0.657                0.733               -0.659 -0.634              
#>      carb  
#> mpg  -0.657
#> cyl        
#> disp       
#> hp    0.733
#> drat       
#> wt         
#> qsec -0.659
#> vs   -0.634
#> am         
#> gear       
#> carb  1.000
summary(mtcars.cor2)
#> Matrix of Spearman's rank correlation rho:
#> (calculation uses everything)
#>      m cy ds h dr w q v a g cr
#> mpg  1                        
#> cyl  * 1                      
#> disp * *  1                   
#> hp   + *  +  1                
#> drat , ,  ,  . 1              
#> wt   + +  +  , ,  1           
#> qsec . .  .  ,      1         
#> vs   , +  ,  , .  . , 1       
#> am   . .  ,  . ,  ,     1     
#> gear . .  .  . ,  ,     + 1   
#> carb , .  .  ,    . , ,     1 
#> attr(,"legend")
#> [1] 0 ‘ ’ 0.3 ‘.’ 0.6 ‘,’ 0.8 ‘+’ 0.9 ‘*’ 0.95 ‘B’ 1
plot(mtcars.cor2, type = "lower")


mtcars.cor2["mpg", "cyl"] # Extract a correlation from the correlation matrix
#> [1] -0.9108013