disjoin.Rd
Transform a factor in separate variables (one per level) with a binary code (0 for absent, 1 for present) in each variable
disjoin(x)
Use cut()
to transform a numerical variable into a factor variable
a matrix containing the data with binary coding
Fromentin J.-M., F. Ibanez & P. Legendre, 1993. A phytosociological method for interpreting plankton data. Mar. Ecol. Prog. Ser., 93:285-306.
Gebski, V.J., 1985. Some properties of splicing when applied to non-linear smoothers. Comput. Stat. Data Anal., 3:151-157.
Grandjouan, G., 1982. Une méthode de comparaison statistique entre les répartitions des plantes et des climats. Thèse d'Etat, Université Louis Pasteur, Strasbourg.
Ibanez, F., 1976. Contribution à l'analyse mathématique des événements en Ecologie planctonique. Optimisations méthodologiques. Bull. Inst. Océanogr. Monaco, 72:1-96.
# Artificial data with 1/5 of zeros
Z <- c(abs(rnorm(8000)), rep(0, 2000))
# Let the program chose cuts
table(cut(Z, breaks=5))
#>
#> (-0.00359,0.717] (0.717,1.43] (1.43,2.15] (2.15,2.87]
#> 6256 2561 944 218
#> (2.87,3.59]
#> 21
# Create one class for zeros, and 4 classes for the other observations
Z2 <- Z[Z != 0]
cuts <- c(-1e-10, 1e-10, quantile(Z2, 1:5/5, na.rm=TRUE))
cuts
#> 20% 40% 60%
#> -0.0000000001 0.0000000001 0.2472909942 0.5205766411 0.8265420885
#> 80% 100%
#> 1.2690374813 3.5869454757
table(cut(Z, breaks=cuts))
#>
#> (-1e-10,1e-10] (1e-10,0.247] (0.247,0.521] (0.521,0.827] (0.827,1.27]
#> 2000 1600 1600 1600 1600
#> (1.27,3.59]
#> 1600
# Binary coding of these data
disjoin(cut(Z, breaks=cuts))[1:10, ]
#> (-1e-10,1e-10] (1e-10,0.247] (0.247,0.521] (0.521,0.827] (0.827,1.27]
#> [1,] 0 0 1 0 0
#> [2,] 0 0 0 1 0
#> [3,] 0 0 0 0 0
#> [4,] 0 0 1 0 0
#> [5,] 0 0 0 0 0
#> [6,] 0 0 1 0 0
#> [7,] 0 0 0 0 1
#> [8,] 0 0 0 0 0
#> [9,] 0 0 1 0 0
#> [10,] 0 0 0 0 0
#> (1.27,3.59]
#> [1,] 0
#> [2,] 0
#> [3,] 1
#> [4,] 0
#> [5,] 1
#> [6,] 0
#> [7,] 0
#> [8,] 1
#> [9,] 0
#> [10,] 1