A C++ implementation of the Average Correlation Clustering Algorithm (ACCA) https://www.sciencedirect.com/science/article/pii/S1532046410000158, originally developed for genetic studies using Pearson correlation as a similarity measure. Unlike traditional clustering methods that rely on distance metrics such as Euclidean or Mahalanobis distance, ACCA groups data based on correlation patterns.
This implementation works directly with the correlation matrix derived from the corr_matrix
function and supports mixed data types along with various correlation methods.
ACCA is an unsupervised clustering method, meaning it identifies patterns without predefined labels. Similar to k-means, it requires defining the K parameter, which controls the number of clusters.
Usage
acca(m, k, ...)
# S3 method for class 'cmatrix'
acca(m, k, maxrep = 2L, maxiter = 100L, ...)
# S3 method for class 'matrix'
acca(m, k, maxrep = 2L, maxiter = 100L, ...)
Arguments
- m
[
matrix(1)
]
correlation matrix fromcorr_matrix
or a distance matrix.- k
[
integer(1)
]
number of clusters considered.- ...
Not used. Included for S3 method consistency.
- maxrep
[
integer(1)
]
maximum number of interactions without change in the clusters.- maxiter
[
integer(1)
]
maximum number of interactions.
Value
[acca_list(k)
]
A list with the
final result of the clustering method.
That is, every element of the list group names of the variables belonging to each cluster k.
References
Bhattacharya, Anindya, and Rajat K. De. "Average correlation clustering algorithm (ACCA) for grouping of co-regulated genes with similar pattern of variation in their expression values." Journal of Biomedical Informatics 43.4 (2010): 560-568.
Examples
# Clustering a correlation matrix with 3 clusters
x <- corrp::corrp(iris)
m <- corrp::corr_matrix(x)
result <- corrp::acca(m, k = 3)
print(result)
#> $cluster1
#> [1] "Sepal.Length"
#>
#> $cluster2
#> [1] "Sepal.Width"
#>
#> $cluster3
#> [1] "Species" "Petal.Length" "Petal.Width"
#>
#> attr(,"class")
#> [1] "acca_list" "list"
# Clustering with 5 clusters and increasing the maximum number of interactions
x <- corrp::corrp(iris)
m <- corrp::corr_matrix(x)
result <- corrp::acca(m, k = 5, maxiter = 200)
print(result)
#> $cluster1
#> [1] "Species"
#>
#> $cluster2
#> [1] "Petal.Width"
#>
#> $cluster3
#> [1] "Sepal.Length"
#>
#> $cluster4
#> [1] "Petal.Length"
#>
#> $cluster5
#> [1] "Sepal.Width"
#>
#> attr(,"class")
#> [1] "acca_list" "list"
# Adjusting the maximum number of iterations without change in clusters
x <- corrp::corrp(iris)
m <- corrp::corr_matrix(x)
result <- corrp::acca(m, k = 2, maxrep = 50)
print(result)
#> $cluster1
#> [1] "Petal.Length" "Sepal.Length" "Petal.Width"
#>
#> $cluster2
#> [1] "Species" "Sepal.Width"
#>
#> attr(,"class")
#> [1] "acca_list" "list"