| Title: | Bayesian Clustering Factor Models |
|---|---|
| Description: | Implements the Bayesian Clustering Factor Models (BCFM) for simultaneous clustering and latent factor analysis of multivariate longitudinal data. The model accounts for within-cluster dependence through shared latent factors while allowing heterogeneity across clusters, enabling flexible covariance modeling in high-dimensional settings. Inference is performed using Markov chain Monte Carlo (MCMC) methods with computationally intensive steps implemented via 'Rcpp'. Model selection and visualization tools are provided. The methodology is described in Shin, Ferreira, and Tegge (2018) <doi:10.1002/sim.70350>. |
| Authors: | Allison Tegge [aut], Marco Ferreira [aut], Hwasoo Shin [aut], Meriem Touami [aut, cre] |
| Maintainer: | Meriem Touami <[email protected]> |
| License: | GPL (>= 3) |
| Version: | 1.0.0 |
| Built: | 2026-05-12 07:22:54 UTC |
| Source: | https://github.com/ategge/bcfm |
Fits a single Bayesian Clustering Factor Models (BCFM) using C++ implementation. This function serves as a wrapper for the BCFMcpp function, handling timing and output formatting.
BCFM.fit( data, model.attributes, hyp.parm, n.iter = 50000, vague.mu = FALSE, covariance = TRUE, p.exponent = 2, every = 10 )BCFM.fit( data, model.attributes, hyp.parm, n.iter = 50000, vague.mu = FALSE, covariance = TRUE, p.exponent = 2, every = 10 )
data |
An array of dimensions (observations, variables, time points) |
model.attributes |
Model attributes generated by initialize.model.attributes |
hyp.parm |
Hyperparameters generated by initialize.hyp.parm |
n.iter |
Number of MCMC iterations. Default is 50000. |
vague.mu |
Logical indicating whether to use vague priors for mu. Default is FALSE. |
covariance |
Logical indicating whether to model covariance structure. Default is TRUE. |
p.exponent |
The Dirichlet priors exponent for probabilities. Default is 2. |
every |
Integer specifying the frequency of progress updates during MCMC. Default is 10. |
A list containing:
Output from the BCFMcpp C++ function
The model attributes used
The hyperparameters used
POSIXct timestamp when model started
POSIXct timestamp when model completed
Total elapsed time for model execution
BCFM.model.selection for fitting multiple models,
initialize.model.attributes, initialize.hyp.parm
# Prepare data using the included simulated dataset data(sim.data) data.pre <- init.data(sim.data, paste0("V", 1:5)) # Initialize model components model.attributes <- initialize.model.attributes(S = nrow(sim.data), times = 1, R = 5, L = 2, G = 2) cluster.hyperparms <- initialize.cluster.hyperparms(data.pre, model.attributes) hyp.parm <- initialize.hyp.parm(model.attributes, cluster.hyperparms) # Fit model result <- BCFM.fit(data.pre, model.attributes, hyp.parm, n.iter = 100, every = 10) result$run.time# Prepare data using the included simulated dataset data(sim.data) data.pre <- init.data(sim.data, paste0("V", 1:5)) # Initialize model components model.attributes <- initialize.model.attributes(S = nrow(sim.data), times = 1, R = 5, L = 2, G = 2) cluster.hyperparms <- initialize.cluster.hyperparms(data.pre, model.attributes) hyp.parm <- initialize.hyp.parm(model.attributes, cluster.hyperparms) # Fit model result <- BCFM.fit(data.pre, model.attributes, hyp.parm, n.iter = 100, every = 10) result$run.time
Performs Bayesian Clustering Factor Models analysis across a grid of group numbers and factor numbers. For each combination, the function fits the BCFM model, calculates IC, and saves results. This is the primary function for model selection to determine the optimal number of clusters and latent factors.
BCFM.model.selection( data, cluster.vars, grouplist, factorlist, n.iter = 50000, vague.mu = FALSE, covariance = TRUE, p.exponent = 2, every = 10, cluster.size = 0.05, burnin = NA, output_dir = tempdir(), seed = NULL )BCFM.model.selection( data, cluster.vars, grouplist, factorlist, n.iter = 50000, vague.mu = FALSE, covariance = TRUE, p.exponent = 2, every = 10, cluster.size = 0.05, burnin = NA, output_dir = tempdir(), seed = NULL )
data |
A data frame containing the variables to be analyzed |
cluster.vars |
A character vector specifying the column names of variables to be used for clustering |
grouplist |
A numeric vector specifying the numbers of groups to test (e.g., c(2, 3, 4, 5)) |
factorlist |
A numeric vector specifying the numbers of latent factors to test (e.g., c(1, 2, 3)) |
n.iter |
Number of MCMC iterations for each model. Default is 50000. |
vague.mu |
Logical indicating whether to use vague priors for mu. Default is FALSE. |
covariance |
Logical indicating whether to model covariance structure. Default is TRUE. |
p.exponent |
The Dirichlet priors exponent for probabilities. Default is 2. |
every |
Integer specifying the frequency of progress updates during MCMC. Default is 10. |
cluster.size |
Minimum proportion required for each cluster. Default is 0.05. |
burnin |
Number of initial MCMC iterations to discard when calculating IC. If NA, an appropriate burnin is determined automatically. |
output_dir |
Directory where results will be saved. Defaults to
|
seed |
Optional integer seed for reproducibility. |
The function performs the following steps for each group-factor combination:
Preprocesses data using init.data
Determines optimal variable ordering using permutation.order
Initializes model attributes and hyperparameters
Fits BCFM model using BCFM.fit
Calculates IC for model comparison
Saves individual results and cumulative IC matrix
The IC matrix can be used to identify the optimal model configuration by selecting the combination of groups and factors with the lowest IC value.
Invisibly returns NULL. Results are saved to disk:
Individual model results for each group-factor combination (where X is the number of groups and Y is the number of factors), containing SDresult and variable order
Contains IC.matrix, timing information, and data. Load this file to compare models and identify the optimal configuration.
This function can be computationally intensive as it fits multiple models. Consider running on high-performance computing resources for large datasets or extensive model grids. The function includes error handling to continue execution even if individual models fail to converge.
BCFM.fit for fitting a single model,
init.data, initialize.model.attributes,
initialize.hyp.parm
# Run model selection using the included simulated dataset data(sim.data) BCFM.model.selection( data = sim.data, cluster.vars = paste0("V", 1:5), grouplist = 2:3, factorlist = 1:2, n.iter = 100, every = 10, burnin = 10 ) # Load and examine IC results load(file.path(tempdir(), "IC.Rdata")) print(IC.matrix)# Run model selection using the included simulated dataset data(sim.data) BCFM.model.selection( data = sim.data, cluster.vars = paste0("V", 1:5), grouplist = 2:3, factorlist = 1:2, n.iter = 100, every = 10, burnin = 10 ) # Load and examine IC results load(file.path(tempdir(), "IC.Rdata")) print(IC.matrix)
It runs a Gibbs sampler for common factors X, factor loadings B, group mean mu, group covariate Omega, idiosyncratic variances sigma^2, group assignment Z and group probabilities probs. This function uses Rcpp.
BCFMcpp(data, model.attributes, hyp.parm, n.iter, every = 1, verbose = FALSE)BCFMcpp(data, model.attributes, hyp.parm, n.iter, every = 1, verbose = FALSE)
data |
The dataset |
model.attributes |
Model attributes generated by initialize.model.attributes |
hyp.parm |
Hyperparameters generated by initialize.hyp.parm |
n.iter |
Total number of iterations |
every |
Save every |
verbose |
Print the results by every 10th step |
List of Gibbs sampler of parameters described in description
The function returns the mode of a vector.
getmode(v)getmode(v)
v |
The vector to find the mode. |
The mode of v.
The function builds a column-wise plots of factor loadings. The parameters fixed at 1 are displayed with red dashed vertical lines.
ggplot_B.CI( Gibbs, true.val = NA, burnin = NA, permutation = NA, main.bool = TRUE, var_labels = NULL )ggplot_B.CI( Gibbs, true.val = NA, burnin = NA, permutation = NA, main.bool = TRUE, var_labels = NULL )
Gibbs |
Result of Gibbs sampler from BCFM function |
true.val |
True values of factor loadings. If not available, NA. |
burnin |
Number of burn-in. If not set, it uses the first tenths as burn-in period. |
permutation |
Permutation of variables. If not set, no permutation. |
main.bool |
Add title of the plots. Default is TRUE. |
var_labels |
Character vector of variable names. If NULL, defaults to Variable 1, Variable 2, etc. |
A ggplot object
It returns a trace plot of factor loadings, B, showing MCMC convergence
ggplot_B.trace( Gibbs, burnin = NA, permutation = NA, true.val = NA, factor.num = 1, var_labels = NULL )ggplot_B.trace( Gibbs, burnin = NA, permutation = NA, true.val = NA, factor.num = 1, var_labels = NULL )
Gibbs |
MCMC sample simulated from |
burnin |
Number of burn-in period. If not specified, no burn-in is removed. |
permutation |
Permutation order vector, if applicable |
true.val |
True values of factor loadings. If not available, NA. |
factor.num |
The index of variable to plot (which variable's loadings across all factors to display). Use NULL to plot all variables. |
var_labels |
Character vector of variable names. If NULL, uses Variable 1, Variable 2, etc. |
A ggplot object showing trace plots
Creates two plots showing IC values across different numbers of groups and factors to help identify the optimal model configuration.
ggplot_IC(matrix, factor_list = 2:4, group_list = 2:4, combine = TRUE)ggplot_IC(matrix, factor_list = 2:4, group_list = 2:4, combine = TRUE)
matrix |
An IC matrix from BCFM.model.selection, with rows representing groups and columns representing factors. |
factor_list |
Numeric vector of factor values corresponding to matrix columns. Default is 2:6. |
group_list |
Numeric vector of group values corresponding to matrix rows. Default is 5:11. |
combine |
Logical. If TRUE (default), returns a combined plot using ggarrange. If FALSE, returns a list of two separate ggplot objects. |
The function creates two complementary visualizations:
Plot 1: IC vs. number of groups, with lines for each number of factors
Plot 2: IC vs. number of factors, with lines for each number of groups
Lower IC values indicate better model fit.
If combine = TRUE, a combined ggplot object. If combine = FALSE, a list with two elements:
ggplot showing IC vs. number of groups
ggplot showing IC vs. number of factors
# Create a toy IC matrix for demonstration IC.matrix <- matrix(c(100, 95, 90, 92, 88, 85), nrow = 2, ncol = 3) rownames(IC.matrix) <- paste0("G", 2:3) colnames(IC.matrix) <- paste0("F", 1:3) # Combined plot (default) ggplot_IC(IC.matrix, factor_list = 1:3, group_list = 2:3) # Separate ggplot objects plots <- ggplot_IC(IC.matrix, factor_list = 1:3, group_list = 2:3, combine = FALSE) plots$by_groups plots$by_factors# Create a toy IC matrix for demonstration IC.matrix <- matrix(c(100, 95, 90, 92, 88, 85), nrow = 2, ncol = 3) rownames(IC.matrix) <- paste0("G", 2:3) colnames(IC.matrix) <- paste0("F", 1:3) # Combined plot (default) ggplot_IC(IC.matrix, factor_list = 1:3, group_list = 2:3) # Separate ggplot objects plots <- ggplot_IC(IC.matrix, factor_list = 1:3, group_list = 2:3, combine = FALSE) plots$by_groups plots$by_factors
Visualizes the latent factor profile means (mu) for each cluster, similar to Latent Profile Analysis (LPA) plots
ggplot_latent.profiles( Gibbs, burnin = NA, factor_labels = NULL, cluster_names = NULL, colors = NULL, title = "Latent Factor Profiles by Cluster", x_label = "Factor", y_label = "Posterior Mean" )ggplot_latent.profiles( Gibbs, burnin = NA, factor_labels = NULL, cluster_names = NULL, colors = NULL, title = "Latent Factor Profiles by Cluster", x_label = "Factor", y_label = "Posterior Mean" )
Gibbs |
Gibbs sample derived from |
burnin |
Number of burn-in period. If not specified, it uses the first tenth as burn-in period |
factor_labels |
Character vector of factor names. If NULL, defaults to Factor 1, Factor 2, etc. |
cluster_names |
Character vector of cluster names. If NULL, defaults to Cluster 1, Cluster 2, etc. |
colors |
Named vector of colors for each cluster. If NULL, uses default color palette |
title |
Plot title. Default is "Latent Factor Profiles by Cluster" |
x_label |
X-axis label. Default is "Factor" |
y_label |
Y-axis label. Default is "Posterior Mean" |
A ggplot object
# Fit a model first using the included simulated dataset data(sim.data) data.pre <- init.data(sim.data, paste0("V", 1:5)) model.attributes <- initialize.model.attributes(S = nrow(sim.data), times = 1, R = 5, L = 2, G = 2) cluster.hyperparms <- initialize.cluster.hyperparms(data.pre, model.attributes) hyp.parm <- initialize.hyp.parm(model.attributes, cluster.hyperparms) result <- BCFM.fit(data.pre, model.attributes, hyp.parm, n.iter = 100, every = 10) # Plot latent profiles ggplot_latent.profiles(Gibbs = result$Result)# Fit a model first using the included simulated dataset data(sim.data) data.pre <- init.data(sim.data, paste0("V", 1:5)) model.attributes <- initialize.model.attributes(S = nrow(sim.data), times = 1, R = 5, L = 2, G = 2) cluster.hyperparms <- initialize.cluster.hyperparms(data.pre, model.attributes) hyp.parm <- initialize.hyp.parm(model.attributes, cluster.hyperparms) result <- BCFM.fit(data.pre, model.attributes, hyp.parm, n.iter = 100, every = 10) # Plot latent profiles ggplot_latent.profiles(Gibbs = result$Result)
The function returns multiple density plots of group mean parameter mu.
ggplot_mu.density( Gibbs, true.val = NA, add.legend = FALSE, burnin = NA, layout.dim = NA, main.title = "Posterior Densities of Group Means (mu)", x.label = "mu", y.label = "Density" )ggplot_mu.density( Gibbs, true.val = NA, add.legend = FALSE, burnin = NA, layout.dim = NA, main.title = "Posterior Densities of Group Means (mu)", x.label = "mu", y.label = "Density" )
Gibbs |
The Gibbs sample from BCFM function |
true.val |
The true value of mu, if applicable |
add.legend |
Add legend on extra pane |
burnin |
Number of burn-in period. If not specified, it uses the first tenths as burn-in. |
layout.dim |
Dimension of panes. If not specified, the plots are in one column. |
main.title |
Main title for the entire plot. Default is "Posterior Densities of Group Means (mu)" |
x.label |
X-axis label. Default is "mu" |
y.label |
Y-axis label. Default is "Density" |
A ggplot object (grob from grid.arrange)
It returns multiple plots of the diagonal of group covariance, Omega using ggplot2 and gridExtra. It returns the result by each factor, different colors representing different factors
ggplot_omega.density( Gibbs, group.select = 1, true.val = NA, burnin = NA, main.title = "Posterior Densities of Omega", x.label = "Value", y.label = "Density", factor_labels = NULL, show.offdiag = TRUE )ggplot_omega.density( Gibbs, group.select = 1, true.val = NA, burnin = NA, main.title = "Posterior Densities of Omega", x.label = "Value", y.label = "Density", factor_labels = NULL, show.offdiag = TRUE )
Gibbs |
Gibbs sample from BCFM |
group.select |
Group/cluster to plot. If not specified, the first group will be used. |
true.val |
True values of Omega, if applicable. |
burnin |
Number of burn-in period. If not specified, the first tenths is used as burn-in. |
main.title |
Main title for the plot. Default is "Posterior Densities of Omega" |
x.label |
X-axis label. Default is "Value" |
y.label |
Y-axis label. Default is "Density" |
factor_labels |
Character vector of factor names. If NULL, defaults to Factor 1, Factor 2, etc. |
show.offdiag |
Show off-diagonal elements. Default is TRUE for any k. |
A ggplot object showing densities of Omega elements
The function returns a density plot of the cluster assignment probability, p. Different color represent different Clusters.
ggplot_probs.density( Gibbs, burnin = NA, truep = NA, main.title = "Posterior Densities of Cluster Probabilities", x.label = "Probability", y.label = "Density", cluster_names = NULL )ggplot_probs.density( Gibbs, burnin = NA, truep = NA, main.title = "Posterior Densities of Cluster Probabilities", x.label = "Probability", y.label = "Density", cluster_names = NULL )
Gibbs |
MCMC sample simulated from |
burnin |
Number of burn-in period. If not specified, it uses the first tenths as burn-in. |
truep |
True values of probabilities. If not available, NA. |
main.title |
Title of the plot. Default is "Posterior Densities of Cluster Probabilities" |
x.label |
X-axis label. Default is "Probability" |
y.label |
Y-axis label. Default is "Density" |
cluster_names |
Character vector of cluster names. If NULL, defaults to Cluster 1, Cluster 2, etc. |
A ggplot object (grob from grid.arrange) with plot and legend
It returns a trace plot of probabilities parameter after burn-in. Different colors represent different groups.
ggplot_probs.trace( Gibbs, burnin = NA, main.title = "Trace Plot: Cluster Probabilities", x.label = "BCFM Iteration (post burn-in)", y.label = "Probability", cluster_names = NULL )ggplot_probs.trace( Gibbs, burnin = NA, main.title = "Trace Plot: Cluster Probabilities", x.label = "BCFM Iteration (post burn-in)", y.label = "Probability", cluster_names = NULL )
Gibbs |
Gibbs sample from |
burnin |
Number of burn-in period. If not specified, it uses the first tenths sample as burn-in. |
main.title |
Title of the plot. Default is "Trace Plot: Cluster Probabilities" |
x.label |
X-axis label. Default is "BCFM Iteration (post burn-in)" |
y.label |
Y-axis label. Default is "Probability" |
cluster_names |
Character vector of cluster names. If NULL, defaults to Cluster 1, Cluster 2, etc. |
A ggplot object showing trace plots
It returns a credible interval plot of idiosyncratic variance, sigma squared. The lines are 95% intervals, while the circles are posterior mean.
ggplot_sigma2.CI( Gibbs, burnin = NA, permutation = NA, main.bool = TRUE, main.title = NULL, x.label = "Variables", y.label = "Variance", var_labels = NULL )ggplot_sigma2.CI( Gibbs, burnin = NA, permutation = NA, main.bool = TRUE, main.title = NULL, x.label = "Variables", y.label = "Variance", var_labels = NULL )
Gibbs |
Gibbs sample from |
burnin |
Number of burn-in period. If not specified, it uses the first tenths sample as burn-in period. |
permutation |
Permutation vector, if applicable |
main.bool |
Return main title. Default is TRUE. |
main.title |
Main title for the plot. Default is expression for sigma squared. |
x.label |
X-axis label. Default is "Variables" |
y.label |
Y-axis label. Default is "Variance" |
var_labels |
Character vector of variable names. If NULL, defaults to Variable 1, Variable 2, etc. |
A ggplot object showing the 95\ posterior means (points) of the idiosyncratic variances sigma squared for each variable.
It returns a credible interval plot of factor loadings covariance, tau. The lines are 95% intervals, while the circles are posterior mean.
ggplot_tau.CI( Gibbs, burnin = NA, true.val = NA, main.bool = TRUE, main.title = NULL, x.label = "Factor", y.label = "Tau", factor_labels = NULL )ggplot_tau.CI( Gibbs, burnin = NA, true.val = NA, main.bool = TRUE, main.title = NULL, x.label = "Factor", y.label = "Tau", factor_labels = NULL )
Gibbs |
Gibbs sample from |
burnin |
Number of burn-in period. If not specified, it uses the first tenths sample as burn-in period. |
true.val |
True values of the taus |
main.bool |
Return main title. Default is TRUE. |
main.title |
Main title for the plot. Default is expression for tau. |
x.label |
X-axis label. Default is "Factor" |
y.label |
Y-axis label. Default is "Tau" |
factor_labels |
Character vector of factor names. If NULL, defaults to Factor 1, Factor 2, etc. |
A ggplot object
Plots the proportion of variability explained by each factor based on eigenvalues of the correlation matrix. Useful for determining the number of factors.
ggplot_variability( data, nfactors = 5, main.title = "Proportion of Variability Explained by Factors", x.label = "Number of Factors", y.label = "Proportion of Variability" )ggplot_variability( data, nfactors = 5, main.title = "Proportion of Variability Explained by Factors", x.label = "Number of Factors", y.label = "Proportion of Variability" )
data |
The data matrix |
nfactors |
Number of factors to display. Default is 5. |
main.title |
Main title for the plot. Default is "Proportion of Variability Explained by Factors" |
x.label |
X-axis label. Default is "Number of Factors" |
y.label |
Y-axis label. Default is "Proportion of Variability" |
A ggplot object
A heatmap of group assignments, Z using ggplot2. It first sorts by the largest group with the most assigned.
ggplot_Zit.heatmap( Gibbs, true.val = NA, burnin = NA, main.title = "Cluster Assignment Heatmap", x.label = "Cluster", y.label = "Subject-Time" )ggplot_Zit.heatmap( Gibbs, true.val = NA, burnin = NA, main.title = "Cluster Assignment Heatmap", x.label = "Cluster", y.label = "Subject-Time" )
Gibbs |
Gibbs sample from |
true.val |
Table of true group assignments, if applicable |
burnin |
Number of burn-in period. If not specified, it uses the first tenths sample as burn-in. |
main.title |
Title of the plot. Default is "Cluster Assignment Heatmap" |
x.label |
X-axis label. Default is "Cluster" |
y.label |
Y-axis label. Default is "Subject-Time" |
A ggplot object
It finds a Laplace-Metropolis marginal density of likelihood using posterior mean. It also uses Woodbury lemma for fast calculation
IC(data.row, Gibbs, model.attributes, cluster.size = 0.05, burnin = NA)IC(data.row, Gibbs, model.attributes, cluster.size = 0.05, burnin = NA)
data.row |
The dataset |
Gibbs |
Gibbs sample derived form |
model.attributes |
Model attributes generated by |
cluster.size |
Minimum proportion required for each cluster (default 0.05) |
burnin |
Number of burn-in period. If not specified, it uses the first tenths as burn-in period |
The value of Laplace-Metropolis marginal density
Prepares the input data by converting it into a 3D array format required by the BCFM model. If data is already in the correct 3D array format, it returns the data as-is. Takes selected clustering variables and creates an array with dimensions (observations, variables, time points).
init.data(data, cluster.vars = NULL)init.data(data, cluster.vars = NULL)
data |
A data frame, matrix, or 3D array containing the data to be used for clustering. If a 3D array with appropriate dimensions is provided and cluster.vars is NULL, the function returns the data unchanged. |
cluster.vars |
A character vector specifying the column names of variables to be used for clustering (required for data frames). If NULL and data is already a 3D array, the function returns data as-is. If NULL and data is a matrix, all columns are used. |
A 3D array with dimensions (n, p, t) where n is the number of observations, p is the number of clustering variables, and t is the number of time points (defaults to 1 for cross-sectional data).
# Example 1: Data frame with variable selection data <- data.frame(x1 = rnorm(100), x2 = rnorm(100), x3 = rnorm(100)) cluster.vars <- c("x1", "x2", "x3") data.pre <- init.data(data, cluster.vars) # Example 2: Matrix (uses all columns) data_matrix <- matrix(rnorm(300), nrow = 100, ncol = 3) data.pre <- init.data(data_matrix) # Example 3: 3D array (returns as-is) data_3d <- array(rnorm(1500), dim = c(100, 3, 5)) data.pre <- init.data(data_3d) # Returns unchanged# Example 1: Data frame with variable selection data <- data.frame(x1 = rnorm(100), x2 = rnorm(100), x3 = rnorm(100)) cluster.vars <- c("x1", "x2", "x3") data.pre <- init.data(data, cluster.vars) # Example 2: Matrix (uses all columns) data_matrix <- matrix(rnorm(300), nrow = 100, ncol = 3) data.pre <- init.data(data_matrix) # Example 3: 3D array (returns as-is) data_3d <- array(rnorm(1500), dim = c(100, 3, 5)) data.pre <- init.data(data_3d) # Returns unchanged
The function returns list of hyperparmeters for Omegas and mus.
initialize.cluster.hyperparms( data, model.attributes, covariance = FALSE, diag.Psi = FALSE, vague.mu = FALSE, zero.mu = FALSE, seed = NULL )initialize.cluster.hyperparms( data, model.attributes, covariance = FALSE, diag.Psi = FALSE, vague.mu = FALSE, zero.mu = FALSE, seed = NULL )
data |
The dataset. |
model.attributes |
Model attributes generated by initialize.model.attributes |
covariance |
Use of covariance matrix of common factors. If FALSE, it uses the correlation matrix. |
diag.Psi |
Diagonal matrix for cluster covariance. If FALSE, it uses the sample covariance. |
vague.mu |
Use of large cluster covariance prior. |
zero.mu |
Set the cluster mean prior at 0. If FALSE, the cluster mean prior are the sample means of the clusters. |
seed |
Optional integer seed for reproducibility. |
A list of mean and variance hyperparameter of mu, and scale hyperparameter of Omega
The function returns a list of hyperparameters of Omega, sigma^2 and mu. It also calls the results from cluster.hyperparms and information from model.attributes.
initialize.hyp.parm( model.attributes, cluster.hyperparms, n.sigma = 2.2, n.s2.sigma = 0.1, n.tau = 1, n.s2.tau = 1, omega.diag.nu = 2, p.exponent = NA )initialize.hyp.parm( model.attributes, cluster.hyperparms, n.sigma = 2.2, n.s2.sigma = 0.1, n.tau = 1, n.s2.tau = 1, omega.diag.nu = 2, p.exponent = NA )
model.attributes |
Model attributes generated by initialize.model.attributes |
cluster.hyperparms |
Cluster related hyperparameters generated by initialize.cluster.hyperparms |
n.sigma |
The shape parameter of sigma^2. If not specified, 6. |
n.s2.sigma |
The rate parameter of sigma^2. If not specified, 4. |
n.tau |
The shape parameter of tau^2. If not specified, 6. |
n.s2.tau |
The rate parameter of tau^2. If not specified, 4. |
omega.diag.nu |
The shape parameter for the first cluster covariance |
p.exponent |
The Dirichlet priors of probabilities. |
A list of fixed hyperparmeters of mu, Omega and sigma squared.
Basic setting and information of the dataset: number of subjects, timepoints, variables, factors and groups.
initialize.model.attributes(S = 216, times = 5, R = 9, L = 3, G = 4)initialize.model.attributes(S = 216, times = 5, R = 9, L = 3, G = 4)
S |
Number of subjects |
times |
Number of timepoints |
R |
Number of covariates |
L |
Number of factors |
G |
Number of groups |
A list of model attributes
It finds the vector of permutation to permute data by its largest absolute value in each eigenvector. It sets the order by specified number of factors, and the rest is ordered as they were.
permutation.order(data, covariance = FALSE, L = 1, fa = TRUE)permutation.order(data, covariance = FALSE, L = 1, fa = TRUE)
data |
The dataset |
covariance |
Logic variable indicating whether the analysis uses covariance or correlation matrix |
L |
Number of latent factors |
fa |
Use factor analysis to sort the variables |
The vector of permutation
It finds the vector of permutation to permute data by its largest absolute value in each eigenvector. It sets the order by specified number of factors, and the rest is ordered as they were. The data is permuted, and if needed, scaled.
permutation.scale( data, permutation = NA, covariance = FALSE, return.array = TRUE, num.layers = 1 )permutation.scale( data, permutation = NA, covariance = FALSE, return.array = TRUE, num.layers = 1 )
data |
The dataset |
permutation |
Vector with the permutation of the data |
covariance |
Logic variable indicating whether the analysis uses covariance or correlation matrix |
return.array |
Return the data as 3-dimensional array |
num.layers |
Number of timepoints |
The dataset that is permuted, either in matrix or array
A simulated dataset for demonstrating the Bayesian Consensus Factor Model.
sim.datasim.data
A data frame with 200 rows and 20 variables (V1 through V20): All variables are simulated numeric values.
Simulated data generated for package examples
data(sim.data) dim(sim.data) head(sim.data)data(sim.data) dim(sim.data) head(sim.data)