Skip to contents

core Given and interaction_model, this returns a new interaction_model that is the "k-core" of the "largest connected component" of the original interaction_model. This function is recommended when diagnose(im) shows that the majority of rows/columns have 1, 2, or 3 connections. In this case, the data is potentially too sparse for pca. If you simply throwing away the rows/columns that are weakly connected, then you will reduce the connections of those that remain. The k-core is what you get if you keep on iterating. In particular, it will find the largest subset of rows and columns from the interaction_model such that every row and column has at least core_threshold number of connections or "data points" in interaction_tibble. This is exactly the k-core if the row and columns correspond to unique elements (non-overlapping). If the elements in the rows match some elements in the columns, then those elements are represented twice... once for the row and once for the column. It is possible that only one of those is retained.

Usage

core(im_input, core_threshold = 3)

Arguments

core_threshold

An integer value that sets the minimum number of connections a row or column must have to be included in the final interaction model. Defaults to 3.

im

The interaction_model to be cleaned.

Value

Returns a modified version of the input interaction model (of the same type as im), which represents the k-core of the largest connected component based on the specified core_threshold. This cleaned interaction model will only include rows and columns that meet the minimum number of connections defined by core_threshold.