Skip to content

Builds and caches the document-level link table, document-term table, sparse document-term matrix, and summary metadata used by Module 3 topic modeling.

Usage

module3_construct_docs(
  filtered_dir,
  output_dir,
  tf_cluster_map = NULL,
  check_repeated_values = FALSE,
  ...
)

Arguments

filtered_dir

Directory containing Module 3 filtered differential-link CSV files.

output_dir

Directory where topic input caches are written.

tf_cluster_map

Named vector mapping TF names to motif clusters.

check_repeated_values

Warn about repeated inconsistent term values. The high-throughput default is `FALSE`; set to `TRUE` for diagnostic audits.

...

Additional topic-document construction arguments passed to the internal Module 3 document builder.

Value

A list with cache paths and input summary counts.