Supporting Tools

Beyond the core curation workflow, Latch Curate provides supporting tools that help curators manage object consistency, interoperability, version control, project management, search, and data delivery.

Linting and Conversion

Linting Workflow

To ensure a consistent object structure, we developed a linting workflow that quickly verifies every task. This tool:
  • Ensures count-construction validation criteria pass
  • Verifies all controlled variables use a restricted term set
  • Catches stray errors early in large ingestion projects
  • Saves teams considerable time on quality assurance
Access the linting workflow: Lint Curated AnnData Workflow

Format Conversion

Many computational biologists prefer Seurat to Scanpy. Because reliable conversion libraries were lacking, we implemented a library that converts AnnData objects to Seurat in pure R by reading the relevant slots from .h5ad files directly on disk, avoiding approaches that embed Python interpreters inside R sessions. Access the conversion workflow: AnnData To Seurat Conversion Workflow

Version Control and Reproducibility

Curated datasets are living assets, and new computational tools or updated scientific knowledge often require re-processing previously curated objects.

Asset Management

Each task outputs assets into directories that can be uploaded to version-controlled blob stores:
  • Driver scripts
  • JSON configuration files
  • Agent logs
  • Validation reports

Workflow Reproducibility

Because the agentic workflow runs inside a versioned container with input data mounted to a sandboxed file system at well-defined locations, rerunning these workflows with modified information is straightforward. Key features:
  • Versioned containers ensure consistent execution environment
  • Fixed input/output paths enable reliable re-runs
  • Comprehensive logging tracks all processing decisions
  • Parameter files allow exact reproduction of previous runs

Data Portal for Project Management

As curated data accumulate, project management becomes critical. We built a data portal that:

Storage and Indexing

  • Stores curated H5AD files
  • Indexes the metadata generated during curation
  • Enables search and filtering by metadata fields

Project Organization

  • Supports internal project organization
  • Manages dataset collections by indication or study type
  • Tracks curation progress across teams

Data Distribution

  • Delivers curated data to external teams or partners
  • Provides secure access controls
  • Maintains audit trails for data usage

Integration with Latch Platform

Latch Data Integration

  • Direct upload of curated objects to Latch Data
  • Automatic organization of outputs by project
  • Version tracking for all curated datasets

Workflow Integration

  • Seamless connection to downstream analysis workflows
  • Automatic triggering of follow-up analyses
  • Support for batch processing of multiple datasets

Collaboration Features

  • Share curated datasets within workspaces
  • Collaborative review of curation reports
  • Team-based quality control processes