Supporting Tools
Beyond the core curation workflow, Latch Curate provides supporting tools that help curators manage object consistency, interoperability, version control, project management, search, and data delivery.Linting and Conversion
Linting Workflow
To ensure a consistent object structure, we developed a linting workflow that quickly verifies every task. This tool:- Ensures count-construction validation criteria pass
- Verifies all controlled variables use a restricted term set
- Catches stray errors early in large ingestion projects
- Saves teams considerable time on quality assurance
Format Conversion
Many computational biologists prefer Seurat to Scanpy. Because reliable conversion libraries were lacking, we implemented a library that converts AnnData objects to Seurat in pure R by reading the relevant slots from.h5ad
files directly on disk, avoiding approaches that embed Python interpreters inside R sessions.
Access the conversion workflow: AnnData To Seurat Conversion Workflow
Version Control and Reproducibility
Curated datasets are living assets, and new computational tools or updated scientific knowledge often require re-processing previously curated objects.Asset Management
Each task outputs assets into directories that can be uploaded to version-controlled blob stores:- Driver scripts
- JSON configuration files
- Agent logs
- Validation reports
Workflow Reproducibility
Because the agentic workflow runs inside a versioned container with input data mounted to a sandboxed file system at well-defined locations, rerunning these workflows with modified information is straightforward. Key features:- Versioned containers ensure consistent execution environment
- Fixed input/output paths enable reliable re-runs
- Comprehensive logging tracks all processing decisions
- Parameter files allow exact reproduction of previous runs
Data Portal for Project Management
As curated data accumulate, project management becomes critical. We built a data portal that:Storage and Indexing
- Stores curated H5AD files
- Indexes the metadata generated during curation
- Enables search and filtering by metadata fields
Project Organization
- Supports internal project organization
- Manages dataset collections by indication or study type
- Tracks curation progress across teams
Data Distribution
- Delivers curated data to external teams or partners
- Provides secure access controls
- Maintains audit trails for data usage
Integration with Latch Platform
Latch Data Integration
- Direct upload of curated objects to Latch Data
- Automatic organization of outputs by project
- Version tracking for all curated datasets
Workflow Integration
- Seamless connection to downstream analysis workflows
- Automatic triggering of follow-up analyses
- Support for batch processing of multiple datasets
Collaboration Features
- Share curated datasets within workspaces
- Collaborative review of curation reports
- Team-based quality control processes