Overview

One of the main requirements of Nextflow workflows is having a shared, POSIX-compliant file system among all workflow tasks. All of the input files are downloaded and staged into a “workdir,” a mounted shared filesystem directory. The workflow tasks then access these files as part of their computation and write intermediate or output files back to the workdir.

A shared filesystem is required since the inputs to workflows might be extremely large, with multiple terabytes of data, and tasks must share files even if they are scheduled on different nodes in the cluster. Having a shared filesystem allows the tasks to write an unlimited amount of data without requesting a lot of storage resources.

Latch provides two options for shared storage when running Nextflow workflows: EFS and ObjectiveFS.

EFS

EFS is a shared file system with nearly unlimited storage capacity and high throughput which is offered by AWS. EFS is mounted into every process in Nextflow and can be accessed as any other directory.

EFS scales with the growing need for storage so it can support both small and large workloads without suffering performance degradation. EFS provides strong data consistency and file locking which is one of the requirements for Nextflow shared file systems.

OFS

ObjectiveFS(OFS) is a serverless shared filesystem built on top of AWS S3 as its storage layer. It has a different architecture from EFS as it processes file operations directly on the host and not on a set of central servers.

OFS filesystem is POSIX compliant and can scale up to 1 PB of data. OFS provides read-and-write consistency guarantees and the same durability and availability guarantees as AWS S3 while having high read-and-write performance and enforced data encryption.

Usage Examples

By default, all new workflows generated with latch init will use OFS as an underlying storage. Follow the Nextflow Tutorial in order to generate a Nextflow project on Latch.

To configure your filesystem, you can change the initialize method in wf/entrypoint.py file.

  1. initialize step of the workflow will provision a shared filesystem. Here you can configure which filesystem you want to use for your workflow.
@custom_task(cpu=0.25, memory=0.5, storage_gib=1)
def initialize() -> str:
    """
    Initialize the workflow by provisioning a shared storage volume.

    This function requests a shared storage volume from the Nextflow dispatcher service
    and returns the name of the provisioned volume.

    Returns:
        str: The name of the provisioned storage volume.

    Raises:
        RuntimeError: If the execution token is not available.
    """
    token = os.environ.get("FLYTE_INTERNAL_EXECUTION_ID")
    if token is None:
        raise RuntimeError("failed to get execution token")

    headers = {"Authorization": f"Latch-Execution-Token {token}"}

    print("Provisioning shared storage volume... ", end="")
    resp = requests.post(
        ### CHANGE THE URL HERE TO PROVISION OFS OR EFS
        "http://nf-dispatcher-service.flyte.svc.cluster.local/provision-storage-ofs",
        headers=headers,
        json={
            "version": 2,
            "storage_expiration_hours": 100, #storage will expire in 100 hours after start of execution
            # "fs_size_tb": 10, # OFS only: specify expected file system size in order to provision more memory for tasks
        },
    )
    resp.raise_for_status()
    print("Done.")

    return resp.json()["name"]
  1. Use the URL of the request above to http://nf-dispatcher-service.flyte.svc.cluster.local/provision-storage-ofs to provision OFS storage or use http://nf-dispatcher-service.flyte.svc.cluster.local/provision-storage-efs to provision EFS storage.

  2. You can specify multiple parameters to configure your shared filesystem for your workload.

  • storage_expiration_hours option specifies when to clean up the data in your storage. Set to 0 hours to delete storage after execution completes. Set this parameter to a non-zero value to keep storage for relaunching.
    • EFS storage expiration defaults to 0 hours.
    • OFS storage expiration defaults to 30 days.
  • version option specifies the version of Nextflow integration to use. For most cases, it should always be set to 2 for new workflows unless you are running a legacy workflow built on version 1 of Nextflow integration.
  • fs_size_tb(OFS only) - approximate expected size in TBs for the filesystem. OFS requires you to specify the filesystem size ahead of time to provision more memory for the task. See OFS task memory requirement for more details.

OFS task memory requirement

Unlike EFS, OFS is not an NFS filesystem and does not use an external server to process file requests. OFS runs as a FUSE process on every node in the cluster and mounts the filesystem into every workflow task. OFS uses node memory to store the file system index and local cache in order to speed up file operations and it requires each workflow task to request extra memory to properly account for OFS memory usage.

With the fs_size_tb parameter you can specify an approximate storage size of your filesystem which will change the memory requirement for each workflow task. The following are the memory requirements for file system size:

Filesystem Size(TB)Additional Task Memory Request(GB)
0-12
2-103
11-204
21-305
31-406
41-507

You can approximate the size of your filesystem by adding up all input file sizes and multiplying by 2 to account for intermediate files.

Comparison

Cost

The main difference between EFS and OFS is their cost models and the underlying storage layer.

EFS pricing model includes charges for data storage and data access. OFS uses S3 as its storage layer, so the pricing model includes charges for mounting OFS filesystems, S3 storage, and the additional RAM provisioned for the tasks.

EFSObjectiveFS
Storage($/GB/month)0.300.023
Throughput Reads ($/GB/month)0.03N/A
Throughput Writes ($/GB/month)0.06N/A
Mount Cost($/mount/hr)N/A0.18
RAM cost($/GiB/hr)N/A0.009972

Performance

OFS has a better performance overall on 1 mount benchmarks. However, EFS throughput is higher in a Nextflow environment with many tasks reading and writting to the same file system due to slow distributed file locking in OFS. Both systems perform well on common Nextflow workloads.

EFSObjectiveFS
Sequential Read(MB/s)92.55122.23
Sequential Write(MB/s)124.20125.07
Random Read(MB/s)58.3977.79
Random Write(MB/s)73.6987.90
Staging to workdir from LData(MB/s)212.31188.40
Writing to LData from workdir(MB/s)246.64208.74

Notes:

  • To get the sequential read/write benchmarks, we measured the time to copy a 1GB file to and from the file system.
  • To get the random read/write benchmarks, we measured the time to copy 1GB by randomly choosing 1MB chunks and writing to/from the file system.
  • To get the staging benchmarks, we measured the total time to download the file to/from LData to filesystem and vice versa.
  • OFS benchmarks were performed on pre-warmed cache

Choosing Shared Storage Option

The correct storage solution depends on your workload and budget requirements. Here is the summary of the file system comparison:

EFSObjectiveFS
Cost++++
Performance+++++
Execution time+++

EFS:

Pros:

  • Stable high throughput on Nextflow workloads

Cons:

  • Expensive. The throughput and storage costs of EFS can be very significant depending on the input size and the workload.
  • Storing data for relaunch can be expensive

OFS

Pros:

  • Lower cost due to using S3. Does not have throughput charges.
  • Good performance with pre-warmed cache

Cons:

  • Throughput can vary more on Nextflow workloads
  • Requires all tasks to provision extra memory

Summary

For most workloads, OFS is a better, more cost-effective option. OFS performs well on most workloads and allows for cheaper experiments. The executions are usually bound by the CPU processing time of the inputs therefore the decreased file system throughput in Nextflow environment doesn’t impact workflow performance significantly on most workloads.