Version: 1.1.1
Date: 2018-01-08

Scientific Filesystem



Here we present the Scientific Filesystem (SCIF), an organizational format that supports exposure of executables and metadata for discoverability. The format includes a known filesystem structure, a definition for a set of environment variables describing it, and functions for generation of the variables and interaction with the libraries, metadata, and executables located within. Some quick resources:

Although scif is not exclusively for containers, in that a container can provide an encapsulated, reproducible environment, the scientific filesystem works optimally when contained. Containers traditionally have one entrypoint, one environment context, and one set of labels to describe it. A container created with a Scientific Filesystem can expose multiple entry points, each that includes its own environment, metadata, installation steps, tests, files, and a primary executable script. SCIF thus brings internal modularity and programatic accessibility to encapsulated, reproducible environments.

What will I learn reading this?

We will start by reviewing the background and rationale for a scientific organizational format, and how SCIF achieves the goals of modularity, transparency, and consistency. We then review the organizational structure of the standard, and the different levels of internal modules that it affords. For this work, we provide several tutorials to demonstrate using the scientific filesystem with Docker and Singularity, and additionally have implemented and released the organizational format as a native integration with the Singularity software. Finally, we discuss use cases for SCIF in context of containers, including how SCIF can be used to evaluate software, provide metrics, serve scientific workflows, and execute a primary function under different contexts. To encourage collaboration and sharing of apps, we have developed an open source, version controlled, tested, and programmatically accessible web infrastructure at https://sci-f.github.io/apps. For developers, we provide a getting started guide for integration of SCIF into other container technologies or contexts. The ease of using SCIF to develop scientific containers offers promise for scientists to easily generate self-documenting containers that are programmatically parseable, exposing software and associated metadata, environments, and files to be quickly found and used.

Getting Started

Resources

We have provided several examples and tutorials for getting started with SCIF. If you have a workflow or container that you’d like to see added, please reach out. If you would like to see other ways to contribute, here are some suggestions. This work will remain open for contributions, and early contributions will be represented in an official submission.

Citation

If SCIF has been useful to you, please cite our work on GigaScience!

Vanessa Sochat; The Scientific Filesystem (SCIF), GigaScience, giy023,
https://doi.org/10.1093/gigascience/giy023