File Systems and Storage

In order to exploit non-volatile memory (NVM) NextgenIO makes use of the file system GekkoFS, which can be used by the Data Scheduler for stage-in and stage-out operations.

dataClay provides object class storage and can be used directly by applications based on object oriented code.

NVDIMM partitions reserved for App Direct mode require an interface to allow applications Direct Access (DAX). This interface can be in the form of a DAX enabled file system mounted on the memory (FSDAX) or Device DAX (DevDAX).

Direct Access (DAX)

FSDAX

There are several file systems that enable access to the NVM for applications, such as ext4, ext3, XFS, and ramfs. On NextgenIO the file system used for FSDAX is ext4.

Once the file system is mounted on the NVM App Direct partition (i.e. the namespace in the NVM reserved for direct access), applications can access the memory in the same manner as in a traditional file system.

EchoFS and GekkoFS make use of file system-enabled DAX. Device DAX is technically possible for EchoFS, but this feature is currently not enabled as the FS relies on FUSE to access memory. DevDAX for GekkoFS may also be implemented in the future.

DevDAX

Device Direct Access implies that an application can perform byte-level operations in the NVM without intervention of a file system. In contrast to FSDAX which provides a a block-device that can support a DAX-enabled filesystem, this mode emits a single character device file (/dev/daxX.Y). For more details see the pmem.io website

GekkoFS

This file system allows for POSIX-like NVM storage operations, and acts as an ad-hoc file-system for the lifetime of a single batch job. POSIX compliance has been tested on OpenFOAM and python applications [1]. GekkoFS can operate as a burst-buffer, performing stage-in and stage-out data transfers for the scheduled batch job. In these ways it is similar to echoFS.

GekkoFS forms a collaborative burst buffer, acting as a single file system for the directly accessible memory, combining the memory on all the nodes allocated to the job by the Job Scheduler. Data and metadata are distributed in blocks over the available storage space.

The file system functions as an interposition library, which redirects all file system operations requested by the job to file-system daemons running on the nodes. The system layout is illustrated in figure 1.

The interception of file-system operation commands is performed by the GekkoFS client, which is pre-loaded by the application when launched. The client also holds a file map, containing all data storage locations across the nodes. Like the other meta-data, this file map is distributed over the nodes. The client maintains an overview of all data and can send requests to individual daemons to perform I/O operations.

System architecture of GekkoFS

Figure 1 The architecture of the GekkoFS distributed file system. All comunication between the application and the FS runs via the GekkoFS client, which redirects commands to the daemons. The GekkoFS daemons run on each file system node. Image from Vef et al [2]

[1]As of May 2019
[2]Vef, M.-A. et al., GekkoFS - A Temporary Distributed File System for HPC Applications, CLUSTER (Proceedings) (2018)

dataClay

dataClay is an object store designed to make use of the features of SCM. It can be accessed directly by applications written in an object-oriented programming language. Currently Java and Python are the two languages supported by DataClay. A full description of the data store can be found in the DataClay Documentation.

A main functionality of the data store is to allow users to make any application-created object persistent in memory. Storing an object in this manner not only saves it for later use, but also allows the object to be called from other applications. dataClay is able to make use of both FSDAX and DevDAX to access NVRAM.

The structure of dataClay consists of two main components: a logic module and the data service. The logic module provides centralised storage, handling the metadata for all objects and checking permissions for user access to the objects stored in the data service.

The data service handles the storage of the persistent objects, as well as any execution requests involving these objects. The execution request are expected to be mainly for methods from the class to which the given object belongs.

dataClay can be called by any application written in the supported languages, however specific effort has been made to improve performance of dataClay in combination with PyCOMPSs.

Overview of dataClay object methods

obj.make_persistent( )
Store obj in dataClay, create Object ID. This method
also allows the user to specify what language the
object should be associated with.
obj.get_location( )
Return obj location in the data service (if a copy
of obj exists, returns one random location
obj.get_all_locations( )
Find all data service locations where obj or its
its copies are stored
obj.new_replica( )
Create a copy of obj