GFARM Grid File System

    10 Votes

GFARM (Global dependable virtual file system) file system is a next generation network shared file system, which will be an alternative solution of NFS, and will meet a demand for much larger, much reliable, and much faster file system. It is an Open Source distributed file system, generally used for large scale cluster.

Grid Data farm is a petascale data intensive computing project initiated in Japan. The project is collaboration among High Energy Accelerator Research Organization (KEK), National Institute of Advanced Industrial Science and Technology (AIST), the University of Tokyo, Tokyo Institute of Technology and University of Tsukuba. The challenge involves construction of a Peta to Exascale parallel file system exploiting local storage's of PCs spread over the worldwide Grid.

Design objectives pf GFRAM GRID is to facilitate reliable file sharing by providing a global virtual file system and build a high performance data computing in a Grid, both distributed/parallel across administrative domains. This file system tries to implement a global virtual file system that supports a complete set of standard POSIX APIs, while still retaining the parallel and distributed data computing feature of Grid Data farm architecture. The basic purpose of this seminar is to provide detailed information regarding the architecture, replication management, metadata organization, space utilization strategies and protocol specifications of GFARM file system.

In this files can be shared among all nodes and clients and applications can access files using the same path regardless of the file location. Main components of GFARM are metadata server, File system nodes and Gfarm clients and it can be mounted from all cluster nodes and sites. By automatic replica selection, system avoid access concentration. It Supports fault tolerance. Based on Grid Security Infrastructure, single sign on and Authentication is provided to clients. Disadvantages of this file system are 

  • Concurrent access
  • Size of the file is limited due to unavailability of file striping
  • Versioning is not supported

GFARM Grid File System

  • Files can be shared among all nodes and clients
  • It can be replicated physically and stored on any file system node
  • Applications can access it regardless of its location
  • File system nodes can be distributed

For improving IO Performance, priority is given to local disk. If there is not enough space, least busy node is selected for storage. Same mechanism is followed for file access. Since GFRAM cache all metadata in memory, there is no disk wait when responding a request. It also manage file descriptors for each process and processes that open each file.