Distributed File Systems are file systems that allow access to files from multiple hosts via a computer network, making it possible for multiple users on multiple machines to share files and storage resources.

Distributed File Systems are designed to be “transparent” in a number of aspects (e.g.: location, concurrency, failure, replication), i.e. client programs see a system which is similar to a local file system. Behind the scenes, the Distributed FS handles locating files, transporting data, and potentially providing other features listed below.

Distributed File Systems can be categorised in:

  • Distributed File Systems are also called network file systems. Many implementations have been made, they are location dependent and they have access control lists (ACLs).
  • Distributed fault-tolerant File Systems replicate data between nodes (between servers or servers/clients) for high availability and offline (disconnected) operation.
  • Distributed parallel File Systems stripe data over multiple servers for high performance. They are normally used in high-performance computing (HPC).
  • Distributed parallel fault-tolerant File Systems stripe and replicate data over multiple servers for high performance and to maintain data integrity. Even if a server fails no data is lost. The file systems are used in both high-performance computing (HPC) and high-availability clusters.

The objectives of this research initiative are:

  • Evaluate and compare performance of various Distributed File Systems
  • Explore and Evaluate the use fo Distributed File Systems as Object Storage
  • Explore the use of Distributed File Systems in OpenStack
  • Explore the use of Distributed File Systems in Hadoop

Problem Statement

With the increasing need and use of cloud storage services providers must be able to deliver a reliable service that is also easily managed. Distributed File Systems provide the basis for a Cloud Storage Service.

Articles and Info

Distributed File Systems Blog post Series:

Contact Point