Storage, together with computing and networking, is one of the fundamental parts of IaaS.
The research initiative on cloud storage at ICCLab, under the Infrastructure theme, focuses on the exploration of the limiting factors of the available storage systems, aiming at identifying new technologies and providing solutions that can be used to improve the efficiency of data management in cloud environments.
The need for advanced distributed architectures and software components allowing the deployment of secure, reliable, highly available and high-performing storage systems is clearly remarked by the fast growing rate of user-generated data. This trend sets challenging requirements for service and infrastructure providers to find efficient solutions for permanent data storage in their data centers.
About Cloud Storage Systems
A cloud storage system is typically obtained through a composition of software resources (running in a distributed environment), and a set of physical machines (i.e., servers), that exposes access to a logical layer of storage.
Cloud storage provides an abstract view of the multiple physical storage resources that it manages (these can be located across multiple servers, or even across different data centers) and it internally handles different layers of transparency that ensure reliability and performance.
The main concepts that are to be found in cloud storage systems are:
- Data replication and reliability. Policies can be defined in such a way that copies of the same data are spread across different failure domains, to ensure availability and disaster recovery.
- Data placement. A cloud storage system exposes a logical view of storage and internally handles how data is assigned to the available resources. This allows for e.g., striping data and improving access performance by using parallel accesses, or ensuring a proper load balancing between a set of nodes.
- Availability. As a distributed system, cloud storage must not exhibit any single point of failure. This is usually achieved by introducing redundancy in hardware components and by implementing fail-over policies to recover from failures.
- Performance. Concurrent accesses to data can improve data rates significantly as different portions of the same file or object can be provided by two different disks or nodes.
- Geo-replication. A cloud storage system can replicate data in such a way that it is closer to where it is consumed (e.g., across data centers on different regions) to improve the access efficiency.
- Implement research ideas into working prototypes that can attract industrial interest
- Obtain funding by participating in financed research projects
- Produce and distribute our open source implementations
- Keep and increase the reputation of the ICCLab in international contexts
- Define a strong field of expertise in Distributed File Systems and software solutions for storage
- Explore and implement clustered storage architectures
From an applied research perspective, the scenario of cloud computing and the growing demand for efficient data storage solutions, offers a ground where many areas and directions can be explored and evaluated.
Here at the ICCLab, the following aspects are currently being developed in the cloud storage initiative:
- Distributed File Systems (DFSs)
- E.g., ceph, OpenStack swift, …
- Storage architectures
- High availability (for storage)
- Object storage
- Highly distributed storage resources
- Independence between storage and computation resource
- Vincenzo Pii (piiv-at-zhaw.ch)
- Andy Edmonds (edmo-at-zhaw.ch)