As the trend continues to move towards Serverless Computing, Edge Computing and Functions as a Service (FaaS), the need for a storage system that can adapt to these architectures grows ever bigger. In a scenario where smart cars have to make decisions on a whim, there is no chance for that car to ask a data center what to do in this scenario. These scenarios constitute a driver for new storage solutions in more distributed architectures. In our work, we have been considering a scenario in which there is a distributed storage solution which exposes different local endpoints to applications distributed over a mix of cloud and local resources; such applications can give the storage infrastructure and indicator of the nature of the data which can then be used to determine where it should be stored. For example, data could be considered to be either latency-sensitive (in which case the storage system should try to store it as locally as possible) or loss sensitive (in which case the storage system should ensure it is on reliable storage). Continue reading
Storage, together with computing and networking, is one of the fundamental parts of IaaS.
The research initiative on cloud storage at ICCLab, under the Infrastructure theme, focuses on the exploration of the limiting factors of the available storage systems, aiming at identifying new technologies and providing solutions that can be used to improve the efficiency of data management in cloud environments.
The need for advanced distributed architectures and software components allowing the deployment of secure, reliable, highly available and high-performing storage systems is clearly remarked by the fast growing rate of user-generated data. This trend sets challenging requirements for service and infrastructure providers to find efficient solutions for permanent data storage in their data centers.
About Cloud Storage Systems
A cloud storage system is typically obtained through a composition of software resources (running in a distributed environment), and a set of physical machines (i.e., servers), that exposes access to a logical layer of storage.
Cloud storage provides an abstract view of the multiple physical storage resources that it manages (these can be located across multiple servers, or even across different data centers) and it internally handles different layers of transparency that ensure reliability and performance.
The main concepts that are to be found in cloud storage systems are:
- Data replication and reliability. Policies can be defined in such a way that copies of the same data are spread across different failure domains, to ensure availability and disaster recovery.
- Data placement. A cloud storage system exposes a logical view of storage and internally handles how data is assigned to the available resources. This allows for e.g., striping data and improving access performance by using parallel accesses, or ensuring a proper load balancing between a set of nodes.
- Availability. As a distributed system, cloud storage must not exhibit any single point of failure. This is usually achieved by introducing redundancy in hardware components and by implementing fail-over policies to recover from failures.
- Performance. Concurrent accesses to data can improve data rates significantly as different portions of the same file or object can be provided by two different disks or nodes.
- Geo-replication. A cloud storage system can replicate data in such a way that it is closer to where it is consumed (e.g., across data centers on different regions) to improve the access efficiency.
- Implement research ideas into working prototypes that can attract industrial interest
- Obtain funding by participating in financed research projects
- Produce and distribute our open source implementations
- Keep and increase the reputation of the ICCLab in international contexts
- Define a strong field of expertise in Distributed File Systems and software solutions for storage
- Explore and implement clustered storage architectures
From an applied research perspective, the scenario of cloud computing and the growing demand for efficient data storage solutions, offers a ground where many areas and directions can be explored and evaluated.
Here at the ICCLab, the following aspects are currently being developed in the cloud storage initiative:
- Distributed File Systems (DFSs)
- Storage architectures
- High availability (for storage)
- Object storage
- Highly distributed storage resources
- Independence between storage and computation resource