Distributed file systems can be relatively cheap for end users; however, their performance can reach critical levels when corporate load overcomes a certain degree.
That is why now many engineering geeks all over the world are grinding away to discover the ways to adjust the distributed file systems work.
Thus, experts from the Massachusetts Institute of Technology (MIT) have tried to solve this problem by using SSD-drives connected to the systems of specific configuration called BlueDBM. The developers say BlueDBM copes with huge amounts of data, and the data can be accessed almost instantly.
According to the developers, one of the most interesting spheres of application of such systems is storing data on the simulation of physical processes on the scale of the universe, when the storage system contains a bunch of files of few exabytes in size.
MIT experts say their brainchild provides a ‘real-time interface’ that grants an access to huge amounts of data sets designed for sophisticated analysis and processing. According to the report, the brand-new system utilizes FPGA-chips (field-programmable gate arrays), located between the host computer and storage systems, functioning via their own network. In practice, this contributes to a significant reduction in the latency of accessing data, discharge of third-party inquiries in the network and restriction of storage systems scalability.
The secret of achieving such incredible performance lies in a combination of PCIe-modules for storing data with storage controllers on FPGA chips. All the mechanisms are connected to a low-latency data transfer gigabit network of that supports SERDES (SERialize/DESerializer) technology.
‘In our system, each individual node that stores data is independent of its neighbors, which enables to resolve the issue of insufficient performance efficiently. In addition, according to the description provided by the development team ‘the controllers’ functioning is organized in a separate network that can increase the speed of the system even further’.
Some improvements include:
- Improved FTL: the current system is designed for read intensive applications. The next step is to optimize writes to flash memory.
- DRAM Caching: It’s can cache reads and writes to the SSD in DRAM on the FPGA board. This can reduce writes to the flash and improve performance.
- Database Acceleration: Existing applications can already take advantage of BlueDBM’s distributed flash store.
Full research article available here: http://people.csail.mit.edu/wjun/papers/fpga2014-wjun.pdf
The system holds great promise for large-scale application acceleration in Big Data.
The system may potentially capture the interest of security agencies that have a natural demand in processing extremely large volumes of data (taking into account Edward Snowden’s story, these guys have plenty of info to process and structure).
When do you think the neoteric advance will acquire the status of a part of common ecosystem? Do you think the development can be correlated with the essential needs of Big Brother? Your comments below are highly appreciated.