Huawei Technologies Co. Ltd.

11/30/2023 | Press release | Distributed by Public on 11/30/2023 04:08

Decoupled Storage-Compute Architecture: The New De facto Standard for Distributed Databases

Increased Internet use and rising costs are driving core systems to embrace distributed databases, which are shifting from a coupled to a decoupled storage-compute architecture.

Open-source databases are reshaping enterprise core systems, with MySQL and PostgreSQL occupying the top two spots in the global database market. And to ensure smooth service operations, decoupled storage-compute architecture has become the de facto standard for distributed databases.

Trends

1. Distributed databases built based on an open-source ecosystem are replacing traditional core systems to better suit service changes, achieve higher efficiency at lower cost, and facilitate long-term technology evolution

Digital and mobile technologies have greatly changed the interaction channels between enterprises and customers. Internet applications, such as mobile apps, are now the best medium for triggering customers' purchase behavior. This leads to rapid business growth, but also brings unpredictable and fluctuating workload surges to core systems. The core system must be sufficiently elastic to ensure that resources are quickly expanded during peak hours to keep services running smoothly, while idle resources are released during off-peak hours to prevent resource wastage.

High O&M costs are another factor driving enterprises to reconstruct traditional core systems. According to an Oracle user survey, 97% of respondents believe that cost is the biggest challenge associated with using Oracle databases, with 35% turning to open-source or other non-Oracle cloud databases.

According to 6sense, MySQL tops the database ranking with a 42.95% market share. PostgreSQL ranks second, with Oracle Database coming in third.

2. Decoupled storage-compute architecture has become the de facto standard for distributed databases because it keeps services running smoothly

Stability is the top consideration for core databases. Performance, functionality, and energy efficiency are also important appraisal criteria.

In the early stages of using distributed databases, both the pilot service scale and data volume are small. To minimize the initial investment, many enterprises deploy both database applications and data on the same server. This is what we call a coupled storage-compute architecture. However, it is vulnerable to eggs-in-one-basket risks. Therefore, some enterprises choose to use multiple servers and redundant data copies to temporarily solve service stability issues. However, as the scale of a distributed database expands, the volume of data and the quantity of servers also increase exponentially. Data redundancy can result in a significant waste of investment. As the volume of data continues to grow, the synchronization of redundant data will consume more and more network bandwidth.

Especially in multi-site disaster recovery (DR) architecture, network bottlenecks may cause data loss if a disaster occurs.

As these problems become increasingly prominent, distributed database construction has gradually evolved from coupled to decoupled storage-compute architecture. In the decoupled scenario, enterprises can use performant, reliable, and shared enterprise-level all-flash storage pools to ensure high data availability. The architecture isolates applications from data, eliminating the need to use multiple redundant data copies for high availability. And the powerful and mature DR capabilities of storage systems can compensate for the insufficient DR capabilities of open-source databases. Most importantly, decoupled storage-compute architecture has been comprehensively tested in traditional core systems, resulting in the development of mature product systems and O&M expertise.

Enterprises can focus on how distributed databases can help them drive business growth without concerns about frequent O&M issues.

Figure 1: Database architecture evolution from coupled to decoupled storage-compute

Currently, major banks around the world have built new core systems using distributed databases that adopt decoupled storage-compute architecture. New mainstream database solutions, such as Amazon Aurora, Alibaba PolarDB, Huawei GaussDB, and Tencent TDSQL, have shifted to decoupled storage-compute architecture, making it the de facto standard for distributed database construction.

3. Distributed databases are driving the development of a new data paradigm

Open-source databases, such as MySQL and PostgreSQL, are deployed on standalone servers. Unlike Oracle RAC, open-source databases cannot coordinate multiple database nodes to simultaneously read from and write to the same database. As a result, distributed databases have significant bottlenecks in performance expansion. However, by using professional storage devices to provide shared data access across nodes and implement a consistent cache layer between database nodes, distributed databases can also provide the same multi-primary capabilities as Oracle RAC. For example, GreatDB and TeleDB have worked with Huawei storage to implement multi-primary capabilities through Huawei's Cantian database storage engine, improving database performance by up to a factor of 10.

Suggestions

1. Promote decoupled storage-compute architecture for distributed databases

Although the industry has seen numerous examples of distributed databases built on coupled storage-compute architecture, decoupled storage-compute architecture has become the necessary choice from the perspectives of technology, O&M, and evolution. If enterprises plan to build a distributed database, they should adopt decoupled storage-compute architecture to prevent repeated construction and resource wastage. Enterprises that already use coupled storage-compute architecture should gradually shift to decoupled architecture to drive future cost reductions, efficiency improvements, and continuous expansion.

2. Encourage the database team and storage team to jointly incubate a new data paradigm

The new data paradigm built with distributed databases centers on the idea that database software is no longer a one-size-fits-all solution. To meet enterprise requirements, a collaborative effort between database and hardware infrastructure is essential. Therefore, the database team should not work alone when building enterprise core systems. Instead, the database and storage teams should work together to build databases and core systems so that they can fully leverage both the software and hardware advantages and establish a new data paradigm.

Learn more about Huawei Storageand subscribe to this blog to get notifications of all the latest posts.

Disclaimer: Any views and/or opinions expressed in this post by individual authors or contributors are their personal views and/or opinions and do not necessarily reflect the views and/or opinions of Huawei Technologies.

Share this: