IBM x FUJIFILM Helps OPPO to Manage Cold Data: Optimizing Storage Costs and Enhancing Efficiency 原创

For enterprises with massive amounts of data, it is impractical to use a single storage medium to store the data.

OPPO, as an explorer of ultimate technology, is committed to creating a multi-intelligent terminal and service for the era of interconnectivity, with the intention of making our life better. According to market reports on the global mobile phones industry, OPPO shipped over 100 million smartphones in the year 2023 alone, making it one of the world's leading smartphone suppliers. Currently, ColorOS has provided a comprehensive coverage of system applications, creating a boundless and unrestricted user experience for 600 million global users.

With the explosive growth of audio and video applications and artificial intelligence technology, OPPO Cloud Service, as the main carrier of future cloud-based applications, not only contains massive user data but also includes its own analytical data such as computing models and algorithm training. How to serve hundreds of millions of users around the world and protect and secure user data while accumulating users are what OPPO is focusing on.

Tiered Storage: The New Choice for Data Management

For enterprises with massive amounts of data, it is impractical to use a single storage medium to store the data. Different storage media come in a variety of capacities and performances depending on the requirements and functionality, hence the cost varies depending on the storage media. Based on years of experience, storage vendors and users have jointly introduced tiered storage solutions to address storage pains.

He Xiaochun, head of OPPO's cloud computing department, said: "OPPO, as an explorer in the mobile phone industry, is in the context of how to meet the substantial annual growth of the data volume of the company's internal business units and the continuous increase in the backup data volume of mobile phone users. After comprehensive consideration of the two aspects of protecting the security of all data storage and reducing the storage costs, our research and development team analyzed and compared the characteristics of various storage media, we decided to adopt the tape storage system in the mobile phone industry in 2023 and began to try to import and deploy the long-term archiving solution for massive, big data."

IBM x FUJIFILM Helps OPPO to Manage Cold Data: Optimizing Storage Costs and Enhancing Efficiency

He Xiaochun Head of OPPO's Cloud Computing Department

Massive data can be accessed through different storage media and a distributed, scalable management of cloud computing. Based on the frequency of data access, data is divided into three tiers: The data needs to be immediately available, which is called hot data; The data can be archived outside of daily processes while it hasn't been active in more than a year but may be recalled at some point, which is called cold data; and data that falls between these two is called warm data.

Oppo has a huge amount of data to handle and the data increases rapidly day by day. Moreover, with the volume of data increasing dramatically every year, in order to better manage this data and keep it safe, we need to sense the temperature of the data through the frequency of data access, etc. According to the temperature of the data, OPPO has defined a 5-tiered storage resource pool.

Tier 0 storage resource pool: Tier 0 storages existing hot and high-frequency data. It loads hot data into the server memory of various CDN data centers which enables quick response to user real-time access requests.

Tier 1 storage resource pool: By using structured and unstructured technologies, data is stored in local flash drives to meet users' normal access needs.

Tier 2 storage resource pool: By using distributed technology and low-cost, high-capacity traditional disk technology, this low-cost, high-capacity storage resource pool is built to meet the needs of users with no discernible access latency.

Tier 3 storage resource pool: By combining traditional tape technology with distributed storage technology, this massive storage resource pool is designed based on tape storage. It provides a high-density storage pool with high write bandwidth, yet it is cost-effective and has low power consumption. It meets data access requirements in minutes.

Tier 4 storage resource pool: This tier is an offline data storage using tape technology.

For data with different "temperatures", organizations need to provide different technical architectures. The nearline tape data storage solution meets the needs of OPPO's current and future cold data management. Not only does it reduce storage costs, but it also provides minute-level data access.

IBM x FUJIFILM Helps OPPO to Manage Cold Data: Optimizing Storage Costs and Enhancing Efficiency

The TCO pressure brought by the growth of massive data

The current OPPO object storage adopts a distributed architecture, which is a technology that stores data in the form of objects in distributed storage clusters to achieve high availability, high reliability, and high scalability. In this mode, data is divided into multiple slices and stored on multiple nodes, which ensures data reliability and fault tolerance through data redundancy and replication mechanisms. This distributed object storage technology not only improves data read/write performance and concurrency but also enables dynamic expansion and flexible management of data, meeting the needs of different application scenarios.

 

IBM x FUJIFILM Helps OPPO to Manage Cold Data: Optimizing Storage Costs and Enhancing Efficiency

OPPO's object HDD storage adopts the classic distributed object architecture. The metadata server stores the relevant index information of objects and slices, while the servers in the HDD storage pool store the slice information of objects. The client obtains index information from the metadata server through the S3 protocol and then retrieves the specific data of the object from the slice storage server.

Due to the enormous amount of data and the large scale of the distributed object storage cluster, OPPO needs to purchase a significant number of servers each year to build new storage space or replace out-of-warranty servers. OPPO realized that there is room to improve their current IT spending.

Tape Storage Solution and Cold Data

In order to reduce the TCO of object storage and optimize the storage service for massive data, OPPO introduced tape storage. Tape storage is an efficient, robust, and secure solution for organizations to manage long-term storage, active archiving, data protection, and recovery needs.

In distributed storage systems, tape storage is widely used for cold data. Rather than storing all data on expensive, high-performance storage devices, organizations can leverage tape storage to store less frequently accessed data. As in tape media, there are two types. One is LTO (Linear Tape Open) which is an open tape format, and the other is IBM's unique 3592 tape. There are two types of tape drives corresponding to them.

Dr. Shi Zemin, Deputy head of the Recording Media Division of Fujifilm (China) Investment Co Ltd, said,“Tape storage has the following advantages:

Cost-Effective: Tape storage is a highly economical option. By offloading cold data from high-performance storage to tape, organizations can take pressure off expensive storage and reduce overall storage costs. A study by Improving Information Technology Sustainability with Modern Tape Storage-2022 *[1], compared the average cost of archiving hard disk and tape over 10 years, assuming that 100 petabytes of information need to be stored for 10 years and the impact of its storage medium is assessed. The total cost of ownership (TCO) is 78% lower if cold data on HDD disks is archived to tape compared to solutions that only use HDD disks.

Reliability: Due to its low bit error rate, error correction capabilities, and redundancy features, tape is more reliable than other media. In a distributed storage system, archiving data on tape increases data redundancy and improves data reliability.

Scalability: Tape technology is easily scalable due to its ability to seamlessly add additional capacity with new media. By integrating tape storage into a tiered storage architecture, organizations can easily expand the storage capacity to meet the growing needs of data storage.

Green Energy: According to the paper Improving Information Technology Sustainability with Modern Tape Storage-2022 ,  assuming that 100 petabytes of information need to be stored for 10 years, a deep archive solution storing 100% of the data on hard disk drives generates 2,663 tons of CO2e over ten years. A deep archive solution storing all the data on tape generates only 79 tons of CO2e, a 97% reduction. According to the conversion between carbon emissions and electricity consumption, it is equivalent to saving 2.59 million kWh (reducing 0.997kg CO2=1 kWh).

IBM x FUJIFILM Helps OPPO to Manage Cold Data: Optimizing Storage Costs and Enhancing Efficiency

Dr. Shi Zemin General Manager of Recording Media Division in Fujifilm (China) Investment Co., Ltd.

Evaluate IBM tape nearline solution for OPPO cold data storage project

The OPPO cold data storage project adopts the IBM tape nearline storage solution, which is based on the IBM tape library TS4500 and 3592 enterprise tape media.

IBM x FUJIFILM Helps OPPO to Manage Cold Data: Optimizing Storage Costs and Enhancing Efficiency

Hou Miao General Manager of IBM, China& General Manager of Technology Department in China

Hou Miao, general manager of IBM, China and general manager of technology department in China, said,

“The TS4500 tape library is an IBM enterprise tape library that can be configured with 18 frames, 128 3592 tape drives, and 17,550 3592-type tapes. If configured with 20TB JE type 3592 tapes, the entire tape library can store 351PB of data; if configured with 50TB JF type 3592 tapes, the entire tape library can store up to 877.5PB of data.

IBM's enterprise-level 3592 tape media utilizes IBM's tape drive technology, as well as Fujifilm’s accumulated expertise on nanotechnology, nano-coating, and nano-dispersion, which has been developed during decades of film manufacturing. This original technology ensures the 3592 tape has a higher storage density. The JF series reaches up to 50TB per cartridge. The 3592 tape is more stable and reliable, meeting data storage requirements in data center environments. In addition, the performance of the 3592 tape is better, especially in terms of efficiency in random access of small files. Overall, the 3592 tape has six sustainable advantages compared to other types of media: low cost, low energy consumption, high security, high reliability, large capacity, and energy-saving environmental sustainability.”

IBM x FUJIFILM Helps OPPO to Manage Cold Data: Optimizing Storage Costs and Enhancing Efficiency

Joint research on distributed tape storage architecture

IBM x FUJIFILM Helps OPPO to Manage Cold Data: Optimizing Storage Costs and Enhancing Efficiency

Rao Youqing IBM Hyperscale System Solutions Chief Architect

He Xiaochun, Tang Hu, Hou Jingpeng, Wu HuoChen, and other main OPPO technology development and architecture team, and IBM hyperscale system solutions sales manager Liang Xiao and chief architect Rao Youqing designed the OPPO cold data storage project which integrates OPPO object storage and IBM tape nearline storage technology to build an end-to-end distributed tape storage architecture. OPPO's warm and hot data is stored in the HDD storage pool, while cold data is stored in a tape storage pool built from tape. OPPO sorts data by ‘temperature’. If the data is cold enough, the system archives the data to tape. If there are access demands, the system can recall the data from the tape at minute level. The OPPO tape-based storage pool utilizes Erasure Code technology to ensure high reliability and high availability of tape storage data while improving tape volume.

 

IBM x FUJIFILM Helps OPPO to Manage Cold Data: Optimizing Storage Costs and Enhancing Efficiency

OPPO cold data storage architecture diagram

User Experience after Actual Use

By monitoring the usage of the distributed tape storage system, the average performance of a single node tape is about 440MB/s, and the read/write performance of the cache file system is around 1GB/s. The utilization rate of the local cache file system is kept within 70%, which meets OPPO’s archive requirements for cold data.

IBM x FUJIFILM Helps OPPO to Manage Cold Data: Optimizing Storage Costs and Enhancing Efficiency

OPPO's object storage cold data archiving system has been launched for over half a year, and the system is running stably. When building the system, OPPO found that although the file sizes of the sliced data nodes and erasure-coded data nodes were the same before compression, after compression the tape volume they took was inconsistent. OPPO has successfully solved this issue and achieved better storage volume allocation.

The Enterprise Tape Nearline Storage Solution, which is based on OPPO object storage, IBM Spectrum Scale/Archive EE, IBM TS4500 tape library, and Fujifilm tape technology, achieved the organic integration of disk storage and tape storage, manages a unified enterprise-level object namespace, realizes seamless circulation of data among different storage pools. The tape nearline storage system enhances IT support for businesses and effectively reduces the cost of data storage. Compared to traditional distributed disk storage, the tape nearline storage system has the features of low cost, low power consumption, high spatial data density, and green data center, which to some extent, reduces the storage costs of low-frequency data archiving.

Looking ahead to the future, OPPO will continue to explore long-term preservation technologies for massive cold data, diversify storage services, reduce storage costs, and concurrently, reduce energy consumption in data centers and provide more economical, more secure, and more reliable data storage solutions for all kinds of businesses.

Please click to read the Chinese version

来源:至顶网存储频道

0赞

好文章,需要你的鼓励

2024

06/06

13:56

分享

点赞

邮件订阅