OPPO, as an explorer of ultimate technology, is committed to creating a multi-intelligent terminal and service for the era of interconnectivity, with the intention of making our life better. According to market reports on the global mobile phones industry, OPPO shipped over 100 million smartphones in the year 2023 alone, making it one of the world's leading smartphone suppliers. Currently, ColorOS has provided a comprehensive coverage of system applications, creating a boundless and unrestricted user experience for 600 million global users.
With the explosive growth of audio and video applications and artificial intelligence technology, OPPO Cloud Service, as the main carrier of future cloud-based applications, not only contains massive user data but also includes its own analytical data such as computing models and algorithm training. How to serve hundreds of millions of users around the world and protect and secure user data while accumulating users are what OPPO is focusing on.
For enterprises with massive amounts of data, it is impractical to use a single storage medium to store the data. Different storage media come in a variety of capacities and performances depending on the requirements and functionality, hence the cost varies depending on the storage media. Based on years of experience, storage vendors and users have jointly introduced tiered storage solutions to address storage pains.
He Xiaochun, head of OPPO's cloud computing department, said: "OPPO, as an explorer in the mobile phone industry, is in the context of how to meet the substantial annual growth of the data volume of the company's internal business units and the continuous increase in the backup data volume of mobile phone users. After comprehensive consideration of the two aspects of protecting the security of all data storage and reducing the storage costs, our research and development team analyzed and compared the characteristics of various storage media, we decided to adopt the tape storage system in the mobile phone industry in 2023 and began to try to import and deploy the long-term archiving solution for massive, big data."
He Xiaochun Head of OPPO's Cloud Computing Department
Massive data can be accessed through different storage media and a distributed, scalable management of cloud computing. Based on the frequency of data access, data is divided into three tiers: The data needs to be immediately available, which is called hot data; The data can be archived outside of daily processes while it hasn't been active in more than a year but may be recalled at some point, which is called cold data; and data that falls between these two is called warm data.
Oppo has a huge amount of data to handle and the data increases rapidly day by day. Moreover, with the volume of data increasing dramatically every year, in order to better manage this data and keep it safe, we need to sense the temperature of the data through the frequency of data access, etc. According to the temperature of the data, OPPO has defined a 5-tiered storage resource pool.
Tier 0 storage resource pool: Tier 0 storages existing hot and high-frequency data. It loads hot data into the server memory of various CDN data centers which enables quick response to user real-time access requests.
Tier 1 storage resource pool: By using structured and unstructured technologies, data is stored in local flash drives to meet users' normal access needs.
Tier 2 storage resource pool: By using distributed technology and low-cost, high-capacity traditional disk technology, this low-cost, high-capacity storage resource pool is built to meet the needs of users with no discernible access latency.
Tier 3 storage resource pool: By combining traditional tape technology with distributed storage technology, this massive storage resource pool is designed based on tape storage. It provides a high-density storage pool with high write bandwidth, yet it is cost-effective and has low power consumption. It meets data access requirements in minutes.
Tier 4 storage resource pool: This tier is an offline data storage using tape technology.
For data with different "temperatures", organizations need to provide different technical architectures. The nearline tape data storage solution meets the needs of OPPO's current and future cold data management. Not only does it reduce storage costs, but it also provides minute-level data access.
The current OPPO object storage adopts a distributed architecture, which is a technology that stores data in the form of objects in distributed storage clusters to achieve high availability, high reliability, and high scalability. In this mode, data is divided into multiple slices and stored on multiple nodes, which ensures data reliability and fault tolerance through data redundancy and replication mechanisms. This distributed object storage technology not only improves data read/write performance and concurrency but also enables dynamic expansion and flexible management of data, meeting the needs of different application scenarios.
OPPO's object HDD storage adopts the classic distributed object architecture. The metadata server stores the relevant index information of objects and slices, while the servers in the HDD storage pool store the slice information of objects. The client obtains index information from the metadata server through the S3 protocol and then retrieves the specific data of the object from the slice storage server.
Due to the enormous amount of data and the large scale of the distributed object storage cluster, OPPO needs to purchase a significant number of servers each year to build new storage space or replace out-of-warranty servers. OPPO realized that there is room to improve their current IT spending.
In order to reduce the TCO of object storage and optimize the storage service for massive data, OPPO introduced tape storage. Tape storage is an efficient, robust, and secure solution for organizations to manage long-term storage, active archiving, data protection, and recovery needs.
In distributed storage systems, tape storage is widely used for cold data. Rather than storing all data on expensive, high-performance storage devices, organizations can leverage tape storage to store less frequently accessed data. As in tape media, there are two types. One is LTO (Linear Tape Open) which is an open tape format, and the other is IBM's unique 3592 tape. There are two types of tape drives corresponding to them.
Dr. Shi Zemin, Deputy head of the Recording Media Division of Fujifilm (China) Investment Co Ltd, said,“Tape storage has the following advantages:
Cost-Effective: Tape storage is a highly economical option. By offloading cold data from high-performance storage to tape, organizations can take pressure off expensive storage and reduce overall storage costs. A study by Improving Information Technology Sustainability with Modern Tape Storage-2022 *[1], compared the average cost of archiving hard disk and tape over 10 years, assuming that 100 petabytes of information need to be stored for 10 years and the impact of its storage medium is assessed. The total cost of ownership (TCO) is 78% lower if cold data on HDD disks is archived to tape compared to solutions that only use HDD disks.
Reliability: Due to its low bit error rate, error correction capabilities, and redundancy features, tape is more reliable than other media. In a distributed storage system, archiving data on tape increases data redundancy and improves data reliability.
Scalability: Tape technology is easily scalable due to its ability to seamlessly add additional capacity with new media. By integrating tape storage into a tiered storage architecture, organizations can easily expand the storage capacity to meet the growing needs of data storage.
Green Energy: According to the paper Improving Information Technology Sustainability with Modern Tape Storage-2022 , assuming that 100 petabytes of information need to be stored for 10 years, a deep archive solution storing 100% of the data on hard disk drives generates 2,663 tons of CO2e over ten years. A deep archive solution storing all the data on tape generates only 79 tons of CO2e, a 97% reduction. According to the conversion between carbon emissions and electricity consumption, it is equivalent to saving 2.59 million kWh (reducing 0.997kg CO2=1 kWh).
Dr. Shi Zemin General Manager of Recording Media Division in Fujifilm (China) Investment Co., Ltd.
The OPPO cold data storage project adopts the IBM tape nearline storage solution, which is based on the IBM tape library TS4500 and 3592 enterprise tape media.
Hou Miao General Manager of IBM, China& General Manager of Technology Department in China
Hou Miao, general manager of IBM, China and general manager of technology department in China, said,
“The TS4500 tape library is an IBM enterprise tape library that can be configured with 18 frames, 128 3592 tape drives, and 17,550 3592-type tapes. If configured with 20TB JE type 3592 tapes, the entire tape library can store 351PB of data; if configured with 50TB JF type 3592 tapes, the entire tape library can store up to 877.5PB of data.
IBM's enterprise-level 3592 tape media utilizes IBM's tape drive technology, as well as Fujifilm’s accumulated expertise on nanotechnology, nano-coating, and nano-dispersion, which has been developed during decades of film manufacturing. This original technology ensures the 3592 tape has a higher storage density. The JF series reaches up to 50TB per cartridge. The 3592 tape is more stable and reliable, meeting data storage requirements in data center environments. In addition, the performance of the 3592 tape is better, especially in terms of efficiency in random access of small files. Overall, the 3592 tape has six sustainable advantages compared to other types of media: low cost, low energy consumption, high security, high reliability, large capacity, and energy-saving environmental sustainability.”
Joint research on distributed tape storage architecture
Rao Youqing IBM Hyperscale System Solutions Chief Architect
He Xiaochun, Tang Hu, Hou Jingpeng, Wu HuoChen, and other main OPPO technology development and architecture team, and IBM hyperscale system solutions sales manager Liang Xiao and chief architect Rao Youqing designed the OPPO cold data storage project which integrates OPPO object storage and IBM tape nearline storage technology to build an end-to-end distributed tape storage architecture. OPPO's warm and hot data is stored in the HDD storage pool, while cold data is stored in a tape storage pool built from tape. OPPO sorts data by ‘temperature’. If the data is cold enough, the system archives the data to tape. If there are access demands, the system can recall the data from the tape at minute level. The OPPO tape-based storage pool utilizes Erasure Code technology to ensure high reliability and high availability of tape storage data while improving tape volume.
OPPO cold data storage architecture diagram
By monitoring the usage of the distributed tape storage system, the average performance of a single node tape is about 440MB/s, and the read/write performance of the cache file system is around 1GB/s. The utilization rate of the local cache file system is kept within 70%, which meets OPPO’s archive requirements for cold data.
OPPO's object storage cold data archiving system has been launched for over half a year, and the system is running stably. When building the system, OPPO found that although the file sizes of the sliced data nodes and erasure-coded data nodes were the same before compression, after compression the tape volume they took was inconsistent. OPPO has successfully solved this issue and achieved better storage volume allocation.
The Enterprise Tape Nearline Storage Solution, which is based on OPPO object storage, IBM Spectrum Scale/Archive EE, IBM TS4500 tape library, and Fujifilm tape technology, achieved the organic integration of disk storage and tape storage, manages a unified enterprise-level object namespace, realizes seamless circulation of data among different storage pools. The tape nearline storage system enhances IT support for businesses and effectively reduces the cost of data storage. Compared to traditional distributed disk storage, the tape nearline storage system has the features of low cost, low power consumption, high spatial data density, and green data center, which to some extent, reduces the storage costs of low-frequency data archiving.
Looking ahead to the future, OPPO will continue to explore long-term preservation technologies for massive cold data, diversify storage services, reduce storage costs, and concurrently, reduce energy consumption in data centers and provide more economical, more secure, and more reliable data storage solutions for all kinds of businesses.
好文章,需要你的鼓励
OpenAI 本周为 ChatGPT 添加了 AI 图像生成功能,用户可直接在对话中创建图像。由于使用量激增,CEO Sam Altman 表示公司的 GPU "正在融化",不得不临时限制使用频率。新功能支持工作相关图像创建,如信息图表等,但在图像编辑精确度等方面仍存在限制。值得注意的是,大量用户正在使用该功能创作吉卜力动画风格的图像。
Synopsys 近期推出了一系列基于 AMD 最新芯片的硬件辅助验证和虚拟原型设计工具,包括 HAPS-200 原型系统和 ZeBu-200 仿真系统,以及面向 Arm 硬件的 Virtualizer 原生执行套件。这些创新工具显著提升了芯片设计和软件开发的效率,有助于加快产品上市速度,满足当前 AI 时代下快速迭代的需求。
人工智能正在深刻改变企业客户关系管理 (CRM) 的方方面面。从销售自动化、营销内容生成到客服智能化,AI不仅提升了运营效率,还带来了全新的服务模式。特别是自主代理AI (Agentic AI) 的出现,有望在多渠道无缝接管客户服务职能,开创CRM发展新纪元。
数据孤岛长期困扰着组织,影响着人工智能的可靠性。它们导致信息分散、模型训练不完整、洞察力不一致。解决方案包括实施强大的数据治理、促进跨部门协作、采用现代数据集成技术等。克服数据孤岛对于充分发挥AI潜力至关重要。