封面
市場調查報告書
商品編碼
2043020

克服人工智慧記憶體障礙:儲存層重分配和HBF分析

Crossing AI Memory Wall: Storage Layer Reallocation and HBF Analysis

出版日期: | 出版商: TrendForce | 英文 13 Pages | 商品交期: 最快1-2個工作天內

價格
簡介目錄

在人工智慧推理領域,MoE架構和長文本上下文處理正迅速提升模型權重和鍵值快取的記憶體容量需求,使瓶頸從「運算能力不足」轉移到「記憶體容量有限」。隨著熱數據量的快速成長,儲存層次結構正在重構,HBM負責處理熱數據,HBF負責處理溫數據,以最佳化成本績效。然而,HBF的商業化仍需克服先進封裝流程和NAND快閃記憶體固有特性帶來的挑戰。

主要亮點

  • 瓶頸:隨著人工智慧的進步,瓶頸已經從運算能力轉移到了記憶體容量。
  • 分層結構:快速成長的熱資料量需要分層儲存。透過使用 HBM 儲存熱數據,HBF 儲存溫數據,可以最大限度地提高成本效益。
  • HBF面臨的挑戰:商業化需要先進的封裝技術,並克服NAND快閃記憶體的限制。

目錄

第1章:LLM開發中的瓶頸:模型架構對基於轉換的運算結構的影響

第2章:從運算瓶頸到重建儲存層

第3章:TRI的觀點

簡介目錄
Product Code: TRi-182

In AI inference, MoE architectures and long-context processing have sharply increased memory-capacity requirements for model weights and KV cache, shifting the bottleneck from insufficient compute to limited memory capacity. As warm data grows rapidly, this will drive a restructuring of the storage hierarchy, where HBM will handle hot data, while HBF will carry warm data to optimize cost–performance. However, commercialization of HBF still needs to overcome challenges in advanced packaging processes and the inherent characteristics of NAND flash.

Key Highlights

  • Bottleneck: AI advancements shifted the bottleneck from compute power to memory capacity.
  • Hierarchy: Surging warm data demands tiered storage: HBM for hot data and HBF for warm, maximizing cost-efficiency.
  • HBF Hurdles: Commercialization requires overcoming advanced packaging and NAND flash limitations.

Table of Contents

1. Development Bottlenecks of LLM: Impact on Computing Structures by Transformation of Model Architectures

  • Figure 1: Features of MoE
  • Figure 2: Deployment Strategies among AI Storage Vendors

2. From Computing Bottlenecks to Restructuring of Storage Layers

  • Figure 3: Hot, Warm, and Cold Architectures of Storage Layers
  • Figure 4: “H³” Architecture
  • Table 1: Comparison between HBM and HBF

3. TRI’s View