description
#deep_learning_recommender_system #memory_disaggregation #total_cost_of_ownership #RDMA

DisaggRec: Architecting disaggregated systems for large-scale personalized recommendation

Meta Info

Presented in arxiv:2212.00939.

Authors: Liu Ke (Meta AI & UWash), Xuan Zhang (UWash), Benjamin Lee (Meta AI & UPenn), G. Edward Suh (Meta AI & Cornell), Hsien-Hsin S. Lee (Meta AI & Intel).

Understanding the paper

TL;DRs

This paper presents DisaggRec, a disaggregated system for large-scale recommendation serving, that decouples the compute and memory resources.

Terminology

Node
- Compute nodes (CNs): supply high-performance processors but only a limited amount of memory.
- Memory nodes (MNs): supply high-capacity DRAM devices.
Strategy
- Scale-up: equip a single server with sufficient resources to serve end-to-end model inference.
- Scale-out: the model's SparseNet is sharded and distributed across multiple servers when the embedding tables cannot fit into a single server's memory.

Problems

Monolithic servers provision computing and memory in fixed proportions, leading to idle resources and wasted costs.

Technical Details

Co-optimize the partitioning strategies for recommendation models and design strategies for disaggregated CNs and MNs.
Minimize the cost subject to latency targets and availability requirements.
Focus on two industry-grade models — a memory-intensive RM1 and a compute-intensive RM2.

System Architecture

Disaggregated System Architecture.

Model Serving

RPC-based Model Serving.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

disaggrec.md

disaggrec.md

DisaggRec: Architecting disaggregated systems for large-scale personalized recommendation

Meta Info

Understanding the paper

TL;DRs

Terminology

Problems

Technical Details

System Architecture

Model Serving

Files

disaggrec.md

Latest commit

History

disaggrec.md

File metadata and controls

DisaggRec: Architecting disaggregated systems for large-scale personalized recommendation

Meta Info

Understanding the paper

TL;DRs

Terminology

Problems

Technical Details

System Architecture

Model Serving