Enterprises and organizations requiring:
Typical adopters include cloud service providers, enterprises deploying private AI infrastructure, and research institutions scaling large language models without transitioning immediately to newer GPU architectures.
The H200 is available in production through NVIDIA’s data center ecosystem and OEM partners. It is commonly deployed in multi-GPU configurations within HGX and DGX platforms, as well as in PCIe-based enterprise servers using the H200 NVL variant. Availability and pricing vary by configuration, system vendor, and regional supply conditions, reflecting typical data center GPU allocation patterns.
The NVIDIA H200 is a Hopper-generation data center GPU designed to improve large-model training and high-throughput inference by expanding memory capacity and bandwidth versus H100. It pairs Hopper Tensor Cores with HBM3e to reduce memory bottlenecks that commonly limit transformer workloads, especially at long context lengths and higher batch sizes.
H200 is deployed as a GPU module in SXM-based multi-GPU platforms and as a PCIe accelerator (H200 NVL) for air-cooled enterprise server designs. It commonly underpins 4- or 8-GPU baseboards and turnkey systems used in AI factories, hyperscale training clusters, and enterprise inference deployments.
| Specification | H200 GPU |
|---|---|
| Architecture | NVIDIA Hopper |
| Memory | 141 GB HBM3e |
| Memory Bandwidth | ~4.8 TB/s |
| Interconnect | NVLink 4 (multi-GPU) |
| Form Factor | SXM, PCIe (H200 NVL) |
| Max TGP | ~700 W (SXM), ~600 W (PCIe) |
| Precision Support | FP64, TF32, FP16, BF16, FP8, INT8 |
| Typical AI Compute | ~8 PFLOPS+ (FP8, sparsity-dependent) |
| Process Node | TSMC 4N |
| Transistor Count | ~80 billion |
| MIG Support | Supported |
| NVLink (peer) | ~900 GB/s bidirectional aggregate |
Compared to H100, the H200 delivers higher effective throughput on modern transformer workloads primarily due to increased memory capacity and bandwidth, while maintaining Hopper-generation compute efficiency and software compatibility.
Comparable NVIDIA GPUs:
Competitor GPUs:
For organizations already standardized on Hopper systems, H200 provides a straightforward upgrade path from H100. Buyers planning for longer-term scaling or frontier model training may evaluate B200 or alternative high-memory accelerators depending on availability and software requirements.
Related NVIDIA GPUs:
Competitor GPUs:
The NVIDIA H200 is a Hopper-generation data center GPU designed to improve large-scale AI training and inference by expanding memory capacity and bandwidth compared with H100. It combines Hopper Tensor Core performance with HBM3e memory to better support modern transformer workloads, particularly those constrained by memory throughput and context length. The H200 is deployed primarily in multi-GPU data center platforms, including HGX and DGX systems, and serves as a practical upgrade path for organizations scaling AI infrastructure ahead of a transition to newer-generation architectures.