leaderboardupdated may 24, 2026 · 4 min read

Cheapest GPU Instances 2026 (AWS, GCP, Azure)

TL;DR

For inference, AWS g5.xlarge (A10G) is the cheapest per inference token. For training, GCP a2-highgpu-1g (A100) leads on $/training-token. Azure's T4-based NC4as_T4_v3 wins for low-cost batch inference.

Equivalent SKUs · monthly cost

us-east-1 / us-central1 · linux · on-demand

WorkloadAWS$/moGCP$/moWinner
Inference (small batch)
AWSg5.xlarge
$734
GCPa2-highgpu-1g
$2,145AWS 66%
Batch inference
AWSg5.xlarge
$734
GCPNC4as_T4_v3
$384GCP 48%

Best for X workload

Cheapest A100 hourGCP wins

GCP a2-highgpu-1g with A100 40GB is consistently 15–20% cheaper than equivalents elsewhere.

a2-highgpu-1gOpen in tool
Cheapest T4 hourAzure wins

Azure NC4as_T4_v3 is the lowest sticker on a T4 hourly basis.

NC4as_T4_v3Open in tool
Best $/inferenceAWS wins

g5.xlarge's A10G hits the sweet spot for sub-batch inference workloads.

g5.xlargeOpen in tool

Frequently asked

Should I use Spot/Preemptible for GPU workloads?
For training: yes, with checkpointing. Savings are typically 70%. For real-time inference: no — eviction kills latency SLOs.
What about Lambda Labs, RunPod, and CoreWeave?
Specialized GPU clouds often beat hyperscaler pricing 30–50% for raw GPU hours. The trade-off is fewer integrations (no managed databases, no IAM, less ops tooling).

Run your own comparison

Plug in your exact vCPU / RAM / region. No signup.

Related guides