leaderboardupdated may 24, 2026 · 4 min read

Cheapest GPU Instances 2026 (AWS, GCP, Azure)

TL;DR

For inference, AWS g5.xlarge (A10G) is the cheapest per inference token. For training, GCP a2-highgpu-1g (A100) leads on $/training-token. Azure's T4-based NC4as_T4_v3 wins for low-cost batch inference.

Equivalent SKUs · monthly cost

us-east-1 / us-central1 · linux · on-demand

Workload	AWS	$/mo	GCP	$/mo	Winner
Inference (small batch)	AWSg5.xlarge	$734	GCPa2-highgpu-1g	$2,145	AWS −66%
Batch inference	AWSg5.xlarge	$734	GCPNC4as_T4_v3	$384	GCP −48%

Best for X workload

Cheapest A100 hourGCP wins

GCP a2-highgpu-1g with A100 40GB is consistently 15–20% cheaper than equivalents elsewhere.

a2-highgpu-1gOpen in tool

Cheapest T4 hourAzure wins

Azure NC4as_T4_v3 is the lowest sticker on a T4 hourly basis.

NC4as_T4_v3Open in tool

Best $/inferenceAWS wins

g5.xlarge's A10G hits the sweet spot for sub-batch inference workloads.

g5.xlargeOpen in tool

Frequently asked

Should I use Spot/Preemptible for GPU workloads?

For training: yes, with checkpointing. Savings are typically 70%. For real-time inference: no — eviction kills latency SLOs.

What about Lambda Labs, RunPod, and CoreWeave?

Specialized GPU clouds often beat hyperscaler pricing 30–50% for raw GPU hours. The trade-off is fewer integrations (no managed databases, no IAM, less ops tooling).

Run your own comparison

Plug in your exact vCPU / RAM / region. No signup.

Cheapest GPU Instances 2026 (AWS, GCP, Azure)

Equivalent SKUs · monthly cost

Best for X workload

Frequently asked

Run your own comparison

Related guides