Skip to main content

0G Compute Inference

0G Compute Network provides decentralized AI inference services, supporting various AI models including Large Language Models (LLM), text-to-image generation, and speech-to-text processing.

Prerequisites

  • Node.js >= 22.0.0
  • A wallet with 0G tokens (either testnet or mainnet)
  • EVM compatible wallet (for Web UI)

Supported Service Types

  • Chatbot Services: Conversational AI with models like GPT, DeepSeek, and others
  • Text-to-Image: Generate images from text descriptions using Stable Diffusion and similar models
  • Speech-to-Text: Transcribe audio to text using Whisper and other speech recognition models

Available Services

Testnet Services
View Testnet Services (2 Available)
#ModelTypeProviderInput (per 1M tokens)Output (per 1M tokens)
1qwen-2.5-7b-instructChatbot0xa48f01...0.05 0G0.10 0G
2qwen-image-edit-2511Image-Edit0x4b2a9...-0.005 0G/image

Available Models by Type:

Chatbots (1 model):

  • Qwen 2.5 7B Instruct: Fast and efficient conversational model

Image-Edit (1 model):

  • Qwen Image Edit 2511: Advanced image editing and manipulation model

All testnet services feature TeeML verifiability and are ideal for development and testing.

Mainnet Services
View Mainnet Services (6 Available)
#ModelTypeProviderInput (per 1M tokens)Output (per 1M tokens)
1GLM-5-FP8Chatbot0xd9966e...1 0G3.2 0G
2deepseek-chat-v3-0324Chatbot0x1B3AAe...0.30 0G1.00 0G
3gpt-oss-120bChatbot0xBB3f5b...0.10 0G0.49 0G
4qwen3-vl-30b-a3b-instructChatbot0x4415ef...0.49 0G0.49 0G
5whisper-large-v3Speech-to-Text0x36aCff...0.05 0G0.11 0G
6z-imageText-to-Image0xE29a72...-0.003 0G/image

Available Models by Type:

Chatbots (4 models):

  • GLM-5-FP8: High-performance reasoning model (FP8 quantized)
  • GPT-OSS-120B: Large-scale open-source GPT model
  • Qwen3 VL 30B A3B Instruct: Efficient conversational model (text-only; image input is not yet supported)
  • DeepSeek Chat V3: Optimized conversational model

Speech-to-Text (1 model):

  • Whisper Large V3: OpenAI's state-of-the-art transcription model

Text-to-Image (1 model):

  • Z-Image: Fast high-quality image generation

All mainnet services feature TeeML verifiability for trusted execution in production environments.

Choose Your Interface

FeatureWeb UICLISDK
Setup time~1 min~2 min~5 min
Interactive chat
Automation
App integration
Direct API access

Best for: Quick testing, experimentation and direct frontend integration.

Option 1: Use the Hosted Web UI

Visit the official 0G Compute Marketplace directly — no installation required:

https://compute-marketplace.0g.ai/inference

Option 2: Run Locally

Installation

pnpm add @0glabs/0g-serving-broker -g

Launch Web UI

0g-compute-cli ui start-web

Open http://localhost:3090 in your browser.

Getting Started

1. Connect & Fund

  1. Connect your wallet (MetaMask recommended)
  2. Deposit some 0G tokens using the account dashboard
  3. Browse available AI models and their pricing

2. Start Using AI Services

Option A: Chat Interface

  • Click "Chat" on any chatbot provider
  • Start conversations immediately
  • Perfect for testing and experimentation

Option B: Get API Integration

  • Click "Build" on any provider
  • Get step-by-step integration guides
  • Copy-paste ready code examples

Understanding Delayed Fee Settlement

How Fee Settlement Works

0G Compute Network uses delayed (batch) settlement for provider fees. This means:

  • Fees are not deducted immediately after each inference request. Instead, the provider accumulates usage fees and settles them on-chain in batches.
  • Your sub-account balance may appear to drop suddenly when a batch settlement occurs. For example, if you make 10 requests and the provider settles all at once, you'll see a single larger deduction rather than 10 small ones.
  • You are only charged for actual usage — no extra fees are deducted. The total amount settled always matches the sum of your individual request costs.
  • This is by design to reduce on-chain transaction costs and improve efficiency for both users and providers.

What this means in practice:

  • After making requests, your provider sub-account balance may temporarily appear higher than your "true" available balance
  • When settlement occurs, the balance updates to reflect all accumulated fees at once
  • If you see a sudden balance decrease, check your usage history — the total will match your actual usage

This behavior is visible in the Web UI (provider sub-account balances), CLI (get-account), and SDK (getAccount()).

Rate Limits

Per-User Rate Limits

Each provider enforces per-user rate limits to ensure fair resource sharing across all users. The default limits are:

  • 30 requests per minute per user (sustained)
  • Burst allowance of 5 requests (short spikes allowed)
  • 5 concurrent requests per user

If you exceed these limits, the provider will return HTTP 429 Too Many Requests. Wait briefly and retry. These limits are set by individual providers and may vary.

Troubleshooting

Common Issues

Error: Too many requests (429)

You are sending requests too quickly. Each provider enforces per-user rate limits (default: 30 requests/min, 5 concurrent).

  • Wait a few seconds and retry
  • Reduce request frequency — for batch workloads, add a delay between requests
  • Check concurrent requests — ensure you are not sending more than 5 simultaneous requests
Error: Insufficient balance

Your provider sub-account doesn't have enough funds. Each provider requires a minimum locked balance of 1 0G to serve requests.

CLI:

Deposit to Main Account

0g-compute-cli deposit --amount 10
0g-compute-cli transfer-fund --provider <PROVIDER_ADDRESS> --amount 1

SDK:

// Deposit to main account
await broker.ledger.depositFund(10);
// Transfer to provider sub-account (minimum 1 0G recommended)
await broker.ledger.transferFund(providerAddress, 'inference', BigInt(1) * BigInt(10 ** 18));

Note: In Node.js, the SDK provides background auto-funding that periodically checks sub-account balances and tops up when insufficient. In browser environments, you must transfer funds manually.

Error: Provider not acknowledged

You need to acknowledge the provider before using their service. The easiest way is to transfer funds, which auto-acknowledges:

CLI:

0g-compute-cli transfer-fund --provider <PROVIDER_ADDRESS> --amount 1

SDK:

// transferFund auto-acknowledges the provider's TEE signer
await broker.ledger.transferFund(providerAddress, 'inference', BigInt(1) * BigInt(10 ** 18));
Error: No funds in provider sub-account

Transfer funds to the specific provider sub-account:

0g-compute-cli transfer-fund --provider <PROVIDER_ADDRESS> --amount 1

Check your account balance:

0g-compute-cli get-account
Web UI not starting

If the web UI fails to start:

  1. Check if another service is using port 3090:
0g-compute-cli ui start-web --port 3091
  1. Ensure the package was installed globally:
pnpm add @0glabs/0g-serving-broker -g

Next Steps


Questions? Join our Discord for support.