Skip to main content

Inference

Run inference by connecting to individual 0G Compute providers via the @0gfoundation/0g-compute-ts-sdk SDK. You manage per-provider sub-accounts and sign every request with your wallet. For fine-tuning via the same SDK see Fine-tuning; for funding and sub-account management see Account.

Not sure which path to use?

0G Compute offers two ways to run inference:

  • Router (recommended for most applications) — a single OpenAI-compatible API endpoint with one unified balance, automatic provider failover, and an API key. Use this if you're building a server-side app, agent, or prototype.
  • Direct (this page) — connect to individual providers, manage per-provider sub-accounts, and sign requests with your wallet. Use this for browser dApps with wallet signing or when you need direct on-chain control.

Side-by-side comparison: Router vs Direct.

If your balance on pc.0g.ai looks empty

The default Router view on pc.0g.ai shows the Router balance, which is a separate on-chain pool from the per-provider sub-accounts described on this page. To see funds you've deposited on compute-marketplace.0g.ai (or through the CLI/SDK below), switch to Advanced mode using the top-right toggle on pc.0g.ai — it's the same Direct flow embedded in the new UI.

Prerequisites

  • Node.js >= 22.0.0
  • A wallet with 0G tokens (either testnet or mainnet)
  • EVM compatible wallet (for Web UI)

Supported Service Types

  • Chatbot Services: Conversational AI with models like GPT, DeepSeek, and others
  • Text-to-Image: Generate images from text descriptions using Stable Diffusion and similar models
  • Speech-to-Text: Transcribe audio to text using Whisper and other speech recognition models

Available Services

The provider and model catalog changes frequently (providers join and leave, pricing is set per-provider). This page does not reproduce the list — check a live source instead:

  • Web UIpc.0g.ai (switch to Advanced mode, top-right) or compute-marketplace.0g.ai/inference — both show the current provider catalog with pricing, health, and TEE attestation
  • CLI0g-compute-cli inference list-providers
  • SDKawait broker.inference.listService()

Verification modes

Each service declares one of two TEE verification modes:

TeeML — The AI model runs directly inside a Trusted Execution Environment. The TEE guarantees that both the model and the computation are protected, and responses are signed by the TEE's private key. Used by self-hosted models.

TeeTLS — The Broker runs inside a TEE and proxies requests to a centralized LLM provider over HTTPS. This provides cryptographic proof that responses genuinely came from the real provider:

  • Authentic routing: During the TLS handshake, the Broker verifies the provider's certificate against trusted Certificate Authorities, ensuring the connection reaches the real provider — not an imposter.
  • Cryptographic proof: The Broker captures the provider's TLS certificate fingerprint and bundles it together with the request hash, response hash, and provider identity into a signed routing proof using its TEE-protected private key.
  • Privacy preservation: Since the Broker runs inside a TEE, it cannot inspect or tamper with user data in transit — 0G acts as a verifiable relay, not a middleman. This is conceptually similar to zkTLS but with stronger privacy properties, as the TEE ensures the relay itself is trustworthy.
  • End-to-end integrity: The TEE attestation proves the Broker is running unmodified code, the CA/TLS system guarantees only the real provider holds a valid certificate for their domain, and the TEE signature binds everything together — a verifier can confirm the proof came from a genuine TEE and that the fingerprint belongs to the expected provider.

Choose Your Interface

FeatureWeb UICLISDK
Setup time~1 min~2 min~5 min
Interactive chat
Automation
App integration
Direct API access

Best for: Quick testing, experimentation and direct frontend integration.

Option 1: Use the Hosted Web UI

Two hosted entry points — both run the same Direct flow against the same per-provider sub-accounts:

Option 2: Run Locally

Installation

pnpm add @0gfoundation/0g-compute-ts-sdk -g

Launch Web UI

0g-compute-cli ui start-web

Open http://localhost:3090 in your browser.

Getting Started

1. Connect & Fund

  1. Connect your wallet (MetaMask recommended)
  2. Deposit some 0G tokens using the account dashboard
  3. Browse available AI models and their pricing

2. Start Using AI Services

Option A: Chat Interface

  • Click "Chat" on any chatbot provider
  • Start conversations immediately
  • Perfect for testing and experimentation

Option B: Get API Integration

  • Click "Build" on any provider
  • Get step-by-step integration guides
  • Copy-paste ready code examples

Understanding Delayed Fee Settlement

How Fee Settlement Works

0G Compute Network uses delayed (batch) settlement for provider fees. This means:

  • Fees are not deducted immediately after each inference request. Instead, the provider accumulates usage fees and settles them on-chain in batches.
  • Your sub-account balance may appear to drop suddenly when a batch settlement occurs. For example, if you make 10 requests and the provider settles all at once, you'll see a single larger deduction rather than 10 small ones.
  • You are only charged for actual usage — no extra fees are deducted. The total amount settled always matches the sum of your individual request costs.
  • This is by design to reduce on-chain transaction costs and improve efficiency for both users and providers.

What this means in practice:

  • After making requests, your provider sub-account balance may temporarily appear higher than your "true" available balance
  • When settlement occurs, the balance updates to reflect all accumulated fees at once
  • If you see a sudden balance decrease, check your usage history — the total will match your actual usage

This behavior is visible in the Web UI (provider sub-account balances), CLI (get-account), and SDK (getAccount()).

This applies only to the Direct flow. The Router uses a different billing path with a single unified balance — there are no per-provider sub-accounts and no delayed batch settlement visible to callers.

Rate Limits

Per-User Rate Limits

Each provider enforces per-user rate limits to ensure fair resource sharing across all users. The default limits are:

  • 30 requests per minute per user (sustained)
  • Burst allowance of 5 requests (short spikes allowed)
  • 5 concurrent requests per user

If you exceed these limits, the provider will return HTTP 429 Too Many Requests. Wait briefly and retry. These limits are set by individual providers and may vary.

Troubleshooting

Common Issues

Error: Too many requests (429)

You are sending requests too quickly. Each provider enforces per-user rate limits (default: 30 requests/min, 5 concurrent).

  • Wait a few seconds and retry
  • Reduce request frequency — for batch workloads, add a delay between requests
  • Check concurrent requests — ensure you are not sending more than 5 simultaneous requests
Error: Insufficient balance

Your provider sub-account doesn't have enough funds. Each provider requires a minimum locked balance of 1 0G to serve requests.

CLI:

Deposit to Main Account

0g-compute-cli deposit --amount 10
0g-compute-cli transfer-fund --provider <PROVIDER_ADDRESS> --amount 1

SDK:

// Deposit to main account
await broker.ledger.depositFund(10);
// Transfer to provider sub-account (minimum 1 0G recommended)
await broker.ledger.transferFund(providerAddress, 'inference', BigInt(1) * BigInt(10 ** 18));

Note: In Node.js, the SDK provides background auto-funding that periodically checks sub-account balances and tops up when insufficient. In browser environments, you must transfer funds manually.

Error: Provider not acknowledged

You need to acknowledge the provider before using their service. The easiest way is to transfer funds, which auto-acknowledges:

CLI:

0g-compute-cli transfer-fund --provider <PROVIDER_ADDRESS> --amount 1

SDK:

// transferFund auto-acknowledges the provider's TEE signer
await broker.ledger.transferFund(providerAddress, 'inference', BigInt(1) * BigInt(10 ** 18));
Error: No funds in provider sub-account

Transfer funds to the specific provider sub-account:

0g-compute-cli transfer-fund --provider <PROVIDER_ADDRESS> --amount 1

Check your account balance:

0g-compute-cli get-account
Web UI not starting

If the web UI fails to start:

  1. Check if another service is using port 3090:
0g-compute-cli ui start-web --port 3091
  1. Ensure the package was installed globally:
pnpm add @0gfoundation/0g-compute-ts-sdk -g

Next Steps


Questions? Join our Discord for support.