Inference
Run inference by connecting to individual 0G Compute providers via the @0gfoundation/0g-compute-ts-sdk SDK. You manage per-provider sub-accounts and sign every request with your wallet. For fine-tuning via the same SDK see Fine-tuning; for funding and sub-account management see Account.
0G Compute offers two ways to run inference:
- Router (recommended for most applications) — a single OpenAI-compatible API endpoint with one unified balance, automatic provider failover, and an API key. Use this if you're building a server-side app, agent, or prototype.
- Direct (this page) — connect to individual providers, manage per-provider sub-accounts, and sign requests with your wallet. Use this for browser dApps with wallet signing or when you need direct on-chain control.
Side-by-side comparison: Router vs Direct.
The default Router view on pc.0g.ai shows the Router balance, which is a separate on-chain pool from the per-provider sub-accounts described on this page. To see funds you've deposited on compute-marketplace.0g.ai (or through the CLI/SDK below), switch to Advanced mode using the top-right toggle on pc.0g.ai — it's the same Direct flow embedded in the new UI.
Prerequisites
- Node.js >= 22.0.0
- A wallet with 0G tokens (either testnet or mainnet)
- EVM compatible wallet (for Web UI)
Supported Service Types
- Chatbot Services: Conversational AI with models like GPT, DeepSeek, and others
- Text-to-Image: Generate images from text descriptions using Stable Diffusion and similar models
- Speech-to-Text: Transcribe audio to text using Whisper and other speech recognition models
Available Services
The provider and model catalog changes frequently (providers join and leave, pricing is set per-provider). This page does not reproduce the list — check a live source instead:
- Web UI — pc.0g.ai (switch to Advanced mode, top-right) or compute-marketplace.0g.ai/inference — both show the current provider catalog with pricing, health, and TEE attestation
- CLI —
0g-compute-cli inference list-providers - SDK —
await broker.inference.listService()
Verification modes
Each service declares one of two TEE verification modes:
TeeML — The AI model runs directly inside a Trusted Execution Environment. The TEE guarantees that both the model and the computation are protected, and responses are signed by the TEE's private key. Used by self-hosted models.
TeeTLS — The Broker runs inside a TEE and proxies requests to a centralized LLM provider over HTTPS. This provides cryptographic proof that responses genuinely came from the real provider:
- Authentic routing: During the TLS handshake, the Broker verifies the provider's certificate against trusted Certificate Authorities, ensuring the connection reaches the real provider — not an imposter.
- Cryptographic proof: The Broker captures the provider's TLS certificate fingerprint and bundles it together with the request hash, response hash, and provider identity into a signed routing proof using its TEE-protected private key.
- Privacy preservation: Since the Broker runs inside a TEE, it cannot inspect or tamper with user data in transit — 0G acts as a verifiable relay, not a middleman. This is conceptually similar to zkTLS but with stronger privacy properties, as the TEE ensures the relay itself is trustworthy.
- End-to-end integrity: The TEE attestation proves the Broker is running unmodified code, the CA/TLS system guarantees only the real provider holds a valid certificate for their domain, and the TEE signature binds everything together — a verifier can confirm the proof came from a genuine TEE and that the fingerprint belongs to the expected provider.
Choose Your Interface
| Feature | Web UI | CLI | SDK |
|---|---|---|---|
| Setup time | ~1 min | ~2 min | ~5 min |
| Interactive chat | ✅ | ❌ | ❌ |
| Automation | ❌ | ✅ | ✅ |
| App integration | ❌ | ❌ | ✅ |
| Direct API access | ❌ | ❌ | ✅ |
- Web UI
- CLI
- SDK
Best for: Quick testing, experimentation and direct frontend integration.
Option 1: Use the Hosted Web UI
Two hosted entry points — both run the same Direct flow against the same per-provider sub-accounts:
- https://compute-marketplace.0g.ai/inference — the original Marketplace UI
- https://pc.0g.ai with the top-right toggle set to Advanced — the same flow embedded in the new pc.0g.ai UI (the default "Router" mode on pc.0g.ai is a different, newer system — see the Router docs)
Option 2: Run Locally
Installation
pnpm add @0gfoundation/0g-compute-ts-sdk -g
Launch Web UI
0g-compute-cli ui start-web
Open http://localhost:3090 in your browser.
Getting Started
1. Connect & Fund
- Connect your wallet (MetaMask recommended)
- Deposit some 0G tokens using the account dashboard
- Browse available AI models and their pricing
2. Start Using AI Services
Option A: Chat Interface
- Click "Chat" on any chatbot provider
- Start conversations immediately
- Perfect for testing and experimentation
Option B: Get API Integration
- Click "Build" on any provider
- Get step-by-step integration guides
- Copy-paste ready code examples
Best for: Automation, scripting, and server environments
Installation
pnpm add @0gfoundation/0g-compute-ts-sdk -g
Setup Environment
Choose Network
0g-compute-cli setup-network
Login with Wallet
Enter your wallet private key when prompted. This will be used for account management and service payments.
0g-compute-cli login
Create Account & Add Funds
Before using inference services, you need to fund your account. For detailed account management, see Account.
0g-compute-cli deposit --amount 10
0g-compute-cli get-account
0g-compute-cli transfer-fund --provider <PROVIDER_ADDRESS> --amount 1
CLI Commands
List Providers
0g-compute-cli inference list-providers
Verify Provider
Check provider's TEE attestation and reliability before using:
0g-compute-cli inference verify --provider <PROVIDER_ADDRESS>
This command outputs the provider's report and verifies their Trusted Execution Environment (TEE) status.
Acknowledge Provider (Optional)
If you already used transfer-fund to fund a provider, acknowledgement happens automatically. This command is only needed if you want to acknowledge without transferring funds:
0g-compute-cli inference acknowledge-provider --provider <PROVIDER_ADDRESS>
Direct API Access
Generate an authentication token for direct API calls:
0g-compute-cli inference get-secret --provider <PROVIDER_ADDRESS>
This generates a Bearer token in the format app-sk-<SECRET> that you can use for direct API calls.
API Usage Examples
- Chatbot
- Text-to-Image
- Speech-to-Text
Use for conversational AI and text generation.
- cURL
- JavaScript
- Python
curl <service_url>/v1/proxy/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer app-sk-<YOUR_SECRET>" \
-d '{
"model": <service.model>,
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Hello!"
}
]
}`
const OpenAI = require('openai');
const client = new OpenAI({
baseURL: `${service.url}/v1/proxy`,
apiKey: 'app-sk-<YOUR_SECRET>'
});
const completion = await client.chat.completions.create({
model: service.model,
messages: [
{
role: 'system',
content: 'You are a helpful assistant.'
},
{
role: 'user',
content: 'Hello!'
}
]
});
console.log(completion.choices[0].message);
from openai import OpenAI
client = OpenAI(
base_url=`${service.url}/v1/proxy`,
api_key='app-sk-<YOUR_SECRET>'
)
completion = client.chat.completions.create(
model=service.model,
messages=[
{
'role': 'system',
'content': 'You are a helpful assistant.'
},
{
'role': 'user',
'content': 'Hello!'
}
]
)
print(completion.choices[0].message)
Generate images from text descriptions.
- cURL
- JavaScript
- Python
curl <service_url>/v1/proxy/images/generations \
-H "Content-Type: application/json" \
-H "Authorization: Bearer app-sk-<YOUR_SECRET>" \
-d '{
"model": <service.model>,
"prompt": "A cute baby sea otter playing in the water",
"n": 1,
"size": "1024x1024"
}'
const OpenAI = require('openai');
const client = new OpenAI({
baseURL: `${service.url}/v1/proxy`,
apiKey: 'app-sk-<YOUR_SECRET>'
});
const response = await client.images.generate({
model: service.model,
prompt: 'A cute baby sea otter playing in the water',
n: 1,
size: '1024x1024'
});
console.log(response.data);
from openai import OpenAI
client = OpenAI(
base_url=`${service.url}/v1/proxy`,
api_key='app-sk-<YOUR_SECRET>'
)
response = client.images.generate(
model=service.model,
prompt='A cute baby sea otter playing in the water',
n=1,
size='1024x1024'
)
print(response.data)
Transcribe audio files to text.
- cURL
- JavaScript
- Python
curl <service_url>/v1/proxy/audio/transcriptions \
-H "Authorization: Bearer app-sk-<YOUR_SECRET>" \
-H "Content-Type: multipart/form-data" \
-F "file=@audio.ogg" \
-F "model=whisper-large-v3" \
-F "response_format=json"
const OpenAI = require('openai');
const fs = require('fs');
const client = new OpenAI({
baseURL: `${service.url}/v1/proxy`,
apiKey: 'app-sk-<YOUR_SECRET>'
});
const transcription = await client.audio.transcriptions.create({
file: fs.createReadStream('audio.ogg'),
model: 'whisper-large-v3',
response_format: 'json'
});
console.log(transcription.text);
from openai import OpenAI
client = OpenAI(
base_url=`${service.url}/v1/proxy`,
api_key='app-sk-<YOUR_SECRET>'
)
with open('audio.ogg', 'rb') as audio_file:
transcription = client.audio.transcriptions.create(
file=audio_file,
model='whisper-large-v3',
response_format='json'
)
print(transcription.text)
Start Local Proxy Server
Run a local OpenAI-compatible server:
# Start server on port 3000 (default)
0g-compute-cli inference serve --provider <PROVIDER_ADDRESS>
# Custom port
0g-compute-cli inference serve --provider <PROVIDER_ADDRESS> --port 8080
Then use any OpenAI-compatible client to connect to http://localhost:3000.
Best for: Application integration and programmatic access
Installation
pnpm add @0gfoundation/0g-compute-ts-sdk
Get up and running quickly with our comprehensive TypeScript starter kit within minutes.
- TypeScript Starter Kit - Complete examples with TypeScript and CLI tool
Initialize the Broker
- Node.js
- Browser
import { ethers } from "ethers";
import { createZGComputeNetworkBroker } from "@0gfoundation/0g-compute-ts-sdk";
// Choose your network
const RPC_URL = process.env.NODE_ENV === 'production'
? "https://evmrpc.0g.ai" // Mainnet
: "https://evmrpc-testnet.0g.ai"; // Testnet
const provider = new ethers.JsonRpcProvider(RPC_URL);
const wallet = new ethers.Wallet(process.env.PRIVATE_KEY!, provider);
const broker = await createZGComputeNetworkBroker(wallet);
import { BrowserProvider } from "ethers";
import { createZGComputeNetworkBroker } from "@0gfoundation/0g-compute-ts-sdk";
// Check if MetaMask is installed
if (typeof window.ethereum === "undefined") {
throw new Error("Please install MetaMask");
}
const provider = new BrowserProvider(window.ethereum);
const signer = await provider.getSigner();
const broker = await createZGComputeNetworkBroker(signer);
@0gfoundation/0g-compute-ts-sdk requires polyfills for Node.js built-in modules.
Vite example:
pnpm add -D vite-plugin-node-polyfills
// vite.config.js
import { nodePolyfills } from 'vite-plugin-node-polyfills';
export default {
plugins: [
nodePolyfills({
include: ['crypto', 'stream', 'util', 'buffer', 'process'],
globals: { Buffer: true, global: true, process: true }
})
]
};
In browser environments, the SDK does not auto-fund provider sub-accounts. Auto-funding requires a wallet signature for each transfer, which would trigger unexpected wallet popups (e.g. MetaMask) during active chat sessions — a poor user experience.
For browser dApps, you must manage funds manually:
- Deposit to your main account:
await broker.ledger.depositFund(10) - Transfer to the provider sub-account:
await broker.ledger.transferFund(providerAddress, 'inference', amount)
In Node.js environments (server-side), the SDK provides background auto-funding that periodically checks provider sub-account balances and tops up from the ledger as needed.
Discover Services
// List all available services
const services = await broker.inference.listService();
// Filter by service type
const chatbotServices = services.filter(s => s.serviceType === 'chatbot');
const imageServices = services.filter(s => s.serviceType === 'text-to-image');
const speechServices = services.filter(s => s.serviceType === 'speech-to-text');
Verify Provider (Optional)
All providers listed on the 0G Compute Network have already been verified by the 0G team. This step is optional and intended for users who want to independently verify a provider's TEE attestation.
The SDK performs automated checks and provides guidance for manual verification steps.
Automated checks:
- TEE signer address match (contract vs attestation report)
- Docker Compose hash verification (calculated vs event log)
Manual steps (instructions included in output):
- Docker image integrity verification via sigstore
- Full quote verification using dstack-verifier
// Verify with real-time step output
const result = await broker.inference.verifyService(
providerAddress,
'./reports', // directory to save attestation reports
(step) => console.log(step.message) // optional: print each step as it happens
);
// Check automated verification results programmatically
if (result.signerVerification.allMatch && result.composeVerification.passed) {
console.log('Automated checks passed');
} else {
console.warn('Automated checks failed — review result for details');
}
// Access structured data
console.log('Signer match:', result.signerVerification.allMatch);
console.log('Compose hash:', result.composeVerification.passed);
console.log('Docker images:', result.dockerImages);
console.log('Reports saved to:', result.outputDirectory);
verifyService can only verify signer address and compose hash automatically. To fully verify a provider's TEE environment, you must also follow the manual steps in the output — including running dstack-verifier and checking image integrity via sigstore.
Account Management
For detailed account operations, see Account.
- Ledger creation (
depositFund): Requires a minimum of 3 0G for initial deposit - Provider sub-account: Each provider requires a minimum locked balance of 1 0G to serve requests. Transfers below this amount may result in rejected requests.
In Node.js environments, the SDK provides background auto-funding that periodically checks provider sub-account balances and tops up from the ledger when insufficient. In browser environments, you must transfer funds manually.
- Node.js
- Browser
// Deposit to main account
await broker.ledger.depositFund(10);
// Node.js: SDK provides background auto-funding that periodically checks
// provider sub-account balances and tops up from the ledger when needed.
// Deposit to main account
await broker.ledger.depositFund(10);
// Browser: manually transfer funds to provider sub-account (minimum 1 0G).
// This also auto-acknowledges the provider's TEE signer on-chain.
await broker.ledger.transferFund(providerAddress, 'inference', BigInt(1) * BigInt(10 ** 18));
Make Inference Requests
- Chatbot
- Text-to-Image
- Speech-to-Text
const messages = [{ role: "user", content: "Hello!" }];
// Get service metadata
const { endpoint, model } = await broker.inference.getServiceMetadata(providerAddress);
// Generate auth headers
const headers = await broker.inference.getRequestHeaders(
providerAddress
);
// Make request
const response = await fetch(`${endpoint}/chat/completions`, {
method: "POST",
headers: { "Content-Type": "application/json", ...headers },
body: JSON.stringify({ messages, model })
});
const data = await response.json();
const answer = data.choices[0].message.content;
// Optional: verify response integrity via TEE signature (see Response Processing below)
const chatID = response.headers.get("ZG-Res-Key") || data.id;
if (chatID) {
const isValid = await broker.inference.processResponse(
providerAddress,
chatID
);
}
const prompt = "A cute baby sea otter";
// Get service metadata
const { endpoint, model } = await broker.inference.getServiceMetadata(providerAddress);
// Generate auth headers
const headers = await broker.inference.getRequestHeaders(
providerAddress
);
// Make request
const response = await fetch(`${endpoint}/images/generations`, {
method: "POST",
headers: { "Content-Type": "application/json", ...headers },
body: JSON.stringify({
model,
prompt,
n: 1,
size: "1024x1024"
})
});
const data = await response.json();
const imageUrl = data.data[0].url;
// Optional: verify response integrity via TEE signature
const chatID = response.headers.get("ZG-Res-Key");
if (chatID) {
const isValid = await broker.inference.processResponse(providerAddress, chatID);
}
const formData = new FormData();
formData.append('file', audioFile); // audioFile is a File or Blob
formData.append('model', model);
formData.append('response_format', 'json');
// Get service metadata
const { endpoint, model } = await broker.inference.getServiceMetadata(providerAddress);
// Generate auth headers
const headers = await broker.inference.getRequestHeaders(
providerAddress
);
// Make request
const response = await fetch(`${endpoint}/audio/transcriptions`, {
method: "POST",
headers: { ...headers },
body: formData
});
const data = await response.json();
const transcription = data.text;
// Optional: verify response integrity via TEE signature
const chatID = response.headers.get("ZG-Res-Key");
if (chatID) {
const isValid = await broker.inference.processResponse(
providerAddress,
chatID
);
}
Response Processing & Verification
Use processResponse when you want to verify response integrity via the provider's TEE signature. Pass the chatID from the response header (ZG-Res-Key) to enable verification.
The processResponse method verifies that an inference response came from a genuine TEE environment by checking the provider's signature for the given chatID.
Parameters:
providerAddress: The address of the provider.chatID: Response identifier for TEE verification. Get fromZG-Res-Keyresponse header, or fall back todata.idfor chatbot responses. Returnsnullif omitted (verification skipped).
- Chatbot
- Text-to-Image
- Speech-to-Text
- Streaming Responses
For chatbot services, verify the response using the chatID from headers or response body:
const response = await fetch(`${endpoint}/chat/completions`, {
method: "POST",
headers: { "Content-Type": "application/json", ...headers },
body: JSON.stringify({ messages, model })
});
const data = await response.json();
// Get chatID: prioritize ZG-Res-Key header, fall back to response body
let chatID = response.headers.get("ZG-Res-Key") || response.headers.get("zg-res-key");
if (!chatID) {
chatID = data.id || data.chatID;
}
// Verify response integrity via TEE signature
if (chatID) {
const isValid = await broker.inference.processResponse(
providerAddress,
chatID
);
console.log("Response valid:", isValid);
}
For text-to-image services, verify using the chatID from response headers:
const requestBody = {
model,
prompt: "A cute baby sea otter",
size: "1024x1024",
n: 1
};
const response = await fetch(`${endpoint}/images/generations`, {
method: "POST",
headers: { "Content-Type": "application/json", ...headers },
body: JSON.stringify(requestBody)
});
const data = await response.json();
// Get chatID from response headers for verification
const chatID = response.headers.get("ZG-Res-Key") || response.headers.get("zg-res-key");
if (chatID) {
const isValid = await broker.inference.processResponse(
providerAddress,
chatID
);
console.log("Response valid:", isValid);
}
For speech-to-text services, verify using the chatID from response headers:
const formData = new FormData();
formData.append('file', audioFile);
formData.append('model', model);
const response = await fetch(`${endpoint}/audio/transcriptions`, {
method: "POST",
headers: { ...headers },
body: formData
});
const data = await response.json();
// Get chatID from response headers for verification
const chatID = response.headers.get("ZG-Res-Key") || response.headers.get("zg-res-key");
if (chatID) {
const isValid = await broker.inference.processResponse(
providerAddress,
chatID
);
console.log("Response valid:", isValid);
}
For streaming responses, handle chatID differently based on service type:
- Chatbot Streaming
- Speech-to-Text Streaming
// For chatbot streaming, first check headers then try to get ID from stream
let chatID = response.headers.get("ZG-Res-Key") || response.headers.get("zg-res-key");
let streamChatID = null;
const decoder = new TextDecoder();
const reader = response.body.getReader();
// Process stream
let rawBody = '';
while (true) {
const { done, value } = await reader.read();
if (done) break;
rawBody += decoder.decode(value, { stream: true });
}
// Parse chatID from stream data as fallback
for (const line of rawBody.split('\n')) {
const trimmed = line.trim();
if (!trimmed || trimmed === 'data: [DONE]') continue;
try {
const jsonStr = trimmed.startsWith('data:')
? trimmed.slice(5).trim()
: trimmed;
const message = JSON.parse(jsonStr);
if (!streamChatID && (message.id || message.chatID)) {
streamChatID = message.id || message.chatID;
}
} catch {}
}
// Use chatID from header if available, otherwise from stream data
const finalChatID = chatID || streamChatID;
if (finalChatID) {
const isValid = await broker.inference.processResponse(
providerAddress,
finalChatID
);
console.log("Chatbot streaming response valid:", isValid);
}
// For speech-to-text streaming, get chatID from headers
const chatID = response.headers.get("ZG-Res-Key") || response.headers.get("zg-res-key");
if (chatID) {
const isValid = await broker.inference.processResponse(
providerAddress,
chatID
);
console.log("Audio streaming response valid:", isValid);
}
Key Points:
processResponseis optional. Use it when you want to verify response integrity via TEE signature.- Pass the
chatIDparameter to enable verification. WithoutchatID, the method returnsnull(verification skipped). - chatID retrieval: Always prioritize
ZG-Res-Keyfrom response headers. Only use fallback methods when header is not present.- Chatbot: First try
ZG-Res-Keyheader, then checkdata.idas fallback - Text-to-Image & Speech-to-Text: Get chatID from
ZG-Res-Keyresponse header - Streaming: Check headers first, then try to get
idfrom stream data as fallback
- Chatbot: First try
Understanding Delayed Fee Settlement
0G Compute Network uses delayed (batch) settlement for provider fees. This means:
- Fees are not deducted immediately after each inference request. Instead, the provider accumulates usage fees and settles them on-chain in batches.
- Your sub-account balance may appear to drop suddenly when a batch settlement occurs. For example, if you make 10 requests and the provider settles all at once, you'll see a single larger deduction rather than 10 small ones.
- You are only charged for actual usage — no extra fees are deducted. The total amount settled always matches the sum of your individual request costs.
- This is by design to reduce on-chain transaction costs and improve efficiency for both users and providers.
What this means in practice:
- After making requests, your provider sub-account balance may temporarily appear higher than your "true" available balance
- When settlement occurs, the balance updates to reflect all accumulated fees at once
- If you see a sudden balance decrease, check your usage history — the total will match your actual usage
This behavior is visible in the Web UI (provider sub-account balances), CLI (get-account), and SDK (getAccount()).
This applies only to the Direct flow. The Router uses a different billing path with a single unified balance — there are no per-provider sub-accounts and no delayed batch settlement visible to callers.
Rate Limits
Each provider enforces per-user rate limits to ensure fair resource sharing across all users. The default limits are:
- 30 requests per minute per user (sustained)
- Burst allowance of 5 requests (short spikes allowed)
- 5 concurrent requests per user
If you exceed these limits, the provider will return HTTP 429 Too Many Requests. Wait briefly and retry. These limits are set by individual providers and may vary.
Troubleshooting
Common Issues
Error: Too many requests (429)
You are sending requests too quickly. Each provider enforces per-user rate limits (default: 30 requests/min, 5 concurrent).
- Wait a few seconds and retry
- Reduce request frequency — for batch workloads, add a delay between requests
- Check concurrent requests — ensure you are not sending more than 5 simultaneous requests
Error: Insufficient balance
Your provider sub-account doesn't have enough funds. Each provider requires a minimum locked balance of 1 0G to serve requests.
CLI:
Deposit to Main Account
0g-compute-cli deposit --amount 10
Transfer to Provider Sub-Account (minimum 1 0G recommended)
0g-compute-cli transfer-fund --provider <PROVIDER_ADDRESS> --amount 1
SDK:
// Deposit to main account
await broker.ledger.depositFund(10);
// Transfer to provider sub-account (minimum 1 0G recommended)
await broker.ledger.transferFund(providerAddress, 'inference', BigInt(1) * BigInt(10 ** 18));
Note: In Node.js, the SDK provides background auto-funding that periodically checks sub-account balances and tops up when insufficient. In browser environments, you must transfer funds manually.
Error: Provider not acknowledged
You need to acknowledge the provider before using their service. The easiest way is to transfer funds, which auto-acknowledges:
CLI:
0g-compute-cli transfer-fund --provider <PROVIDER_ADDRESS> --amount 1
SDK:
// transferFund auto-acknowledges the provider's TEE signer
await broker.ledger.transferFund(providerAddress, 'inference', BigInt(1) * BigInt(10 ** 18));
Error: No funds in provider sub-account
Transfer funds to the specific provider sub-account:
0g-compute-cli transfer-fund --provider <PROVIDER_ADDRESS> --amount 1
Check your account balance:
0g-compute-cli get-account
Web UI not starting
If the web UI fails to start:
- Check if another service is using port 3090:
0g-compute-cli ui start-web --port 3091
- Ensure the package was installed globally:
pnpm add @0gfoundation/0g-compute-ts-sdk -g
Next Steps
- Manage Accounts → Account
- Fine-tune Models → Fine-tuning Guide
- Become a Provider → Provider Setup
- View Examples → GitHub
Questions? Join our Discord for support.