Documentation
Build with
Ranus AI.
Production-ready docs for integrating Ranus models through a familiar, OpenAI-compatible chat completions API. Start with direct HTTP for simple use cases, then layer in frameworks like LangChain for retrieval, tools, and multi-step workflows.
Authenticate safely
Create an API key in the dashboard and keep it server-side using environment variables such as RANUS_API_KEY.
Choose your integration
Use direct HTTP for lean production paths, or LangChain when you need retrieval, tool calling, memory, or more advanced orchestration.
Observe usage
Track credits, token consumption, model activity, and request health in one clean dashboard after deployment.
01. Quickstart
Start with one request
Use the same request shape across cURL, Node.js, Python, and Go. This reduces onboarding friction and keeps integration patterns consistent across teams and services.
Recommended starting path
Node.js
RecommendedBest fit for Next.js, Express, background jobs, and other server-side JavaScript runtimes using native fetch.
ts
const response = await fetch('https://api.ranus.tech/v1/chat/completions', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
Authorization: `Bearer ${process.env.RANUS_API_KEY}`,
},
body: JSON.stringify({
model: 'ranus-smart',
messages: [
{ role: 'system', content: 'You are a helpful marketing assistant.' },
{ role: 'user', content: 'Write a product launch email.' },
],
temperature: 0.4,
max_tokens: 500,
}),
});
if (!response.ok) {
throw new Error('Ranus request failed');
}
const data = await response.json();
console.log(data.choices[0]?.message?.content);02. Endpoint
OpenAI-compatible chat completions endpoint
Ranus exposes a production-ready chat completions API so teams can integrate quickly using familiar request and response conventions.
HTTP endpoint
POST https://api.ranus.tech/v1/chat/completions03. Authentication
Authenticate with a Bearer token
Create an API key from the dashboard and send it in the Authorization header. Keep secrets on the server, never in browser-exposed code or public repositories.
Authorization header
Authorization: Bearer sk_live_xxxCreate and manage keys from the dashboard API Keys page. For production, store the value in an environment variable like RANUS_API_KEY.
Security note
04. Models
Choose the right model for the workload
Each model is tuned for a different balance of speed, reasoning depth, and credit efficiency. Multipliers determine relative credit usage per token.
| Alias | Use case | Multiplier (in/out) | Max output |
|---|---|---|---|
| ranus-fast | Fast and efficient for classification, extraction, routing, and short chat. | 1x · 1x | 4.096 |
| ranus-smart | Balanced default for production assistants, support workflows, and content. | 1x · 2x | 32.768 |
| ranus-reason | Deeper reasoning for coding, planning, technical analysis, and complex tasks. | 2x · 3x | 128.000 |
Guidance for ranus-reason
This is a reasoning-oriented model. Give it enough output budget with max_tokens for multi-step answers, and expect richer usage telemetry such as reasoning-related token accounting when available.
Need more detail? Review the full Models catalog.
05. Integrations
Use direct HTTP or build on LangChain
Direct HTTP is ideal when you want minimal abstraction and complete request control. LangChain is recommended when your product needs retrieval, tools, composable chains, or multi-step orchestration.
When to choose LangChain
LangChain
Recommended for RAGBest choice when you need retrieval, tools, memory, chains, or agent-style orchestration on top of Ranus models.
ts
import { ChatOpenAI } from '@langchain/openai';
import { HumanMessage, SystemMessage } from '@langchain/core/messages';
const model = new ChatOpenAI({
apiKey: process.env.RANUS_API_KEY,
model: 'ranus-smart',
temperature: 0.2,
configuration: {
baseURL: 'https://api.ranus.tech/v1',
},
});
const result = await model.invoke([
new SystemMessage('Summarize support tickets clearly and concisely.'),
new HumanMessage('Customer cannot access billing dashboard after upgrading plan.'),
]);
console.log(result.content);06. Response
Expect structured response metadata
Responses include the assistant output and usage fields for token and credit accounting. This makes it straightforward to surface telemetry in analytics, observability, or billing workflows.
JSON response
{
"id": "req_abc123",
"object": "chat.completion",
"model": "ranus-smart",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Subject: Launching Our New Product..."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 120,
"completion_tokens": 300,
"total_tokens": 420,
"credits_used": 1.86,
"remaining_credits": 49998.14
}
}07. Errors & limits
Handle predictable error codes
Error responses follow a consistent shape so your client and backend can classify failures cleanly and react with the right UX, alerting, or retry policy.
Error response
{
"error": {
"code": "quota_exceeded",
"message": "Monthly credit limit exceeded.",
"request_id": "req_abc123"
}
}- invalid_api_key — invalid or malformed key
- api_key_revoked — key has been revoked
- subscription_inactive — no active paid subscription
- quota_exceeded — monthly credit exhausted
- rate_limit_exceeded — too many requests per minute
- model_not_found — invalid model alias
- model_unavailable — model temporarily unavailable
08. FAQ
Common implementation questions
Short answers to the questions developers typically ask before shipping their first production integration.
Ready to build
Build fast.
Ship with confidence.
Start with one API key, one endpoint, and one consistent request shape across your stack.