For years, AI progress has centered on scaling particular person basis fashions: bigger parameters, longer context home windows, stronger reasoning, and higher software use. Sakana AI’s Fugu factors elsewhere, behaving like one mannequin from the skin whereas coordinating a number of skilled brokers internally.
A single API name can set off direct answering, specialist delegation, intermediate verification, and remaining synthesis, hiding orchestration complexity behind a standard LLM interface. On this article, a sensible information to Fugu’s structure, variants, pricing, benchmarks, entry, code, assessments, enterprise match, trade-offs, and use instances.
What’s Sakana Fugu?
Sakana Fugu is an OpenAI-compatible managed mannequin API that appears like a single LLM however works as a multi-agent system internally. Builders ship a immediate to at least one mannequin ID, resembling fugu or fugu-ultra, whereas Fugu handles agent choice, position project, coordination, verification, and remaining response.
As an alternative of manually constructing planner, coder, reviewer, researcher, or supervisor brokers with frameworks like LangGraph, AutoGen, or CrewAI, groups get orchestration packaged into the mannequin itself. This reduces the necessity to handle prompts, routing, retries, reminiscence, state, monitoring, and failure restoration.
Why the naming issues
The title “Sakana” means fish in Japanese. The corporate typically frames its analysis round collective intelligence, just like how a faculty of fish can behave as one coordinated system. Fugu follows that concept. Many brokers coordinate behind one interface.
Why Multi-Agent System as a Mannequin Issues
Most manufacturing AI programs right now fall into one among three patterns:
- Single-model prompting
- Device-augmented LLM purposes
- Manually designed multi-agent workflows
Single-model prompting is straightforward, however it might fail on advanced duties that require planning, execution, verification, and iteration.
Device-augmented LLMs enhance usefulness by connecting fashions to look, databases, code execution, APIs, or enterprise programs. However the mannequin nonetheless normally acts because the central reasoning engine.
Multi-agent workflows go additional. They divide work throughout specialised brokers. For instance:
- A planner breaks down the duty.
- A researcher gathers context.
- A coder writes code.
- A reviewer checks for correctness.
- A verifier assessments the reply.
- A supervisor coordinates the method.
This could enhance reliability on tough duties, however constructing it nicely is tough. Groups should reply many system design questions:
- Which agent ought to deal with which job?
- How ought to brokers talk?
- When ought to the system cease?
- How ought to intermediate outputs be verified?
- How ought to price and latency be managed?
- How ought to failures be recovered?
- How ought to compliance restrictions be utilized?
Fugu makes an attempt to make this simpler by turning multi-agent orchestration right into a model-level functionality. The developer doesn’t have to design each agent interplay manually.
Sakana Fugu Launch Overview
Sakana Fugu was launched as Sakana AI’s industrial multi-agent orchestration product. The preliminary beta positioned it as a system that coordinates swimming pools of frontier basis fashions for coding, arithmetic, scientific reasoning, analysis, and sophisticated evaluation.
The newest Fugu launch makes the product simpler to entry by means of Sakana’s console and an OpenAI-compatible API. The core launch message is straightforward: builders can plug multi-agent intelligence into current workflows with out rewriting their utility round a brand new SDK or orchestration framework.
Fugu vs Fugu Extremely
Sakana Fugu is available in two important mannequin choices: Fugu and Fugu Extremely.
Fugu
Fugu is the default mannequin for on a regular basis work. It balances efficiency and latency. It’s appropriate for coding help, code evaluation, chatbots, inner assistants, doc evaluation, and interactive workflows the place response time issues.
A key level is that Fugu can path to one of the best mannequin based mostly on the duty. It additionally permits customers to choose particular brokers out of the mannequin pool, which might help with knowledge, privateness, compliance, or organizational necessities.
Fugu Extremely
Fugu Extremely is optimized for optimum reply high quality. It coordinates a deeper pool of skilled brokers and is meant for onerous, high-stakes, multi-step issues. Based on the Sakana, Fugu Extremely can route between one to 3 brokers relying on the issue.
Fugu Extremely is best fitted to workloads the place accuracy, depth, and persistence matter greater than latency. Examples embrace:
- Paper replica
- Kaggle-style knowledge science workflows
- Cybersecurity evaluation
- Literature evaluation
- Patent investigation
- Deep technical analysis
- Advanced code evaluation
- Scientific reasoning
Comparability desk
| Function | Fugu | Fugu Extremely |
| Finest for | On a regular basis coding, chat, evaluation, interactive workflows | Onerous reasoning, analysis, high-stakes evaluation |
| Design objective | Stability high quality and latency | Maximize high quality |
| Agent pool | Versatile, with opt-out help | Mounted full pool |
| Latency | Decrease | Increased |
| Price | Relies on lively underlying agent tier | Mounted token pricing |
| Really useful customers | Builders, product groups, inner instruments | Researchers, superior builders, enterprise evaluation groups |
| Fundamental trade-off | Much less depth than Extremely | Increased price and response time |
Structure: How Fugu Works Internally
Fugu’s structure could be understood as a managed orchestration layer wrapped inside a mannequin API.
From the skin, the movement seems like this:

Internally, the system is nearer to this:

Sakana Fugu exposes a single API whereas internally coordinating a pool of specialised fashions. The consumer sends one request, and Fugu handles routing, delegation, verification, and synthesis.
Core structure elements
1. API gateway
The developer interacts with a typical API floor. This issues as a result of Fugu helps OpenAI-compatible endpoints, so groups can reuse current OpenAI SDK shoppers with a distinct base URL and API key.
2. Orchestrator mannequin
The orchestrator is the core intelligence layer. It decides how the duty must be dealt with. For less complicated duties, it could reply with minimal orchestration. For advanced duties, it might coordinate a number of skilled brokers.
3. Agent pool
Fugu has entry to a pool of underlying fashions or brokers. These brokers might have totally different strengths throughout coding, reasoning, analysis, long-context evaluation, or different specialised duties.
4. Dynamic routing
As an alternative of hardcoding a workflow, Fugu dynamically selects which agent or brokers to make use of. That is necessary as a result of mannequin strengths are sometimes task-specific. One mannequin might carry out higher at code era, one other at mathematical reasoning, one other at long-context synthesis.
5. Delegation and communication
The orchestrator can break down a posh job into subtasks. It might probably ship targeted directions to totally different brokers and management what context every agent receives.
6. Verification
For tough duties, the system can use verification-style habits. One agent might resolve, one other might critique or validate, and the orchestrator might mix the outcomes.
7. Synthesis
The ultimate reply is returned as a single response. The consumer doesn’t see the complete inner agent graph. .
Pricing
Fugu has two pricing modes: pay-as-you-go and subscription plans.
Pay-as-you-go
Pay-as-you-go is designed for heavier manufacturing workloads. Sakana says consumption-based tokens are served at greater precedence than monthly-plan tokens.
Fugu pricing
Fugu pricing is dependent upon the lively agent setup.
| Energetic brokers | Billing rule |
| 1 agent | Pay the usual fee for the particular underlying mannequin |
| A number of brokers | Charges are usually not stacked. You’re charged one fee based mostly on the top-tier mannequin concerned |
That is necessary as a result of many multi-agent programs develop into costly when every mannequin name is billed individually. Fugu’s pricing mannequin tries to keep away from stacking mannequin charges throughout brokers.
Fugu Extremely pricing
Fugu Extremely has mounted pricing for fugu-ultra-20260615 per 1M tokens.
| Token sort | Commonplace worth | Context larger than 272K |
| Enter | $5 per 1M tokens | $10 per 1M tokens |
| Output | $30 per 1M tokens | $45 per 1M tokens |
| Cached enter | $0.50 per 1M tokens | $1.00 per 1M tokens |
Subscription plans
Subscription plans are designed for people and on a regular basis hands-on use. Each tier contains each Fugu and Fugu Extremely.
| Plan | Worth | Finest for | Utilization |
| Commonplace | $20/month | Light-weight every day utilization, occasional API calls, small experiments | Baseline allowance |
| Professional | $100/month | Common coding, evaluation, analysis, and evaluation periods | 10x Commonplace utilization |
| Max | $200/month | Heavy long-running workloads | 20x Commonplace utilization |
Benchmark Outcomes
Sakana studies Fugu and Fugu Extremely benchmark scores throughout coding, reasoning, science, agentic duties, long-context reasoning, and cybersecurity-style analysis.
Sakana Fugu and Fugu Extremely in contrast with frontier baseline fashions throughout coding, reasoning, science, long-context, and agentic benchmarks.
Benchmarks are helpful, however they shouldn’t be handled as direct manufacturing ensures. Fugu’s benchmark profile suggests three sensible insights.
1. Fugu is strongest when duties require orchestration
The strongest use case isn’t a easy one-shot reply. The mannequin is designed for duties that profit from decomposition, skilled choice, verification, and synthesis.
Examples:
- Debug this repository.
- Evaluate this pull request.
- Reproduce this analysis paper.
- Examine this patent panorama.
- Analyze a potential safety vulnerability.
- Evaluate a number of technical approaches and advocate one.
2. Extremely isn’t at all times routinely higher
Fugu Extremely is optimized for reply high quality, however Fugu can outperform it on some benchmarks. Builders ought to benchmark each fashions on their very own workload earlier than standardizing.
A sensible routing technique could possibly be:
Use fugu for interactive work.
Use fugu-ultra for advanced, high-value duties.
Fallback to fugu when latency or price issues.
3. Multi-agent efficiency comes with hidden complexity
Although Fugu hides orchestration complexity from the developer, the underlying system nonetheless performs extra work. This could have an effect on latency, price, and observability.
Groups ought to monitor:
- Whole tokens
- Orchestration tokens
- Latency by job sort
- High quality by workload class
- Failure instances
- Mannequin model habits
- Price per profitable consequence
Technical Arms-on: Utilizing Sakana Fugu API
Sakana fugu documentation: https://console.sakana.ai/get-started
1: Create an API key
Go to the Sakana console API key web page login and create API: https://console.sakana.ai/api-keys

Create an API key and retailer it securely. The secret’s proven solely as soon as.
2: Set setting variables
export FUGU_API_KEY="your_api_key_here"
export FUGU_BASE_URL="https://api.sakana.ai/v1"
3: Set up the OpenAI Python SDK
pip set up openai
4: Primary Responses API name
import os
from openai import OpenAI
consumer = OpenAI(
api_key=os.environ["FUGU_API_KEY"],
base_url=os.environ.get("FUGU_BASE_URL", "https://api.sakana.ai/v1"),
)
response = consumer.responses.create(
mannequin="fugu",
enter="Clarify Sakana Fugu in easy phrases for a software program engineer.",
)
print(response.output_text)
Step 5: Use Fugu Extremely for more durable reasoning
import os
from openai import OpenAI
consumer = OpenAI(
api_key=os.environ["FUGU_API_KEY"],
base_url=os.environ.get("FUGU_BASE_URL", "https://api.sakana.ai/v1"),
)
response = consumer.responses.create(
mannequin="fugu-ultra",
directions="You're a senior AI architect. Be exact and technical.",
enter="""
Evaluate single-agent LLM programs, manually designed multi-agent workflows,
and Sakana Fugu-style multi-agent programs as a mannequin.
Deal with structure, price, latency, observability, and governance.
""",
)
print(response.output_text)
Conclusion
Sakana Fugu stands out as a result of it shifts the abstraction layer. As an alternative of providing simply one other massive mannequin, it packages multi-agent orchestration behind a mannequin API.
For builders, this implies simpler entry to agentic workflows with out constructing advanced orchestration programs from scratch. For technical leaders, it provides a managed method to enhance reasoning, coding, analysis, and evaluation whereas lowering dependence on a single mannequin supplier.
Fugu is greatest suited for advanced, ambiguous, high-value duties moderately than easy chatbot prompts. Nonetheless, groups ought to undertake it rigorously, given its restricted routing transparency, potential latency, unclear token accounting, and regional constraints.
The only method to consider Fugu is that this: it’s not only a mannequin you immediate. It’s a mannequin that manages different fashions. That makes it an necessary step towards the subsequent era of AI purposes.
Regularly Requested Questions
A. It’s uncovered as a single mannequin API, however internally it behaves as a multi-agent orchestration system.
A. Use fugu for traditional work and fugu-ultra for advanced, high-value duties. Use fugu-ultra-20260615 if you wish to pin a selected Extremely model.
A. Sure. It helps OpenAI-compatible Responses, Chat Completions, and Fashions APIs.
Login to proceed studying and luxuriate in expert-curated content material.