Ship AI features faster with one API

Sixfinger API gives you a single integration point for fast chat generation, streaming outputs, and plan-based governance. Build support bots, copilots, coding assistants, and multilingual applications without switching providers.

Includes full plan-to-model mapping below.

10 models Streaming enabled Plan-safe limits Referral bonuses Single API key

Start Free Read Docs

4Plans available

15Models in top tier

3 RPMFree plan baseline throughput

50,000,000Plus monthly token capacity

Interactive Advisor

Plan Comparator

Set expected monthly volume and get the most cost-efficient plan.

Requests

Tokens

Streaming

Calculating...

Model Decision UI

Pick your workload profile and receive a model suggestion.

Task

Latency

Budget

Choose options to see recommendation.

Token and Cost Simulator

Estimator

Forecast response cost and speed before deploying prompts.

Prompt chars

Output tokens

Model tier

Set values to compute estimates.

Build With Sixfinger

Production-style usage examples teams deploy on this API.

Support desk copilot

Low-latency multilingual support with model fallback.

Coding assistant

Code-first prompt routing with fast stream delivery.

Backoffice classifier

Tagging, triage, and summary pipelines with one key.

Why teams choose Sixfinger API

Fast Start

Even free tier includes streaming and multiple model options for fast prototyping.

Predictable Limits

Per-minute, hourly, daily, and monthly controls are clear and predictable.

Model Flexibility

General chat, coding-heavy and reasoning-centric choices across plan levels.

Regional Quality

Dedicated options like qwen3-32b and allam-2-7b for regional quality.

Operational Safety

Native usage stats endpoint and account-level keys for safer operations.

Growth Bonus

Referral bonuses increase monthly capacity without integration changes.

Plan and pricing

Free

4 models

$0 USD / month

RPM3

RPH60

RPD100

200 requests/month
20,000 tokens/month
Max 100 tokens/request
Streaming: enabled

llama-8b-instant allam-2-7b step-3.5-flash nemotron-3-super-120b-a12b

Starter

11 models

$5 USD / month

RPM15

RPH300

RPD1500

3,000 requests/month
300,000 tokens/month
Max 500 tokens/request
Streaming: enabled

llama-8b-instant allam-2-7b gpt4-nano qwen3-32b llama-70b llama-maverick-17b +5 more

Pro

15 models

$15 USD / month

RPM50

RPH1500

RPD15000

75,000 requests/month
7,500,000 tokens/month
Max 1500 tokens/request
Streaming: enabled

llama-8b-instant allam-2-7b gpt4-nano qwen3-32b llama-70b llama-maverick-17b +9 more

Plus

15 models

$39 USD / month

RPM150

RPH5000

RPD50000

500,000 requests/month
50,000,000 tokens/month
Max 3000 tokens/request
Streaming: enabled

llama-8b-instant allam-2-7b gpt4-nano qwen3-32b llama-70b llama-maverick-17b +9 more

Model catalog by plan

Free plan

Key	Name	Language	Speed	Size
llama-8b-instant	Llama 3.1 8B Instant	Multilingual	Very Fast	8B
allam-2-7b	Allam 2 7B	Turkish/Arabic	Fast	7B
step-3.5-flash	Step 3.5 Flash	Multilingual	Very Fast	Unknown
nemotron-3-super-120b-a12b	Nemotron 3 Super 120B A12B	Multilingual	Fast	120B

Starter plan

Key	Name	Language	Speed	Size
llama-8b-instant	Llama 3.1 8B	Multilingual	Very Fast	8B
allam-2-7b	Allam 2 7B	TR/AR	Fast	7B
gpt4-nano	GPT-4.1 Nano	Multilingual	Very Fast	Nano
qwen3-32b	Qwen3 32B	Turkish	Fast	32B
llama-70b	Llama 3.3 70B	Multilingual	Fast	70B
llama-maverick-17b	Llama Maverick	Multilingual	Fast	17B
llama-scout-17b	Llama Scout	Multilingual	Very Fast	17B
gpt-oss-20b	GPT OSS 20B	Multilingual	Fast	20B
glm-4.5-air	GLM 4.5 Air	Multilingual	Fast	Unknown
qwen3-coder	Qwen3 Coder	Multilingual	Fast	Unknown
lfm-2.5-1.2b-thinking	LFM 2.5 1.2B Thinking	Multilingual	Very Fast	1.2B

Pro plan

Key	Name	Language	Speed	Size
llama-8b-instant	Llama 3.1 8B	Multilingual	Very Fast	8B
allam-2-7b	Allam 2 7B	TR/AR	Fast	7B
gpt4-nano	GPT-4.1 Nano	Multilingual	Very Fast	Nano
qwen3-32b	Qwen3 32B	Turkish	Fast	32B
llama-70b	Llama 3.3 70B	Multilingual	Fast	70B
llama-maverick-17b	Llama Maverick	Multilingual	Fast	17B
llama-scout-17b	Llama Scout	Multilingual	Very Fast	17B
gpt-oss-20b	GPT OSS 20B	Multilingual	Fast	20B
gpt-oss-120b	GPT OSS 120B	Multilingual	Fast	120B
kimi-k2	Kimi K2	Chinese	Fast	Unknown
step-3.5-flash	Step 3.5 Flash	Multilingual	Very Fast	Unknown
nemotron-3-super-120b-a12b	Nemotron 3 Super 120B A12B	Multilingual	Fast	120B
glm-4.5-air	GLM 4.5 Air	Multilingual	Fast	Unknown
qwen3-coder	Qwen3 Coder	Multilingual	Fast	Unknown
lfm-2.5-1.2b-thinking	LFM 2.5 1.2B Thinking	Multilingual	Very Fast	1.2B

Plus plan

Key	Name	Language	Speed	Size
llama-8b-instant	Llama 3.1 8B	Multilingual	Very Fast	8B
allam-2-7b	Allam 2 7B	TR/AR	Fast	7B
gpt4-nano	GPT-4.1 Nano	Multilingual	Very Fast	Nano
qwen3-32b	Qwen3 32B	Turkish	Fast	32B
llama-70b	Llama 3.3 70B	Multilingual	Fast	70B
llama-maverick-17b	Llama Maverick	Multilingual	Fast	17B
llama-scout-17b	Llama Scout	Multilingual	Very Fast	17B
gpt-oss-20b	GPT OSS 20B	Multilingual	Fast	20B
gpt-oss-120b	GPT OSS 120B	Multilingual	Fast	120B
kimi-k2	Kimi K2	Chinese	Fast	Unknown
step-3.5-flash	Step 3.5 Flash	Multilingual	Very Fast	Unknown
nemotron-3-super-120b-a12b	Nemotron 3 Super 120B A12B	Multilingual	Fast	120B
glm-4.5-air	GLM 4.5 Air	Multilingual	Fast	Unknown
qwen3-coder	Qwen3 Coder	Multilingual	Fast	Unknown
lfm-2.5-1.2b-thinking	LFM 2.5 1.2B Thinking	Multilingual	Very Fast	1.2B

Quick comparison

Plan	Price	Rate limits	Monthly requests	Monthly tokens	Model count	Max tokens/request
Free	$0 USD	3 / 60 / 100	200	20,000	4	100
Starter	$5 USD	15 / 300 / 1500	3,000	300,000	11	500
Pro	$15 USD	50 / 1500 / 15000	75,000	7,500,000	15	1500
Plus	$39 USD	150 / 5000 / 50000	500,000	50,000,000	15	3000

Popular use cases

Support assistant

Low-latency streaming answers with plan-safe limits for predictable operations.

Coding copilot

Route coding tasks to GPT OSS tiers or larger model plans when needed.

Multilingual chat

Use Turkish-focused and multilingual models under one API key.

Content production

Scale article generation and rewriting with usage analytics and upgrade paths.

Reasoning workflows

Assign heavier prompts to reasoning-capable models in Pro and Plus plans.

Backoffice automation

Backoffice bots for summaries, tagging, and classification tasks.

FAQ

How many models can I use?

It depends on plan. Free starts with a focused set and Plus unlocks the full catalog.

Can I select model manually?

Yes. Send the model parameter in chat requests if that model is available in your current plan.

Is streaming supported?

Yes. Streaming support is enabled for all plans including free tier.

How do I increase limits?

Upgrade plan from dashboard workflow to increase RPM, monthly requests, and token budget.

Release Feed

Playground 2.0 is liveSplit-pane prompt lab, streaming telemetry, template cards.

Command Palette addedUse Cmd/Ctrl+K for instant navigation.

Theme profilesToggle light/dark and high-contrast profile for long sessions.

Start free, scale when needed

Create your account, generate your API key, and start with free plan. Upgrade only when your product needs more throughput and larger model coverage.

Create Account Open Documentation