Puras is an agentic backend-as-a-service. You write a skill — a prompt plus a typed contract — and Puras runs it server-side as a full agent: it plans, calls tools, and iterates until there's a finished result. You call it like any other API. One key, any app.
We launched quietly two weeks ago, and the platform has already outgrown what we said then. Here's what's actually live today.
What is an agentic backend?
A plain backend executes requests. An agentic backend executes goals.
It's built for work where a single model call isn't enough — jobs that need to plan, use tools, and iterate:
- research a topic and produce a cited report,
- turn a store listing into a finished, captioned video ad,
- build a playable mini-game from a logo and a handful of sprites.
That work runs for minutes, not milliseconds. It calls models, browsers,
ffmpeg, and other skills along the way. An agentic backend gives it a place
to run — with the agent loop, tool execution, file storage, billing, and
observability already built — so your app just submits a job and reads the
result.
One primitive: the skill
The whole platform reduces to three nouns: a skill is one capability, a skillpack is a versioned bundle of skills you deploy as one unit, and a job is one run of one skill.
A skill is a folder with a skill.yaml:
title: Ad Creative
description: Research a product page and produce a finished ad image.
entrypoint: SKILL.md
text_model: claude/sonnet-4-6
image_model: google/nano-banana
input_schema:
type: object
properties:
product_url: { type: string, description: "Public product page" }
required: [product_url]
output_schema:
type: object
properties:
image_url: { type: string }
The entrypoint alone decides how it runs:
- Point it at a
.mdfile and the worker runs an agentic loop with that file as the system prompt. The agent gets built-in tools —bash, web search and fetch, screenshots, file read/write/edit, media generation, transcription, subagents, and a persistent workspace memory — plus any custom tools you declare. - Point it at
main.py:runand the worker calls your Python in an isolated subprocess. Deterministic, fast, no LLM in the loop.
Same submission API, same billing, same observability. The only difference is what runs inside the worker.
Deploy an AI agent in two minutes
pip install puras
puras login # paste your workspace API key
puras init # scaffold a starter skillpack
puras deploy # zip + push a versioned deployment
puras run hello --input prompt="Say hi to Ada"
Or skip the CLI entirely and tell your coding agent. Puras is MCP-first — the server is hosted, auth is OAuth, there's nothing to install:
claude mcp add --transport http puras https://mcp.puras.co/mcp
From Claude Code, Cursor, or VS Code, your agent can scaffold, push, fork, run, and tail skills directly. Prompt to production without leaving your editor.
Call it from anywhere
Every deployed skill is an API. Address it by its
workspace/skillpack/skill path:
curl -X POST "https://api.puras.co/v1/jobs?skillpack=acme/growth-tools&wait=true" \
-H "Authorization: Bearer $PURAS_API_KEY" \
-H "Content-Type: application/json" \
-d '{"skill": "ad-creative", "inputs": {"product_url": "https://example.com"}}'
One endpoint, three call modes: async (poll the job id), sync (?wait=true),
or live streaming (?stream=true, Server-Sent Events). From Python:
import puras
client = puras.Client() # reads PURAS_API_KEY
result = client.run("acme/growth-tools/ad-creative",
inputs={"product_url": "https://example.com"})
Ask for a capability, never wire a model
Skills declare models the way they declare schemas — high-level, in
skill.yaml: text_model for the agent (claude/sonnet-4-6,
claude/opus-4-8, …) and image_model / video_model / audio_model
defaults for media.
Inside the skill, media is four capability verbs — generate_image,
generate_video, generate_audio, transcribe — and the inputs fix the
task:
- pass an image → image-to-video,
- pass lip-sync audio → a talking head,
- pass neither → text-to-video.
The model is a portable family token — google/veo, bytedance/seedance,
kuaishou/kling, google/nano-banana, elevenlabs/tts — or just "auto".
If your skill asks a family for a task it can't do, you get a clear error up
front — never a silently-wrong paid render.
Any run can override any of them, no redeploy:
{ "skill": "ad-creative", "inputs": { "...": "..." },
"media_models": { "video": "bytedance/seedance-fast" } }
Same skill. Different models. No redeploy. An override that can't serve a given call falls back to a capable model and flags it with a warning, so an override never breaks a deployed skill. When a better model ships, switching is a one-line override — not a redeploy.
Every job is a pipeline you can watch
A running job isn't a spinner. The dashboard renders it as a live pipeline — research, parallel asset generation, render, output — and the API streams the same events over SSE. A failure lands on the node that broke — not a generic 500.
After the run:
GET /v1/jobs/{id}/usagegives the line-item cost breakdown per model call.- Skills can declare evals — deterministic
checkgraders and LLM-judgerubricgraders that score every finished run into aneval_score, so quality is a number you can track, not a vibe.
The backend you stop maintaining
- Drive — workspace-level file storage. Every run gets its own browsable folder, grouped by skill; skills read and write it as plain files.
- Workspace memory — a shared, queryable brain. What one skill learns about your product, every other skill can reuse, so work done once isn't redone.
- Secrets — skillpack-scoped and write-only (values are never returned by the API), injected as env vars at runtime.
- Versioned deployments — every deploy is immutable. Activation is a
rolling switch, and any caller can pin
?version=Nto keep running a release it has validated.
No servers, no queues, no Docker.
Skills you can run today
Puras isn't a toolkit demo — these skills are deployed, public, and runnable in a free in-browser playground right now:
| Skill | What you get |
|---|---|
| AI Playable Ad Generator | A real, MRAID-ready HTML5 mini-game in one file, from a logo and sprites |
| AI Game Ad Generator | A captioned video ad from a Play Store / App Store link |
| AI End Card Maker | Install-driving end cards in the placement sizes UA networks need |
| Auto-Caption Burner | Word-synced karaoke captions burned onto any video |
| AI Product Photo Generator | An art-directed product photo set: hero, detail, in-use, lifestyle |
| AI UGC Video Generator | A handheld, talk-to-camera UGC ad with a cast creator |
| AI Product Video Generator | A cinematic reveal clip from product photos |
| AI Image Ad Generator | Researched static ads with headline and CTA, per placement |
| AI Landing Page Generator | A self-contained landing page at a stable live URL |
| AI Talking Avatar Generator | A lip-synced talking-head video that reads your script verbatim |
| AI Content Repurposer | One source rewritten natively for LinkedIn, X, Reddit, Instagram |
| AI Deep Research Agent | A cited, verified research report as polished HTML or PDF |
Every one of them is built on the same primitives described above — and every
public skill has a machine-readable markdown twin at its page URL + .md,
so your coding agent can read the contract directly.
Pricing: usage-based, success-only
No subscriptions, no seats. You're charged per job, on success only — failed and cancelled jobs cost $0, no matter how much they did. Every job has a line-item breakdown of what it used.
Every workspace gets $10 of free credit, replenished $2 a day, and every public skill has a free in-browser playground — sign in with Google and run it. No credit card.
Get started
- Browse the public AI skills and run one in the playground.
- Build and deploy a skillpack — zero to deployed in a few minutes.
- Or connect your coding agent and let it do both:
claude mcp add --transport http puras https://mcp.puras.co/mcp
FAQ
What is an agentic backend? A managed backend that runs AI agents server-side: long-running jobs that plan, call tools, and iterate to a finished result, exposed to your app as a plain API with billing and observability built in.
What's the difference between an agentic and a deterministic skill?
The entrypoint. A .md entrypoint runs an LLM tool-use loop with the file
as its system prompt; a main.py:run entrypoint runs plain Python. Both
deploy, bill, and stream the same way.
Do I need my own model API keys? No. Model usage — LLM tokens and media generation — is metered into each job's cost and billed to your workspace balance. One Puras key is all your app holds.
Can my coding agent use Puras directly?
Yes — that's the primary dev surface. The hosted MCP server at
mcp.puras.co exposes the full loop (scaffold, push, run, tail, fork) over
OAuth, with nothing to install.
