Mehmet Ecevit7 min read

Introducing Puras: The Agentic Backend-as-a-Service

Puras is an agentic backend-as-a-service. Deploy AI agents as skills behind one API — agent loop, media models, billing, and observability built in.

Puras is an agentic backend-as-a-service. You write a skill — a prompt plus a typed contract — and Puras runs it server-side as a full agent: it plans, calls tools, and iterates until there's a finished result. You call it like any other API. One key, any app.

We launched quietly two weeks ago, and the platform has already outgrown what we said then. Here's what's actually live today.

What is an agentic backend?

A plain backend executes requests. An agentic backend executes goals.

It's built for work where a single model call isn't enough — jobs that need to plan, use tools, and iterate:

  • research a topic and produce a cited report,
  • turn a store listing into a finished, captioned video ad,
  • build a playable mini-game from a logo and a handful of sprites.

That work runs for minutes, not milliseconds. It calls models, browsers, ffmpeg, and other skills along the way. An agentic backend gives it a place to run — with the agent loop, tool execution, file storage, billing, and observability already built — so your app just submits a job and reads the result.

One primitive: the skill

The whole platform reduces to three nouns: a skill is one capability, a skillpack is a versioned bundle of skills you deploy as one unit, and a job is one run of one skill.

A skill is a folder with a skill.yaml:

yaml
title: Ad Creative
description: Research a product page and produce a finished ad image.
entrypoint: SKILL.md
text_model: claude/sonnet-4-6
image_model: google/nano-banana

input_schema:
  type: object
  properties:
    product_url: { type: string, description: "Public product page" }
  required: [product_url]

output_schema:
  type: object
  properties:
    image_url: { type: string }

The entrypoint alone decides how it runs:

  • Point it at a .md file and the worker runs an agentic loop with that file as the system prompt. The agent gets built-in tools — bash, web search and fetch, screenshots, file read/write/edit, media generation, transcription, subagents, and a persistent workspace memory — plus any custom tools you declare.
  • Point it at main.py:run and the worker calls your Python in an isolated subprocess. Deterministic, fast, no LLM in the loop.

Same submission API, same billing, same observability. The only difference is what runs inside the worker.

Deploy an AI agent in two minutes

bash
pip install puras
puras login      # paste your workspace API key
puras init       # scaffold a starter skillpack
puras deploy     # zip + push a versioned deployment
puras run hello --input prompt="Say hi to Ada"

Or skip the CLI entirely and tell your coding agent. Puras is MCP-first — the server is hosted, auth is OAuth, there's nothing to install:

bash
claude mcp add --transport http puras https://mcp.puras.co/mcp

From Claude Code, Cursor, or VS Code, your agent can scaffold, push, fork, run, and tail skills directly. Prompt to production without leaving your editor.

Call it from anywhere

Every deployed skill is an API. Address it by its workspace/skillpack/skill path:

bash
curl -X POST "https://api.puras.co/v1/jobs?skillpack=acme/growth-tools&wait=true" \
  -H "Authorization: Bearer $PURAS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"skill": "ad-creative", "inputs": {"product_url": "https://example.com"}}'

One endpoint, three call modes: async (poll the job id), sync (?wait=true), or live streaming (?stream=true, Server-Sent Events). From Python:

python
import puras

client = puras.Client()  # reads PURAS_API_KEY
result = client.run("acme/growth-tools/ad-creative",
                    inputs={"product_url": "https://example.com"})

Ask for a capability, never wire a model

Skills declare models the way they declare schemas — high-level, in skill.yaml: text_model for the agent (claude/sonnet-4-6, claude/opus-4-8, …) and image_model / video_model / audio_model defaults for media.

Inside the skill, media is four capability verbs — generate_image, generate_video, generate_audio, transcribe — and the inputs fix the task:

  • pass an image → image-to-video,
  • pass lip-sync audio → a talking head,
  • pass neither → text-to-video.

The model is a portable family token — google/veo, bytedance/seedance, kuaishou/kling, google/nano-banana, elevenlabs/tts — or just "auto". If your skill asks a family for a task it can't do, you get a clear error up front — never a silently-wrong paid render.

Any run can override any of them, no redeploy:

json
{ "skill": "ad-creative", "inputs": { "...": "..." },
  "media_models": { "video": "bytedance/seedance-fast" } }

Same skill. Different models. No redeploy. An override that can't serve a given call falls back to a capable model and flags it with a warning, so an override never breaks a deployed skill. When a better model ships, switching is a one-line override — not a redeploy.

Every job is a pipeline you can watch

A running job isn't a spinner. The dashboard renders it as a live pipeline — research, parallel asset generation, render, output — and the API streams the same events over SSE. A failure lands on the node that broke — not a generic 500.

After the run:

  • GET /v1/jobs/{id}/usage gives the line-item cost breakdown per model call.
  • Skills can declare evals — deterministic check graders and LLM-judge rubric graders that score every finished run into an eval_score, so quality is a number you can track, not a vibe.

The backend you stop maintaining

  • Drive — workspace-level file storage. Every run gets its own browsable folder, grouped by skill; skills read and write it as plain files.
  • Workspace memory — a shared, queryable brain. What one skill learns about your product, every other skill can reuse, so work done once isn't redone.
  • Secrets — skillpack-scoped and write-only (values are never returned by the API), injected as env vars at runtime.
  • Versioned deployments — every deploy is immutable. Activation is a rolling switch, and any caller can pin ?version=N to keep running a release it has validated.

No servers, no queues, no Docker.

Skills you can run today

Puras isn't a toolkit demo — these skills are deployed, public, and runnable in a free in-browser playground right now:

SkillWhat you get
AI Playable Ad GeneratorA real, MRAID-ready HTML5 mini-game in one file, from a logo and sprites
AI Game Ad GeneratorA captioned video ad from a Play Store / App Store link
AI End Card MakerInstall-driving end cards in the placement sizes UA networks need
Auto-Caption BurnerWord-synced karaoke captions burned onto any video
AI Product Photo GeneratorAn art-directed product photo set: hero, detail, in-use, lifestyle
AI UGC Video GeneratorA handheld, talk-to-camera UGC ad with a cast creator
AI Product Video GeneratorA cinematic reveal clip from product photos
AI Image Ad GeneratorResearched static ads with headline and CTA, per placement
AI Landing Page GeneratorA self-contained landing page at a stable live URL
AI Talking Avatar GeneratorA lip-synced talking-head video that reads your script verbatim
AI Content RepurposerOne source rewritten natively for LinkedIn, X, Reddit, Instagram
AI Deep Research AgentA cited, verified research report as polished HTML or PDF

Every one of them is built on the same primitives described above — and every public skill has a machine-readable markdown twin at its page URL + .md, so your coding agent can read the contract directly.

Pricing: usage-based, success-only

No subscriptions, no seats. You're charged per job, on success only — failed and cancelled jobs cost $0, no matter how much they did. Every job has a line-item breakdown of what it used.

Every workspace gets $10 of free credit, replenished $2 a day, and every public skill has a free in-browser playground — sign in with Google and run it. No credit card.

Get started

  1. Browse the public AI skills and run one in the playground.
  2. Build and deploy a skillpack — zero to deployed in a few minutes.
  3. Or connect your coding agent and let it do both: claude mcp add --transport http puras https://mcp.puras.co/mcp

FAQ

What is an agentic backend? A managed backend that runs AI agents server-side: long-running jobs that plan, call tools, and iterate to a finished result, exposed to your app as a plain API with billing and observability built in.

What's the difference between an agentic and a deterministic skill? The entrypoint. A .md entrypoint runs an LLM tool-use loop with the file as its system prompt; a main.py:run entrypoint runs plain Python. Both deploy, bill, and stream the same way.

Do I need my own model API keys? No. Model usage — LLM tokens and media generation — is metered into each job's cost and billed to your workspace balance. One Puras key is all your app holds.

Can my coding agent use Puras directly? Yes — that's the primary dev surface. The hosted MCP server at mcp.puras.co exposes the full loop (scaffold, push, run, tail, fork) over OAuth, with nothing to install.