puras / Avatar Studio

Turn a script — and a photo — into a talking-head video

An AI talking-avatar generator: paste your exact words and an optional presenter photo, and it returns a lip-synced talking-head video that reads your script to camera, in the language you choose. No camera, talent, or editing.

Input
Output
Presenter photo + script → talking video

~$1.10 per clip·Try it free — $10 free credits, no card.

The exact words spoken, verbatim. Plain prose, no SSML.

Loading…

No inputs handy? Try one of these

Everything you can do

What AI Talking Avatar Generator does

Script in, video out

Paste your exact words and the avatar reads them to camera, lip-synced — nothing to film.

Your words, exactly

It speaks the script verbatim, so product names and calls-to-action come out right every time.

Ready to post

You get a finished talking-head clip to drop into ads, explainers, or social feeds.

Callable from your product

Generate avatar videos on demand from any language with one API key.

Who it's for

Built for teams who ship.

Marketing & social teams

Produce spokesperson clips without booking talent or a shoot.

Course & content creators

Turn scripts into talking-head lessons and explainers at scale.

Product & app teams

Generate onboarding and explainer videos programmatically.

Comparison

AI Talking Avatar Generator vs HeyGen

An API-native, usage-based alternative to HeyGen — your skill and prompt run server-side, called from your own product, billed per run.

HeyGenpuras
PricingMonthly subscription with credit tiersUsage-based — pay per video, exact cost each run
How it worksPick an avatar and edit in their studioA script (and optional presenter photo) in, a lip-synced clip out
Your own faceAvatar creation flow inside their appUpload any front-facing portrait — it lip-syncs that photo directly
Where it runsA web app you log intoAPI-native — generate talking-head clips from your own product

Comparison

AI Talking Avatar Generator vs Synthesia

An API-native, usage-based alternative to Synthesia — your skill and prompt run server-side, called from your own product, billed per run.

Synthesiapuras
Built forStudio avatars for corporate and training videoScript-to-clip spokesperson videos for ads, social, and products
PricingPer-seat subscription with video-minute allowancesUsage-based — pay per video, exact cost each run
How it worksAssemble scenes in their editorOne call: script in, finished lip-synced MP4 out
Where it runsA web app you log intoAPI-native — generate clips from any language with one key

Comparison

AI Talking Avatar Generator vs D-ID

An API-native, usage-based alternative to D-ID — your skill and prompt run server-side, called from your own product, billed per run.

D-IDpuras
PricingCredit packs and subscription tiersUsage-based — pay per video, exact cost each run
LanguagesVaries by plan tierReads your exact words, in the language you choose
OutputTalking-photo clips inside their platformFinished MP4s in 9:16, 1:1, or 16:9 — ready for feeds and ads
Where it runsTheir studio or their API plansAPI-native by default — same skill in the browser and in your code

Free to try

Try AI Talking Avatar Generator free — in your browser

The playground above is the real skill, not a demo. Load an example or bring your own inputs, sign in with Google, and run it — no credit card, no subscription, no install. After the free try, runs are usage-based and every run reports its exact cost.

Run it free

FAQ

Questions, answered.

How do I turn a photo into a talking AI video?+

Upload a clear, front-facing portrait and a text script — the talking photo AI lip-syncs the photo to a generated voice reading your exact words, and returns a finished MP4 clip. The photo's framing carries straight through to the output.

How do I make an AI video of someone talking from a script?+

Paste the exact words you want spoken; the avatar reads them verbatim to camera, lip-synced, with no filming. Supply a presenter photo or let the skill cast a fitting presenter from the script itself.

Can I make multilingual talking head videos?+

Yes — set a language code or leave it on auto-detect, and the avatar speaks your script in that language. It reads your words verbatim, so the script you write is the script that's spoken.

Can I make a talking avatar from a photo of myself?+

Yes — give it your own front-facing portrait as the presenter and it lip-syncs that photo to your script. It uses the photo you supply rather than training a separate model, so there's nothing to set up.

Is there a talking avatar API for developers?+

Yes — this is an API-native skill, not a web app. Call it from any language with one API key to generate lip-synced talking-head MP4s on demand, with usage-based pricing and the exact cost reported per run.

What format are the talking avatar videos exported in?+

Each run returns a finished video clip — one per aspect ratio you pick (9:16, 1:1, or 16:9). When you supply your own presenter photo, the clip keeps that photo's framing. Ready to drop into ads, social feeds, explainers, or training videos.

Looking for a cheaper HeyGen or Synthesia alternative?+

This is a usage-based talking avatar generator with no seats and no per-task credit bundles — you pay for the run you make and see its exact cost each time, then embed it in your own product via the API.

Powered by puras

This skill is infrastructure.

AI Talking Avatar Generator runs on puras, the agentic backend — every skill here is a versioned, server-side capability your product can call with one API key: async, sync, or streamed. No model wiring, no queue, no servers.

Related skills

Ecommerce & product creative

For developers

Run it from your own stack.

This skill is an API. One call runs the whole pipeline server-side as a long-running job and returns the result — from Python, plain HTTP, or an MCP-connected coding agent.

API access — MCP · Python SDK · cURL · JSON schemas
MCP· recommended for coding agents
claude mcp add --transport http puras https://mcp.puras.co/mcp

OAuth on first call — no key to paste. Then ask the agent to run talking-avatar from skillpack puras/avatar-studio.

CLI
pip install puras puras login # or set PURAS_API_KEY puras run puras/avatar-studio/talking-avatar -i key=value
Python SDK
import puras client = puras.Client() # PURAS_API_KEY from env result = client.run("puras/avatar-studio/talking-avatar", {"voice":"auto","script":"Big news — we just shipped dark mode. Open Settings, tap Appearance, and pick Dark. Your eyes will thank you tonight.\n","avatar_image":"https://uozfqcfhlhugotnevscg.supabase.co/storage/v1/object/public/puras-public-skills/talking-avatar/dark-mode-presenter.jpg"})
HTTP API· wait=true blocks until the job finishes
curl -X POST "https://api.puras.co/v1/jobs?skillpack=puras/avatar-studio&wait=true" \ -H "Authorization: Bearer $PURAS_API_KEY" \ -H "Content-Type: application/json" \ -d '{"skill":"talking-avatar","inputs":{"voice":"auto","script":"Big news — we just shipped dark mode. Open Settings, tap Appearance, and pick Dark. Your eyes will thank you tonight.\n","avatar_image":"https://uozfqcfhlhugotnevscg.supabase.co/storage/v1/object/public/puras-public-skills/talking-avatar/dark-mode-presenter.jpg"}}'
Input schema (JSON Schema)
{
  "type": "object",
  "required": [
    "script"
  ],
  "properties": {
    "look": {
      "type": "text",
      "maxLength": 400,
      "description": "Optional steer for the generated presenter and setting. Ignored with avatar_image."
    },
    "voice": {
      "enum": [
        "auto",
        "warm_female",
        "warm_male",
        "energetic_female",
        "energetic_male",
        "calm_narrator",
        "authoritative_male"
      ],
      "type": "string",
      "default": "auto",
      "description": "Voice persona. `auto` fits it to the script."
    },
    "script": {
      "type": "text",
      "maxLength": 1000,
      "minLength": 2,
      "description": "The exact words spoken, verbatim. Plain prose, no SSML."
    },
    "language": {
      "type": "string",
      "maxLength": 12,
      "description": "Optional language code (e.g. \"en\", \"tr\"). Empty = auto-detect."
    },
    "avatar_image": {
      "type": "image",
      "description": "Optional presenter portrait. Clear, front-facing, mouth visible."
    },
    "aspect_ratios": {
      "type": "array",
      "items": {
        "enum": [
          "9:16",
          "1:1",
          "16:9"
        ],
        "type": "string"
      },
      "default": [
        "9:16"
      ],
      "minItems": 1,
      "description": "Output frame(s). Pick one or more — one clip is rendered per selected ratio. Ignored when you supply avatar_image (the photo's framing wins).",
      "uniqueItems": true
    }
  }
}
Output schema (JSON Schema)
{
  "type": "object",
  "properties": {
    "videos": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "video_url": {
            "type": "video",
            "description": "Drive path to the rendered talking-avatar clip (with narration); served to readers as a stable media URL. The playground renders it with a <video> player."
          },
          "aspect_ratio": {
            "enum": [
              "9:16",
              "1:1",
              "16:9"
            ],
            "type": "string",
            "description": "The frame this clip was rendered for."
          }
        }
      },
      "minItems": 1,
      "description": "One rendered clip per aspect ratio."
    }
  }
}
SpecDocs

Try AI Talking Avatar Generator free.

Run it in the playground above — $10 free credit, no card. Every run returns an exact cost receipt.

Want this in your own pipeline? Deploy your own skill →