# AI Talking Avatar Generator

> Agent skill on puras — published by puras. An AI talking-head video generator: turn a text script — and an optional presenter photo — into a lip-synced talking-avatar video that reads your exact words to camera, in the language you choose. No camera, talent, or editing — a finished clip ready for ads, explainers, or social.

- **Skill path:** `puras/avatar-studio/talking-avatar`
- **Skillpack:** Avatar Studio (`avatar-studio`)
- **Skillpack ID:** `577be7ab-db36-4689-9bda-5c35ef02ce25`
- **Deployment version:** v21
- **Human page:** https://puras.co/skills/puras/avatar-studio/talking-avatar
- **API base:** `https://api.puras.co`

## Description

An AI talking-head video generator: turn a text script — and an optional presenter photo — into a lip-synced talking-avatar video that reads your exact words to camera, in the language you choose. No camera, talent, or editing — a finished clip ready for ads, explainers, or social.

**Turn a script — and a photo — into a talking-head video**

## Who it's for

- **Marketing & social teams** — Produce spokesperson clips without booking talent or a shoot.
- **Course & content creators** — Turn scripts into talking-head lessons and explainers at scale.
- **Product & app teams** — Generate onboarding and explainer videos programmatically.

## AI Talking Avatar Generator vs HeyGen

| | HeyGen | puras |
| --- | --- | --- |
| Pricing | Monthly subscription with credit tiers | Usage-based — pay per video, exact cost each run |
| How it works | Pick an avatar and edit in their studio | A script (and optional presenter photo) in, a lip-synced clip out |
| Your own face | Avatar creation flow inside their app | Upload any front-facing portrait — it lip-syncs that photo directly |
| Where it runs | A web app you log into | API-native — generate talking-head clips from your own product |

## AI Talking Avatar Generator vs Synthesia

| | Synthesia | puras |
| --- | --- | --- |
| Built for | Studio avatars for corporate and training video | Script-to-clip spokesperson videos for ads, social, and products |
| Pricing | Per-seat subscription with video-minute allowances | Usage-based — pay per video, exact cost each run |
| How it works | Assemble scenes in their editor | One call: script in, finished lip-synced MP4 out |
| Where it runs | A web app you log into | API-native — generate clips from any language with one key |

## AI Talking Avatar Generator vs D-ID

| | D-ID | puras |
| --- | --- | --- |
| Pricing | Credit packs and subscription tiers | Usage-based — pay per video, exact cost each run |
| Languages | Varies by plan tier | Reads your exact words, in the language you choose |
| Output | Talking-photo clips inside their platform | Finished MP4s in 9:16, 1:1, or 16:9 — ready for feeds and ads |
| Where it runs | Their studio or their API plans | API-native by default — same skill in the browser and in your code |

## FAQ

### How do I turn a photo into a talking AI video?

Upload a clear, front-facing portrait and a text script — the talking photo AI lip-syncs the photo to a generated voice reading your exact words, and returns a finished MP4 clip. The photo's framing carries straight through to the output.

### How do I make an AI video of someone talking from a script?

Paste the exact words you want spoken; the avatar reads them verbatim to camera, lip-synced, with no filming. Supply a presenter photo or let the skill cast a fitting presenter from the script itself.

### Can I make multilingual talking head videos?

Yes — set a language code or leave it on auto-detect, and the avatar speaks your script in that language. It reads your words verbatim, so the script you write is the script that's spoken.

### Can I make a talking avatar from a photo of myself?

Yes — give it your own front-facing portrait as the presenter and it lip-syncs that photo to your script. It uses the photo you supply rather than training a separate model, so there's nothing to set up.

### Is there a talking avatar API for developers?

Yes — this is an API-native skill, not a web app. Call it from any language with one API key to generate lip-synced talking-head MP4s on demand, with usage-based pricing and the exact cost reported per run.

### What format are the talking avatar videos exported in?

Each run returns a finished video clip — one per aspect ratio you pick (9:16, 1:1, or 16:9). When you supply your own presenter photo, the clip keeps that photo's framing. Ready to drop into ads, social feeds, explainers, or training videos.

### Looking for a cheaper HeyGen or Synthesia alternative?

This is a usage-based talking avatar generator with no seats and no per-task credit bundles — you pay for the run you make and see its exact cost each time, then embed it in your own product via the API.

## Try it free

AI Talking Avatar Generator has a free in-browser playground on its page (https://puras.co/skills/puras/avatar-studio/talking-avatar) — load an example or bring your own inputs and run it with a Google sign-in. No credit card, no subscription; runs are usage-based after the free try.

## Input schema

```json
{
  "type": "object",
  "required": [
    "script"
  ],
  "properties": {
    "look": {
      "type": "text",
      "maxLength": 400,
      "description": "Optional steer for the generated presenter and setting. Ignored with avatar_image."
    },
    "voice": {
      "enum": [
        "auto",
        "warm_female",
        "warm_male",
        "energetic_female",
        "energetic_male",
        "calm_narrator",
        "authoritative_male"
      ],
      "type": "string",
      "default": "auto",
      "description": "Voice persona. `auto` fits it to the script."
    },
    "script": {
      "type": "text",
      "maxLength": 1000,
      "minLength": 2,
      "description": "The exact words spoken, verbatim. Plain prose, no SSML."
    },
    "language": {
      "type": "string",
      "maxLength": 12,
      "description": "Optional language code (e.g. \"en\", \"tr\"). Empty = auto-detect."
    },
    "avatar_image": {
      "type": "image",
      "description": "Optional presenter portrait. Clear, front-facing, mouth visible."
    },
    "aspect_ratios": {
      "type": "array",
      "items": {
        "enum": [
          "9:16",
          "1:1",
          "16:9"
        ],
        "type": "string"
      },
      "default": [
        "9:16"
      ],
      "minItems": 1,
      "description": "Output frame(s). Pick one or more — one clip is rendered per selected ratio. Ignored when you supply avatar_image (the photo's framing wins).",
      "uniqueItems": true
    }
  }
}
```

## Output schema

```json
{
  "type": "object",
  "properties": {
    "videos": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "video_url": {
            "type": "video",
            "description": "Drive path to the rendered talking-avatar clip (with narration); served to readers as a stable media URL. The playground renders it with a <video> player."
          },
          "aspect_ratio": {
            "enum": [
              "9:16",
              "1:1",
              "16:9"
            ],
            "type": "string",
            "description": "The frame this clip was rendered for."
          }
        }
      },
      "minItems": 1,
      "description": "One rendered clip per aspect ratio."
    }
  }
}
```

## Examples

### Presenter photo + script → talking video

Give it a presenter portrait and the exact words to say. The avatar reads your script verbatim in a fitted voice and you get back the lip-synced clip — framing follows the photo.

Inputs:

```json
{
  "voice": "auto",
  "script": "Big news — we just shipped dark mode. Open Settings, tap Appearance, and pick Dark. Your eyes will thank you tonight.\n",
  "avatar_image": "https://uozfqcfhlhugotnevscg.supabase.co/storage/v1/object/public/puras-public-skills/talking-avatar/dark-mode-presenter.jpg"
}
```

Outputs:

```json
{
  "videos": [
    {
      "video_url": "https://uozfqcfhlhugotnevscg.supabase.co/storage/v1/object/public/puras-public-skills/talking-avatar/dark-mode-demo.mp4",
      "aspect_ratio": "9:16"
    }
  ]
}
```

### Script only, any language — skill casts the presenter

No image, no language flag. The skill infers a fitting presenter from the script, auto-detects the language (here German), speaks it verbatim, and returns the lip-synced clip.

Inputs:

```json
{
  "script": "Ich habe diesen Avatar in wenigen Minuten erstellt — und jetzt kann er meine Botschaft ganz natürlich sprechen.",
  "aspect_ratios": [
    "9:16"
  ]
}
```

Outputs:

```json
{
  "videos": [
    {
      "video_url": "https://uozfqcfhlhugotnevscg.supabase.co/storage/v1/object/public/puras-public-skills/talking-avatar/avatar-de-demo.mp4",
      "aspect_ratio": "9:16"
    }
  ]
}
```

## Use this skill

puras runs this skill on its own backend — you send inputs and get the result back. Three ways to call, fastest first:

### 1. MCP server — recommended for coding agents, no API key

Connect the puras MCP server; auth is OAuth in the browser on first call, so there is nothing to paste.

```bash
claude mcp add --transport http puras https://mcp.puras.co/mcp
```

Any MCP client works — point it at `https://mcp.puras.co/mcp` (HTTP transport). Then ask the agent to run `talking-avatar` from skillpack `puras/avatar-studio` with your inputs.

### 2. CLI / Python SDK — `pip install puras`

```bash
pip install puras
puras login            # or set PURAS_API_KEY
puras run puras/avatar-studio/talking-avatar -i key=value
```

From Python:

```python
import puras

client = puras.Client()   # PURAS_API_KEY from env
result = client.run("puras/avatar-studio/talking-avatar", {"voice":"auto","script":"Big news — we just shipped dark mode. Open Settings, tap Appearance, and pick Dark. Your eyes will thank you tonight.\n","avatar_image":"https://uozfqcfhlhugotnevscg.supabase.co/storage/v1/object/public/puras-public-skills/talking-avatar/dark-mode-presenter.jpg"})
```

### 3. HTTP API

`wait=true` blocks until the run reaches a terminal status and returns the result inline.

```bash
curl -X POST "https://api.puras.co/v1/jobs?skillpack=puras/avatar-studio&wait=true" \
  -H "Authorization: Bearer $PURAS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"skill":"talking-avatar","inputs":{"voice":"auto","script":"Big news — we just shipped dark mode. Open Settings, tap Appearance, and pick Dark. Your eyes will thank you tonight.\n","avatar_image":"https://uozfqcfhlhugotnevscg.supabase.co/storage/v1/object/public/puras-public-skills/talking-avatar/dark-mode-presenter.jpg"}}'
```

Mint an API key (for CLI / SDK / API) from https://puras.co/api-keys.

## Pricing

Runs are billed usage-based from your workspace credit balance — the cost of a job is the sum of the model token (and any media) usage it incurs. There is no per-call platform fee. The playground and the job result report the exact cost of each run.

- Pricing page: https://puras.co/pricing
- Machine-readable model price registry: `https://api.puras.co/v1/pricing`