# Avatar Studio

> Agent skillpack on puras — published by puras. AI talking-avatar video skills — turn a script and an optional presenter photo into a lip-synced talking-head video that reads your exact words to camera, in the language you choose.

- **Skillpack path:** `puras/avatar-studio`
- **Skillpack ID:** `577be7ab-db36-4689-9bda-5c35ef02ce25`
- **Deployment version:** v21
- **Skills in pack:** 1
- **Human page:** https://puras.co/skills/puras/avatar-studio
- **API base:** `https://api.puras.co`

## Use these skills

puras runs these skills on its own backend — you send inputs and get the result back. Three ways to call, fastest first:

### 1. MCP server — recommended for coding agents, no API key

Connect the puras MCP server; auth is OAuth in the browser on first call, so there is nothing to paste.

```bash
claude mcp add --transport http puras https://mcp.puras.co/mcp
```

Any MCP client works — point it at `https://mcp.puras.co/mcp` (HTTP transport). Then ask the agent to run `<skill>` from skillpack `puras/avatar-studio` with your inputs.

### 2. CLI / Python SDK — `pip install puras`

```bash
pip install puras
puras login            # or set PURAS_API_KEY
puras run puras/avatar-studio/<skill> -i key=value
```

From Python:

```python
import puras

client = puras.Client()   # PURAS_API_KEY from env
result = client.run("puras/avatar-studio/<skill>", {})
```

### 3. HTTP API

`wait=true` blocks until the run reaches a terminal status and returns the result inline.

```bash
curl -X POST "https://api.puras.co/v1/jobs?skillpack=puras/avatar-studio&wait=true" \
  -H "Authorization: Bearer $PURAS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"skill":"<skill>","inputs":{}}'
```

Mint an API key (for CLI / SDK / API) from https://puras.co/api-keys.

## Skills

- `talking-avatar` — AI Talking Avatar Generator: An AI talking-head video generator: turn a text script — and an optional presenter photo — into a lip-synced talking-avatar video that reads your exact words to camera, in the language you choose. No camera, talent, or editing — a finished clip ready for ads, explainers, or social.

### AI Talking Avatar Generator (`talking-avatar`)

An AI talking-head video generator: turn a text script — and an optional presenter photo — into a lip-synced talking-avatar video that reads your exact words to camera, in the language you choose. No camera, talent, or editing — a finished clip ready for ads, explainers, or social.

Spec: https://puras.co/skills/puras/avatar-studio/talking-avatar.md

**Input schema**

```json
{
  "type": "object",
  "required": [
    "script"
  ],
  "properties": {
    "look": {
      "type": "text",
      "maxLength": 400,
      "description": "Optional steer for the generated presenter and setting. Ignored with avatar_image."
    },
    "voice": {
      "enum": [
        "auto",
        "warm_female",
        "warm_male",
        "energetic_female",
        "energetic_male",
        "calm_narrator",
        "authoritative_male"
      ],
      "type": "string",
      "default": "auto",
      "description": "Voice persona. `auto` fits it to the script."
    },
    "script": {
      "type": "text",
      "maxLength": 1000,
      "minLength": 2,
      "description": "The exact words spoken, verbatim. Plain prose, no SSML."
    },
    "language": {
      "type": "string",
      "maxLength": 12,
      "description": "Optional language code (e.g. \"en\", \"tr\"). Empty = auto-detect."
    },
    "avatar_image": {
      "type": "image",
      "description": "Optional presenter portrait. Clear, front-facing, mouth visible."
    },
    "aspect_ratios": {
      "type": "array",
      "items": {
        "enum": [
          "9:16",
          "1:1",
          "16:9"
        ],
        "type": "string"
      },
      "default": [
        "9:16"
      ],
      "minItems": 1,
      "description": "Output frame(s). Pick one or more — one clip is rendered per selected ratio. Ignored when you supply avatar_image (the photo's framing wins).",
      "uniqueItems": true
    }
  }
}
```

**Output schema**

```json
{
  "type": "object",
  "properties": {
    "videos": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "video_url": {
            "type": "video",
            "description": "Drive path to the rendered talking-avatar clip (with narration); served to readers as a stable media URL. The playground renders it with a <video> player."
          },
          "aspect_ratio": {
            "enum": [
              "9:16",
              "1:1",
              "16:9"
            ],
            "type": "string",
            "description": "The frame this clip was rendered for."
          }
        }
      },
      "minItems": 1,
      "description": "One rendered clip per aspect ratio."
    }
  }
}
```

**Presenter photo + script → talking video**

Give it a presenter portrait and the exact words to say. The avatar reads your script verbatim in a fitted voice and you get back the lip-synced clip — framing follows the photo.

Inputs:

```json
{
  "voice": "auto",
  "script": "Big news — we just shipped dark mode. Open Settings, tap Appearance, and pick Dark. Your eyes will thank you tonight.\n",
  "avatar_image": "https://uozfqcfhlhugotnevscg.supabase.co/storage/v1/object/public/puras-public-skills/talking-avatar/dark-mode-presenter.jpg"
}
```

Outputs:

```json
{
  "videos": [
    {
      "video_url": "https://uozfqcfhlhugotnevscg.supabase.co/storage/v1/object/public/puras-public-skills/talking-avatar/dark-mode-demo.mp4",
      "aspect_ratio": "9:16"
    }
  ]
}
```

**Script only, any language — skill casts the presenter**

No image, no language flag. The skill infers a fitting presenter from the script, auto-detects the language (here German), speaks it verbatim, and returns the lip-synced clip.

Inputs:

```json
{
  "script": "Ich habe diesen Avatar in wenigen Minuten erstellt — und jetzt kann er meine Botschaft ganz natürlich sprechen.",
  "aspect_ratios": [
    "9:16"
  ]
}
```

Outputs:

```json
{
  "videos": [
    {
      "video_url": "https://uozfqcfhlhugotnevscg.supabase.co/storage/v1/object/public/puras-public-skills/talking-avatar/avatar-de-demo.mp4",
      "aspect_ratio": "9:16"
    }
  ]
}
```

**Call**

```bash
curl -X POST "https://api.puras.co/v1/jobs?skillpack=puras/avatar-studio&wait=true" \
  -H "Authorization: Bearer $PURAS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"skill":"talking-avatar","inputs":{"voice":"auto","script":"Big news — we just shipped dark mode. Open Settings, tap Appearance, and pick Dark. Your eyes will thank you tonight.\n","avatar_image":"https://uozfqcfhlhugotnevscg.supabase.co/storage/v1/object/public/puras-public-skills/talking-avatar/dark-mode-presenter.jpg"}}'
```

---

## Pricing

Runs are billed usage-based from your workspace credit balance — the cost of a job is the sum of the model token (and any media) usage it incurs. There is no per-call platform fee. The playground and the job result report the exact cost of each run.

- Pricing page: https://puras.co/pricing
- Machine-readable model price registry: `https://api.puras.co/v1/pricing`
