upstage-document-parse技能使用说明

2026-03-31 新闻来源：网淘吧围观:77

电脑广告

手机广告

Upstage 文档解析

使用 Upstage 的 Document Parse API 从文档中提取结构化内容。

支持的格式

PDF（异步处理，最多 1000 页）、PNG、JPG、JPEG、TIFF、BMP、GIF、WEBP、DOCX、PPTX、XLSX、HWP

upstage-document-parse

安装

clawhub install upstage-document-parse

API 密钥设置

从以下位置获取您的 API 密钥Upstage 控制台
配置 API 密钥：

openclaw config set skills.entries.upstage-document-parse.apiKey "your-api-key"

或添加到~/.openclaw/openclaw.json：

{
  "skills": {
    "entries": {
      "upstage-document-parse": {
        "apiKey": "your-api-key"
      }
    }
  }
}

使用示例

只需要求代理解析您的文档：

"Parse this PDF: ~/Documents/report.pdf"
"Parse: ~/Documents/report.jpg"

同步 API（小文档）

适用于小文档（建议少于 20 页）。

参数

参数	类型	默认值	描述
`model`	字符串	必需	使用`文档解析`（最新版）或`文档解析-夜间版`
`文档`	文件	必需	待解析的文档文件
`模式`	字符串	`标准`	`标准`（侧重文本）、`增强`（复杂表格/图像）、`自动`
`光学字符识别`	字符串	`自动`	`自动`（仅图像）或`强制`（始终进行OCR）
`输出格式`	字符串	`['html']`	`文本`,`超文本标记语言`,`标记语言`（数组格式）
`坐标`	布尔值	`真`	包含边界框坐标
`Base64编码`	字符串	`[]`	需要Base64编码的元素：`["表格"]`,`["图形"]`, 等等。
`图表识别`	布尔值	`真`	将图表转换为表格（测试版）
`合并跨页表格`	布尔值	`假`	合并跨页表格（测试版，若为真则最多20页）

基本解析

curl -X POST "https://api.upstage.ai/v1/document-digitization" \
  -H "Authorization: Bearer $UPSTAGE_API_KEY" \
  -F "document=@/path/to/file.pdf" \
  -F "model=document-parse"

提取Markdown

curl -X POST "https://api.upstage.ai/v1/document-digitization" \
  -H "Authorization: Bearer $UPSTAGE_API_KEY" \
  -F "document=@report.pdf" \
  -F "model=document-parse" \
  -F "output_formats=['markdown']"

复杂文档增强模式

curl -X POST "https://api.upstage.ai/v1/document-digitization" \
  -H "Authorization: Bearer $UPSTAGE_API_KEY" \
  -F "document=@complex.pdf" \
  -F "model=document-parse" \
  -F "mode=enhanced" \
  -F "output_formats=['html', 'markdown']"

强制OCR处理扫描文档

curl -X POST "https://api.upstage.ai/v1/document-digitization" \
  -H "Authorization: Bearer $UPSTAGE_API_KEY" \
  -F "document=@scan.pdf" \
  -F "model=document-parse" \
  -F "ocr=force"

以Base64格式提取表格图像

curl -X POST "https://api.upstage.ai/v1/document-digitization" \
  -H "Authorization: Bearer $UPSTAGE_API_KEY" \
  -F "document=@invoice.pdf" \
  -F "model=document-parse" \
  -F "base64_encoding=['table']"

响应结构

{
  "api": "2.0",
  "model": "document-parse-251217",
  "content": {
    "html": "<h1>...</h1>",
    "markdown": "# ...",
    "text": "..."
  },
  "elements": [
    {
      "id": 0,
      "category": "heading1",
      "content": { "html": "...", "markdown": "...", "text": "..." },
      "page": 1,
      "coordinates": [{"x": 0.06, "y": 0.05}, ...]
    }
  ],
  "usage": { "pages": 1 }
}

元素类别

段落,一级标题,二级标题,三级标题,列表,表格,图,图表,公式,图注,页眉,页脚,索引,脚注

异步API（大文档）

适用于最多1000页的文档。文档以每批10页的方式处理。

提交请求

curl -X POST "https://api.upstage.ai/v1/document-digitization/async" \
  -H "Authorization: Bearer $UPSTAGE_API_KEY" \
  -F "document=@large.pdf" \
  -F "model=document-parse" \
  -F "output_formats=['markdown']"

响应：

{"request_id": "uuid-here"}

检查状态并获取结果

curl "https://api.upstage.ai/v1/document-digitization/requests/{request_id}" \
  -H "Authorization: Bearer $UPSTAGE_API_KEY"

响应包含下载链接针对每批（30天内有效）。

列出所有请求

curl "https://api.upstage.ai/v1/document-digitization/requests" \
  -H "Authorization: Bearer $UPSTAGE_API_KEY"

状态值

已提交：请求已接收
已开始：处理进行中
已完成：准备就绪，可供下载
失败：发生错误（请检查失败信息)

说明

结果存储30天
下载链接15分钟后失效（重新获取状态以获取新链接）
文档按最多10页分批处理

Python 使用

import requests

api_key = "up_xxx"

# Sync
with open("doc.pdf", "rb") as f:
    response = requests.post(
        "https://api.upstage.ai/v1/document-digitization",
        headers={"Authorization": f"Bearer {api_key}"},
        files={"document": f},
        data={"model": "document-parse", "output_formats": "['markdown']"}
    )
print(response.json()["content"]["markdown"])

# Async for large docs
with open("large.pdf", "rb") as f:
    r = requests.post(
        "https://api.upstage.ai/v1/document-digitization/async",
        headers={"Authorization": f"Bearer {api_key}"},
        files={"document": f},
        data={"model": "document-parse"}
    )
request_id = r.json()["request_id"]

# Poll for results
import time
while True:
    status = requests.get(
        f"https://api.upstage.ai/v1/document-digitization/requests/{request_id}",
        headers={"Authorization": f"Bearer {api_key}"}
    ).json()
    if status["status"] == "completed":
        break
    time.sleep(5)

LangChain 集成

from langchain_upstage import UpstageDocumentParseLoader

loader = UpstageDocumentParseLoader(
    file_path="document.pdf",
    output_format="markdown",
    ocr="auto"
)
docs = loader.load()

环境变量（替代方案）

您也可以将API密钥设置为环境变量：

export UPSTAGE_API_KEY="your-api-key"

提示

使用mode=enhanced处理复杂表格、图表、图像
使用mode=auto让API按页面自行决定
对于超过20页的文档，使用异步API
使用ocr=force处理扫描的PDF或图像
merge_multipage_tables=true合并拆分的表格（增强模式下最多20页）
异步API结果保留30天
服务器端超时：每个请求5分钟（同步API）
标准文档处理约需3秒

免责申明

部分文章来自各大搜索引擎，如有侵权，请与我联系删除。

打赏

文章底部电脑广告

手机广告位-内容正文底部

标签

上一篇：OpenClaw Hardener技能使用说明下一篇：office-quotes技能使用说明