Skip to main content
POST
/
get_ocr_page_text
Python
import os
from unstructured import UnstructuredClient

client = UnstructuredClient(
    username=os.environ.get("UNSTRUCTURED_USERNAME"),  # This is the default and can be omitted
    password=os.environ.get("UNSTRUCTURED_PASSWORD"),  # This is the default and can be omitted
)
response = client.data_source.get_ocr_page_text(
    data_connector_name="data_connector_name",
    file_name="file_name",
)
print(response.page_number)
{
  "total_page_count": 123,
  "page_text": "<string>",
  "page_number": 123
}

Documentation Index

Fetch the complete documentation index at: https://docs.deasylabs.com/llms.txt

Use this file to discover all available pages before exploring further.

Authorizations

Authorization
string
header
required

Basic authentication header of the form Basic <encoded-value>, where <encoded-value> is the base64-encoded string username:password.

Body

application/json
data_connector_name
string
required
file_name
string
required
page_number
integer | null
chunk_id
string | null

Response

Successful Response

total_page_count
integer
required
page_text
string
required
page_number
integer
required