Python

import os
from unstructured import UnstructuredClient

client = UnstructuredClient(
    username=os.environ.get("UNSTRUCTURED_USERNAME"),  # This is the default and can be omitted
    password=os.environ.get("UNSTRUCTURED_PASSWORD"),  # This is the default and can be omitted
)
response = client.data_source.ingest(
    data_connector_name="data_connector_name",
)
print(response)

{
  "detail": "<string>"
}

Data Connectors

Ingest

Process documents with OCR and ingest them into Unstructured.

This endpoint performs optical character recognition on documents and stores the extracted data.

Request Body

Field	Type	Description
`data_connector_name`	`str`	Name of the data connector to use.
`file_names`	`List[str]`	Specific files to process. If omitted, processes all.
`job_id`	`str`	Custom job ID for tracking. Auto-generated if not provided.
`clean_up_out_of_sync`	`bool`	Remove files from VDB not in source. Default: `true`.
`file_count_to_run`	`int`	Limit number of files to process.
`use_llm`	`bool`	Use LLM for enhanced extraction. Default: `false`.

Response

200: OCR job started successfully
- Returns: Job tracking information
400: Bad Request (e.g., invalid data connector, unsupported VDB type)
500: Internal Server Error

Example

{
  "data_connector_name": "my-documents",
  "use_llm": true,
  "clean_up_out_of_sync": true,
  "file_count_to_run": 100
}

POST

ocr

ingest

Python

import os
from unstructured import UnstructuredClient

client = UnstructuredClient(
    username=os.environ.get("UNSTRUCTURED_USERNAME"),  # This is the default and can be omitted
    password=os.environ.get("UNSTRUCTURED_PASSWORD"),  # This is the default and can be omitted
)
response = client.data_source.ingest(
    data_connector_name="data_connector_name",
)
print(response)

{
  "detail": "<string>"
}

Body

application/json

data_connector_name

string

required

file_names

string[] | null

job_id

string | null

clean_up_out_of_sync

boolean

default:true

file_count_to_run

integer | null

use_llm

boolean

default:false

Response

Successful Response

Delete List Ingested Data

⌘I

Data Connectors

Tags

Taxonomies

Metadata

Task Tracking

Data Slices

Destinations

Ingest

Request Body

Response

Example

Body

Response