Metadata Properties
| Property | Description | Example |
|---|---|---|
| Values | The extracted or classified value(s) | ["NDA", "Non-Disclosure Agreement"] |
| Evidence | Text snippet supporting the extraction | ”This Non-Disclosure Agreement is entered into…” |
| Confidence | AI confidence score (0-1) | 0.95 |
Metadata Levels
| Level | Description | Use Case |
|---|---|---|
| File-Level | Aggregated metadata for the entire document | Document classification, search filters |
| Chunk-Level | Granular metadata per text segment | Precise evidence location, RAG retrieval |
Metadata Standardization
The platform includes AI-powered standardization to clean and normalize extracted values:| Feature | Description |
|---|---|
| Deduplication | Merge similar values (e.g., “Inc.” and “Incorporated”) |
| Normalization | Standardize formats (dates, currencies, names) |
| Bulk Standardization | Apply standardization across multiple tags |
Standardization helps ensure consistency across your metadata, making it easier to search, filter, and analyze your documents.
How Metadata Generation Works
1
Document Processing
Documents are chunked and prepared for analysis.
2
Tag Application
The AI applies your Tags to extract or classify information from each chunk.
3
Evidence Capture
The system captures the text snippet that supports each extraction.
4
Aggregation
Chunk-level metadata is aggregated to create file-level metadata.
5
Standardization
Optional normalization and deduplication cleans the results.
Example Metadata Output
For a contract document with a “Contract Type” classification tag:Python SDK
- Generate Metadata
- Batch Processing
- List Metadata
- Upsert & Delete

