Use this file to discover all available pages before exploring further.
This cookbook shows you how to use the platform’s AI to automatically generate custom taxonomies. Instead of manually defining tags one by one, you can describe your use case in natural language, and the AI will build a complete taxonomy for you.
The suggest feature allows you to bootstrap complex taxonomies in seconds:
from unstructured import UnstructuredClientclient = UnstructuredClient( username="your-username", password="your-password",)# Describe what you want to extractsuggestions = client.taxonomy.suggest( user_context=""" I need to analyze legal contracts. Extract key terms, financial obligations, risk indicators, important dates, and anything relevant for compliance. Focus on NDAs, MSAs, and employment agreements. """, data_connector_name="my-documents", # Optional: use documents to ground suggestions)# Review the suggestionsprint("🤖 AI-Suggested Taxonomy:")print(suggestions.suggestion)
Example output:
🤖 AI-Suggested Taxonomy:{ "contract_type": { "description": "Primary contract classification: NDA, MSA, SLA, SOW, Employment, Lease", "type": "word" }, "parties": { "description": "All parties to the contract with their legal names", "type": "list[string]" }, "effective_date": { "description": "Date when the contract becomes legally binding", "type": "date" }, "total_value": { "description": "Total monetary value of the contract in USD", "type": "float" }}
For higher accuracy, point to specific files in your data connector. The AI will analyze these documents to suggest relevant tags:
# The AI analyzes your files to suggest the most relevant tagssuggestions = client.taxonomy.suggest( user_context="Extract key data from these vendor invoices", data_connector_name="my-s3-bucket", file_names=[ "invoices/sample-invoice-1.pdf", "invoices/sample-invoice-2.pdf" ])
You can modify the AI suggestions before creating the taxonomy:
# Get suggestionsresponse = client.taxonomy.suggest( user_context="Legal contract analysis", data_connector_name="my-docs")# Convert to list for editingtags = []for name, details in response.suggestion.items(): tags.append({ "name": name, "description": details["description"], "output_type": details["type"] })# Add a custom tag the AI might have missedtags.append({ "name": "reviewed_by_legal", "description": "Whether this contract has been reviewed", "output_type": "boolean",})# Create the customized taxonomyclient.taxonomy.upsert( taxonomy_name="legal-contracts-custom", tags=tags,)