Extraction Pipeline¶
The LLM-powered entity and relation extraction system.
ExtractionPipeline¶
ExtractionPipeline ¶
Extracts entities and relations from conversational messages using an LLM.
Usage
pipeline = ExtractionPipeline(llm_client) result = pipeline.extract("My wife's name is Lena")
extract
async
¶
Extract entities and relations from a user message.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
message
|
str
|
The user's conversational message. |
required |
Returns:
| Type | Description |
|---|---|
ExtractionResult
|
ExtractionResult with entities, relations, and timing info. |
ExtractionResult
|
On failure, returns an empty result (never raises). |
Result Types¶
ExtractionResult¶
ExtractionResult
dataclass
¶
Complete result from processing a single message.
ExtractedEntity¶
ExtractedEntity
dataclass
¶
An entity extracted from a user message.
ExtractedRelation¶
ExtractedRelation
dataclass
¶
A relation extracted between two entities.
LLM Clients¶
LLMClient Protocol¶
LLMClient ¶
Bases: Protocol
Protocol for LLM clients used by the extraction pipeline.
extract
async
¶
Send extraction prompt to LLM and return raw text response.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
system_prompt
|
str
|
System instructions for extraction. |
required |
user_message
|
str
|
The user's conversational message to extract from. |
required |
Returns:
| Type | Description |
|---|---|
str
|
Raw LLM response text (expected to be JSON). |
Raises:
| Type | Description |
|---|---|
LLMError
|
If the LLM call fails. |
MockLLMClient¶
MockLLMClient ¶
Mock LLM that returns predetermined extraction results.
Register responses with set_response(message_substring, json_response).
Falls back to an empty extraction if no match is found.
set_response ¶
Register a canned response for messages containing the given substring.
AnthropicLLMClient¶
AnthropicLLMClient ¶
LLM client using the Anthropic async API (Claude Haiku for extraction).
Utilities¶
repair_llm_json¶
repair_llm_json ¶
Attempt to parse and repair common LLM JSON output issues.
Handles
- Markdown fences / prose around fenced blocks
- Preamble / trailing extra text (extracts first complete JSON object/array)
- Trailing commas before closing braces/brackets
- Unclosed top-level brackets/braces (best-effort)
Returns:
| Type | Description |
|---|---|
dict[str, Any] | list[Any] | None
|
Parsed dict/list or None if repair fails. |