Integrate PlainID into LangChain to enforce policy-driven control across LLM pipelines.
This enables:
- Prompt validation (categorization).
- PII protection (anonymization).
- Secure retrieval (RAG filtering).
The integration is built on top of:
core-plainidfor authorization and Policy evaluation. See our documentation on the Integration Core for more information
Installation
pip install langchain-plainid
This automatically installs the required core-plainid dependency.
Define Request Context
Every request is evaluated through PlainID using a Request Context, which represents the calling identity (user, agent, system, etc.).
To define a Request Context:
- Import the required classes.
from core_plainid.models.context.request_context import AdditionalIdentity, RequestContext
- Create a
RequestContextobject and define the entity ID and entity type.
request_context = RequestContext(
entity_id="your_entity_id",
entity_type_id="your_entity_type",
additional_identities=[
AdditionalIdentity(
entity_id="agent_id",
entity_type_id="agent_type",
),
],
)
- Pass the Request Context into the configuration.
config = {"configurable": {"request_context": request_context}}
Control Prompts with Categorization
Categorization ensures that incoming prompts are aligned with organizational policies before they reach your LLM.
To control prompts using categorization:
- Import the required classes.
from core_plainid.categorization.categorizer import Categorizer
from core_plainid.utils.plainid_permissions_provider import PlainIDPermissionsProvider
from langchain_plainid.categorization.categorization_runnable import CategorizationRunnable
- Configure the
PlainIDPermissionsProvider.
permissions_provider = PlainIDPermissionsProvider(
base_url="https://platform-product.us1.plainid.io",
client_id="your_client_id",
client_secret="your_client_secret",
)
- Create a
Categorizerinstance.
categorizer = Categorizer(
classifier_provider=classifier,
permissions_provider=permissions_provider,
all_categories=["contract", "HR", "finance"],
)
- Wrap the
CategorizerwithCategorizationRunnable.
categorization = CategorizationRunnable(categorizer=categorizer)
- Invoke the categorization process.
result = await categorization.ainvoke(
"What is the weather today?",
config=config,
)
Protect Sensitive Data (PII Anonymization)
Automatically detect and mask sensitive information before it propagates downstream.
To protect sensitive data using anonymization:
- Import the required classes.
from core_plainid.anonymization.presidio_anonymizer import PresidioAnonymizer
from langchain_plainid.anonymization.anonymization_runnable import AnonymizationRunnable
from core_plainid.utils.plainid_permissions_provider import PlainIDPermissionsProvider
from core_plainid.models.context.request_context import RequestContext
- Configure the
PlainIDPermissionsProvider.
permissions_provider = PlainIDPermissionsProvider(
base_url="https://platform-product.us1.plainid.io",
client_id="your_client_id",
client_secret="your_client_secret",
)
- Create a
PresidioAnonymizerinstance.
anonymizer = PresidioAnonymizer(
permissions_provider=permissions_provider,
encrypt_key="your_16_char_key!",
)
- Wrap the anonymizer with
AnonymizationRunnable.
anonymization = AnonymizationRunnable(anonymizer=anonymizer)
- Invoke the anonymization process.
result = await anonymization.ainvoke(
"John Smith lives in New York",
config=config,
)
Secure Retrieval Across Vector Stores
The retrieval system enforces PlainID authorization policies on document retrieval from vector stores. It supports multiple vector stores simultaneously, where each PlainID resource type maps to a single vector store collection (for example, a ChromaDB collection or a FAISS index).
PlainID Setup
Configure Rulesets in PlainID using a custom template name (one resource type per vector store collection). For example, if you have a customer collection with country and age metadata:
# METADATA
# custom:
# plainid:
# kind: Ruleset
# name: rs1
ruleset(asset, identity, requestParams, action) if {
asset.template == "customer"
asset["country"] == "Sweden"
asset["country"] != "Russia"
asset["age"] >= 5
}
# METADATA
# custom:
# plainid:
# kind: Ruleset
# name: rs2
ruleset(asset, identity, requestParams, action) if {
asset.template == "customer"
asset["country"] == "Norway"
asset["age"] <= 100
}
Note that you need to add country and age parameters to your vector store as document metadata. PlainID uses these metadata fields to build the filters applied during retrieval.
To configure secure retrieval across vector stores:
- Import the required classes.
from langchain_chroma import Chroma
from langchain_core.documents import Document
from core_plainid.models.context.request_context import RequestContext
from langchain_plainid.retrieval.filter_directive_provider import FilterDirectiveProvider
from langchain_plainid.retrieval.multi_store_retriever import MultiStoreRetriever
from langchain_plainid.retrieval.retrieval_runnable import RetrievalRunnable
- Configure the
FilterDirectiveProvider.
filter_provider = FilterDirectiveProvider(
base_url="https://platform-product.us1.plainid.io",
client_id="your_client_id",
client_secret="your_client_secret",
)
- Create the document datasets.
customer_docs = [
Document(page_content="Stockholm is the capital of Sweden.", metadata={"country": "Sweden", "age": 5}),
Document(page_content="Oslo is the capital of Norway.", metadata={"country": "Norway", "age": 5}),
Document(page_content="Helsinki is the capital of Finland.", metadata={"country": "Finland", "age": 5}),
]
product_docs = [
Document(page_content="Widget A is available in Europe.", metadata={"region": "Europe", "price": 10}),
Document(page_content="Widget B is available in Asia.", metadata={"region": "Asia", "price": 20}),
]
- Initialize the vector stores.
customer_store = Chroma.from_documents(customer_docs, embeddings, collection_name="customers")
product_store = Chroma.from_documents(product_docs, embeddings, collection_name="products")
- Create a
MultiStoreRetriever.
retriever = MultiStoreRetriever(
filter_provider=filter_provider,
resource_types=["customer", "product"],
vector_stores=[customer_store, product_store],
k=4,
)
- Create the
RequestContext.
request_context = RequestContext(
entity_id="your_entity_id",
entity_type_id="your_entity_type",
)
- Execute the retrieval request.
docs = await retriever.aretrieve(
"What is the capital of Sweden?",
request_context=request_context,
)
Convert Retrieval into a Runnable
The RetrievalRunnable wraps the MultiStoreRetriever as a LangChain Runnable[str, list[Document]], allowing it to be used in LangChain chains.
To convert retrieval into a Runnable:
- Import the
RetrievalRunnable.
from langchain_plainid.retrieval.retrieval_runnable import RetrievalRunnable
- Wrap the retriever.
retrieval = RetrievalRunnable(retriever=retriever)
- Invoke the Runnable with configuration.
docs = await retrieval.ainvoke(
"What is the capital of Sweden?",
config=config,
)
Apply Policy-Based Retrieval
Use this approach to directly enforce PlainID Policies during retrieval.
To apply Policy-Based retrieval:
- Import the required classes.
from langchain_plainid.retrieval.filter_directive_provider import FilterDirectiveProvider
from langchain_plainid.retrieval.multi_store_retriever import MultiStoreRetriever
- Configure the
FilterDirectiveProvider.
filter_provider = FilterDirectiveProvider(
base_url="https://platform-product.us1.plainid.io",
client_id="your_client_id",
client_secret="your_client_secret",
)
- Create a
MultiStoreRetriever.
retriever = MultiStoreRetriever(
filter_provider=filter_provider,
resource_types=["customer", "product"],
vector_stores=[customer_store, product_store],
k=4,
)
- Execute the retrieval request.
results = await retriever.aretrieve(
"What is the capital of Sweden?",
request_context=request_context,
)
Build a Fully Governed AI Chain
One of the key benefits of wrapping PlainID components as LangChain Runnables is the ability to chain them using the | (pipe) operator. The Request Context is passed once via configuration and flows through all runnables in the chain.
To build a fully governed AI chain:
- Create a chain using the pipe operator.
chain = categorization_runnable | anonymization_runnable | retrieval_runnable
- Define the configuration with the Request Context.
config = {"configurable": {"request_context": request_context}}
- Invoke the chain.
docs = await chain.ainvoke("What is John Smith's contract status?", config=config)
This chain performs the following actions:
- Categorizes the prompt and verifies it matches allowed categories in PlainID.
- Anonymizes the prompt by detecting and masking or encrypting PII.
- Retrieves documents by querying vector stores with PlainID-enforced filters.