LangChain

Integrate PlainID into LangChain to enforce policy-driven control across LLM pipelines.

This enables:

Prompt validation (categorization).
PII protection (anonymization).
Secure retrieval (RAG filtering).

The integration is built on top of:

core-plainid for authorization and Policy evaluation. See our documentation on the Integration Core for more information

Installation

pip install langchain-plainid

This automatically installs the required core-plainid dependency.

Define Request Context

Every request is evaluated through PlainID using a Request Context, which represents the calling identity (user, agent, system, etc.).

To define a Request Context:

Import the required classes.

from core_plainid.models.context.request_context import AdditionalIdentity, RequestContext

Create a RequestContext object and define the entity ID and entity type.

request_context = RequestContext(
    entity_id="your_entity_id",
    entity_type_id="your_entity_type",
    additional_identities=[
        AdditionalIdentity(
            entity_id="agent_id",
            entity_type_id="agent_type",
        ),
    ],
)

Pass the Request Context into the configuration.

config = {"configurable": {"request_context": request_context}}

Control Prompts with Categorization

Categorization ensures that incoming prompts are aligned with organizational policies before they reach your LLM.

To control prompts using categorization:

Import the required classes.

from core_plainid.categorization.categorizer import Categorizer
from core_plainid.utils.plainid_permissions_provider import PlainIDPermissionsProvider
from langchain_plainid.categorization.categorization_runnable import CategorizationRunnable

Configure the PlainIDPermissionsProvider.

permissions_provider = PlainIDPermissionsProvider(
    base_url="https://platform-product.us1.plainid.io",
    client_id="your_client_id",
    client_secret="your_client_secret",
)

Create a Categorizer instance.

categorizer = Categorizer(
    classifier_provider=classifier,
    permissions_provider=permissions_provider,
    all_categories=["contract", "HR", "finance"],
)

Wrap the Categorizer with CategorizationRunnable.

categorization = CategorizationRunnable(categorizer=categorizer)

Invoke the categorization process.

result = await categorization.ainvoke(
    "What is the weather today?",
    config=config,
)

Protect Sensitive Data (PII Anonymization)

Automatically detect and mask sensitive information before it propagates downstream.

To protect sensitive data using anonymization:

Import the required classes.

from core_plainid.anonymization.presidio_anonymizer import PresidioAnonymizer
from langchain_plainid.anonymization.anonymization_runnable import AnonymizationRunnable
from core_plainid.utils.plainid_permissions_provider import PlainIDPermissionsProvider
from core_plainid.models.context.request_context import RequestContext

Configure the PlainIDPermissionsProvider.

permissions_provider = PlainIDPermissionsProvider(
    base_url="https://platform-product.us1.plainid.io",
    client_id="your_client_id",
    client_secret="your_client_secret",
)

Create a PresidioAnonymizer instance.

anonymizer = PresidioAnonymizer(
    permissions_provider=permissions_provider,
    encrypt_key="your_16_char_key!",
)

Wrap the anonymizer with AnonymizationRunnable.

anonymization = AnonymizationRunnable(anonymizer=anonymizer)

Invoke the anonymization process.

result = await anonymization.ainvoke(
    "John Smith lives in New York",
    config=config,
)

Secure Retrieval Across Vector Stores

The retrieval system enforces PlainID authorization policies on document retrieval from vector stores. It supports multiple vector stores simultaneously, where each PlainID resource type maps to a single vector store collection (for example, a ChromaDB collection or a FAISS index).

PlainID Setup

Configure Rulesets in PlainID using a custom template name (one resource type per vector store collection). For example, if you have a customer collection with country and age metadata:

# METADATA
# custom:
#   plainid:
#     kind: Ruleset
#     name: rs1
ruleset(asset, identity, requestParams, action) if {
    asset.template == "customer"
    asset["country"] == "Sweden"
    asset["country"] != "Russia"
    asset["age"] >= 5
}

# METADATA
# custom:
#   plainid:
#     kind: Ruleset
#     name: rs2
ruleset(asset, identity, requestParams, action) if {
    asset.template == "customer"
    asset["country"] == "Norway"
    asset["age"] <= 100
}

Note that you need to add country and age parameters to your vector store as document metadata. PlainID uses these metadata fields to build the filters applied during retrieval.

To configure secure retrieval across vector stores:

Import the required classes.

from langchain_chroma import Chroma
from langchain_core.documents import Document
from core_plainid.models.context.request_context import RequestContext
from langchain_plainid.retrieval.filter_directive_provider import FilterDirectiveProvider
from langchain_plainid.retrieval.multi_store_retriever import MultiStoreRetriever
from langchain_plainid.retrieval.retrieval_runnable import RetrievalRunnable

Configure the FilterDirectiveProvider.

filter_provider = FilterDirectiveProvider(
    base_url="https://platform-product.us1.plainid.io",
    client_id="your_client_id",
    client_secret="your_client_secret",
)

Create the document datasets.

customer_docs = [
    Document(page_content="Stockholm is the capital of Sweden.", metadata={"country": "Sweden", "age": 5}),
    Document(page_content="Oslo is the capital of Norway.", metadata={"country": "Norway", "age": 5}),
    Document(page_content="Helsinki is the capital of Finland.", metadata={"country": "Finland", "age": 5}),
]

product_docs = [
    Document(page_content="Widget A is available in Europe.", metadata={"region": "Europe", "price": 10}),
    Document(page_content="Widget B is available in Asia.", metadata={"region": "Asia", "price": 20}),
]

Initialize the vector stores.

customer_store = Chroma.from_documents(customer_docs, embeddings, collection_name="customers")
product_store = Chroma.from_documents(product_docs, embeddings, collection_name="products")

Create a MultiStoreRetriever.

retriever = MultiStoreRetriever(
    filter_provider=filter_provider,
    resource_types=["customer", "product"],
    vector_stores=[customer_store, product_store],
    k=4,
)

Create the RequestContext.

request_context = RequestContext(
    entity_id="your_entity_id",
    entity_type_id="your_entity_type",
)

Execute the retrieval request.

docs = await retriever.aretrieve(
    "What is the capital of Sweden?",
    request_context=request_context,
)

Convert Retrieval into a Runnable

The RetrievalRunnable wraps the MultiStoreRetriever as a LangChain Runnable[str, list[Document]], allowing it to be used in LangChain chains.

To convert retrieval into a Runnable:

Import the RetrievalRunnable.

from langchain_plainid.retrieval.retrieval_runnable import RetrievalRunnable

Wrap the retriever.

retrieval = RetrievalRunnable(retriever=retriever)

Invoke the Runnable with configuration.

docs = await retrieval.ainvoke(
    "What is the capital of Sweden?",
    config=config,
)

Apply Policy-Based Retrieval

Use this approach to directly enforce PlainID Policies during retrieval.
To apply Policy-Based retrieval:

Import the required classes.

from langchain_plainid.retrieval.filter_directive_provider import FilterDirectiveProvider
from langchain_plainid.retrieval.multi_store_retriever import MultiStoreRetriever

Configure the FilterDirectiveProvider.

filter_provider = FilterDirectiveProvider(
    base_url="https://platform-product.us1.plainid.io",
    client_id="your_client_id",
    client_secret="your_client_secret",
)

Create a MultiStoreRetriever.

retriever = MultiStoreRetriever(
    filter_provider=filter_provider,
    resource_types=["customer", "product"],
    vector_stores=[customer_store, product_store],
    k=4,
)

Execute the retrieval request.

results = await retriever.aretrieve(
    "What is the capital of Sweden?",
    request_context=request_context,
)

Build a Fully Governed AI Chain

One of the key benefits of wrapping PlainID components as LangChain Runnables is the ability to chain them using the | (pipe) operator. The Request Context is passed once via configuration and flows through all runnables in the chain.

To build a fully governed AI chain:

Create a chain using the pipe operator.

chain = categorization_runnable | anonymization_runnable | retrieval_runnable

Define the configuration with the Request Context.

config = {"configurable": {"request_context": request_context}}

Invoke the chain.

docs = await chain.ainvoke("What is John Smith's contract status?", config=config)

This chain performs the following actions:

Categorizes the prompt and verifies it matches allowed categories in PlainID.
Anonymizes the prompt by detecting and masking or encrypting PII.
Retrieves documents by querying vector stores with PlainID-enforced filters.

Documentation Index