The service is responsible for discovery processes, including connecting to vector databases, identifying relevant metadata filters for authorization, and discovering MCP services, tools, and other connectors. It also ensures secure communication with the PlainID Cloud platform to deliver discovered Assets.
Connectors
Pinecone
The purpose of the connector is to discover for a pinecone instance all indexes and namespace all metadata keys of documents with configurable filters. Vectors are processed in batches to minimize memory footprint, making it suitable for large datasets with millions of vectors
The connector provides:
- Automatic discovery of all indexes and namespaces
- Metadata key extraction and type inference (string/number/bool/array_string)
- Regex-based filtering of namespaces and metadata keys
- Batch processing: Processes vectors in batches (1000 at a time) to optimize memory usage
Configuration
Values can use Environment Variable substitutions (like ${LOG_LEVEL:info}). Some settings (e.g. http, management) are defined by the micro-infra framework and may have additional defaults.
High-level Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| log.level | string | no | info | Log level (e.g. debug, info, warn, error). Can be set via LOG_LEVEL env. |
| plainIdUrl | string | yes* | — | Base URL of PlainID API (used to obtain JWT). Required for sources that send data to the orchestrator. |
| plainIdDiscoveryUrl | string | yes* | — | Base URL of the orchestrator (pinecone-orch-connector or mcp-orch-connector). Required when using Pinecone or PlainIdMcpGateway. |
| discoverySources | array | yes | — | List of discovery source entries. |
| http | object | no | (micro-infra) | HTTP server. Code defaults: port = 8080, jwt.jwtIgnoreVerification = true. Other fields from plainttp. |
| management | object | no | (micro-infra) | Monitoring/health. Schema from monitor.MonitorManagerConfig. |
Discovery Source Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| popId | string | yes | Unique identifier for the source (Point of Presence). |
| environmentId | string (UUID) | yes | PlainID environment UUID; required for sending to the orchestrator. |
| type | string | yes | Source type: Pinecone or PlainIdMcpGateway. |
| periodicStart | string (cron) | no | Cron expression (6 fields, with seconds). If empty, no periodic discovery is scheduled. |
| vendor | object | yes | Type-specific config (see below). |
| plainIdCredentials | object | yes | clientId, clientSecret for JWT. |
| metadataKeys | object | no | Metadata key filter: mode (include/exclude), patterns (regex). Used by Pinecone. |
| availabilityThreshold | float | no | Availability threshold (0–1). Default: 0.1. Pinecone only. |
Pinecone Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
| vendor.pinecone.apiKey | string | yes | — | Pinecone API key. |
| vendor.pinecone.sampleLimit | uint | no | 0 (no limit) | Max number of vectors to analyze per namespace. 0 = no limit. |
| vendor.collections | object | no | — | Namespace filter: mode (include/exclude), patterns (regex). |
FilterConfig (collections / metadataKeys) Parameters
| Parameter | Type | Description |
|---|---|---|
| mode | string | include or exclude. |
| patterns | []string | List of regex patterns. |
PlainID MCP Gateway Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| vendor.plainIdMcpGateway.url | string | yes | URL of pid-mcp service (e.g. http://pid-mcp:5235). |
plainIdCredentials Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| clientId | string | yes (for orchestrator sources) | PlainID API client ID. |
| clientSecret | string | yes (for orchestrator sources) | PlainID API client secret. |
Configuration examples
Example: Pinecone only
log:
level: info
plainIdUrl: "https://api.dev8.plainid.cloud"
plainIdDiscoveryUrl: "https://api.app.dev8.plainid.cloud"
discoverySources:
- popId: POP456
environmentId: "550e8400-e29b-41d4-a716-446655440000"
type: Pinecone
periodicStart: "0 0 1 * * ?"
vendor:
pinecone:
apiKey: "your-pinecone-api-key"
sampleLimit: 50000 # optional; 0 = no limit
collections:
mode: exclude
patterns:
- "users"
- "books_.*"
plainIdCredentials:
clientId: "your-client-id"
clientSecret: "your-client-secret"
metadataKeys:
mode: exclude
patterns:
- "createdAt"
- "timestamp.*"
availabilityThreshold: 0.1
Example: PlainID MCP Gateway only
log:
level: info
plainIdUrl: "https://api.dev8.plainid.cloud"
plainIdDiscoveryUrl: "https://api.app.dev8.plainid.cloud"
discoverySources:
- popId: POP789
environmentId: "550e8400-e29b-41d4-a716-446655440000"
type: PlainIdMcpGateway
periodicStart: "0 0 1 * * ?"
vendor:
plainIdMcpGateway:
url: "http://pid-mcp:5235"
plainIdCredentials:
clientId: "your-client-id"
clientSecret: "your-client-secret"
Example: Pinecone and PlainID MCP Gateway in one configuration
A single process can run both source types. There is one top-level plainIdDiscoveryUrl. If you need different base URLs per source, run separate instances with different configs.
log:
level: info
plainIdUrl: "https://api.dev8.plainid.cloud"
plainIdDiscoveryUrl: "https://api.app.dev8.plainid.cloud"
discoverySources:
- popId: Pinecone-1
environmentId: "550e8400-e29b-41d4-a716-446655440000"
type: Pinecone
periodicStart: "0 0 1 * * ?"
vendor:
pinecone:
apiKey: "your-pinecone-api-key"
collections:
mode: include
patterns: ["prod_.*"]
plainIdCredentials:
clientId: "your-client-id"
clientSecret: "your-client-secret"
metadataKeys:
mode: exclude
patterns:
- "createdAt"
availabilityThreshold: 0.1
- popId: MCP-1
environmentId: "550e8400-e29b-41d4-a716-446655440000"
type: PlainIdMcpGateway
periodicStart: "0 0 2 * * ?"
vendor:
plainIdMcpGateway:
url: "http://pid-mcp:5235"
plainIdCredentials:
clientId: "your-client-id"
clientSecret: "your-client-secret"