promptguard.core

class promptguard.PromptGuard(model_name='arkaean/promptguard-distilbert', threshold=0.5, device='auto', use_cache=True, cache_size=10000, cache_ttl=3600, enable_analysis=True, enable_sanitization=True, **kwargs)[source]

Bases: object

Main PromptGuard classifier for detecting malicious prompts.

__init__(model_name='arkaean/promptguard-distilbert', threshold=0.5, device='auto', use_cache=True, cache_size=10000, cache_ttl=3600, enable_analysis=True, enable_sanitization=True, **kwargs)[source]

Initialise the PromptGuard classifier.

Parameters:
  • model_name (str) – HuggingFace Hub model identifier.

  • threshold (float) – Malicious classification threshold in [0.0, 1.0]. Prompts with a model probability at or above this value are classified as malicious.

  • device (str | None) – Device for model inference — "cuda", "cpu", or "auto" (selects CUDA when available).

  • use_cache (bool) – Enable in-memory LRU caching of analysis results.

  • cache_size (int) – Maximum number of entries held in the cache.

  • cache_ttl (int | None) – Cache entry time-to-live in seconds. Pass None to disable expiry.

  • enable_analysis (bool) – When True (default), enables supplementary sentiment, intent, keyword, and attack-pattern analysis that enriches RiskScore metadata.

  • enable_sanitization (bool) – When True (default), enables the sanitize() and sanitize_if_malicious() methods.

  • **kwargs – Additional options forwarded to PromptGuardConfig.

sanitize(prompt, strategy=SanitizationStrategy.BALANCED, analyze_after=True)[source]

Sanitise a potentially malicious prompt.

Parameters:
Returns:

A SanitizeResponse with the sanitisation outcome and before/after risk scores.

Raises:

ValueError – When sanitisation is not enabled on this instance.

Return type:

SanitizeResponse

sanitize_if_malicious(prompt, strategy=SanitizationStrategy.BALANCED)[source]

Sanitize prompt only if it’s detected as malicious.

Parameters:
  • prompt (str) – Prompt to check and potentially sanitize

  • strategy (SanitizationStrategy) – Sanitization strategy if needed

Returns:

Tuple of (potentially_sanitized_prompt, was_sanitized)

Return type:

Tuple[str, bool]

analyze(prompt)[source]

Analyze a single prompt for malicious content.

clear_cache()[source]

Clear the analysis cache

cache_stats()[source]

Get cache statistics

classify(prompt, threshold=None)[source]

Simple binary classification.

analyze_batch(prompts, batch_size=None, show_progress=True)[source]

Analyze multiple prompts efficiently in batches, using cache when enabled.

classify_batch(prompts, threshold=None, show_progress=False)[source]

Simple binary classification for multiple prompts

property device: str

Get the device being used for inference.

property threshold: float

Get current classification threshold.