promptguard.core¶
- class promptguard.PromptGuard(model_name='arkaean/promptguard-distilbert', threshold=0.5, device='auto', use_cache=True, cache_size=10000, cache_ttl=3600, enable_analysis=True, enable_sanitization=True, **kwargs)[source]¶
Bases:
objectMain PromptGuard classifier for detecting malicious prompts.
- __init__(model_name='arkaean/promptguard-distilbert', threshold=0.5, device='auto', use_cache=True, cache_size=10000, cache_ttl=3600, enable_analysis=True, enable_sanitization=True, **kwargs)[source]¶
Initialise the PromptGuard classifier.
- Parameters:
model_name (str) – HuggingFace Hub model identifier.
threshold (float) – Malicious classification threshold in
[0.0, 1.0]. Prompts with a model probability at or above this value are classified as malicious.device (str | None) – Device for model inference —
"cuda","cpu", or"auto"(selects CUDA when available).use_cache (bool) – Enable in-memory LRU caching of analysis results.
cache_size (int) – Maximum number of entries held in the cache.
cache_ttl (int | None) – Cache entry time-to-live in seconds. Pass
Noneto disable expiry.enable_analysis (bool) – When
True(default), enables supplementary sentiment, intent, keyword, and attack-pattern analysis that enrichesRiskScoremetadata.enable_sanitization (bool) – When
True(default), enables thesanitize()andsanitize_if_malicious()methods.**kwargs – Additional options forwarded to
PromptGuardConfig.
- sanitize(prompt, strategy=SanitizationStrategy.BALANCED, analyze_after=True)[source]¶
Sanitise a potentially malicious prompt.
- Parameters:
prompt (str) – Prompt to sanitise.
strategy (SanitizationStrategy) – Sanitisation strategy to apply.
analyze_after (bool) – When
True(default), the sanitised prompt is re-analysed and the result stored inSanitizeResponse.sanitized_analysis.
- Returns:
A
SanitizeResponsewith the sanitisation outcome and before/after risk scores.- Raises:
ValueError – When sanitisation is not enabled on this instance.
- Return type:
- sanitize_if_malicious(prompt, strategy=SanitizationStrategy.BALANCED)[source]¶
Sanitize prompt only if it’s detected as malicious.
- Parameters:
prompt (str) – Prompt to check and potentially sanitize
strategy (SanitizationStrategy) – Sanitization strategy if needed
- Returns:
Tuple of (potentially_sanitized_prompt, was_sanitized)
- Return type:
- analyze_batch(prompts, batch_size=None, show_progress=True)[source]¶
Analyze multiple prompts efficiently in batches, using cache when enabled.