promptguard.schemas

Data Models

class promptguard.RiskScore(is_malicious, probability, risk_level, confidence, explanation, metadata=None)[source]

Bases: object

Result of a single-prompt security analysis.

is_malicious: bool

True when the model probability exceeds the configured threshold.

probability: float

Malicious probability in [0.0, 1.0] from the model.

risk_level: RiskLevel

Coarse-grained RiskLevel derived from probability.

confidence: float

Distance from the decision boundary, scaled to [0.0, 1.0].

explanation: str

Human-readable summary with evidence.

metadata: Dict[str, Any] | None = None

Optional per-analyser detail (sentiment, intent, keywords, attack_patterns).

to_dict()[source]

Serialise to a plain dictionary.

class promptguard.SanitizationResult(original, sanitized, was_modified, removed_patterns, strategy, confidence, risk_reduction)[source]

Bases: object

Outcome of a single prompt sanitisation operation.

original: str

The original (pre-sanitisation) prompt text.

sanitized: str

The cleaned prompt text.

was_modified: bool

True when sanitized differs from original.

removed_patterns: List[str]

Fragments of text that were matched and removed or replaced.

strategy: SanitizationStrategy

The SanitizationStrategy that was applied.

confidence: float

Estimated confidence that the sanitised prompt is safe ([0.0, 1.0]).

risk_reduction: float

Estimated reduction in risk ([0.0, 1.0]).

to_dict()[source]

Serialise to a plain dictionary.

class promptguard.SanitizeResponse(sanitization, original_analysis, sanitized_analysis, risk_before, risk_after, risk_reduction)[source]

Bases: object

Typed result returned by PromptGuard.sanitize().

sanitization: SanitizationResult

Detailed sanitisation outcome.

original_analysis: RiskScore

RiskScore for the original prompt.

sanitized_analysis: RiskScore | None

RiskScore for the sanitised prompt, or None when analyze_after was False.

risk_before: float

Malicious probability of the original prompt.

risk_after: float | None

Malicious probability after sanitisation, or None.

risk_reduction: float

Difference risk_before - risk_after (0.0 when unchanged).

Enumerations

class promptguard.RiskLevel(*values)[source]

Bases: str, Enum

Categorised risk level returned by the classifier.

LOW = 'low'
MEDIUM = 'medium'
HIGH = 'high'
class promptguard.Intent(*values)[source]

Bases: str, Enum

Detected intent of the analysed prompt.

QUESTION = 'question'
INSTRUCTION = 'instruction'
CONVERSATION = 'conversation'
JAILBREAK = 'jailbreak'
INJECTION = 'injection'
UNKNOWN = 'unknown'
class promptguard.Sentiment(*values)[source]

Bases: str, Enum

Detected sentiment of the analysed prompt.

POSITIVE = 'positive'
NEUTRAL = 'neutral'
NEGATIVE = 'negative'
class promptguard.SanitizationStrategy(*values)[source]

Bases: str, Enum

Strategy controlling how aggressively a prompt is sanitised.

CONSERVATIVE = 'conservative'
BALANCED = 'balanced'
MINIMAL = 'minimal'