Text Signals API¶
PerplexitySignal
¶
Bases: BaseSignal
Analyzes text complexity using Perplexity metrics.
This signal assumes that AI-generated text tends to have lower perplexity (more predictable) compared to human-written text.
Mechanism: 1. Tokenize input text using a pretrained tokenizer (e.g., GPT-2). 2. Calculate perplexity using the corresponding language model. 3. Map perplexity to an AI probability score using a logistic function. - Low perplexity -> High AI Probability. - High perplexity -> Low AI Probability (Human).
Attributes:
| Name | Type | Description |
|---|---|---|
model_name |
str
|
Name of the underlying model (default: "gpt2"). |
device |
Optional[str]
|
Device to run model on ('cpu', 'cuda'). |
Source code in veridex/text/perplexity.py
6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 | |
name
property
¶
Returns 'perplexity'.
dtype
property
¶
Returns 'text'.
__init__(model_name='gpt2', device=None)
¶
Initialize the Perplexity signal.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model_name
|
str
|
Identifier for the model used to calculate perplexity. |
'gpt2'
|
device
|
Optional[str]
|
Device to run model on. |
None
|
Source code in veridex/text/perplexity.py
run(input_data)
¶
Calculate perplexity and convert to an AI score.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_data
|
str
|
Text to analyze. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
DetectionResult |
DetectionResult
|
|
Source code in veridex/text/perplexity.py
ZlibEntropySignal
¶
Bases: BaseSignal
Detects AI content using compression ratio (zlib entropy).
This method employs a compression-based approach under the hypothesis that AI-generated content is more predictable map (lower entropy) and thus more compressible than human content.
Algorithm
ratio = len(zlib(text)) / len(text) - Lower ratio (< 0.6) -> Highly compressible -> Likely AI. - Higher ratio (> 0.8) -> Less compressible -> Likely Human.
Attributes:
| Name | Type | Description |
|---|---|---|
name |
str
|
'zlib_entropy' |
dtype |
str
|
'text' |
Source code in veridex/text/entropy.py
BinocularsSignal
¶
Bases: BaseSignal
Implements the 'Binoculars' Zero-Shot Detection method.
This advanced detection strategy compares the perplexity of two models forms a ratio: an 'Observer' model and a 'Performer' model.
Formula:
Score = log(PPL_Observer) / log(PPL_Performer)
Interpretation: If the score is below a certain threshold (typically ~0.90), the text is considered AI-generated. This method is considered state-of-the-art for zero-shot detection.
Reference: "Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text" (arXiv:2401.12070)
Attributes:
| Name | Type | Description |
|---|---|---|
observer_id |
str
|
HuggingFace ID for the observer model. |
performer_id |
str
|
HuggingFace ID for the performer model. |
use_mock |
bool
|
If True, returns dummy results without loading models (for testing). |
Source code in veridex/text/binoculars.py
4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 | |
name
property
¶
Returns 'binoculars'.
dtype
property
¶
Returns 'text'.
__init__(observer_id='tiiuae/falcon-7b-instruct', performer_id='tiiuae/falcon-7b', use_mock=False)
¶
Initialize the Binoculars signal.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
observer_id
|
str
|
HuggingFace ID for the observer model. |
'tiiuae/falcon-7b-instruct'
|
performer_id
|
str
|
HuggingFace ID for the performer model. |
'tiiuae/falcon-7b'
|
use_mock
|
bool
|
If True, returns dummy results without loading models (for testing). |
False
|
Source code in veridex/text/binoculars.py
run(input_data)
¶
Execute Binoculars detection.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_data
|
str
|
Text to analyze. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
DetectionResult |
DetectionResult
|
|
Source code in veridex/text/binoculars.py
106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 | |
DetectGPTSignal
¶
Bases: BaseSignal
Implements the DetectGPT zero-shot detection method.
DetectGPT uses the curvature of the model's log-probability function to distinguish human-written text from AI-generated text. The core idea is that AI-generated text occupies regions of negative log-curvature.
Reference: "DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature" (Mitchell et al., 2023).
Algorithm:
1. Compute log-probability of original text log p(x).
2. Generate k perturbed versions x_tilde using a mask-filling model (e.g., T5).
3. Compute log-probability of perturbations log p(x_tilde).
4. Calculate curvature score: log p(x) - mean(log p(x_tilde)).
5. Normalize score to [0, 1] range.
Note: This signal is computationally expensive as it requires loading two LLMs (Base and Perturbation) and running multiple forward passes.
Attributes:
| Name | Type | Description |
|---|---|---|
base_model_name |
str
|
HuggingFace model ID for probability computation. |
perturbation_model_name |
str
|
HuggingFace model ID for perturbation (T5). |
n_perturbations |
int
|
Number of perturbed samples to generate. |
device |
str
|
Computation device ('cpu', 'cuda'). |
Source code in veridex/text/detectgpt.py
16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 | |
name
property
¶
Returns 'detect_gpt'.
dtype
property
¶
Returns 'text'.
__init__(base_model_name='gpt2-medium', perturbation_model_name='t5-base', n_perturbations=10, device=None)
¶
Initialize the DetectGPT signal.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
base_model_name
|
str
|
Name of the model to use for computing log-probabilities. Defaults to "gpt2-medium". |
'gpt2-medium'
|
perturbation_model_name
|
str
|
Name of the model to use for generating perturbations. Defaults to "t5-base". |
't5-base'
|
n_perturbations
|
int
|
Number of perturbed samples to generate. Defaults to 10. |
10
|
device
|
Optional[str]
|
Device to run models on ('cpu' or 'cuda'). If None, auto-detects CUDA. |
None
|
Source code in veridex/text/detectgpt.py
run(input_data)
¶
Analyzes the text using DetectGPT logic.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_data
|
str
|
The text to analyze. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
DetectionResult |
DetectionResult
|
Result object containing the curvature-based score. - score: 0.0 (Human) to 1.0 (AI). - metadata: Contains 'curvature', 'original_log_prob'. |
Source code in veridex/text/detectgpt.py
TDetectSignal
¶
Bases: DetectGPTSignal
Implements T-Detect (West et al., 2025), a robust variant of DetectGPT.
Instead of assuming a Gaussian distribution for perturbations (Z-score), T-Detect uses a Student's t-distribution to better model the heavy-tailed nature of adversarial or non-native text perturbations.
Source code in veridex/text/tdetect.py
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 | |
HumanOODSignal
¶
Bases: BaseSignal
Implements a zero-shot "Human Texts Are Outliers" (HumanOOD) detection signal.
This signal treats the LLM's own generations as the "In-Distribution" (ID) class and human texts as outliers (OOD).
It generates N samples from the model to form an ID cluster, then computes the Mahalanobis or Euclidean distance of the input text's embedding from this cluster.
Higher distance = More likely to be Human (Outlier). Lower distance = More likely to be Machine (ID).
Result score is 1.0 - (normalized_distance), so that AI (Low distance) -> 1.0.
Source code in veridex/text/human_ood.py
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 | |