API Reference¶
Base¶
- class tsadmetrics.base.Metric.Metric(name=None, config_file=None, **params)¶
Bases:
objectBase class for time series anomaly detection metrics.
This class provides common functionality for metric configuration, including parameter validation from a YAML configuration file and support for a parameter schema defined in each subclass.
- Attributes:
- name (str):
Name of the metric instance.
- params (dict):
Dictionary of parameters used by the metric.
- binary_prediction (bool):
Whether the metric expects binary predictions (True) or continuous scores (False).
- Parameters:
- name (str, optional):
The name of the metric. If not provided, it defaults to the lowercase name of the subclass.
- config_file (str, optional):
Path to a YAML configuration file. Parameters defined in the file under the metric’s name will be loaded automatically.
- **params:
Additional parameters passed directly to the metric. These override those loaded from the configuration file.
- Raises:
- ValueError:
If a required parameter is missing or if the configuration file is not found.
- TypeError:
If a parameter does not match its expected type as defined in the schema.
- _compute(y_true, y_pred)¶
Compute the value of the metric (core implementation).
This method contains the actual logic of the metric and must be implemented by subclasses. It is automatically called by compute() after input validation.
- Parameters:
- y_true (array-like):
Ground truth binary labels.
- y_pred (array-like):
Predicted binary labels.
- Returns:
float: The value of the metric.
- Raises:
NotImplementedError: If the method is not overridden by a subclass.
- _validate_inputs(y_true, y_pred)¶
Validate that y_true and y_pred are valid sequences of the same length.
- If binary_prediction = True:
Both y_true and y_pred must be binary (0 or 1).
- If binary_prediction = False:
y_true must be binary (0 or 1), y_pred can be continuous values.
- Raises:
ValueError: If lengths differ or values are not valid. TypeError: If inputs are not array-like.
- compute(y_true, y_pred)¶
Compute the value of the metric (wrapper method).
This method performs input validation and then calls the internal _compute() method, which contains the actual metric logic.
Important: Subclasses should not override this method. Instead, implement _compute() to define the behavior of the metric.
- Parameters:
- y_true (array-like):
Ground truth binary labels.
- y_pred (array-like):
Predicted binary labels.
- Returns:
float: The value of the metric..
- Raises:
- NotImplementedError
If _compute() is not implemented by the subclass.
- configure(config_file=None, **params)¶
Load and validate metric parameters from a YAML configuration file and/or from explicit keyword arguments.
- Parameters:
- config_file (str, optional):
Path to the configuration file. If provided, it will load parameters under the section with the metric’s name.
- **params:
Parameters passed directly to the metric instance.
- Raises:
- ValueError:
If a required parameter is not specified or the configuration file is missing.
- TypeError:
If a parameter value does not match the expected type.
Evaluation¶
- class tsadmetrics.evaluation.Report.Report¶
Bases:
object- generate_report(results, output_file)¶
Generate a report from the evaluation results.
- Parameters:
- results (dict):
Dictionary containing evaluation results.
- output_file (str):
Path to the output file where the report will be saved.
- class tsadmetrics.evaluation.Runner.Runner(dataset_evaluations, metrics=None)¶
Bases:
objectOrchestrates the evaluation of datasets using a set of metrics.
The Runner class provides functionality to:
Load datasets from direct data, file references, or a global YAML configuration file.
Load metrics either directly from a list or from a configuration file.
Evaluate all datasets against all metrics.
Optionally generate a report summarizing the evaluation results.
- Parameters:
- dataset_evaluations (list or str):
Accepted formats:
Global config file (str) If a string is provided and metrics is None, it is assumed to be the path to a configuration file that defines both datasets and metrics.
Direct data (list of tuples) Example:
[ ("dataset1", y_true1, (y_pred_binary1, y_pred_continuous1)), ("dataset2", y_true2, (y_pred_binary2, y_pred_continuous2)), ("dataset3", y_true3, y_pred3) ]
where y_pred may be binary or continuous.
File references (list of tuples) Example:
[ ("dataset1", "result1.csv"), ("dataset2", "result2.csv") ]
Each file must contain:
y_true
Either: * (y_pred_binary and y_pred_continuous) * or (y_pred)
- metrics (list or str, optional):
List of metrics: Each element is a tuple of the form [(metric_name, {param_name: value, …}), …]
Example:
[ ("pwf", {"beta": 1.0}), ("rpate", {"alpha": 0.5}), ("adc", {}) ]
Config file (str): Path to a YAML file containing metric definitions.
- Attributes:
- dataset_evaluations (list):
Loaded datasets in normalized format: (name, y_true, y_pred_binary, y_pred_continuous, y_pred)
- metrics (list):
List of metrics with their configurations.
- Raises:
- ValueError:
If a configuration file is invalid or required fields are missing.
- run(generate_report=False, report_file='evaluation_report.csv')¶
Run the evaluation for all datasets and metrics.
- Parameters:
- generate_report (bool, optional):
If True, generates a report of the evaluation results. Defaults to False.
- report_file (str, optional):
Path where the report will be saved if generate_report is True. Defaults to “evaluation_report.csv”.
- Returns:
- pd.DataFrame:
DataFrame structured as follows:
The first row contains the parameters of each metric.
The subsequent rows contain the metric values for each dataset.
The index column represents the dataset names, with the first row labeled as ‘params’.
Example:
dataset | metric1 | metric2 ----------|---------------|-------- params | {'param1':0.2}| {} dataset1 | 0.5 | 1.0 dataset2 | 0.125 | 1.0
Metrics¶
Registry¶
- class tsadmetrics.metrics.Registry.Registry¶
Bases:
objectCentral registry for anomaly detection metrics.
This class provides a centralized interface to register, retrieve, and load metric classes for anomaly detection tasks.
- classmethod available_metrics()¶
List all registered metric names.
- Returns:
list[str]: A list of registered metric names.
- classmethod get_metric(name: str, **params) Metric¶
Retrieve and instantiate a registered metric by name.
- Args:
name (str): Name of the metric to retrieve. **params: Parameters to initialize the metric instance.
- Returns:
Metric: An instance of the requested metric.
- Raises:
ValueError: If the metric name is not registered.
- classmethod load_metrics_from_file(filepath: str)¶
Load and instantiate metrics from a YAML configuration file.
- Args:
filepath (str): Path to the YAML configuration file.
- Returns:
list[tuple[str, dict]]: A list of tuples containing the metric name and the parameters used to instantiate it.
- classmethod load_metrics_info_from_file(filepath: str)¶
Load metric definitions (names and parameters) from a YAML configuration file.
- Args:
filepath (str): Path to the YAML file.
- Returns:
list[tuple[str, dict]]: A list of tuples containing the metric name and its parameters, e.g.
[("metric_name", {"param1": value, ...}), ...].- Raises:
ValueError: If the YAML file contains invalid entries or unsupported format.
- classmethod register(metric_cls: Type[Metric])¶
Register a metric class using its name attribute.
- Args:
- metric_cls (Type[Metric]): The metric class to register.
The class must define a
nameattribute.
- Raises:
- ValueError: If the metric class does not define a
name attribute or if a metric with the same name is already registered.
- ValueError: If the metric class does not define a
- tsadmetrics.metrics.Registry.auto_register()¶
Automatically register all subclasses of
Metricfound in the project.This function inspects the current inheritance tree of
Metricand registers each subclass in the central registry.
Metric Types¶
Single-Point Based Metrics (SPM)¶
These metrics evaluate predictions by considering each point independently, without taking into account the temporal context in which anomalies occur. In other words, they treat each instant in isolation, ignoring the continuity or structure of anomalies over time.
- class tsadmetrics.metrics.spm.DiceCoefficient(**kwargs)¶
Bases:
MetricCalculate the Dice Coefficient for anomaly detection in time series.
The Dice Coefficient is a similarity measure between the predicted and ground-truth binary anomaly segments. It is mathematically equivalent to the F1-score but derived from a set-theoretic perspective. The metric quantifies the overlap between the predicted anomalies and the actual anomalies, taking values between 0 and 1, where 1 indicates perfect agreement.
The Dice Coefficient is defined as:
\[\mathrm{DiceCoefficient} = \frac{2 \cdot TP}{2 \cdot TP + FP + FN}\]- where:
\(TP\) is the number of true positives (correctly detected anomaly points),
\(FP\) is the number of false positives (incorrectly predicted anomalies),
\(FN\) is the number of false negatives (missed true anomalies).
- Notes:
The Dice Coefficient is symmetric: swapping prediction and ground truth yields the same result.
If both y_true and y_pred are all zeros (no anomalies), the metric returns 1.0 to represent perfect agreement in the absence of anomalies.
- Reference:
- For more information, see the original paper:
https://www.sciencedirect.com/science/article/pii/S0094576522003162
- Attributes:
- name (str):
Fixed name identifier for this metric: “dicec”.
- binary_prediction (bool):
Indicates whether this metric expects binary predictions. Always True since it requires binary anomaly scores.
- Parameters:
(none)
- class tsadmetrics.metrics.spm.PointwiseAucPr(**kwargs)¶
Bases:
MetricPoint-wise Area Under the Precision-Recall Curve (AUC-PR) for anomaly detection.
This metric computes the standard Area Under the Precision-Recall Curve (AUC-PR) in a point-wise manner. Each time-series data point is treated independently when calculating precision and recall, making this suitable for anomaly detection tasks where anomalies are labeled at the individual point level.
- Reference:
- Implementation based on:
https://link.springer.com/article/10.1007/s10618-023-00988-8
- Attributes:
- name (str):
Fixed name identifier for this metric: “pw_auc_pr”.
- binary_prediction (bool):
Indicates whether this metric expects binary predictions. Always False since it requires continuous anomaly scores.
- class tsadmetrics.metrics.spm.PointwiseAucRoc(**kwargs)¶
Bases:
MetricPoint-wise Area Under the Receiver Operating Characteristic Curve (AUC-ROC) for anomaly detection.
This metric computes the standard Area Under the ROC Curve (AUC-ROC) in a point-wise manner. Each time-series data point is treated independently when calculating true positives, false positives, and false negatives. It is widely used to evaluate the ability of anomaly scoring functions to distinguish between normal and anomalous points.
- Reference:
- Implementation based on:
https://link.springer.com/article/10.1007/s10618-023-00988-8
- Attributes:
- name (str):
Fixed name identifier for this metric: “pw_auc_roc”.
- binary_prediction (bool):
Indicates whether this metric expects binary predictions. Always False since it requires continuous anomaly scores.
- class tsadmetrics.metrics.spm.PointwiseFScore(**kwargs)¶
Bases:
MetricPoint-wise F-score for anomaly detection in time series.
This metric computes the classical F-score without considering temporal context, treating each time-series point independently. It balances precision and recall according to the configurable parameter beta.
- Reference:
- Implementation based on:
https://link.springer.com/article/10.1007/s10618-023-00988-8
- Attributes:
- name (str):
Fixed name identifier for this metric: “pwf”.
- binary_prediction (bool):
Indicates whether this metric expects binary predictions. Always True since it requires binary anomaly scores.
- Parameters:
- beta (float):
The beta value, which determines the weight of precision in the combined score.
- class tsadmetrics.metrics.spm.PrecisionAtK(**kwargs)¶
Bases:
MetricPrecision at K (P@K) for anomaly detection in time series.
This metric evaluates how many of the top-k points with the highest anomaly scores correspond to true anomalies. It is particularly useful when focusing on identifying the most anomalous points rather than setting a global threshold.
By definition, k is automatically set to the number of true anomalies present in y_true.
\[k = \sum(y\_true)\]- Reference:
- Implementation based on:
https://link.springer.com/article/10.1007/s10618-023-00988-8
- Attributes:
- name (str):
Fixed name identifier for this metric: “pak”.
- binary_prediction (bool):
Indicates whether this metric expects binary predictions. Always False since it requires continuous anomaly scores.
Temporal Evaluation Metrics (TEM)¶
This category includes metrics that incorporate temporal context in the evaluation process. They consider not only whether an anomaly was detected, but also when and how it occurred relative to the original sequence. These metrics are specifically designed for time series anomaly detection and are particularly useful for analyzing one or more properties of the model related to the temporal structure of anomalies, such as their duration, anticipation, coverage, or overlap.
Tolerant Partial Detection Metrics (TPDM)¶
These metrics consider a predicted anomaly valid if it occurs at any point within the interval of a real anomaly. They assume that partial detection is sufficient to signal a potential anomaly, allowing further verification.
- class tsadmetrics.metrics.tem.tpdm.BalancedPointadjustedFScore(**kwargs)¶
Bases:
MetricBalanced point-adjusted F-score for anomaly detection in time series. This metric modifies the standard F-score by applying a temporal adjustment: for each ground-truth anomalous segment, if at least one point is predicted as anomalous, the entire segment is considered correctly detected. Additionally, for each false positive point at time t, all points in the range [t - floor(w/2), t + floor(w/2)] are set to 1, which can generate additional false positives. The adjusted predictions are then compared to the ground-truth labels using the standard point-wise F-score formula.
- Reference:
- For more information, see the original paper:
https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=10890568
- Attributes:
- name (str):
Fixed name identifier for this metric: “bpaf”.
- binary_prediction (bool):
Indicates whether this metric expects binary predictions. Always True since it requires binary anomaly scores.
- Parameters:
- beta (float):
Weight factor for recall in the F-score (default = 1.0).
- w (int):
Temporal window size for expanding each true positive point.
- class tsadmetrics.metrics.tem.tpdm.CompositeFScore(**kwargs)¶
Bases:
MetricComposite F-score for anomaly detection in time series.
This metric combines aspects of the point-wise F-score and the segment-wise F-score. It is defined as the harmonic mean of point-wise precision and segment-wise recall. Using point-wise precision ensures that false positives are properly penalized, a limitation often found in purely segment-wise metrics.
- Reference:
- For more information, see the original paper:
- Attributes:
- name (str):
Fixed name identifier for this metric: “cf”.
- binary_prediction (bool):
Indicates whether this metric expects binary predictions. Always True since it requires binary anomaly scores.
- class tsadmetrics.metrics.tem.tpdm.PointadjustedAucPr(**kwargs)¶
Bases:
MetricPoint-adjusted Area Under the Precision-Recall Curve (AUC-PR) for anomaly detection.
Unlike the standard point-wise AUC-PR, this variant uses a point-adjusted evaluation:
Each anomalous segment in y_true is considered correctly detected if at least one point within that segment is predicted as anomalous.
Once a segment is detected, all its points are marked as detected in the adjusted prediction.
This adjustment accounts for the fact that detecting any part of an anomalous segment is often sufficient in practice.
- Reference:
- Implementation based on:
https://link.springer.com/article/10.1007/s10618-023-00988-8
- Attributes:
- name (str):
Fixed name identifier for this metric: “pa_auc_pr”.
- binary_prediction (bool):
Indicates whether this metric expects binary predictions. Always False since it requires continuous anomaly scores.
- class tsadmetrics.metrics.tem.tpdm.PointadjustedAucRoc(**kwargs)¶
Bases:
MetricPoint-adjusted Area Under the ROC Curve (AUC-ROC) for anomaly detection in time series.
Unlike standard point-wise AUC-ROC, this metric applies point-adjusted evaluation:
Each anomalous segment in y_true is considered correctly detected if at least one point within that segment is predicted as anomalous.
Once a segment is detected, all its points are marked as detected in the adjusted predictions.
Adjusted predictions are then used to compute true positive rate (TPR) and false positive rate (FPR) at multiple thresholds to construct the ROC curve.
- Reference:
- Implementation based on:
https://link.springer.com/article/10.1007/s10618-023-00988-8
- Attributes:
- name (str):
Fixed name identifier for this metric: “pa_auc_pr”.
- binary_prediction (bool):
Indicates whether this metric expects binary predictions. Always False since it requires continuous anomaly scores.
- Raises:
- ValueError:
If y_true and y_anomaly_scores have mismatched lengths.
- TypeError:
If inputs are not array-like.
- class tsadmetrics.metrics.tem.tpdm.PointadjustedFScore(**kwargs)¶
Bases:
MetricPoint-adjusted F-score for anomaly detection in time series.
This metric modifies the standard F-score by applying a temporal adjustment to the predictions:
For each ground-truth anomalous segment, if at least one point is predicted as anomalous, all points in that segment are considered correctly detected.
The adjusted predictions are then compared to the ground-truth labels using the standard point-wise F-score formula.
- Reference:
- Implementation based on:
https://link.springer.com/article/10.1007/s10618-023-00988-8
- For more information, see the original paper:
- Attributes:
- name (str):
Fixed name identifier for this metric: “paf”.
- binary_prediction (bool):
Indicates whether this metric expects binary predictions. Always True since it requires binary anomaly scores.
- Parameters:
- beta (float):
Weight factor for recall in the F-score (default = 1.0).
- class tsadmetrics.metrics.tem.tpdm.RangebasedFScore(**kwargs)¶
Bases:
MetricRange-based F-score for anomaly detection in time series.
This metric evaluates anomaly detection performance over temporal ranges, combining range-based precision and recall into a harmonic mean. It accounts for positional bias, existence and overlap rewards, and cardinality penalties, allowing fine-grained control over missed detections and false alarms.
- Reference:
- Implementation based on:
https://link.springer.com/article/10.1007/s10618-023-00988-8
- For more information, see the original paper:
- Attributes:
- name (str):
Fixed name identifier for this metric: “rbf”.
- binary_prediction (bool):
Indicates whether this metric expects binary predictions. Always True since it requires binary anomaly scores.
- Parameters:
- p_alpha (float):
Relative importance of existence reward for precision (0 <= alpha_p <= 1).
- r_alpha (float):
Relative importance of existence reward for recall (0 <= alpha_r <= 1).
- p_bias (str):
Positional bias for precision (“flat”, “front”, “middle”, “back”).
- r_bias (str):
Positional bias for recall (“flat”, “front”, “middle”, “back”).
- cardinality_mode (str, optional):
Cardinality factor type (“one”, “reciprocal”, “gamma”).
- beta (float):
Weight of precision in the F-score. Default = 1.
- class tsadmetrics.metrics.tem.tpdm.SegmentwiseFScore(**kwargs)¶
Bases:
MetricSegment-wise F-score for anomaly detection in time series.
This metric computes the F-score at the segment level rather than point-wise. Each contiguous segment of anomalies in the ground truth is treated as a unit. - True positive (TP): at least one predicted anomaly within a ground-truth segment. - False negative (FN): no predicted anomaly in a ground-truth segment. - False positive (FP): predicted segment with no overlap with any ground-truth segment.
- Reference:
- Implementation based on:
https://link.springer.com/article/10.1007/s10618-023-00988-8
- For more information, see the original paper:
- Attributes:
- name (str):
Fixed name identifier for this metric: “swf”.
- binary_prediction (bool):
Indicates whether this metric expects binary predictions. Always True since it requires binary anomaly scores.
- Parameters:
- beta (float): Weight of precision in the harmonic mean.
Default is 1.0 (balanced F1-score).
Precise Temporal Detection Metrics (PTDM)¶
These metrics require the predicted anomaly to cover a significant portion of the real anomaly’s duration. They are stricter about temporal accuracy than other types and value precise detection over partial alignment.
- class tsadmetrics.metrics.tem.ptdm.AverageDetectionCount(**kwargs)¶
Bases:
MetricCalculate average detection count for anomaly detection in time series.
This metric computes, for each ground-truth anomalous segment, the percentage of points within that segment that are predicted as anomalous. It then averages these percentages across all true anomaly events, providing an estimate of detection coverage per event.
- Reference:
- Implementation based on:
- Attributes:
- name (str):
Fixed name identifier for this metric: “adc”.
- binary_prediction (bool):
Indicates whether this metric expects binary predictions. Always True since it requires binary anomaly scores.
- class tsadmetrics.metrics.tem.ptdm.DetectionAccuracyInRange(**kwargs)¶
Bases:
MetricCalculate detection accuracy in range for anomaly detection in time series.
This metric measures the proportion of predicted anomaly events that correspond to true anomalies. It is defined as:
\[\text{DAIR} = \frac{EM + DA}{EM + DA + FA}\]Where:
- EM (Exact Match):
Number of predicted anomaly segments that exactly match a true anomaly segment.
- DA (Detected Anomaly):
Number of true anomaly points not exactly matched where at least one prediction falls within a window [i-k, i+k] around the true point index i or within the true segment range.
- FA (False Anomaly):
Number of predicted anomaly segments that do not overlap any true anomaly segment even within a k-step tolerance window around true points.
- Reference:
- For more information, see the original paper:
https://acta.sapientia.ro/content/docs/evaluation-metrics-for-anomaly-detection.pdf
- Attributes:
- name (str):
Fixed name identifier for this metric: “dair”.
- binary_prediction (bool):
Indicates whether this metric expects binary predictions. Always True since it requires binary anomaly scores.
- Parameters:
- k (int):
Half-window size for tolerance around each true anomaly point. A prediction within k time steps of a true point counts toward detection.
- class tsadmetrics.metrics.tem.ptdm.PointadjustedAtKFScore(**kwargs)¶
Bases:
MetricCalculate the Point-adjusted at K% F-score for anomaly detection in time series.
This metric extends the standard Point-adjusted F-Score by introducing a minimum coverage threshold K for each anomalous segment. For every ground-truth anomalous segment, if at least K% of its points are predicted as anomalous, the entire segment is marked as detected (all points are marked positive in the adjusted prediction). Otherwise, the segment remains unchanged, preserving only the correctly detected points for point-level evaluation.
In this way, partial detections below the threshold contribute proportionally at the point level, but not at the segment level.
- Reference:
- Implementation based on:
https://link.springer.com/article/10.1007/s10618-023-00988-8
- For more information, see the original paper:
- Attributes:
- name (str):
Fixed name identifier for this metric: “pakf”.
- binary_prediction (bool):
Indicates whether this metric expects binary predictions. Always True since it requires binary anomaly scores.
- Parameters:
- k (float):
The minimum percentage of the anomaly that must be detected to consider the anomaly as detected.
- beta (float):
The beta value, which determines the weight of precision in the combined score. Default is 1, which gives equal weight to precision and recall.
- class tsadmetrics.metrics.tem.ptdm.PointadjustedAtKLFScore(**kwargs)¶
Bases:
MetricCalculates the point-adjusted at K% F-score with a tolerance window l for anomaly detection in time series. It extends the standard F-score by applying a temporal adjustment: if at least K% of the points within an anomalous segment are predicted as anomalous, the entire segment is considered correctly detected. Additionally, a tolerance window of size l is applied around each predicted positive point, so that points within ±l positions of a true anomaly are also counted as detected, making the metric more robust to small temporal misalignments.
- Reference:
- For more information, see the original paper:
- Attributes:
- name (str):
Fixed name identifier for this metric: “paklf”.
- binary_prediction (bool):
Indicates whether this metric expects binary predictions. Always True since it requires binary anomaly scores.
- Parameters:
- k (float):
The minimum percentage of the anomaly that must be detected to consider the anomaly as detected.
- l (int):
The tolerance window (in time steps) applied around each predicted positive point. Points within ±l distance of a true anomaly are treated as correctly detected. Default is 0, meaning no tolerance.
- beta (float):
The beta value, which determines the weight of precision in the combined score. Default is 1, which gives equal weight to precision and recall.
- class tsadmetrics.metrics.tem.ptdm.TimeseriesAwareFScore(**kwargs)¶
Bases:
MetricCalculate time series aware F-score for anomaly detection in time series.
This metric is based on the range_based_f_score, but introduces two key modifications. First, a predicted anomalous segment is only counted as a true positive if it covers at least a fraction \({\theta}\) of the ground‑truth anomaly range. Second, each labeled anomaly is extended by a tolerance window of length \({\delta}\) at its end, within which any overlap contribution decays linearly from full weight down to zero. Unlike the original range-based formulation, this variant omits cardinality and positional bias terms, focusing solely on overlap fraction and end‑tolerance decay.
- Reference:
- Implementation based on:
https://link.springer.com/article/10.1007/s10618-023-00988-8
- For more information, see the original paper:
- Attributes:
- name (str):
Fixed name identifier for this metric: “taf”.
- binary_prediction (bool):
Indicates whether this metric expects binary predictions. Always True since it requires binary anomaly scores.
- Parameters:
- y_true (np.array):
The ground truth binary labels for the time series data.
- y_pred (np.array):
The predicted binary labels for the time series data.
- beta (float):
The beta value, which determines the weight of precision in the combined score. Default is 1, which gives equal weight to precision and recall.
- alpha (float):
Relative importance of the existence reward versus overlap reward (\({0 \leq \alpha \leq 1}\)).
- delta (float):
- Tolerance window length at the end of each true anomaly segment.
- If past_range is True, \({\delta}\) must be a float in (0, 1], representing the fraction of the segment’s
length to extend. E.g., \({\delta}\) = 0.5 extends a segment of length 10 by 5 time steps.
- If past_range is False, \({\delta}\) must be a non-negative integer, representing an absolute number of
time steps to extend each segment.
- theta (float):
Minimum fraction (\({ 0 \leq \theta \leq 1}\)) of the true anomaly range that must be overlapped by predictions for the segment to count as detected.
- past_range (bool):
- Determines how \({\delta}\) is interpreted.
True: \({\delta}\) is treated as a fractional extension of each segment’s length.
False: \({\delta}\) is treated as an absolute number of time steps.
- class tsadmetrics.metrics.tem.ptdm.TotalDetectedInRange(**kwargs)¶
Bases:
MetricCalculate total detected in range for anomaly detection in time series.
This metric measures the proportion of true anomaly events that are correctly detected. It is defined as:
\[\text{TDIR} = \frac{EM + DA}{EM + DA + MA}\]Where:
- EM (Exact Match):
Number of predicted anomaly segments that exactly match a true anomaly segment.
- DA (Detected Anomaly):
Number of true anomaly points not exactly matched where at least one prediction falls within a window [i-k, i+k] around the true point index i or within the true segment range.
- MA (Missed Anomaly):
Number of true anomaly segments that do not overlap any predicted anomaly segment even within a k-step tolerance window around true points.
- Reference:
- For more information, see the original paper:
https://acta.sapientia.ro/content/docs/evaluation-metrics-for-anomaly-detection.pdf
- Attributes:
- name (str):
Fixed name identifier for this metric: “tdir”.
- binary_prediction (bool):
Indicates whether this metric expects binary predictions. Always True since it requires binary anomaly scores.
- Parameters:
- k (int):
Half-window size for tolerance around each true anomaly point. A prediction within k time steps of a true point counts toward detection.
- class tsadmetrics.metrics.tem.ptdm.WeightedDetectionDifference(**kwargs)¶
Bases:
MetricCalculate weighted detection difference for anomaly detection in time series.
For each true anomaly segment, each point in the segment is assigned a weight based on a Gaussian function centered at the segment’s midpoint: points closer to the center receive higher weights, which decay with distance according to the standard deviation sigma. These weights form the basis for scoring both correct detections and false alarms.
WS (Weighted Sum) is defined as the sum of Gaussian weights for all predicted anomaly points that fall within any true anomaly segment (extended by delta time steps at the ends). WF (False Alarm Weight) is the sum of Gaussian weights for all predicted anomaly points that do not overlap any true anomaly segment (within the same extension).
The final score is:
\[\text{WDD} = \text{WS} - \text{WF} \cdot \text{FA}\]Where:
- WS:
Sum of Gaussian weights for all predicted anomaly points that fall within any true anomaly segment (extended by delta time steps at the ends).
- WF:
Sum of Gaussian weights for all predicted anomaly points that do not overlap any true anomaly segment (within the same extension).
- FA (False Anomaly):
Number of predicted anomaly segments that do not overlap any true anomaly segment even within a k-step tolerance window around true points.
- Reference:
- For more information, see the original paper:
https://acta.sapientia.ro/content/docs/evaluation-metrics-for-anomaly-detection.pdf
- Attributes:
- name (str):
Fixed name identifier for this metric: “wdd”.
- binary_prediction (bool):
Indicates whether this metric expects binary predictions. Always True since it requires binary anomaly scores.
- Parameters:
- k (int):
The maximum number of time steps within which an anomaly must be predicted to be considered detected.
Temporal Matching Evaluation Metrics (TMEM)¶
These metrics measure how well real and predicted anomalies are aligned, penalizing temporal deviations in start, duration, or end of the events.
- class tsadmetrics.metrics.tem.tmem.AbsoluteDetectionDistance(**kwargs)¶
Bases:
MetricCalculate absolute detection distance for anomaly detection in time series.
This metric computes, for each predicted anomaly point that overlaps a ground-truth anomaly segment, the relative distance from that point to the temporal center of the corresponding segment. It then sums all those distances and divides by the total number of such matching predicted points, yielding the mean distance to segment centers for correctly detected points.
- Reference:
- For more information, see the original paper:
- Attributes:
- name (str):
Fixed name identifier for this metric: “add”.
- binary_prediction (bool):
Indicates whether this metric expects binary predictions. Always True since it requires binary anomaly scores.
- class tsadmetrics.metrics.tem.tmem.EnhancedTimeseriesAwareFScore(**kwargs)¶
Bases:
MetricCalculate enhanced time series aware F-score for anomaly detection in time series.
This metric is similar to the range-based F-score in that it accounts for both detection existence and overlap proportion. Additionally, it requires that a significant fraction \({\theta_r}\) of each true anomaly segment be detected, and that a significant fraction \({\theta_p}\) of each predicted segment overlaps with the ground truth. Finally, F-score contributions from each event are weighted by the square root of the true segment’s length, providing a compromise between point-wise and segment-wise approaches.
- Reference:
- Implementation based on:
https://link.springer.com/article/10.1007/s10618-023-00988-8
- For more information, see the original paper:
- Attributes:
- name (str):
Fixed name identifier for this metric: “etaf”.
- binary_prediction (bool):
Indicates whether this metric expects binary predictions. Always True since it requires binary anomaly scores.
- Parameters:
- theta_p (float):
Minimum fraction (\({0 \leq \theta_p \leq 1}\)) of a predicted segment that must be overlapped by ground truth to count as detected.
- theta_r (float):
Minimum fraction (\({0 \leq \theta_r \leq 1}\)) of a true segment that must be overlapped by predictions to count as detected.
- class tsadmetrics.metrics.tem.tmem.TemporalDistance(**kwargs)¶
Bases:
MetricCalculate temporal distance for anomaly detection in time series.
This metric computes the sum of the distances from each labelled anomaly point to the closest predicted anomaly point, and from each predicted anomaly point to the closest labelled anomaly point.
- Reference:
- Implementation based on:
https://link.springer.com/article/10.1007/s10618-023-00988-8
- For more information, see the original paper:
- Attributes:
- name (str):
Fixed name identifier for this metric: “td”.
- binary_prediction (bool):
Indicates whether this metric expects binary predictions. Always True since it requires binary anomaly scores.
- Parameters:
- distance (int):
The distance type parameter for the temporal distance calculation. - 0: Euclidean distance - 1: Squared Euclidean distance
Delay-Penalized Metrics (DPM)¶
These metrics penalize predictions that occur significantly after the real anomaly starts, incentivizing early detection.
- class tsadmetrics.metrics.tem.dpm.DelayThresholdedPointadjustedFScore(**kwargs)¶
Bases:
MetricCalculate delay thresholded point-adjusted F-score for anomaly detection in time series.
This metric is based on the standard F-score, but applies a temporal adjustment to the predictions before computing it. Specifically, for each ground-truth anomalous segment, if at least one point within the first k time steps of the segment is predicted as anomalous, all points in the segment are marked as correctly detected. The adjusted predictions are then compared to the ground-truth labels using the standard point-wise F-score formulation.
- Reference:
- Implementation based on:
https://link.springer.com/article/10.1007/s10618-023-00988-8
- For more information, see the original paper:
- Attributes:
- name (str):
Fixed name identifier for this metric: “dtpaf”.
- binary_prediction (bool):
Indicates whether this metric expects binary predictions. Always True since it requires binary anomaly scores.
- Parameters:
- k (int):
Maximum number of time steps from the start of an anomaly segment within which a prediction must occur for the segment to be considered detected.
- beta (float):
The beta value, which determines the weight of precision in the combined score. Default is 1, which gives equal weight to precision and recall.
- class tsadmetrics.metrics.tem.dpm.EarlyDetectionScore(**kwargs)¶
Bases:
MetricCalculate the Early Detection (ED) score for anomaly detection in time series.
This metric quantifies how early an anomaly is detected relative to its true occurrence by evaluating the timing of the first prediction within a defined anomaly window around each ground-truth anomaly.
For every anomaly window \(\mathrm{aw}_d(a) = [t_b, t_e]\), the ED score is computed as \(\mathrm{ed}_a = 1 - \frac{\mathrm{pos}_d(T(a)) - b}{e - b}\) if a detection occurs inside the window, and 0 otherwise, where \(\mathrm{pos}_d(T(a))\) denotes the position of the first predicted anomaly within the window and \(b\) and \(e\) represent the window’s start and end indices, respectively. Consequently, detections closer to the start of the window yield scores approaching 1 (indicating early detection), while detections near the end yield values closer to 0.
Following the specification in the ADE paper, the anomaly window length is defined as \(\lVert \mathrm{aw}_d \rVert = 0.1 \cdot \lVert d \rVert / \lVert A_d \rVert\), where \(\lVert d \rVert\) is the length of the time series and \(\lVert A_d \rVert\) is the total number of anomalies. Each window is symmetrically centered on the anomaly and bounded within the series limits.
The final ED score is obtained by averaging the individual \(\mathrm{ed}_a\) values for all ground-truth anomalies, thus providing a single measure of how promptly the model detects anomalies across the entire series.
- Reference:
- For more information, see the original paper:
- Attributes:
- name (str):
Fixed name identifier for this metric: “eds”.
- binary_prediction (bool):
Indicates whether this metric expects binary predictions. Always True since it requires binary anomaly labels.
- class tsadmetrics.metrics.tem.dpm.LatencySparsityawareFScore(**kwargs)¶
Bases:
MetricCalculate latency and sparsity aware F-score for anomaly detection in time series.
This metric is based on the standard F-score, but applies a temporal adjustment to the predictions before computing it. Specifically, for each ground-truth anomalous segment, all points in the segment are marked as correctly detected only after the first true positive is predicted within that segment. This encourages early detection by delaying credit for correct predictions until the anomaly is initially detected. Additionally, to reduce the impact of scattered false positives, predictions are subsampled using a sparsity factor n, so that only one prediction is considered every n time steps. The adjusted predictions are then used to _compute the standard point-wise F-score.
- Reference:
- Implementation based on:
- For more information, see the original paper:
- Attributes:
- name (str):
Fixed name identifier for this metric: “lsaf”.
- binary_prediction (bool):
Indicates whether this metric expects binary predictions. Always True since it requires binary anomaly scores.
- Parameters:
- ni (int):
The batch size used in the implementation to handle latency and sparsity.
- beta (float):
The beta value, which determines the weight of precision in the combined score. Default is 1, which gives equal weight to precision and recall.
- class tsadmetrics.metrics.tem.dpm.MeanTimeToDetect(**kwargs)¶
Bases:
MetricCalculate mean time to detect for anomaly detection in time series.
This metric quantifies the average detection delay across all true anomaly events. For each ground-truth anomaly segment, let i be the index where the segment starts, and let \({j \geq i}\) be the first index within that segment where the model predicts an anomaly. The detection delay for that event is defined as:
\[\Delta t = j - i\]The MTTD is the mean of all such \({\Delta t}\) values, one per true anomaly segment, and expresses the average number of time steps between the true onset of an anomaly and its first detection.
- Reference:
- For more information, see the original paper:
- Attributes:
- name (str):
Fixed name identifier for this metric: “mttd”.
- binary_prediction (bool):
Indicates whether this metric expects binary predictions. Always True since it requires binary anomaly scores.
- class tsadmetrics.metrics.tem.dpm.NabScore(**kwargs)¶
Bases:
MetricCalculate NAB score for anomaly detection in time series.
This metric rewards early and accurate detections of anomalies while penalizing false positives. For each ground truth anomaly segment, only the first correctly predicted anomaly point contributes positively to the score, with earlier detections receiving higher rewards. In contrast, every false positive prediction contributes negatively.
- Reference:
- Implementation based on:
https://link.springer.com/article/10.1007/s10618-023-00988-8
- For more information, see the original paper:
- Attributes:
- name (str):
Fixed name identifier for this metric: “nab_score”.
- binary_prediction (bool):
Indicates whether this metric expects binary predictions. Always True since it requires binary anomay scores.
Temporal Shift-Tolerant Metrics (TSTM)¶
These metrics allow a temporal tolerance in detecting an anomaly, considering predictions correct if they occur near the real event, even if they do not exactly match its start or end. This flexibility is useful when exact timing is less critical, but detection within a reasonable window is important.
- class tsadmetrics.metrics.tem.tstm.AffiliationbasedFScore(**kwargs)¶
Bases:
MetricCalculate affiliation based F-score for anomaly detection in time series.
This metric combines the affiliation-based precision and recall into a single score using the harmonic mean, adjusted by a weight \({\beta}\) to control the relative importance of recall versus precision. Since both precision and recall are distance-based, the F-score reflects a balance between how well predicted anomalies align with true anomalies and vice versa.
- Reference:
- Implementation based on:
https://link.springer.com/article/10.1007/s10618-023-00988-8
- For more information, see the original paper:
- Attributes:
- name (str):
Fixed name identifier for this metric: “aff_f”.
- binary_prediction (bool):
Indicates whether this metric expects binary predictions. Always True since it requires binary anomaly scores.
- Parameters:
- beta (float):
The beta value, which determines the weight of precision in the combined score. Default is 1, which gives equal weight to precision and recall.
- class tsadmetrics.metrics.tem.tstm.NormalizedAffiliationbasedFScore(**kwargs)¶
Bases:
MetricCalculate normalized affiliation-based F-score for anomaly detection in time series.
This metric combines the affiliation-based precision and recall into a single score, weighted by \(\beta\), but first applies an affine normalization to the affiliation-based precision using a threshold parameter \(\alpha\):
\[P_{u}^{aff} = \frac{\mathrm{precision}_{aff} - \alpha}{1 - \alpha}\]Then the (signed) F-score is computed as:
\[F_{\beta} = \frac{(1 + \beta^2)\,|P_{u}^{aff}|\,\mathrm{recall}_{aff}} {\beta^2\,P_{u}^{aff} + \mathrm{recall}_{aff}} \times \operatorname{sign}(P_{u}^{aff})\]- Notes:
If there are no predicted anomalies, the score is 0.
If \(1-\alpha = 0\), the score is 0 to avoid division by zero.
If the denominator \(\beta^2 P_{u}^{aff} + \mathrm{recall}_{aff}\) is 0, the score is 0 to avoid division by zero.
- Reference:
- For more information, see the original paper:
- Attributes:
- name (str):
Fixed name identifier for this metric: “naff_f”.
- binary_prediction (bool):
Indicates whether this metric expects binary predictions. Always True since it requires binary anomaly scores.
- Parameters:
- beta (float):
The beta value, which determines the weight of precision in the combined score. Default is 1, which gives equal weight to precision and recall.
- alpha (float):
Normalization threshold applied to affiliation-based precision before computing the F-score. Must satisfy \(\alpha < 1\). Default is 0.
- class tsadmetrics.metrics.tem.tstm.Pate(**kwargs)¶
Bases:
MetricCalculate PATE score for anomaly detection in time series using real-valued anomaly scores.
This version of PATE operates on continuous anomaly scores rather than binary predictions. It assigns weights to each score according to its temporal proximity to the true anomaly intervals. An early buffer of length early and a delay buffer of length delay define the tolerance regions before and after each anomaly. High scores within the true interval receive full weight, while scores in the buffer zones are linearly decayed based on their distance from the interval edges. Scores outside all tolerance zones contribute as false positives, and intervals with insufficiently high scores are penalized as false negatives.
The final PATE score aggregates these weighted contributions to produce a smooth, continuous performance measure sensitive to both timing and confidence.
- Reference:
- Implementation based on:
- For more information, see the original paper:
- Attributes:
- name (str):
Fixed name identifier for this metric: “pate”.
- binary_prediction (bool):
Indicates whether this metric expects binary predictions. Always False since it requires continuous anomaly scores.
- Parameters:
- early (int):
Length of the early buffer zone before each anomaly interval.
- delay (int):
Length of the delay buffer zone after each anomaly interval.
- class tsadmetrics.metrics.tem.tstm.PateFScore(**kwargs)¶
Bases:
MetricCalculate PATE F-Score for anomaly detection in time series using binary predictions.
This metric evaluates how well the predicted binary anomalies align with the ground truth, considering temporal proximity around each real anomaly interval. It defines two tolerance zones: an early buffer of length early preceding the true interval and a delay buffer of length delay following it. Detections within the true interval receive full credit, while those in the buffer zones receive linearly decaying weights depending on their temporal distance from the true anomaly. Predictions outside these regions are treated as false positives, and missed intervals as false negatives.
The weighted contributions of precision and recall are combined into a final F-Score, measuring both timing accuracy and detection completeness.
- Reference:
- Implementation based on:
- For more information, see the original paper:
- Attributes:
- name (str):
Fixed name identifier for this metric: “pate_f1”.
- binary_prediction (bool):
Indicates whether this metric expects binary predictions. Always True since it requires binary anomaly scores.
- Parameters:
- early (int):
The maximum number of time steps before an anomaly must be predicted to be considered early.
- delay (int):
The maximum number of time steps after an anomaly must be predicted to be considered delayed.
- class tsadmetrics.metrics.tem.tstm.TimeTolerantFScore(**kwargs)¶
Bases:
MetricCalculate time tolerant F-score for anomaly detection in time series. This metric is based on the standard F-score, but applies a temporal adjustment to the predictions before computing it. Specifically, a predicted anomalous point is considered a true positive if it lies within a temporal window of size \({\tau}\) around any ground-truth anomalous point. This allows for small temporal deviations in the predictions to be tolerated. The adjusted predictions are then used to _compute the standard point-wise F-Score.
- Reference:
- Implementation based on:
https://link.springer.com/article/10.1007/s10618-023-00988-8
- For more information, see the original paper:
- Attributes:
- name (str):
Fixed name identifier for this metric: “ttf”.
- binary_prediction (bool):
Indicates whether this metric expects binary predictions. Always True since it requires binary anomaly scores.
- Parameters:
- t (int):
The time tolerance parameter.
- beta (float):
The beta value, which determines the weight of precision in the combined score. Default is 1, which gives equal weight to precision and recall.
- class tsadmetrics.metrics.tem.tstm.VusPr(**kwargs)¶
Bases:
MetricCalculate the VUS-PR (Volume Under the PR Surface) score for anomaly detection in time series.
This metric is an extension of the classical AUC-PR, incorporating a temporal tolerance parameter window that smooths the binary ground-truth labels. It allows for some flexibility in the detection of anomalies that are temporally close to the true events. The final metric integrates the PR-AUC over several levels of temporal tolerance (from 0 to window), yielding a volume under the PR surface.
- Reference:
- Implementation based on:
https://link.springer.com/article/10.1007/s10618-023-00988-8
- For more information, see the original paper:
- Attributes:
- name (str):
Fixed name identifier for this metric: “vus_pr”.
- binary_prediction (bool):
Indicates whether this metric expects binary predictions. Always False since it requires continuous anomaly scores.
- Parameters:
- window (int):
Maximum temporal tolerance used to smooth the evaluation. Default is 4.
- class tsadmetrics.metrics.tem.tstm.VusRoc(**kwargs)¶
Bases:
MetricCalculate the VUS-ROC (Volume Under the ROC Surface) score for anomaly detection in time series.
This metric extends the classical AUC-ROC by introducing a temporal tolerance parameter l, which smooths the binary ground-truth labels. The idea is to allow a flexible evaluation that tolerates small misalignments in the detection of anomalies. The final score is computed by integrating the ROC-AUC over different values of the tolerance parameter, from 0 to window, thus producing a volume under the ROC surface.
- Reference:
- Implementation based on:
https://link.springer.com/article/10.1007/s10618-023-00988-8
- For more information, see the original paper:
- Attributes:
- name (str):
Fixed name identifier for this metric: “vus_roc”.
- binary_prediction (bool):
Indicates whether this metric expects binary predictions. Always False since it requires continuous anomaly scores.
- Parameters:
- window (int):
Maximum temporal tolerance l used to smooth the evaluation. Default is 4.