FASS (Feature Attribution Stability Suite) is a benchmarking framework for evaluating the robustness of explanation methods under prediction-invariant perturbations. It addresses a critical gap in explainable AI, where attribution methods often produce unstable explanations despite identical model predictions. FASS provides a systematic way to assess and compare the stability of different attribution techniques, guiding the development of more reliable and trustworthy explainability tools.
1. Apply controlled perturbations (geometric, photometric, compression)
2. Generate attribution maps using selected XAI methods
3. Align perturbed and original inputs for comparison
4. Compute stability metrics across attribution maps
5. Aggregate results into benchmark scores
FASS introduces a three-axis decomposition for attribution stability:
• Structural Similarity: pixel-level consistency (SSIM)
• Rank Correlation: ordering consistency of feature importance
• Top-k Overlap: agreement on most important features
• Supports Grad-CAM, Integrated Gradients, SHAP, and LIME
• Evaluates across image classification and vision models
• Enables robustness comparison under realistic perturbations
• First systematic framework for attribution stability benchmarking
• Decouples prediction correctness from explanation robustness
• Provides standardized metrics for XAI evaluation
Python, PyTorch, OpenCV, PIL, NumPy, SciPy, XAI Libraries