Аннотация:This paper introduces a multilayered multimodal Explainable Artificial Intelligence (XAI) framework designed for interdisciplinary research, where the transparency of AI-driven insights is paramount. Our approach incorporates five key modules: 1. Deep Neural Network - Extracts features and provides initial classification or clustering. 2. XAI Tools (e.g., Grad-CAM, SHAP) - Generate localized attributions, highlighting the salient regions or features that most influence the neural network's decisions. 3. Soft Model - Applies heuristic thresholds and domain-specific segmentation to further refine and interpret the raw explanatory signals. 4. Intermediate Model - Consolidates predictions, heuristic outputs, and relevant metadata into a structured “draft” explanation. 5. Final Generative Model - Produces a comprehensive, human-readable report by contextualizing the intermediate findings with large language models. The proposed cascade is inherently multimodal, integrating imagery, tabular data, and text. By introducing a “soft” heuristic layer, domain experts can adjust sensitivity thresholds to better capture anomalies or critical variations in different data sources. The final generative component further enriches interpretability by creating narrative-style explanations that link model predictions to underlying evidence. Experiments and use cases across diverse fields-including environmental monitoring, geology, engineering, and social sciences-demonstrate that this framework fosters high levels of transparency, adaptability, and user trust.