In the paper, we describe a path for creating an information flow model (a readable twin) for a deep learning model (an unreadable model). This path has been implemented as a Python tool called Human Readable Twin Explainer (HuReTEx). Properly aggregated artifacts generated by individual key layers of the deep learning model for training cases constitute the basis for building a model in the form of a flow graph. Then, the most important prediction paths are determined. These paths, in connection with appropriately presented artifacts (e.g., in the form of images or descriptions in natural language), constitute a clear explanation of the knowledge acquired by the model during the training process.
Pancerz et al. (Wed,) studied this question.