A scoping review on synthetic datasets: Opportunities, challenges and future directions for endodontics
Document Type
Article
Department
Dental-oral, Maxillo-facial Surgery
Abstract
Background: Recent advances in Artificial Intelligence (AI) and machine learning algorithms have transformed diagnostic tasks in dentistry and endodontics. However, the development of robust AI systems remains constrained by the scarcity of annotated, high-quality datasets along with issues like class imbalance and privacy concerns. Synthetic data, replicating the statistical and visual characteristics of real data, may offer a promising solution to these limitations.
Objective: The scoping review aimed to map (i) existing synthetic datasets in dentistry/endodontics, (ii) commonly used synthetic data generation techniques, (iii) their fidelity and diagnostic utility, and (iv) current limitations and future research directions in endodontics.Methods: A comprehensive search was conducted across PubMed, Scopus, EBSCO, and Google Scholar using keywords related to synthetic datasets in dentistry and endodontics. Data regarding synthetic image modality most used, model architecture, training dataset size, evaluation metrics, and the real-world clinical utility (use-cases) of synthetic datasets generated were extracted and analysed descriptively.
Results: Eleven studies met the inclusion criteria, primarily using Generative Adversarial Networks (GANs), with only one utilising a hybrid GAN-diffusion model. The most common synthetic imaging modalities were panoramic radiographs (n = 4), intraoral clinical images and lateral cephalograms (n = 2 each), followed by periapical, bitewing, and TMJ Magnetic Resonance Imaging (MRI) (n = 1 each). Training datasets ranged from 509 to 35 254 images, with stable generation achieved using as few as 1720 images in the hybrid model. Evaluation metrics varied, with Fréchet Inception Distance (FID), Signal-to-Noise Ratio (SNR), and visual Turing tests being most common. Only four studies evaluated clinical utility, using synthetic images to train AI models. While synthetic data alone showed moderate fidelity, combining it with real images consistently improved accuracy across tasks like canal classification, implant segmentation, tooth numbering, and skeletal pattern recognition.
Conclusion: Synthetic data use in endodontics is still emerging but holds strong potential. While GANs dominate current research, diffusion and hybrid models may yield more realistic images. Future efforts should focus on generating diverse, clinically representative datasets to improve diagnostics, education, and AI-driven endodontic care.
AKU Student
no
Publication (Name of Journal)
The International Endodontic Journal
DOI
10.1111/iej.70116
Recommended Citation
Naved, N.,
Umer, F.
(2026). A scoping review on synthetic datasets: Opportunities, challenges and future directions for endodontics. The International Endodontic Journal.
Available at:
https://ecommons.aku.edu/pakistan_fhs_mc_surg_dent_oral_maxillofac/296
Comments
Volume, issue and pagination are not provided by author/publisher/