Objective:
This work aims to develop an automated segmentation method for the prostate and its surrounding organs-at-risk (OAR) in pelvic computed tomography to facilitate prostate radiation treatment planning.
Approach:
In this work, we propose a novel deep-learning algorithm combining a U-shaped convolutional neural network (CNN) and vision transformer (VIT) for multi-organ (i.e., bladder, prostate, rectum, left and right femoral heads) segmentation in male pelvic CT images. The U-shaped model consists of three components: a CNN-based encoder for local feature extraction, a token-based VIT for capturing global dependencies from the CNN features, and a CNN-based decoder for predicting the segmentation out- come from the VIT’s output. The novelty of our network is a token-based multi-head self-attention (MHSA) mechanism used in the transformer, which encourages long- range dependencies and forwards informative high-resolution feature maps from the encoder to the decoder. In addition, a knowledge distillation strategy is deployed to further enhance the learning capability of the proposed network.
Main results:
We evaluated the network using: 1) a dataset collected from 94 patients with prostate cancer; 2) and a public dataset CT-ORG. A quantitative evaluation of the proposed network’s performance was performed on each organ based on 1) volume similarity between the segmented contours and ground truth using Dice score, segmentation sensitivity, and precision, 2) surface similarity evaluated by Hausdorff distance (HD), mean surface distance (MSD) and residual mean square distance (RMS), 3) and percentage volume difference (PVD). The performance was then compared against other state-of-art methods. Average volume similarity measures obtained by the network over all organs were Dice score = 0.91, sensitivity = 0.90, precision=0.92, average surface similarities were HD = 3.78 mm, MSD = 1.24 mm, RMS = 2.03 mm; average percentage volume difference was PVD = 9.9% on the first dataset. The network also obtained Dice score = 0.93, sensitivity = 0.93, precision=0.93, average surface similarities were HD = 5.82 mm, MSD = 1.16 mm, RMS = 1.24 mm; average percentage volume difference was PVD = 6.6% on the CT-ORG dataset.
Significance:
In summary, we propose a token-based transformer network with knowledge distillation for multi-organ segmentation using CT images. This method provides accurate and reliable segmentation results for each organ using CT imaging, facilitating the prostate radiation clinical workflow.
Background
The number of patients undergoing proton therapy has increased in recent years. Current treatment planning systems (TPS) calculate dose maps using three-dimensional (3D) maps of relative stopping power (RSP) and mass density. The patient-specific maps of RSP and mass density were obtained by translating the CT number (HU) acquired using single-energy computed tomography (SECT) with appropriate conversions and coefficients. The proton dose calculation uncertainty of this approach is 2.5%-3.5% plus 1 mm margin. SECT is the major clinical modality for proton therapy treatment planning. It would be intriguing to enhance proton dose calculation accuracy using a deep learning (DL) approach centered on SECT.
Objectives
The purpose of this work is to develop a deep learning method to generate mass density and relative stopping power (RSP) maps based on clinical single-energy CT (SECT) data for proton dose calculation in proton therapy treatment.
Methods
Artificial neural networks (ANN), fully convolutional neural networks (FCNN), and residual neural networks (ResNet) were used to learn the correlation between voxel-specific mass density, RSP, and SECT CT number (HU). A stoichiometric calibration method based on SECT data and an empirical model based on dual-energy CT (DECT) images were chosen as reference models to evaluate the performance of deep learning neural networks. SECT images of a CIRS 062M electron density phantom were used as the training dataset for deep learning models. CIRS anthropomorphic M701 and M702 phantoms were used to test the performance of deep learning models.
Results
For M701, the mean absolute percentage errors (MAPE) of the mass density map by FCNN are 0.39%, 0.92%, 0.68%, 0.92%, and 1.57% on the brain, spinal cord, soft tissue, bone, and lung, respectively, whereas with the SECT stoichiometric method, they are 0.99%, 2.34%, 1.87%, 2.90%, and 12.96%. For RSP maps, the MAPE of FCNN on M701 are 0.85%, 2.32%, 0.75%, 1.22%, and 1.25%, whereas with the SECT reference model, they are 0.95%, 2.61%, 2.08%, 7.74%, and 8.62%.
Conclusion
The results show that deep learning neural networks have the potential to generate accurate voxel-specific material property information, which can be used to improve the accuracy of proton dose calculation.
Advances in knowledge
Deep learning-based frameworks are proposed to estimate material mass density and RSP from SECT with improved accuracy compared with conventional methods.
The hippocampus plays a crucial role in memory and cognition. Because of the associated toxicity from whole brain radiotherapy, more advanced treatment planning techniques prioritize hippocampal avoidance, which depends on an accurate segmentation of the small and complexly shaped hippocampus. To achieve accurate segmentation of the anterior and posterior regions of the hippocampus from T1 weighted (T1w) MR images, we developed a novel model, Hippo-Net, which uses a cascaded model strategy. The proposed model consists of two major parts: (1) a localization model is used to detect the volume-of-interest (VOI) of hippocampus. (2) An end-to-end morphological vision transformer network (Franchi et al 2020 Pattern Recognit. 102 107246, Ranem et al 2022 IEEE/CVF Conf. on Computer Vision and Pattern Recognition Workshops (CVPRW) pp 3710–3719) is used to perform substructures segmentation within the hippocampus VOI. The substructures include the anterior and posterior regions of the hippocampus, which are defined as the hippocampus proper and parts of the subiculum. The vision transformer incorporates the dominant features extracted from MR images, which are further improved by learning-based morphological operators. The integration of these morphological operators into the vision transformer increases the accuracy and ability to separate hippocampus structure into its two distinct substructures. A total of 260 T1w MRI datasets from medical segmentation decathlon dataset were used in this study. We conducted a five-fold cross-validation on the first 200 T1w MR images and then performed a hold-out test on the remaining 60 T1w MR images with the model trained on the first 200 images. In five-fold cross-validation, the Dice similarity coefficients were 0.900 ± 0.029 and 0.886 ± 0.031 for the hippocampus proper and parts of the subiculum, respectively. The mean surface distances (MSDs) were 0.426 ± 0.115 mm and 0.401 ± 0.100 mm for the hippocampus proper and parts of the subiculum, respectively. The proposed method showed great promise in automatically delineating hippocampus substructures on T1w MR images. It may facilitate the current clinical workflow and reduce the physicians’ effort.
Background: Daily or weekly cone-beam computed tomography (CBCT) scans are commonly used for accurate patient positioning during the image-guided radiotherapy (IGRT) process, making it an ideal option for adaptive radiotherapy (ART) replanning. However, the presence of severe artifacts and inaccurate Hounsfield unit (HU) values prevent its use for quantitative applications such as organ segmentation and dose calculation. To enable the clinical practice of online ART, it is crucial to obtain CBCT scans with a quality comparable to that of a CT scan. Purpose: This work aims to develop a conditional diffusion model to perform image translation from the CBCT to the CT distribution for the image quality improvement of CBCT. Methods: The proposed method is a conditional denoising diffusion probabilistic model (DDPM) that utilizes a time-embedded U-net architecture with residual and attention blocks to gradually transform the white Gaussian noise sample to the target CT distribution conditioned on the CBCT. The model was trained on deformed planning CT (dpCT) and CBCT image pairs, and its feasibility was verified in brain patient study and head-and-neck (H&N) patient study. The performance of the proposed algorithm was evaluated using mean absolute error (MAE), peak signal-to-noise ratio (PSNR) and normalized cross-correlation (NCC) metrics on generated synthetic CT (sCT) samples. The proposed method was also compared to four other diffusion model-based sCT generation methods. Results: In the brain patient study, the MAE, PSNR, and NCC of the generated sCT were 25.99 HU, 30.49 dB, and 0.99, respectively, compared to 40.63 HU, 27.87 dB, and 0.98 of the CBCT images. In the H&N patient study, the metrics were 32.56 HU, 27.65 dB, 0.98 and 38.99 HU, 27.00, 0.98 for sCT and CBCT, respectively. Compared to the other four diffusion models and one Cycle generative adversarial network (Cycle GAN), the proposed method showed superior results in both visual quality and quantitative analysis. Conclusions: The proposed conditional DDPM method can generate sCT from CBCT with accurate HU numbers and reduced artifacts, enabling accurate CBCT-based organ segmentation and dose calculation for online ART.
In this work, we demonstrate a method for rapid synthesis of high-quality CT images from unpaired, low-quality CBCT images, permitting CBCT-based adaptive radiotherapy. We adapt contrastive unpaired translation (CUT) to be used with medical images and evaluate the results on an institutional pelvic CT dataset. We compare the method against cycleGAN using mean absolute error, structural similarity index, root mean squared error, and Frèchet Inception Distance and show that CUT significantly outperforms cycleGAN while requiring less time and fewer resources. The investigated method improves the feasibility of online adaptive radiotherapy over the present state-of-the-art.
Purpose: Dose escalation to dominant intraprostatic lesions (DILs) is a novel treatment strategy to improve the treatment outcome of prostate radiation therapy. Treatment planning requires accurate and fast delineation of the prostate and DILs. In this study, a 3D cascaded scoring convolutional neural network is proposed to automatically segment the prostate and DILs from MRI. Methods and materials: The proposed cascaded scoring convolutional neural network performs end-to-end segmentation by locating a region-of-interest (ROI), identifying the object within the ROI, and defining the target. A scoring strategy, which is learned to judge the segmentation quality of DIL, is integrated into cascaded convolutional neural network to solve the challenge of segmenting the irregular shapes of the DIL. To evaluate the proposed method, 77 patients who underwent MRI and PET/CT were retrospectively investigated. The prostate and DIL ground truth contours were delineated by experienced radiologists. The proposed method was evaluated with fivefold cross-validation and holdout testing. Results: The average centroid distance, volume difference, and Dice similarity coefficient (DSC) value for prostate/DIL are 4.3 ± 7.5/3.73 ± 3.78 mm, 4.5 ± 7.9/0.41 ± 0.59 cc, and 89.6 ± 8.9/84.3 ± 11.9%, respectively. Comparable results were obtained in the holdout test. Similar or superior segmentation outcomes were seen when compared the results of the proposed method to those of competing segmentation approaches. Conclusions: The proposed automatic segmentation method can accurately and simultaneously segment both the prostate and DILs. The intended future use for this algorithm is focal boost prostate radiation therapy.
Purpose: To determine the dosimetric effects of rotational errors on target coverage using volumetric modulated arc therapy (VMAT) for multi-target stereotactic radiosurgery (SRS).
Methods and Materials: This retrospective study includes 50 SRS cases, each with 2 intracranial planning target volumes (PTVs). Both PTVs were planned for simultaneous treatment to 21 Gy using a single-isocenter, non-coplanar VMAT SRS technique. Rotational errors of 0.5°, 1.0°, and 2.0° were simulated about all axes. The dose to 95% of the PTV (D95) and the volume covered by 95% of the prescribed dose (V95) were evaluated using multivariate analysis to determine how PTV coverage is related to PTV volume, PTV separation, and rotational error.
Results: At 0.5° rotational error, D95 values and V95 coverage rates were ≥ 95% in all cases. For rotational errors of 1.0°, 7% of targets had D95 and V95 values below 95%. Coverage worsened substantially when the rotational error increased to 2.0°: D95 and V95 values were > 95% for only 63% of the targets. Multivariate analysis showed that PTV volume and distance to isocenter were strong predictors of target coverage.
Conclusions: The effects of rotational errors on target coverage were studied across a broad range of SRS cases. In general, the risk of compromised coverage increases with decreasing target volume, increasing rotational error and increasing distance between targets. Multivariate regression models from this study may be used to quantify the dosimetric effects of rotational errors on target coverage given patient-specific input parameters of PTV volume and distance to isocenter.
This study evaluated the feasibility of using artificial intelligence (AI) segmentation software for volume-modulated arc therapy (VMAT) prostate planning in conjunction with knowledge-based planning to facilitate a fully automated workflow. Two commercially available AI software programs, Radformation AutoContour (Radformation, New York, NY) and Siemens AI-Rad Companion (Siemens Healthineers, Malvern, PA) were used to auto-segment the rectum, bladder, femoral heads, and bowel bag on 30 retrospective clinical cases (10 intact prostate, 10 prostate bed, and 10 prostate and lymph node). Physician-segmented target volumes were transferred to AI structure sets. In-house RapidPlan models were used to generate plans using the original, physician-segmented structure sets as well as Radformation and Siemens AI-generated structure sets. Thus, there were three plans for each of the 30 cases, totaling 90 plans. Following RapidPlan optimization, planning target volume (PTV) coverage was set to 95%. Then, the plans optimized using AI structures were recalculated on the physician structure set with fixed monitor units. In this way, physician contours were used as the gold standard for identifying any clinically relevant differences in dose distributions. One-way analysis of variation (ANOVA) was used for statistical analysis. No statistically significant differences were observed across the three sets of plans for intact prostate, prostate bed, or prostate and lymph nodes. The results indicate that an automated volumetric modulated arc therapy (VMAT) prostate planning workflow can consistently achieve high plan quality. However, our results also show that small but consistent differences in contouring preferences may lead to subtle differences in planning results. Therefore, the clinical implementation of auto-contouring should be carefully validated.
Background:
The hippocampus plays a crucial role in memory and cognition. Because of the associated toxicity from whole brain radiotherapy, more advanced treatment planning techniques prioritize hippocampal avoidance, which depends on an accurate segmentation of the small and complexly shaped hippocampus. Purpose: To achieve accurate segmentation of the anterior and posterior regions of the hippocampus from T1 weighted (T1w) MRI images, we developed a novel model, Hippo-Net, which uses a mutually enhanced strategy.
Methods:
The proposed model consists of two major parts: 1) a localization model is used to detect the volume-of-interest (VOI) of hippocampus. 2) An end-to-end morphological vision transformer network is used to perform substructures segmentation within the hippocampus VOI. The substructures include the anterior and posterior regions of the hippocampus, which are defined as the hippocampus proper and parts of the subiculum. The vision transformer incorporates the dominant features extracted from MRI images, which are further improved by learning-based morphological operators. The integration of these morphological operators into the vision transformer increases the accuracy and ability to separate hippocampus structure into its two distinct substructures.
A total of 260 T1w MRI datasets from Medical Segmentation Decathlon dataset were used in this study. We conducted a five-fold cross-validation on the first 200 T1w MR images and then performed a hold-out test on the remaining 60 T1w MR images with the model trained on the first 200 images. The segmentations were evaluated with two indicators, 1) multiple metrics including the Dice similarity coefficient (DSC), 95th percentile Hausdorff distance (HD95), mean surface distance (MSD), volume difference (VD) and center-of-mass distance (COMD); 2) Volumetric Pearson correlation analysis.
Results:
In five-fold cross-validation, the DSCs were 0.900±0.029 and 0.886±0.031 for the hippocampus proper and parts of the subiculum, respectively. The MSD were 0.426±0.115mm and 0.401±0.100 mm for the hippocampus proper and parts of the subiculum, respectively. Conclusions: The proposed method showed great promise in automatically delineating hippocampus substructures on T1w MRI images. It may facilitate the current clinical workflow and reduce the physicians’ effort.
Radiation therapy aims to control malignant and less commonly benign diseases while preserving the surrounding healthy tissues. Standard courses of radiation therapy last up to six weeks, during which time anatomical changes are often anticipated due to tumor shrinkage and the day-to-day variations of organ filling and patient positioning. Historically, clinicians have compensated for these variations by adding generous margins around target volumes to prevent a geometric miss but at the expense of increased radiation dose to the healthy tissues. One alternative is adaptive radiotherapy where the patient receives customized treatment based on the “anatomy of the day.” This approach reduces the need for large margins by directly accounting for the inter-fraction variations and consequently better spares the healthy tissues. Adaptive radiotherapy has been an active research area for some time and finally has been commercialized and implemented in some radiotherapy clinics, due in large part to machine learning. In this Research Topic “Machine Learning-Based Adaptive Radiotherapy Treatments: From Bench Top to Bedside”, machine learning applications in various stages of the adaptive radiotherapy workflow are covered, including image registration, segmentation, treatment planning, and clinical decision support.