Hardware-Accelerated On-Demand Rigid
and Nonrigid 3D Image Registration
 
Authors:
Raj Shekhar, PhD, University of Maryland School of Medicine; William L. Plishker, PhD; Reza M. Rad, PhD
 
Background:
Three-dimensional (3D) image registration is fundamental to various medical imaging procedures. It is a prerequisite for comparing serial images in longitudinal comparison studies for quantifying disease progression or regression, often in response to a treatment. It is also a necessary first step in multimodality image fusion before images from two or more modalities with complementary information can be meaningfully overlaid. An emerging application of image registration is in image-guided therapies, in which typically higher resolution and higher quality pretreatment images need to be registered with typically lower resolution and lower quality intratreatment images. In all of these instances, images are misregistered because they are usually acquired at two different times separated by hours to years with the patient in different body orientations.

Decades of research has led to the development of accurate, reliable, and fully automated image registration algorithms, the most recognized of which is intensity-based registration by maximizing mutual information (MI) between a pair of images. Although applicable to a range of modalities and organs (rigid or nonrigid) and therefore quite versatile, the slow execution of MI-based registration continues to limit its full clinical potential.

We present here a hardware-accelerated implementation of MI-based image registration that has reduced the time of registration from hours to less than a minute. Our implementation executes a well-known and well-validated nonrigid image registration algorithm[1] on three field-programmable gate array (FPGA) chips, resident on a single, commercially available add-on PC board. Because rigid registration is the first step in nonrigid registration, our implementation is able to compute rigid registration as well. Highly optimized pipelined and parallel computing along with a custom parallel memory architecture that significantly reduces data transfer overhead have helped achieve over 100-fold acceleration. The implementation has been optimized to produce results comparable to those from an equivalent software-only implementation.[2]

 
Evaluation:
We evaluated our three-FPGA implementation on three separate image databases. The first database from Vanderbilt University included 10 pairs of brain computed tomography (CT) and magnetic resonance (MR) images and tested the rigid registration capability. For testing nonrigid image registration, we used two in-house image databases with validation markups that we had developed earlier for the validation of our nonrigid image registration algorithms.[3,4] The first of these two databases used here, included 10 pairs of whole-body positron emission tomography (PET) and CT images acquired on separate (not hybrid PET/CT) scanners. The second included 10 pairs of exhale and inhale CT scans of the lung or the abdomen.

For characterization of speed, images were resized to a 256 x 256 x 256 matrix. Often this meant upsampling in the slice direction and cropping and/or downsampling slices to have 256 x 256 within-slice voxels . This resampling did not affect the registration accuracy significantly, because the registration accuracy is decided, to a large degree, by the largest voxel dimension in the original data. Our achievement of subvoxel accuracy reported below confirms this. The time of image registration was < 1 min (see Table 1).

Table 1

Table 1. Time of image registration

We assessed registration accuracy by comparing our results with the reference results. For the Vanderbilt brain images, the reference results were ,in fact, the ground truth determined from known fiducial markers (although erased from the images). For the two nonrigid image registration databases, no ground truth existed. For these images, we treated previously validated, albeit slower, single-FPGA implementation as providing the reference solution. No significant difference was seen in the results of the two implementations (see Table 2). In each case, we normalized the observed difference by the largest voxel dimension (often the slice spacing) of the images in the original resolution prior to any resizing. This presented a unified approach to assess accuracy in spite of variations in voxel dimensions.

Table 2

Table 2. Accuracy of image registration

 
Discussion:
The time of image registration is a function of image size, the degree of starting misalignment, and the mode of image registration (rigid versus nonrigid). We resized the images to a 256 x 256 x 256 voxel matrix to remove size dependency in reporting the time of registration. The resulting voxel count is comparable to that in most clinical imaging exams and, therefore, the reported times represent realistic speed estimates in practice. Overall, our implementation allowed rigid image registration in < 1/2 min and nonrigid image registration in < 1 min.

Hardware-accelerated execution did not compromise the quality of image registration. A normalized registration accuracy of < 1 represents achievement of subvoxel accuracy (i.e., the images are registered with an error smaller than the largest voxel dimension). A value of 1, conversely, represents voxel-order accuracy. Subvoxel accuracy images was achieved for the first two databases. For the exhale-inhale CT database, the normalized registration accuracy was voxel-order. Because most registration algorithms aim to achieve subvoxel or voxel-order accuracies, our results indicate that we retained registration accuracy with the hardware implementation.

Finally, the use of images from various different modalities, and of various different organs, demonstrates the general-purpose nature of our image registration technology. We have incorporated the developed technology as a plug-in into OrisiX, a well-known medical imaging visualization software. OrisiX thus serves as the front-end of our technology. Our successful OsiriX integration indicates that the described technology can be integrated into most picture archiving and communication systems (PACS) to give radiologists easy access to the high-speed image registration capability described here.

 
Conclusion:
We have developed and presented an accurate and hardware-accelerated image registration technology capable of completing most image registration tasks in < 1 min. Furthermore, the technology is compact (comes in the form of an add-on PC board) and can be easily integrated into a PACS for enterprise-wise use. The subminute speed and easy enterprise-wide access have the potential to use image registration on-demand in a broad range of image-based diagnostic and therapeutic applications.
 
References:
[1] Walimbe V, Shekhar R. Automatic elastic image registration by interpolation of 3D rotations and translations from discrete rigid-body transformations. Medical Image Analysis. 2006;10(6):899-914.

[2] Dandekar O, Shekhar R. FPGA-accelerated deformable registration for improved target-delineation during CT-guided interventions. IEEE Trans on Biomedical Circuits and Systems. 2007;1(2):116-127.

[3] Shekhar R, Walimbe V, Raja S, et al. Automated 3-dimensional elastic registration of whole-body PET and CT from separate or combined scanners. Journal of Nuclear Medicine. 2005;46(9):1488-1496.

[4] Shekhar R, Lei P, Castro-Pareja CR, Plisker WL, D’Souza WD. Automatic segmentation of phase-correlated CT scans through nonrigid image registration using geometrically regularized free-form deformation. Medical Physics. 2007;34(7):3054-3066.