A Preliminary Study of the Progression of DICOM Object Conformance
and Metadata Usage in the Last Twenty Years |
| |
| Authors: |
| Dongbai Guo, PhD, Oracle Corporation |
| |
| Hypothesis: |
| With the Worldwide adoption of the DICOM standard in the last 20 years, we expect the quality of a DICOM object to improve with better knowledge of the standard and wider distribution of software toolkits. If we quantify the quality of a group of DICOM objects by the percentage of DICOM objects conforming to the standard and by the average number of metadata attribute tags used within a DICOM object, we should see an improving trend in the last 20 years. |
| |
| Introduction: |
| Knowledge of the quality of a DICOM object is very important in quality assurance and data/application integration in a healthcare enterprise. It is also very important in designing large scale enterprise medical imaging software. Although there have been many anecdotal evidences of good and bad DICOM objects, there has never been a study of the quality of DICOM objects over an extended period. In this work, we collected a large number of DICOM images and quantitatively measured the changes of DICOM object quality within the last 20 years, hoping to draw a definitive conclusion on the progression of the metadata quality. |
| |
| Methods: |
| We collected 73,681 distinct DICOM objects (total 30.15 GB) from both public and private sources. We set up conformance validation rules to specify a list of mandatory DICOM attributes that must be included within a DICOM object. The list included the most common data elements from DICOM SOP common module, DICOM study module, and DICOM series module. We also extracted all metadata from all DICOM contents. We classified metadata into standard and private data elements and counted the total number of occurrences. We categorized all images by their image-capture year and studied the trend over the last two decades.
The program is set up on a PC with two 3.00GHz CPU and 2GB memory running Linux operating system. The software package we used is Oracle 11.1 Multimedia DICOM. The total time to collect the statistics was 1946 seconds. |
| |
| Results: |
| Of all images, 308 images (0.42% of total) did not contain any date information and were discarded for this study. Most of those images were anonymized. The rest contained incomplete information. 251 images (0.34% of total) could not be analyzed due to parsing errors and further studies are required to categorize them. The rest of the images (73,122, 99.24% of total) were processed by the program described. The results are displayed in Table 1. The first column of this table shows the year a DICOM image was captured. The second column shows the total number of images captured that year. The third column shows the total number of invalid images in that year (those that failed conformance validation and are missing one or more basic DICOM data elements). The fourth column shows the average number of data elements (both standard and private) in these images. The fifth column shows the average number of standard data elements in these images.

Table 1. DICOM Quality and Metadata Usage Statistics |
| |
| Discussion: |
| We plotted the total number of images captured each year in a logarithmic plot in Figure 1. Since the sample data set is not collected from a single operating clinical enterprise, the curve is not as smooth as one can obtain from an image archive within a typical healthcare organization. For example, there are a smaller number of images in years 2007 and 2008 than the years before. Nevertheless, this plot clearly confirmed earlier studies that the growth of DICOM content was exponential. Our sample data set indicated that total number of medical images doubled every 2.18 years (with R-squared value of 0.549) over the last 20 years.

Figure 1.
Figure 2 shows the progression of the number of DICOM data elements embedded within a DICOM object, starting from 1992. We did not plot year 1991, since there is only a single non-conformant image from that year within our data set. This figure clearly shows the growing use of standard data elements in the last 20 years. However, the total number of data elements (both standard and private) does not exhibit a definitive trend.

Figure 2.
We also studied the reason for the failed conformation validations in the sample data set. It seemed that non-conformance occurred at the early phase of DICOM adoption. For example, the four images in year 2004 that failed validation came from dental communities. That was the year the DICOM standard was introduced to dental imaging. Images from the same source the next year contained no such non-conformant content.
This is a preliminary study. Further work, such as building a large set of conformance constraints that measure different levels of conformance or even modality-specific conformance, may yield more insightful knowledge of the progression of DICOM metadata quality. A bigger sample data set with more coverage will certainly lead to better precision in quality measurement. |
| |
| Conclusion: |
| There is a linear increase of 2.17 standard data elements per DICOM object per year (with an R-squared value of 0.58), which indicates increasing adoption and usage of standard metadata. The quality of a DICOM object improves because the percentage of non-conformant DICOM objects drops from 0.22% in the 1990s, to 0.017% in the 2000s. However, the total number of data elements fluctuate due to varying number of private data elements, which we speculate might be related to the introduction of new imaging modalities that had not yet been incorporated into the DICOM standard. Further studies are necessary to explain the fluctuating numbers of the private data elements. |
| |
| References: |
The DICOM standard http://medical.nema.org
Oracle Multimedia DICOM Developer’s Guide 11g Release 1 (11.1) Part Number B28416-03, http://download.oracle.com/docs/cd/B28359_01/appdev.111/b28416/toc.htm |
| |
| |
|
| |
| |
| |