PDF to DICOM Converter
A python package for PDF to Encapsulated DCM and PDF to DICOM RGB converter
The python package is available for use on PyPI. It can be setup simply via pip
pip install pdf2dcm
To the check the setup, simply check the version number of the pdf2dcm
package by
python -c 'import pdf2dcm; print(pdf2dcm.__version__)'
Poppler is a popular project that is used for the creation of Dicom RGB Secondary Capture. You can check if you already have it installed by calling pdftoppm -h
in your terminal/cmd. To install poppler these are some of the recommended ways-
Conda
conda install -c conda-forge poppler
Ubuntu
sudo apt-get install poppler-utils
MacOS
brew install poppler
from pdf2dcm import Pdf2EncapsDCM
converter = Pdf2EncapsDCM()
converted_dcm = converter.run(path_pdf='tests/test_data/test_file.pdf', path_template_dcm='tests/test_data/CT_small.dcm', suffix =".dcm")
print(converted_dcm)
# [ 'tests/test_data/test_file.dcm' ]
Parameters converter.run
:
path_pdf (str)
: path of the pdf that needs to be encapsulatedpath_template_dcm (str, optional)
: path to template for getting the repersonalisation of data.suffix (str, optional)
: suffix of the dicom files. Defaults to ".dcm".
Returns:
List[Path]
: list of path of the stored encapsulated dcm
from pdf2dcm import Pdf2RgbSC
converter = Pdf2RgbSC()
converted_dcm = converter.run(path_pdf='tests/test_data/test_file.pdf', path_template_dcm='tests/test_data/CT_small.dcm', suffix =".dcm")
print(converted_dcm)
# [ 'tests/test_data/test_file_0.dcm', 'tests/test_data/test_file_1.dcm' ]
Parameters converter.run
:
path_pdf (str)
: path of the pdf that needs to be convertedpath_template_dcm (str, optional)
: path to template for getting the repersonalisation of data.suffix (str, optional)
: suffix of the dicom files. Defaults to ".dcm".
Returns:
List[Path]
: list of paths of the stored secondary capture dcm
- The name of the output dicom is same as the name of the input pdf
- If no template is provided no repersonalisation takes place
- It is possible to produce dicoms without a suffix by simply passing
suffix=""
to theconverter.run()
It is the process of copying over data regarding the identity of the encapsualted pdf from a template dicom. Currently, the fields that are repersonalised by default are-
- PatientName
- PatientID
- PatientSex
- StudyInstanceUID
SeriesInstanceUIDSOPInstanceUID
The fields SeriesInstanceUID
and SOPInstanceUID
have been removed from the repersonalization by copying as it violates the DICOM standards.
You can set the fields to repersonalize by passing repersonalisation_fields into Pdf2EncapsDCM()
, or Pdf2RgbSC()
Example:
fields = [
"PatientName",
"PatientID",
"PatientSex",
"StudyInstanceUID",
"AccessionNumber"
]
converter = Pdf2RgbSC(repersonalisation_fields=fields)
note: this will overwrite the default fields.