info@ehidc.org

 202-624-3270

CyberPDF: Smart and Secure Coordinate-based Automated Health PDF Data Batch Extraction

Analytics, Privacy & Cybersecurity

  • Privacy & Cybersecurity

    Exploring the ways in which we are protecting the privacy, security, and confidentiality of patient information.  
  • Analytics

    Examine how healthcare data can provide insight across claims, cost, clinical, and more.

CyberPDF: Smart and Secure Coordinate-based Automated Health PDF Data Batch Extraction

March 1, 2019

CyberPDF: Smart and Secure Coordinate-based Automated Health PDF Data Batch Extraction

Data extraction from files is a prevalent activity in today’s electronic health record systems which can be laborious. When document analysis is repetitive (e.g., processing a series of files with the same layout and extraction requirements), relying on data-entry staff to manually perform such tasks is costly and highly insecure. Particularly analyzing a large list of PDF files (as a widely used format) to extract specific data and migrate them to other destinations for later use is both tedious and frustrating to do manually. This paper addresses a very practical requirement of batch extracting data from PDF files in health data document analysis and beyond. Specifically, we propose a Coordinate Based Information Extraction System (CBIES) to instrument a smart and automatic PDF batch data extraction tool, releasing health organizations from duplicate efforts and reducing labor costs. The proposed technique enables users to query a representative PDF document and extract the same data from a series of files in the batch analysis manner swiftly. Furthermore, since security and privacy considerations are essential part of any health record systems, it is included in our approach. Based on CBIES, we implement a prototype tool for PDF batch data extraction technique named, CyberPDF. The tool exhibits great efficiency, security and accuracy in multi-file data processing.

The full article can be downloaded below.  

Share