Are We Disregarding Privacy Rules Because They Are Hard? Part 3 of 3

Are We Disregarding Privacy Rules Because They Are Hard? Part 3 of 3

Shouldn’t This Be Easier By Now?

hcfa1500 redactEventually, someone in Information Technology or Database Administration gets asked to extract data from a PHI rich line of business system or data warehouse but deliver it as de-identified data.  Almost any data extraction approach allows for data to be masked, redacted, suppressed or even randomized in some way.  This type of functionality can give us de-identified but often useless data for testing, analytics or development.

Since my company, The EDI Project™ was founded in 2001, we have been asked to de-identify or anonymize data for testing and development work many times.  Each time we have written custom code to do so for each project.  This code is never transferable to another customer environment and must be re-done for every scenario.  If we were doing this every time, we thought there has to be other companies who are having the same problem.

It turns out, there are tools on the market to address extracting data from a line of business system or data warehouse and anonymize the data so it is useful and not just de-identified into useless “John Doe” records.

For example, one of the largest integration engines on the market offers this functionality as a $250,000 add on to their existing, very expensive suite of products.  It is complicated to learn and use and must have custom code added if multiple systems are required to be anonymized the same way (e.g. enrollment, eligibility and claims data have to have matching but anonymized names and dates of birth).

There are other tools in this space that sniff out vast data stores for PHI and attempt to automagically de-identify the data.  Usually this is a masking or data redaction type approach, but even when it is not, many fields are marked as “suspect PHI” and left for human review.  I can’t blame them either.  While Patient Name fields or Date of Birth are easy enough to identify, free form fields can be a nightmare.  Either way, these tools are usually very expensive and often leave the job half done.

There are a lot of cases where a certain files like EDI 837 Claims or maybe an enrollment database has to be de-identified for a test system.  Perhaps it is an ongoing extract of data from a data warehouse for an analytics study.  This is where most of the time, the work is either not done (exemption granted), or custom code is deployed (expensive / time consuming).  But technology is supposed to be faster, better and cheaper isn’t it?

Since we are the guys who are often asked to do the work looked at our experience in extraction of health care data to design a tool we would want to use.  No compromises.  We wanted easy to learn and use, powerful to handle big data environments without being a bottleneck to any extraction work.  Finally, it would be able to anonymize data across multiple sources so that the matching but de-identified data maintained record integrity (i.e. all the records for one patient in the PHI data sources had corresponding records in the de-identified data sources).  Oh yeah – and since the main project being done is already expensive enough, the tool should be inexpensive.

People have been using ETL (Extract, Transform, Load) tools for decades and are familiar with how they work.  Thinking about the “T” in “Transform”, a common thing to do would be to change a date from MMDDYYYY format to DDMMYYYY format.  This type of common transformation logic doesn’t have to be rewritten every time you extract from a new source.  The integrator just picks it from a list when doing mapping work.  Anonymizing PHI should be that simple as well.

Functions and drop downs need to be available to anonymize every kind of PHI and handle it according to the special properties for that type of data.  Names are anonymized differently than zip codes.  More specifically, the anonymization routine for a Date of Birth (DOB) is handled differently than a Date of Service (DOS).  The software should know that already and not need to be defined by the integration team or subject matter expert.

As a result, we developed and launched our own Anonymization Engine called “Don’t Redact!™”.  We’re integrators and so we built the tool an integrator would want to get this done quickly and easily.  It can be learned by someone who has experience with integration tools in an afternoon and your first sizeable anonymization effort can be deployed in a day or so after learning the ropes.

Under the spirit of no compromises and disruptive technology, the Don’t Redact!™ Anonymization Engine is $25,000.

While The EDI Project™ is a professional services organization and we would be happy to deploy the software for you or set up your first live anonymized environment, the tool is well thought out and easy enough you won’t need any services at all.

Want to find out more?  http://theediproject.com/anonymization.html

Part 1: Minimum Necessary or Optional   

Part 2: A False Choice. . . 

Advertisement

Are We Disregarding Privacy Rules Because They Are Hard? Part 1 of 3

Are We Disregarding Privacy Rules Because They Are Hard? Part 1 of 3

Minimum Necessary or Optional?

One of the things that continues to excite me about the world of healthcare informatics is the opportunity to reduce the cost of care while providing better care and overall better outcomes.  Often people think in terms of zero sum game where reducing the cost of care always reduces care and outcomes.  But the promise of technology is that it can make us more efficient; a man can dig a hole faster with a shovel with more precise dimensions than with his bare hands.

tools

Having the right tool for the right job is important. . . 

 

Much attention has been paid of late to re-admission rates for hospitals.  Hospitals stays are expensive and if a patient is sufficiently recovered from whatever put them there to begin with, they are usually eager to get home to continue to recover in a more familiar environment.  Both parties – the hospital and the patient – often want the stay to end as soon as possible.

But if the patient is released too early, it is always bad news.  At best, they must be re-admitted – often through the emergency room process.  Worse, they could relapse and not make it back to the hospital at all.  Outcomes for patients who are released too early are both worse and more expensive than if they had stayed in the hospital instead of being released.

Certainly, trusting our doctors is a first step, but they are often very busy and under the same pressures to release a patient discussed above.  There are simply too many variables to be perfect at this when practicing medicine.  While experience gives the doctor his most potent weapon she can only draw from the experience available to them.  Patterns do exist, however, that are indicators of good situations to use additional caution when deciding to release.  No one doctor could ever amass enough experience to recognize them all though.

Today, there are powerful analytic tools available that can take massive amounts of data and sift through looking for patterns that simply would not or could not be seen otherwise.  Rather than take a sample scenario and examine the data to see if that scenario is more likely to result in a readmission, these tools are capable of comparing millions or billions of situations to each other at the same time.  The result is finding co-morbidities or patterns of care that no one could have ever thought to test out on their own.

These types of comparisons were computational fairy tales just a few years ago but can be done today because of advancements in parallel processing.  The bad news is no matter how good the tools are, they are only as good as the data they have to examine in the first place. . . What if no one can get the data?

Minimum Necessary is the process that is defined in the HIPAA regulations:  When using or disclosing protected health information or when requesting protected health information from another covered entity, a covered entity must make reasonable efforts to limit protected health information to the minimum necessary to accomplish the intended purpose of the use, disclosure or request. 

 

Next: Part 2A False Choice. . .  

Part 3: Shouldn’t This Be Easier By Now?