Saturday, July 1, 2017

Week 5: Important Work

In this week I have been working on PTM-84. As I mentioned in OpenMRS talks the current version does not support to match patients by taking two datasources. That means it only supported to match patients by deduplication. 

What is required to do?
According to my ultimate goal Patient Matching 2.0 module must support for the incremental patient matching. For that it is required to,
  • Fetch patients considering the date created or date changed with the report date
  • Fetch all the patients (except the patients that have been already fetched)
and perform the match with two data sources.

Why this is necessary?
The current version is not an efficient method for implementations having huge set of records.

For instance, if we have 10,000 patients in our system and we need to match the patients using first name and the date of birth. Goal is to check for the duplicates among them. If we compare all patients to all the others that is roughly 50 million comparisons ( 10,000 x (10,000 - 1) / 2 ). After couple of days if we run the same match where 90 patients have been added and 10 updated, with the current version it would still carry out the same method of comparison and this time it would be about 51 million comparisons!
The current goal is to perform comparisons only for the added and updated records for that particular match. If we have this sort of an amazing method rather than 51 million of comparisons it would result in only about 1 million [(100 x 99 / 2) + (100 x 9990)]. 

In order to do that we should consider two datasources without that it is just matching updated or added patients with themselves. (deduplication)

What I have done ?
These are the things I did, I had to change couple of methods in MatchingReportUtils.java
The methods are
  1. InitScratchTable
  2. CreRanSamAnalyzer
  3. CreAnalFormPairs
  4. CrePairdataSourAnalyzer
  5. ScoringData
  6. CreatingReport
In order to support for two datasources rather than deduplication.

No comments:

Post a Comment

GSoC 2017 - Final Report