Data Engineering Solution for Pharmacy Management

Business Overview

Business Overview

RedSail Technologies is a leading U.S. provider of pharmacy software solutions with a focus on long-term care. The company’s mission is to improve healthcare and strengthen the role of independent pharmaceutical organizations by providing affordable and high-quality specialized software.

The client’s flagship product is a SaaS platform for pharmacy management that allows handling patient information, orders, pharmaceutical product transfer and delivery, billing, and electronic prescriptions.

Project Scope

The client came to NIX United for a data analytics solution that can be used inside the company for various products and also can be provided as a SaaS for businesses.

The NIX team’s scope included the following tasks:

1

Create robust and secure data storage
2

Set up fast data processing for future analytics

Solution

The NIX team was responsible for collecting data coming from the flagship product, as well as processing and storing it for further analytics and the development of predictive models.

01 Data Collection
02 ETL and Data Warehouse
03 Legacy Data Processing and Maintenance

01
Data Collection

The flagship SaaS software has an architecture with more than 100 microservices. The two main data sources are the Apache Kafka streaming platform and Azure data lake.

Apache Kafka

On the backend, Apache Kafka uses operational PostgresDB and MongoDB as data sources. For data transfer, we used Debezium—an open-source distributed platform for change data capture. The data gets into Kafka in the form of JSON messages, and for each microservice, we create a separate topic.

Azure Data Lake

As a next step, we took data from Apache Kafka as a data source for the Azure data lake. Using custom PySpark scripts, we set up a data stream from Kafka that is converted into parquet-type files and stored in the data lake.
02
ETL and Data Warehouse

As we mentioned above, to process streaming data from Apache Kafka, the NIX team used PySpark scripts. From Kafka, raw data goes in two directions—temporary Postgres storage and the data lake.

Data from all sources is stored in PostgreSQL tables and as parquet files in the Azure data lake. For data orchestration, we used Apache Airflow and SQL functions to extract, process, and store raw data. After that, we used SQL functions to process the raw data, remove duplicates and empty rows, and save the data to permanent storage.
03
Legacy Data Processing and Maintenance

The NIX team was also responsible for legacy data processing and transformation from the client’s claim processing system, typically 1.3 million claims per day.

The data was originally compressed and encrypted, so we converted it using a prebuilt Java package in Python and transferred it to the database.

Outcome

The client received a robust data engineering and analytics solution that enables real-time reports. All data generated inside the company walls now serves as a bedrock for analytics.

Moreover, the client intends to offer this solution as a Software-as-a-Service (SaaS), opening up numerous promising opportunities for stakeholders.

Team:

10 experts (Project Manager, 8 Data Engineers, 3 QA Engineers)

Tech Stack:

Azure, Apache Kafka, Apache Spark, Apache Airflow, PostgreSQL, MongoDB

Request a consultation

Relevant Success Stories

Platform for Monitoring Drug Stability Budget on Excursion

The platform evaluates storage and transportation conditions regarding stability budgets to accelerate decision-making on the release of drugs for consumption.

AWS-powered Development Platform for Clinical Trials Management

NIX built a pioneering AWS solution that blends cost-efficiency, adaptability, and cutting-edge technology.

Navigating the Cloud: Modernization of Healthcare Data Pipelines

By embracing a cloud-centric approach and optimizing data pipelines, NIX empowered healthcare and insurance providers with cost-effective, secure, and data-driven solutions.