Request a call
  • Hidden

Business Overview

Group 532

Business Overview

RedSail Technologies is a leading U.S. provider of pharmacy software solutions with a focus on long-term care. The company’s mission is to improve healthcare and strengthen the role of independent pharmaceutical organizations by providing affordable and high-quality specialized software.

The client’s flagship product is a SaaS platform for pharmacy management that allows handling patient information, orders, pharmaceutical product transfer and delivery, billing, and electronic prescriptions.

Project Scope

The client came to NIX United for a data analytics solution that can be used inside the company for various products and also can be provided as a SaaS for businesses.

The NIX team’s scope included the following tasks:

  • 1

    Create robust and secure data storage

     

  • 2

    Set up fast data processing for future analytics

     

Solution

The NIX team was responsible for collecting data coming from the flagship product, as well as processing and storing it for further analytics and the development of predictive models.

  • 01

    Data Collection

    The flagship SaaS software has an architecture with more than 100 microservices. The two main data sources are the Apache Kafka streaming platform and Azure data lake.

    Apache Kafka

    On the backend, Apache Kafka uses operational PostgresDB and MongoDB as data sources. For data transfer, we used Debezium—an open-source distributed platform for change data capture. The data gets into Kafka in the form of JSON messages, and for each microservice, we create a separate topic.

    Azure Data Lake

    As a next step, we took data from Apache Kafka as a data source for the Azure data lake. Using custom PySpark scripts, we set up a data stream from Kafka that is converted into parquet-type files and stored in the data lake.

  • 02

    ETL and Data Warehouse

    As we mentioned above, to process streaming data from Apache Kafka, the NIX team used PySpark scripts. From Kafka, raw data goes in two directions—temporary Postgres storage and the data lake.

    Data from all sources is stored in PostgreSQL tables and as parquet files in the Azure data lake. For data orchestration, we used Apache Airflow and SQL functions to extract, process, and store raw data. After that, we used SQL functions to process the raw data, remove duplicates and empty rows, and save the data to permanent storage.

  • 03

    Legacy Data Processing and Maintenance

    The NIX team was also responsible for legacy data processing and transformation from the client’s claim processing system, typically 1.3 million claims per day.

    The data was originally compressed and encrypted, so we converted it using a prebuilt Java package in Python and transferred it to the database.

Outcome

The client received a robust data engineering and analytics solution that enables real-time reports. All data generated inside the company walls now serves as a bedrock for analytics.

Moreover, the client intends to offer this solution as a Software-as-a-Service (SaaS), opening up numerous promising opportunities for stakeholders.

Group 533

Team:

10 experts (Project Manager, 8 Data Engineers, 3 QA Engineers)

Tech Stack:

Azure, Apache Kafka, Apache Spark, Apache Airflow, PostgreSQL, MongoDB

 

Relevant Success Stories

Contact Us