Request a call
  • Hidden

Business Overview

The client is a marketing technology company specializing in business intelligence. The core product is a data-driven SaaS solution in the Google Cloud Platform (GCP), which provides car dealerships with deep insights about customer journey, and ranks the influence of each channel’s contribution, including paid search, display ads, email, third party websites, organic search, social media, and brand websites.

Moreover, the client’s services also incorporate custom reports for third-party car marketplaces. This helps them recognize how many leads they provide to each car dealer and measure their business efficiency
more precisely.

 

Initially, the client’s team processed approximately 1,000 sales entries per month and manually created reports. The sharply increased demand for their service made manual processing of this task challenging and time-consuming. In that regard, the client urgently searched for a solution to fully automate this process.

NIX’s role was to implement this functionality into a platform structure by setting up a new ETL process in the cloud service that will find, analyze, and extract data about required websites from 5+ terabytes of storage and automatically generate a report.

Solution

Throughout the development process the client’s technical team and our data engineering department tightly cooperated to provide smooth integration of new data flows into complex data-driven systems. Initially, the technical experts from the client’ side had their vision of how this solution should be implemented using Google Functions as a basis; however, our architect and data engineers offered a more suitable and efficient option.

Based on experience and expertise in GCP services, they recommended building solutions using GCP dataflow. This approach perfectly covered all of the client’s challenges and provided smooth auto-scaling options and dynamic work rebalancing. Moreover, this completely excludes even a chance of scalability under any load.

Data engineers quickly set up a demo process. The client was convinced with a demonstration of the high effectiveness of our approach. After that, it was deployed into the client’s system.

Data Flow Insights

  • 1

    When the function is triggered, an algorithm parses information from cloud storage containing numerous databases and converts data fields into one format. Each sales record includes phone numbers, emails, addresses, and more.

  • 2

    The system calls the Google GeoCode API to receive user coordinates. After that, it sorts data, removing all coordinates except house locations to compare with the coordinates from the pageview records.

  • 3

    All structured data is gathered in a resulting .csv file, which is imported into Treasure Data.

  • 4

    Aggregating Treasure Data content, the system selects required fields based on matching logic depending on the request and generates a final report.

Outcome

Now, the client’s employees involved in generating reports for third-party marketplaces can redirect their effort to other tasks which require human contribution.

Comprehensive automation of the data analysis process increases the platform’s performance and precision dramatically. The updated system processes 1M entries instead of 1k sales entries, and has high scaling capabilities with minimal effort.

Team:

Solution architect, Project Manager, 2 Data Engineers

Tech Stack:

Python, Google Cloud, PostgreSQL, Presto, Hive

Contact Us