ggk-quote

Get A Quote

ggk-contact

+91 1234 44 4444

Data Engineering & Pipeline Management

Implemented responsive and modern front-end app

Challenges

  • There are hundreds of thousands of datasets sourced from thousands of providers which are either free/public or paid, that are available and hosted on this application
  • Enable the enterprise users, decision scientists, and data analysts to upload their organizational datasets
  • Facilitate joins between the uploaded transactional/non-transactional datasets and the other publicly hosted datasets
  • These joins should be executed within a few seconds for a seamless user experience

Solutions

  • Implemented responsive and modern frontend app for data scientists using Redux React
  • Designed and implemented all middle-tier services that include APIs and data access layer on Python Django
  • Wrote ANSI SQL code generator in Python, it considers all user selections, connects with metadata system, and generates the final query that runs on Snowflake
  • Built search and recommendation systems on Neo4j, these help users find those relevant features that are most pertinent to their own uploaded datasets

Tools & Technologies

Numpy, Django, Redux, React, AWS, Snowflake

Key benefits

  • Snowflake allows complex joins that include running various math functions between large datasets to happen within seconds, giving an output of billions of rows
  • It auto-creates multiple clusters depending on the count of concurrent queries as the workload increases
  • Data Scientists are able to quickly iterate over their models and thus move towards higher accuracy levels since they now save a significant amount of time finding the most relevant features