Associate Data Engineer

January 10, 2025

Technical Project Manager / Scrum Master

January 20, 2025

Role: Databricks Data Engineer

Location: Bangalore – Hybrid

Experience: 7 – 12 years

Job Description:

We are seeking a seasoned Databricks data engineer to join our dynamic team. As a Data Engineer, you will be responsible for designing, developing, and maintaining data pipelines and infrastructure to support Databricks implementation and analytics. You will work closely with cross-functional teams to understand business requirements, design efficient data solutions, and ensure data integrity and quality. The ideal candidate should have expertise in using databricks and Snowflake for ETL processes, proficiency in Python programming for scripting and automation, and a solid understanding of data warehousing concepts and SQL.

Responsibilities:

Implement scalable and sustainable data engineering solutions using tools such as Databricks, Azure, Apache Spark, and Python. The data pipelines must be created, maintained, and optimized as workloads move from development to production for specific use cases.
Implement Batch and Real-time data ingestion/extraction processes through ETL, Streaming, API, etc., between diverse source and target systems with structured and unstructured datasets
Migrate the Stored procedures and functions from Snowflake to Databricks notebook
Collaborate with data analysts, reporting team and business advisors to gather requirements and define data models that effectively support business requirements
Develop and maintain scalable and efficient data pipelines to ensure seamless data flow across various systems address any issues or bottlenecks in existing pipelines.
Implement robust data checks to ensure the accuracy and integrity of data. Summarize and validate large datasets to ensure they meet quality standards.
Monitor data jobs for successful completion. Troubleshoot and resolve any issues that arise to minimize downtime and ensure continuity of data processes.
Regularly review and audit data processes and pipelines to ensure compliance with internal standards and regulatory requirements
Familiar with working on Agile methodologies – scrum, sprint planning, backlog refinement etc.
Optimize Big Data solutions for high performance and scalability.
Conduct performance optimization, tuning, and troubleshooting of data infrastructure components to ensure optimal performance and resource utilization.
Develop and optimize SQL queries for data extraction, transformation, and analysis, ensuring adherence to best practices and performance.
Design and implement data models and schemas to support reporting, visualization, and business intelligence initiatives.
Monitor and troubleshoot data pipelines and processes, identifying and resolving issues to ensure uninterrupted data flow and availability.
Work with ETL tool like Snaplogic to extract data from multiple data sources

Requirements:

7-12 years’ experience on Data Engineering role working with Databricks & Cloud technologies.
Ability to work in US PST hours (50% IST and 50% PST time zone)
Bachelor’s degree in computer science, Engineering, or related field; advanced degree preferred. Proven experience in data engineering, with expertise in Databricks and Apache
In-depth experience in Databricks core components like Data Frames, Datasets, Spark SQL, Delta Lake, Databricks Notebook, DBFS, and Databricks Connect.
Experience in migrating the Snowflake Stored Procs and functions to Databricks notebook
Hands on experience using version control systems such as Git and CI/CD workflows and practice
Experience on Delta Live Tables, Autoloader & Unity Catalog.
Spark for large-scale data processing and analytics.Strong proficiency in SQL, with the ability to write complex queries
Solid understanding of data warehousing concepts, dimensional modeling, and data governance principles.
Experience with version control systems (e.g., Git) and CI/CD pipelines for code deployment and automation.
Excellent problem-solving skills and attention to detail, with the ability to troubleshoot and debug complex data issues.
Strong communication and collaboration skills, with the ability to work effectively in a cross-functional team environment.
Working experience with SnapLogic ETL tool

Note : Looking for someone who can join immediately or within 30 days.

For more details you can visit company website: www.infometry.net

About Us:

Infometry is a leading Advanced Data Analytics product solutions and global services company headquartered in Silicon Valley, California, and Bangalore, India. Infometry products include INFOFISCUS & Informatica Cloud Connectors.

INFOFISCUS is a Prebuilt Cloud Analytics solution for Finance, Supply Chain, Sales, and Marketing. It’s been featured and listed on Informatica Marketplace.

Informatica Cloud Connectors are built for Google Sheets, Google Drive, Google Ads, Googlcareers@infometry.nete Pub/Sub, Google Big Table, HubSpot, and Adaptive Insights.

In addition to products, Infometry helps customers with Enterprise Data Strategy, Cloud Data Orchestration, Enterprise Service Bus, Snowflake Migration, Data Quality, MDM, Informatica PowerCenter to Cloud Migration, Salesforce Implementation, Big Data and AI/ML Analytics.

We are an engineering partner for multiple Data companies such as Informatica, Matillion, Snowflake, GCP, Tableau, Looker, Talend, Pentaho, Dell Boomi .. etc. We are growing at a rapid pace of 200% every year and we are looking for passionate members for our team. If you feel you are the one? Follow Us on LinkedIn and check our website for regular job updates.

To apply for this job email your details to Sreeja.Reddy@infometry.net