About
Highly motivated early-career Data Engineer with a strong foundation in building and optimizing scalable ETL pipelines and data warehouses on AWS. Proven ability to leverage technologies like Apache Spark, Python, and SQL to process multi-format data, automate workflows, and deliver actionable insights. Eager to apply certified cloud and big data expertise to drive data-driven solutions in a dynamic environment.
Work
Mactores
|Data Engineer Intern
Mumbai, Maharashtra, India
→
Summary
• Gaining exposure to Apache Spark, AWS, and Databricks through internship training. • Earned Databricks Certified Data Engineer Associate and AWS Certified Cloud Practitioner, demonstrating skills in data ingestion, transformation, and cloud workflows. • Built and tested ETL pipelines in personal projects using Spark, Airflow, and key AWS services. • Explored and implemented pipeline design strategies such as incremental loading, partitioning, and orchestration to improve efficiency and scalability
Education
Vasantdada Patil Pratishthan's College of Engineering & Visual Arts, Mumbai University
→
B.Tech
Electronics & Telecommunication Engineering (EXTC)
Grade: 7.85 CGPA
Skills
Programming Languages
Python, SQL, PySpark.
Databases
MySQL, MongoDB.
Big Data & ETL Frameworks
Apache Spark, Hadoop, Apache Airflow, Databricks.
Developer Tools & Concepts
Git, Docker, VS Code, PyCharm, IntelliJ, Eclipse, SFTP Server, DBeaver, YAML, Linux (Ubuntu).
Cloud Platforms & Services
AWS, Databricks.
Projects
Built Automated Incremental ETL Pipeline
Summary
Designed and automated an ETL pipeline using Apache Spark, MySQL, S3, and Airflow, incorporating full-load and incremental strategies with upsert functionality. Optimized performance through advanced partitioning and mirroring, reducing processing time by 40% for large-scale datasets and enabling faster data retrieval and insights.