NEW
Proxify is bringing transparency to tech team performance based on research conducted at Stanford. An industry first, built for engineering leaders.
Learn more
Dimitrios M.
Senior Big Data Engineer
Dimitrios is a Senior Big Data Engineer with over five years of experience designing and building large-scale data pipelines using Spark, Airflow, Python, and AWS. He specializes in distributed processing, ETL optimization, and cloud-native analytics, delivering solutions that are both efficient and scalable.
He has contributed to data platforms across eCommerce, fintech, and enterprise environments, working with companies such as Profitero+, EPAM, and EY. His projects include Spark performance tuning, high-volume data ingestion on AWS EMR, and orchestrating complex workflows with Airflow.
Known for his strong problem-solving skills and clear communication, Dimitrios consistently emphasizes reliability, efficiency, and well-structured engineering practices in every project he undertakes.
Main expertise
- SQL 5 years
- Data Engineering 5 years
- AWS 4 years
Other skills
- AWS Athena 2 years
- MySQL 1 years
- Snowflake 1 years

Selected experience
Employment
Senior Big Data Engineer
Profitero+ - 4 months
Profitero is a global e-commerce analytics platform used by major retail brands to track digital shelf performance, automate insights, and optimize marketplace visibility across Amazon, Walmart, and other major retailers.
- Leads Spark-based ETL optimization initiatives, reducing compute footprint and ensuring SLA-bound ETAs for mission-critical data products.
- Owns the architecture and implementation of high-volume ingestion pipelines running on AWS EMR and Databricks clusters.
- Implements advanced performance tuning, partitioning strategies, caching, and execution-plan refinement across PySpark and Scala Spark workloads.
- Designs Airflow DAGs for fully orchestrated extraction, transformation, model preparation, and delivery processes.
- Collaborates with data scientists and analytics teams to standardize transformation logic into reusable frameworks.
- Ensures production reliability by improving monitoring, logging, and alerting across EMR, Snowflake, and GCP workloads.
- Supports cross-team integration and shapes engineering standards for new data products.
Technologies:
- Technologies:
Apache Spark
Python
SQL
AWS S3
Scala
Google Cloud
Apache Airflow
Snowflake
ETL
AWS EMR
Big Data Engineer
Profitero+ - 1 year 7 months
- Built advanced Spark jobs using an internal framework to process multi-source retail datasets at scale.
- Developed complete ETL pipelines—from raw ingestion (S3, APIs, vendor feeds) to cleaning, preparation, and datamart outputs.
- Created recursive custom algorithms to optimize a bottleneck ETL step, improving execution speed and stability.
- Tuned CPU and memory usage, executor configurations, and storage formats to meet strict ETAs.
- Created and maintained Airflow DAGs orchestrating dozens of interdependent tasks.
- Collaborated with product, data, and client teams on integration patterns and requirements alignment.
- Ensured secure distribution of final datasets via SFTP automations.
Technologies:
- Technologies:
Apache Spark
Python
SQL
AWS S3
Scala
Snowflake
ETL
AWS EMR
Senior Data Engineer
Mantaro Brands - 3 months
Mantaro Brands operates and scales multiple D2C consumer brands, relying heavily on data analytics, forecasting, and automated reporting infrastructure.
- Designed and optimized ingestion and transformation pipelines supporting marketplace analytics and brand operations.
- Implemented Spark-based workflows for forecasting, demand analysis, and product performance insights.
- Created Airflow DAGs to orchestrate ETL steps and validation checks.
- Improved Snowflake schema design and warehouse performance for reporting use cases.
- Ensured data readiness and quality for internal analytics and external marketplace integrations.
Technologies:
- Technologies:
AWS
Python
SQL
Google Cloud
Apache Airflow
Snowflake
Big Data Developer
EPAM Systems - 1 year 11 months
EPAM is a global engineering and consulting firm delivering cloud, big data, and platform modernization for enterprise clients.
- Built sophisticated Spark jobs on a custom internal framework, executed on AWS EMR clusters.
- Designed ETL flows from heterogeneous data sources, applied cleaning and curation logic, and delivered final datamarts.
- Implemented recursive algorithms for complex transformations, significantly accelerating bottleneck jobs.
- Performed Spark and EMR performance tuning, reducing job runtimes to meet fixed ETAs.
- Developed Airflow DAGs to orchestrate multi-layer pipelines.
- Communicated with customers and cross-functional teams for data integration and requirement analysis.
As Dev Lead:
- Led task allocation, peer reviews, and onboarding for new engineers.
- Provided architecture guidance and ensured alignment with enterprise engineering standards.
- Supported teams with hands-on technical assistance across Spark, Airflow, and AWS EMR.
Technologies:
- Technologies:
AWS
Apache Spark
Python
SQL
Scala
Apache Airflow
ETL
- Team Leading
AWS EMR
Data Engineer / BI Developer
AbbVie - 1 year
- Contributed to the implementation of Palantir Foundry for enterprise data governance, modeling, and analytics.
- Built curated datasets, operational workflows, and governed pipelines within Foundry.
- Developed analytics-ready tables and dashboards to support commercial and regulatory operations.
- Collaborated closely with business and data science teams to structure high-value datasets.
Technologies:
- Technologies:
Python
SQL
- Data Modeling
Palantir Foundry
Education
MSc.Petroleum Engineering
Heriot-Watt University · 2015 - 2017
BSc.Engineer's degree, Pipeline Network Design, Construction and Operation
Gubkin Russian State University of Oil and Gas (National Research University) · 2008 - 2013
Find your next developer within days, not months
In a short 25-minute call, we would like to:
- Understand your development needs
- Explain our process to match you with qualified, vetted developers from our network
- You are presented the right candidates 2 days in average after we talk
