Dimitrios M.

Senior Big Data Engineer

Dimitrios is a Senior Big Data Engineer with over five years of experience designing and building large-scale data pipelines using Spark, Airflow, Python, and AWS. He specializes in distributed processing, ETL optimization, and cloud-native analytics, delivering solutions that are both efficient and scalable.

He has contributed to data platforms across eCommerce, fintech, and enterprise environments, working with companies such as Profitero+, EPAM, and EY. His projects include Spark performance tuning, high-volume data ingestion on AWS EMR, and orchestrating complex workflows with Airflow.

Known for his strong problem-solving skills and clear communication, Dimitrios consistently emphasizes reliability, efficiency, and well-structured engineering practices in every project he undertakes.

Main expertise

  • SQL
    SQL 5 years
  • Data Engineering 5 years
  • AWS
    AWS 4 years

Other skills

  • AWS Athena
    AWS Athena 2 years
  • MySQL
    MySQL 1 years
  • Snowflake
    Snowflake 1 years
Dimitrios

Dimitrios M.

Germany

Get started

Selected experience

Employment

  • Senior Big Data Engineer

    Profitero+ - 4 months

    Profitero is a global e-commerce analytics platform used by major retail brands to track digital shelf performance, automate insights, and optimize marketplace visibility across Amazon, Walmart, and other major retailers.

    • Leads Spark-based ETL optimization initiatives, reducing compute footprint and ensuring SLA-bound ETAs for mission-critical data products.
    • Owns the architecture and implementation of high-volume ingestion pipelines running on AWS EMR and Databricks clusters.
    • Implements advanced performance tuning, partitioning strategies, caching, and execution-plan refinement across PySpark and Scala Spark workloads.
    • Designs Airflow DAGs for fully orchestrated extraction, transformation, model preparation, and delivery processes.
    • Collaborates with data scientists and analytics teams to standardize transformation logic into reusable frameworks.
    • Ensures production reliability by improving monitoring, logging, and alerting across EMR, Snowflake, and GCP workloads.
    • Supports cross-team integration and shapes engineering standards for new data products.

    Technologies:

    • Technologies:
    • Apache Spark Apache Spark
    • Python Python
    • SQL SQL
    • AWS S3 AWS S3
    • Scala Scala
    • Google Cloud Google Cloud
    • Apache Airflow Apache Airflow
    • Snowflake Snowflake
    • ETL ETL
    • AWS EMR AWS EMR
  • Big Data Engineer

    Profitero+ - 1 year 7 months

    • Built advanced Spark jobs using an internal framework to process multi-source retail datasets at scale.
    • Developed complete ETL pipelines—from raw ingestion (S3, APIs, vendor feeds) to cleaning, preparation, and datamart outputs.
    • Created recursive custom algorithms to optimize a bottleneck ETL step, improving execution speed and stability.
    • Tuned CPU and memory usage, executor configurations, and storage formats to meet strict ETAs.
    • Created and maintained Airflow DAGs orchestrating dozens of interdependent tasks.
    • Collaborated with product, data, and client teams on integration patterns and requirements alignment.
    • Ensured secure distribution of final datasets via SFTP automations.

    Technologies:

    • Technologies:
    • Apache Spark Apache Spark
    • Python Python
    • SQL SQL
    • AWS S3 AWS S3
    • Scala Scala
    • Snowflake Snowflake
    • ETL ETL
    • AWS EMR AWS EMR
  • Senior Data Engineer

    Mantaro Brands - 3 months

    Mantaro Brands operates and scales multiple D2C consumer brands, relying heavily on data analytics, forecasting, and automated reporting infrastructure.

    • Designed and optimized ingestion and transformation pipelines supporting marketplace analytics and brand operations.
    • Implemented Spark-based workflows for forecasting, demand analysis, and product performance insights.
    • Created Airflow DAGs to orchestrate ETL steps and validation checks.
    • Improved Snowflake schema design and warehouse performance for reporting use cases.
    • Ensured data readiness and quality for internal analytics and external marketplace integrations.

    Technologies:

    • Technologies:
    • AWS AWS
    • Python Python
    • SQL SQL
    • Google Cloud Google Cloud
    • Apache Airflow Apache Airflow
    • Snowflake Snowflake
  • Big Data Developer

    EPAM Systems - 1 year 11 months

    EPAM is a global engineering and consulting firm delivering cloud, big data, and platform modernization for enterprise clients.

    • Built sophisticated Spark jobs on a custom internal framework, executed on AWS EMR clusters.
    • Designed ETL flows from heterogeneous data sources, applied cleaning and curation logic, and delivered final datamarts.
    • Implemented recursive algorithms for complex transformations, significantly accelerating bottleneck jobs.
    • Performed Spark and EMR performance tuning, reducing job runtimes to meet fixed ETAs.
    • Developed Airflow DAGs to orchestrate multi-layer pipelines.
    • Communicated with customers and cross-functional teams for data integration and requirement analysis.

    As Dev Lead:

    • Led task allocation, peer reviews, and onboarding for new engineers.
    • Provided architecture guidance and ensured alignment with enterprise engineering standards.
    • Supported teams with hands-on technical assistance across Spark, Airflow, and AWS EMR.

    Technologies:

    • Technologies:
    • AWS AWS
    • Apache Spark Apache Spark
    • Python Python
    • SQL SQL
    • Scala Scala
    • Apache Airflow Apache Airflow
    • ETL ETL
    • Team Leading
    • AWS EMR AWS EMR
  • Data Engineer / BI Developer

    AbbVie - 1 year

    • Contributed to the implementation of Palantir Foundry for enterprise data governance, modeling, and analytics.
    • Built curated datasets, operational workflows, and governed pipelines within Foundry.
    • Developed analytics-ready tables and dashboards to support commercial and regulatory operations.
    • Collaborated closely with business and data science teams to structure high-value datasets.

    Technologies:

    • Technologies:
    • Python Python
    • SQL SQL
    • Data Modeling
    • Palantir Foundry Palantir Foundry

Education

  • MSc.Petroleum Engineering

    Heriot-Watt University · 2015 - 2017

  • BSc.Engineer's degree, Pipeline Network Design, Construction and Operation

    Gubkin Russian State University of Oil and Gas (National Research University) · 2008 - 2013

Find your next developer within days, not months

In a short 25-minute call, we would like to:

  • Understand your development needs
  • Explain our process to match you with qualified, vetted developers from our network
  • You are presented the right candidates 2 days in average after we talk

Not sure where to start? Let’s have a chat