Samuel P.

Data Scientist

Samuel is a skilled Data and Business Intelligence professional with five years of experience spanning the telecom, consulting, and life sciences sectors. Holding a PhD in Statistics, he combines advanced modeling expertise with practical industry experience, effectively bridging the gap between complex analytics and actionable business insights.

Among his key achievements, Samuel designed and implemented an automated data workflow using Snowflake, Python, and Power BI, which reduced manual processing by 90%, improved data quality by 20%, and supported over 10 successful product launches within two years. This innovation significantly enhanced company-wide data accessibility and decision-making.

Recognized for his ability to translate complex data into clear, impactful insights, Samuel fosters strong cross-functional collaboration and consistently drives data-driven transformation across business functions.

Main expertise

  • R (programming language)
    R (programming language) 6 years
  • SQL
    SQL 8 years
  • Data Science 6 years

Other skills

  • dbt
    dbt 4 years
  • Data Engineering 3 years
  • Oracle
    Oracle 3 years
Samuel

Samuel P.

Italy

Get started

Selected experience

Employment

  • Data Scientist

    Telenet, Belgium - 1 year 4 months

    • Optimized a production-grade behavioral classification model across 2M+ entities, identifying and flagging high-risk user anomalies that drove measurable financial impact.
    • Developed auditable ETL/ELT data pipelines using Snowflake, Python, and dbt, achieving a 92% reduction in manual data handling while ensuring data quality and regulatory compliance.
    • Collaborated with Product, Sales, and Marketing teams to design and deploy dashboards that monitored and evaluated product launch performance.

    Technologies:

    • Technologies:
    • Qlik Sense Qlik Sense
    • Python Python
    • SQL SQL
    • Oracle Oracle
    • Microsoft Power BI Microsoft Power BI
    • Data Science
    • Google Cloud Google Cloud
    • Pandas Pandas
    • Data Engineering
    • Git Git
    • Scrapy Scrapy
    • Scikit-learn Scikit-learn
    • ELT
    • Data Analytics
    • PL/SQL PL/SQL
    • Snowflake Snowflake
    • Data Modeling
    • ETL ETL
    • Machine Learning Machine Learning
    • Streamlit Streamlit
    • dbt dbt
    • Large Language Models (LLM) Large Language Models (LLM)
    • Data Quality
    • GitHub Copilot GitHub Copilot
    • DAX
    • Microsoft Excel Microsoft Excel
  • Data Engineer

    Sopra Steria, Belgium - 1 year 2 months

    • Designed and developed end-to-end data pipelines in Azure and Databricks, integrating cross-functional datasets at scale and reducing analysis time by 40%.
    • Designed and optimized data models for enterprise-level reporting, ensuring robust data governance and standardized KPIs across multiple geographies.
    • Improved cross-country reporting consistency, reducing data discrepancies by 25% and strengthening reporting reliability.

    Technologies:

    • Technologies:
    • MongoDB MongoDB
    • Databricks Databricks
    • Python Python
    • SQL SQL
    • Azure Azure
    • Azure Blob storage Azure Blob storage
    • Microsoft Power BI Microsoft Power BI
    • Data Science
    • Google Cloud Google Cloud
    • Azure Data Factory Azure Data Factory
    • NumPy NumPy
    • Pandas Pandas
    • Data Engineering
    • BigQuery BigQuery
    • Git Git
    • ELT
    • Apache Airflow Apache Airflow
    • Data Analytics
    • Azure Synapse Azure Synapse
    • Azure Cloud Azure Cloud
    • Data Modeling
    • ETL ETL
    • Machine Learning Machine Learning
    • Tableau Tableau
    • Streamlit Streamlit
    • NoSQL
    • dbt dbt
    • Microsoft Power Automate Microsoft Power Automate
    • Large Language Models (LLM) Large Language Models (LLM)
    • Microsoft Fabric Microsoft Fabric
    • PySpark PySpark
    • Data Quality
    • GitHub Copilot GitHub Copilot
    • DAX
    • Microsoft Excel Microsoft Excel
  • Data Scientist

    University of Trento, Italy - 1 year 9 months

    • Significantly improved data validation protocols, ensuring dataset integrity for critical decision support systems in alignment with Ethical and Responsible AI standards.
    • Trained and validated predictive AI/ML models on complex biomedical datasets, improving diagnostic accuracy by 5%.
    • Adapted analytical workflows to be scalable and cloud-ready within Python and R frameworks, reducing computing time by 53%.

    Technologies:

    • Technologies:
    • MongoDB MongoDB
    • AWS AWS
    • Python Python
    • SQL SQL
    • Bash Bash
    • Data Science
    • NumPy NumPy
    • Pandas Pandas
    • R (programming language) R (programming language)
    • Git Git
    • SciPy SciPy
    • Scikit-learn Scikit-learn
    • Data Analytics
    • Data Modeling
    • Machine Learning Machine Learning
    • Tableau Tableau
    • NoSQL
    • Data Quality
    • Microsoft Excel Microsoft Excel
  • PhD Statistician

    University of Gothenburg, Sweden - 4 years 4 months

    • Developed statistical and machine learning frameworks for high-dimensional genomic and environmental data, resulting in five peer-reviewed publications.
    • Optimized regression, Bayesian, and machine learning models for biological datasets, improving predictive accuracy by 12%.

    Technologies:

    • Technologies:
    • Python Python
    • SQL SQL
    • Bash Bash
    • Data Science
    • NumPy NumPy
    • Pandas Pandas
    • Data Engineering
    • R (programming language) R (programming language)
    • Git Git
    • Scikit-learn Scikit-learn
    • Data Analytics
    • Data Modeling
    • Machine Learning Machine Learning
    • Microsoft Power Platform Microsoft Power Platform
    • PySpark PySpark
    • Data Quality
    • Microsoft Excel Microsoft Excel

Education

  • Doctor Of PhilosophyStatistics

    University of Gothenburg, Sweden · 2016 - 2021

  • MSc.Computer Science

    Uppsala University, Sweden · 2014 - 2016

  • BSc.Biology

    University of Padova, Italy · 2011 - 2014

Portfolio

  • Breweries Go-To-Market - 1
  • Breweries Go-To-Market - 2
  • Telecom Initiative Performance Dashboard - 1
  • Telecom Initiative Performance Dashboard - 2
  • Outlook Calendar Notes - 1

Find your next developer within days, not months

In a short 25-minute call, we would like to:

  • Understand your development needs
  • Explain our process to match you with qualified, vetted developers from our network
  • You are presented the right candidates 2 days in average after we talk

Not sure where to start? Let’s have a chat