NEW
Proxify is bringing transparency to tech team performance based on research conducted at Stanford. An industry first, built for engineering leaders.
Learn more
Samuel P.
Data Scientist
Samuel is a skilled Data and Business Intelligence professional with five years of experience spanning the telecom, consulting, and life sciences sectors. Holding a PhD in Statistics, he combines advanced modeling expertise with practical industry experience, effectively bridging the gap between complex analytics and actionable business insights.
Among his key achievements, Samuel designed and implemented an automated data workflow using Snowflake, Python, and Power BI, which reduced manual processing by 90%, improved data quality by 20%, and supported over 10 successful product launches within two years. This innovation significantly enhanced company-wide data accessibility and decision-making.
Recognized for his ability to translate complex data into clear, impactful insights, Samuel fosters strong cross-functional collaboration and consistently drives data-driven transformation across business functions.
Main expertise
- R (programming language) 6 years

- SQL 8 years

- Data Science 6 years
Other skills
- dbt 4 years

- Data Engineering 3 years
- Oracle 3 years
Selected experience
Employment
Data Scientist
Telenet, Belgium - 1 year 4 months
- Optimized a production-grade behavioral classification model across 2M+ entities, identifying and flagging high-risk user anomalies that drove measurable financial impact.
- Developed auditable ETL/ELT data pipelines using Snowflake, Python, and dbt, achieving a 92% reduction in manual data handling while ensuring data quality and regulatory compliance.
- Collaborated with Product, Sales, and Marketing teams to design and deploy dashboards that monitored and evaluated product launch performance.
Technologies:
- Technologies:
Qlik Sense
Python
SQL
Oracle
Microsoft Power BI
- Data Science
Google Cloud
Pandas
- Data Engineering
Git
Scrapy
Scikit-learn
- ELT
- Data Analytics
PL/SQL
Snowflake
- Data Modeling
ETL
Machine Learning
Streamlit
dbt
Large Language Models (LLM)
- Data Quality
GitHub Copilot
- DAX
Microsoft Excel
Data Engineer
Sopra Steria, Belgium - 1 year 2 months
- Designed and developed end-to-end data pipelines in Azure and Databricks, integrating cross-functional datasets at scale and reducing analysis time by 40%.
- Designed and optimized data models for enterprise-level reporting, ensuring robust data governance and standardized KPIs across multiple geographies.
- Improved cross-country reporting consistency, reducing data discrepancies by 25% and strengthening reporting reliability.
Technologies:
- Technologies:
MongoDB
Databricks
Python
SQL
Azure
Azure Blob storage
Microsoft Power BI
- Data Science
Google Cloud
Azure Data Factory
NumPy
Pandas
- Data Engineering
BigQuery
Git
- ELT
Apache Airflow
- Data Analytics
Azure Synapse
Azure Cloud
- Data Modeling
ETL
Machine Learning
Tableau
Streamlit
- NoSQL
dbt
Microsoft Power Automate
Large Language Models (LLM)
Microsoft Fabric
PySpark
- Data Quality
GitHub Copilot
- DAX
Microsoft Excel
Data Scientist
University of Trento, Italy - 1 year 9 months
- Significantly improved data validation protocols, ensuring dataset integrity for critical decision support systems in alignment with Ethical and Responsible AI standards.
- Trained and validated predictive AI/ML models on complex biomedical datasets, improving diagnostic accuracy by 5%.
- Adapted analytical workflows to be scalable and cloud-ready within Python and R frameworks, reducing computing time by 53%.
Technologies:
- Technologies:
MongoDB
AWS
Python
SQL
Bash
- Data Science
NumPy
Pandas
R (programming language)
Git
SciPy
Scikit-learn
- Data Analytics
- Data Modeling
Machine Learning
Tableau
- NoSQL
- Data Quality
Microsoft Excel
PhD Statistician
University of Gothenburg, Sweden - 4 years 4 months
- Developed statistical and machine learning frameworks for high-dimensional genomic and environmental data, resulting in five peer-reviewed publications.
- Optimized regression, Bayesian, and machine learning models for biological datasets, improving predictive accuracy by 12%.
Technologies:
- Technologies:
Python
SQL
Bash
- Data Science
NumPy
Pandas
- Data Engineering
R (programming language)
Git
Scikit-learn
- Data Analytics
- Data Modeling
Machine Learning
Microsoft Power Platform
PySpark
- Data Quality
Microsoft Excel
Education
Doctor Of PhilosophyStatistics
University of Gothenburg, Sweden · 2016 - 2021
MSc.Computer Science
Uppsala University, Sweden · 2014 - 2016
BSc.Biology
University of Padova, Italy · 2011 - 2014
Portfolio
Find your next developer within days, not months
In a short 25-minute call, we would like to:
- Understand your development needs
- Explain our process to match you with qualified, vetted developers from our network
- You are presented the right candidates 2 days in average after we talk





