NEW
Proxify is bringing transparency to tech team performance based on research conducted at Stanford. An industry first, built for engineering leaders.
Learn more
João M.
Deep Learning Research Engineer
João is a Deep Learning Engineer at ASML with over 10 years of experience in artificial intelligence.
He specializes in building advanced models, including large language models (LLMs), capable of code refactoring, bug detection, and continual learning. João works extensively with PyTorch and deploys models on cloud platforms and high-performance computing systems.
Prior to ASML, he led research teams at GAIPS Lab, published in leading AI conferences, and secured competitive grants from the U.S. Air Force and FCT. He also taught AI courses, earning a Teaching Excellence Award for his contributions to education.
João’s key projects include advancing continual learning techniques, enabling AI to acquire new knowledge without forgetting previous tasks, and applying reinforcement learning to train models more efficiently with less data. He is passionate about making AI systems more effective, practical, and continually improving.
Main expertise
- Python 10 years

- Machine Learning 10 years

- Data Science 10 years
Other skills
Selected experience
Employment
AI Researcher
Utrecht University - 4 months
Developing a production-oriented AI system for personalised patient interventions in dementia care, combining reinforcement learning with large language models. Designed and built ToneRL, a hybrid architecture where a lightweight RL agent learns to control an LLM’s output in real time, adapting communication style to individual patients based on clinical feedback, without retraining or modifying the underlying model. Collaborating directly with clinicians and linguists to ensure outputs meet clinical communication standards. The architecture is designed for scalable deployment: one frozen base model serves all patients, with per-patient adaptation handled by lightweight policy instances requiring minimal compute.
Technologies:
- Technologies:
MongoDB
Docker
- Project Management
- Budget Management
Python
C
C++
- Data Science
TensorFlow
NumPy
OpenCV
Keras
Pandas
- MLOps
Open source
LaTeX
PyTorch
PyCharm
- Unit Testing
JUnit
Git
Unix
SciPy
Scikit-learn
Matplotlib
Convolutional neural network
- Recurrent neural network
- Transformer Network
- NLP
Machine Learning
- Performance Testing
- Computer Vision
Cuda
OpenAI API
- Prompt Engineering
Ollama
- Neural Network
Security
Large Language Models (LLM)
Hugging Face Transformers
- JAX
- MLflow
PyTorch Lightning
Hugging Face
- LoRA
- QLoRA
AI
AI Engineering
vLLM
Agentic AI
Deep Learning Research Engineer
ASML - 1 year 2 months
- Led a research team on the "LLMs for Software Engineering" project, focusing on technical debt reduction, bug detection, and documentation analysis using Large Language Models.
- Designed, implemented, trained, tested, and deployed LLMs for automatic code refactoring and bug detection.
- Deployed models to cloud production environments and HPC distributed computing clusters.
- Monitored the continual performance of deployed models using tools such as MLFlow, Sacred, and Weights & Biases.
- Connected the company’s research department with academic partners at TU/e.
Technologies:
- Technologies:
Docker
Java
Flask
Python
C++
AWS S3
Azure
- Data Science
Google Cloud
TensorFlow
NumPy
OpenCV
XGBoost
Keras
Caffe
Matlab
Pandas
Linux
- MLOps
Open source
LaTeX
PyTorch
PyCharm
- Unit Testing
Git
- Command-line interface
Unix
SciPy
Scikit-learn
Matplotlib
Azure ML
Random Forest
- Clustering
- SVM
- PCA
Convolutional neural network
- Recurrent neural network
- Transformer Network
- NLP
Machine Learning
- Automation Testing
- Boost
Cuda
Pytest
Apache Flink
YAML
OpenAI API
- Prompt Engineering
Julia
- Mojo
Stable Diffusion
- Neural Network
Large Language Models (LLM)
- JAX
- MLflow
PyTorch Lightning
- Slurm
Hugging Face
Deep Learning Research Engineer
GAIPS Research - 5 years 8 months
- Designed, implemented, trained, tested, and deployed state-of-the-art deep learning architectures, including Actor-Critics, DQNs, and LLMs, using convolutional, recurrent, and attention-based mechanisms for feature extraction across a wide range of tasks.
- Deployed models to cloud production environments on platforms such as Google Cloud, Amazon AWS, and Slurm HPC distributed computing clusters.
- Monitored the continual performance of deployed models using tools like MLFlow, Sacred, and Weights & Biases.
- Assembled the company’s HPC Slurm cluster.
- Led five research teams as first author, publishing a research paper for each in top-tier AI venues, including AAAI, IJCAI, ECAI, the Artificial Intelligence Journal, and PLoS One Journal.
- Presented AI research at top-tier international conferences such as AAAI, IJCAI, and ECAI.
- Secured two competitive funding grants, one from the U.S. Air Force Office of Scientific Research and another from the Portuguese Foundation for Science and Technology (FCT).
- Received the Best Paper award for the project “Helping People On The Fly: Ad Hoc Teamwork for Human-Robot Teams.”
Technologies:
- Technologies:
Docker
Python
- Data Science
Joomla
NumPy
OpenCV
XGBoost
Keras
- MLOps
Open source
PyTorch
PyCharm
Git
- Command-line interface
SciPy
Scikit-learn
Matplotlib
Azure ML
Convolutional neural network
- Recurrent neural network
- Transformer Network
Machine Learning
- Computer Vision
- Boost
Cuda
Pytest
YAML
OpenAI API
- Neural Network
Hugging Face Transformers
- JAX
- MLflow
PyTorch Lightning
- Slurm
Hugging Face
Software Engineer
Thales - 8 months
- Reduced technical debt and increased overall test coverage of the Top Sky Tower solution, a tool for air traffic controllers to manage electronic strips.
- Implemented and tested critical security detection systems.
Technologies:
- Technologies:
Java
C++
C#
WPF
Software Engineer
IST IT Department - 1 year
- Trained a Convolutional Neural Network to classify valid identity card images.
- Implemented software for automatic and periodic backups of the university’s records to the AWS cloud.
- Re-implemented legacy software using modern technologies such as Scala and Kotlin.
Technologies:
- Technologies:
Java
Python
AWS S3
Scala
Kotlin
TensorFlow
Keras
PyTorch
- Computer Vision
Software Engineer
DV Trading LLC - 2 months
• Implemented graphical user interfaces using WPF and .NET for the trading team • Refactored and optimized code in several legacy projects, increasing overall performance of proprietary trading tools by up to 30%
Technologies:
- Technologies:
Java
C++
C#
.NET
WPF
Laravel Developer
Systems Group - 4 months
Designed and developed a website for the Trainees project - matchmaking companies and near-graduates from the Portuguese ESHTE
Technologies:
- Technologies:
PHP
Laravel
MySQL
MariaDB
MongoDB
Docker
Education
Standalone courseComputer Science
Delft University of Technology · 2023 - 2023
Doctor Of PhilosophyComputer Science
Instituto Superior Técnico · 2019 - 2025
MSc.Information Systems and Computer Engineering
Instituto Superior Técnico · 2016 - 2018
BSc.Information Systems and Computer Engineering
Instituto Superior Técnico · 2012 - 2016
Portfolio
Find your next developer within days, not months
In a short 25-minute call, we would like to:
- Understand your development needs
- Explain our process to match you with qualified, vetted developers from our network
- You are presented the right candidates 2 days in average after we talk




