GiulioQuaglia
AI and Data Engineer specialising in cloud-based data platforms, ETL pipeline design, and machine learning integration. Working across Azure, Databricks, Python, and Spark to build reliable, scalable data systems for the finance sector.
About me
background & motivations
I'm a Cloud Data Engineer based in Bologna, Italy, with roots in Biomedical Engineering. My academic path took me from Ancona to Turin, where I completed a 110/110 Master's at Politecnico di Torino with an AI and Data orientation, sharpening my focus on machine learning, signal processing, and data systems for healthcare applications.
Today I work at Prometeia, designing scalable ETL pipelines and data validation frameworks for the banking sector. Before that, I spent two years at Iconsulting working across the full data stack: from gathering requirements and designing data models, to building ETL pipelines, migrating legacy architectures to cloud-native Azure solutions, and delivering BI dashboards for various clients across different industries.
Away from the terminal, I'm an endurance athlete and triathlete. The discipline I bring to a long-distance race, patience, iteration, and knowing when to push, shapes how I approach complex engineering problems.
Work Experience
career timeline
- Design and implement scalable data pipelines using Azure Data Factory and Databricks for financial analytics use cases.
- Develop production-grade Python solutions for cloud-based data processing, working alongside software engineers and data scientists.
- Build and optimise data models ensuring efficiency, reliability, and regulatory compliance for banking sector clients.
- Implement automated data quality checks, model monitoring, and exception reporting pipelines for risk and compliance teams.
- Gathered and analysed business requirements, translating them into data architecture decisions and ETL pipeline designs for data warehousing solutions.
- Designed and implemented end-to-end ETL workflows using SAP Data Services, IBM InfoSphere DataStage, Azure Data Factory, and Databricks across multiple client projects.
- Led the migration of legacy SAP Data Services pipelines to Azure Data Factory and Databricks, including SQL-to-Spark SQL translation and performance optimisation.
- Built Power BI dashboards and data models providing unified visibility across economic, financial, operational, and environmental KPIs for executive stakeholders.
- Diagnosed and resolved data stream and pipeline issues, improving system stability and overall performance in production environments.
- Collaborated with BI analysts, data architects, and client teams across the financial, utilities, and retail sectors.
- Designed and implemented a novel uncertainty quantification method integrated into the DeepSleepNetLite CNN architecture for automated sleep stage classification.
- Improved reliability and interpretability of model predictions, enabling clinicians to assess confidence levels before acting on AI-generated outputs.
- Conducted the full research cycle: literature review, model design, training, validation, and academic write-up. Graduated 110/110.
- Design and build custom websites for small businesses and professionals, handling both frontend implementation and hosting setup.
- Supported high school and university students in learning programming fundamentals, covering Python, C, and C++ over four years.
Selected Projects
engineering, research & personal work
Statistical Model Validation for Anti-Fraud Systems
Python framework for validation and monitoring of statistical fraud detection models in production banking. Automates data quality checks, model control processes, and exception reporting pipelines to ensure regulatory compliance across complex model estates.
SAP ETL to Azure Migration & BI Dashboard
Full migration of legacy SAP Data Services ETL pipelines to Azure Data Factory and Databricks, including SQL-to-Spark SQL translation and performance tuning. Delivered an executive Power BI dashboard consolidating economic, financial, weather, and internal KPI data streams.
Multi-Agent CodeLlama
Autonomous code generation and validation loop with Developer + Reviewer LLM agents built on CodeLlama-7B/13B. Integrates unit tests, iterative feedback, and deterministic sampling to ensure output quality. Fully offline Docker stack optimised for CPU-only execution.
Uncertainty Quantification for DeepSleepNet Lite
Novel uncertainty quantification method integrated into the DeepSleepNetLite CNN for automated sleep stage classification. Improved reliability and interpretability of predictions for clinical decision-making. Full research and validation cycle. Graduated 110/110.
OpenWeather API to MySQL Pipeline
End-to-end data pipeline that fetches hourly weather data from the OpenWeatherMap API, stores it in a MySQL database, and analyses it with Python and Pandas. Containerised with Docker for reproducibility and scheduled execution.
Data Warehouse Migration for BI Performance
Strategic data warehouse migration to improve BI department performance. Redesigned data models, streamlined ETL workflows with SAP Data Services and Oracle DB, and led performance tuning and testing phases. Produced detailed technical documentation for ongoing maintenance.
Skills & Education
technical stack & academic background
Endurance athlete and triathlete. Sprint, Olympic, and Half-Ironman distances completed. That same discipline shapes how I approach long engineering challenges. Active contributor to open-source data and ML communities.
The Blog
machine learning, data engineering, biomedical ai, finance, sport & more
// I have ADHD and I write about everything I study. ML/AI, data engineering, software, biomedical applications, finance, crypto, training, nutrition. If it caught my attention long enough, there's probably a post about it.
5 things nobody tells you before migrating a SAP pipeline to Azure
Time windows, NULL semantics in Spark SQL, and why your test always passes but production doesn't. An honest diary from someone who lived it.
What an Ironman and a data pipeline have in common
Pacing, iteration, and the art of not giving up at km 30. Reflections on discipline and engineering.
Running LLMs locally for code review: is it worth it?
A year after building an offline Developer-Reviewer agent with CodeLlama. What worked, what didn't.
Get in Touch
open to interesting conversations
Good data projects start with good conversations.
Whether you're working on a data architecture problem, exploring a collaboration, or just want to talk pipelines, models, or triathlon race strategy. I'm always happy to hear from you.