AI and Data Engineer specialising in cloud-based data platforms, ETL pipeline design, and machine learning integration. Working across Azure, Databricks, Python, and Spark to build reliable, scalable data systems for the finance sector.
background & motivations
I'm a Cloud Data Engineer based in Bologna, Italy, with roots in Biomedical Engineering. My academic path took me from Ancona to Turin, where I completed a 110/110 Master's at Politecnico di Torino with an AI and Data orientation, sharpening my focus on machine learning, signal processing, and data systems for healthcare applications.
Today I work at Prometeia, designing scalable ETL pipelines and data validation frameworks for the banking sector. Before that, I spent two years at Iconsulting working across the full data stack: from gathering requirements and designing data models, to building ETL pipelines, migrating legacy architectures to cloud-native Azure solutions, and delivering BI dashboards for various clients across different industries.
Away from the terminal, I'm an endurance athlete and triathlete. The discipline I bring to a long-distance race, patience, iteration, and knowing when to push, shapes how I approach complex engineering problems.
career timeline
engineering, research & personal work
Python framework for validation and monitoring of statistical fraud detection models in production banking. Automates data quality checks, model control processes, and exception reporting pipelines to ensure regulatory compliance across complex model estates.
Full migration of legacy SAP Data Services ETL pipelines to Azure Data Factory and Databricks, including SQL-to-Spark SQL translation and performance tuning. Delivered an executive Power BI dashboard consolidating economic, financial, weather, and internal KPI data streams.
Autonomous code generation and validation loop with Developer + Reviewer LLM agents built on CodeLlama-7B/13B. Integrates unit tests, iterative feedback, and deterministic sampling to ensure output quality. Fully offline Docker stack optimised for CPU-only execution.
Novel uncertainty quantification method integrated into the DeepSleepNetLite CNN for automated sleep stage classification. Improved reliability and interpretability of predictions for clinical decision-making. Full research and validation cycle. Graduated 110/110.
End-to-end data pipeline that fetches hourly weather data from the OpenWeatherMap API, stores it in a MySQL database, and analyses it with Python and Pandas. Containerised with Docker for reproducibility and scheduled execution.
Strategic data warehouse migration to improve BI department performance. Redesigned data models, streamlined ETL workflows with SAP Data Services and Oracle DB, and led performance tuning and testing phases. Produced detailed technical documentation for ongoing maintenance.
technical stack & academic background
Endurance athlete and triathlete. Sprint, Olympic, and Half-Ironman distances completed. That same discipline shapes how I approach long engineering challenges. Active contributor to open-source data and ML communities.
machine learning, data engineering, biomedical ai, finance, sport & more
// I have ADHD and I write about everything I study. ML/AI, data engineering, software, biomedical applications, finance, crypto, training, nutrition. If it caught my attention long enough, there's probably a post about it.
Time windows, NULL semantics in Spark SQL, and why your test always passes but production doesn't. An honest diary from someone who lived it.
Pacing, iteration, and the art of not giving up at km 30. Reflections on discipline and engineering.
A year after building an offline Developer-Reviewer agent with CodeLlama. What worked, what didn't.
open to interesting conversations
Good data projects start with good conversations.
Whether you're working on a data architecture problem, exploring a collaboration, or just want to talk pipelines, models, or triathlon race strategy. I'm always happy to hear from you.