Proposals
Audio
Field | Contact | Title | Thesis type | Description |
---|---|---|---|---|
Audio signal processing | Pezzoli | Modeling and characterization of sound source directivities | Full thesis, Short thesis | The directivity is an inherent property of every sound source (e.g., a musical instrument). The goal of this thesis is to define suitable models for the directivity of sound sources which can be used when simulating directivities. Long or short thesis depends on the depth and novelty of the analysis. [Required knowledge: machine learning, basic knowledge of statistical signal processing, spherical harmonics decomposition of sound field]. |
Audio Signal Processing | Bernardini, Pezzoli | Spatial audio in networked music performance applications | Full thesis | In this thesis, novel approaches to the processing of multichannel audio contents will be investigated. In particular, the main goal is to enhance the immersivity in the context of networked music performance. [Required knowledge: audio signal processing, ambisonics, binaural rendering] [Additional skills: VR] |
Audio Signal Processing | Bernardini | Unsupervised Speech Quality Estimation (in collaboration with Fraunhofer IIS, Erlangen) | Full thesis | Quantifying the quality of a speech signal is a challenging and complex problem due to the sophisticated structure of speech signals and the subjective nature of speech quality, influenced by cognitive factors. This challenge increases in the absence of a clean reference signal, a scenario of significant practical relevance. Recently, self-supervised and non-intrusive audio quality measures have gained popularity, driven by the impressive performance of self-supervised methods in tasks such as automatic speech recognition. This thesis aims to explore the application of self-supervised methods for speech signal quality estimation. Specifically, it involves obtaining subjective listening scores and validating the performance of non-intrusive speech quality measures using the acquired scores. These scores will be utilized to develop a self-supervised speech quality metric based on compressed latent space representations of the speech signal. The student undertaking this research is expected to have a foundational understanding of speech and audio signal processing, as well as deep learning. Practical skills required include proficiency in Python, along with experience in using the numpy and PyTorch libraries. [The thesis will be developed in collaboration with Fraunhofer IIS, Erlangen.] |
3D audio | Pezzoli, Greco | Novel approach to parametric soundfield reconstruction | Full thesis | Investigate a method for reconstructing sound fields at arbitrary locations using data from spatially distributed microphone arrays. Optimized for reverberant environments, this approach models the acoustic scene through parameters that define direct and diffuse sound components, capturing source location and directivity. This enables precise reconstruction, with parameters estimated in a relative coordinate system to support scalable and distributed processing. |
3D audio | Pezzoli | Physics-informed deep prior for sound field reconstruction | Full thesis | The estimation of sound field provided by neural networks such has deep prior, can potentially diverge from the underlying physics. This thesis aims at defining novel paradigm for sound field reconstruciton that leverage on the generational power of neural networks and prior knowledge of physics. [Required knowledge: deep learning, acoustics] [Thesis in collaboration with Prof. S. Koyama of the National Institute of Informatics - Tokyo.] |
3D audio | Pezzoli | Ray-space-based kernel interpolation | Full thesis | Sound field reconstruction is at the base of several spatial audio applications. In this thesis the combination of sound field representations and kernel interpolation enables to overcome current limitations of sound field reconstruction with potential benefits in severl applications. [Required knowledge: acoustics] [Thesis in collaboration with Prof. S. Koyama of the National Institute of Informatics - Tokyo.] |
3D audio | Pezzoli | Physics-informed deep kernel interpolation for sound field reconstruction | Full thesis | Sound field reconstruction is at the base of several spatial audio applications. Deep kernel learning has potential application for the reconstruction of the sound field thanks to the possibility of adopting physics-informed neural network in order to impose prior knowledge of the acoustics. The thesis aims to develop novel deep kernel models for the reconstruction of acoustic fields. [Required knowledge: deep learning, acoustics] [Thesis in collaboration with Prof. S. Koyama of the National Institute of Informatics - Tokyo.] |
Audio Signal Processing | Bernardini | Strategies for Clipping Prevention in Dynamic Sound Filtering | Full thesis | The thesis aims to validate a method capable of predicting the occurrence of clipping at the output of a network of parametric digital filters, typically used in digital audio effects. If validated, this method would enable us to continuously monitor the values that a filtering parameter can assume without causing clipping. The student will assess the effectiveness of the method, particularly in parametric equalizers, highlighting aspects of robustness and weaknesses in specific implementations when their parameters are altered during equalization. |
Audio Signal Processing | Mezza, Bernardini | Deep Packet Loss Concealment for Speech and Music | Full thesis | Audio communications over the Internet have become an integral part of everyday life. However, speed is often prioritized over reliability in order to respect strict real-time constraints. Consequently, short audio segments (packets) risk being severely delayed or lost. We recently developed deep Packet Loss Concealment (PLC) methods, as well as hybrid PLC algorithms combining signal-processing and deep-learning techniques in a synergistic way. In this thesis, we will explore its performance on speech and/or music signals. The thesis won't deal with network-related and other IP-related aspects. [Required knowledge: signal processing theory; practical experience with deep learning.] |
Audio Signal Processing | Mezza, Giampiccolo, Bernardini | Music Demixing | Full thesis | Music Demixing (MDX) refers to a set of novel techniques aimed at separating and extracting the audio stems instruments that makes up a given song. Think of Spleeter by Deezer, Meta's Demucs, or the Ultimate Vocal Remover. The MDX field is very much growing in the past few years, but the problem is long from being solved. In this thesis, we will develop cutting-edge deep-learning models for demixing music signals and drums recordings. [Required knowledge: experience with deep learning tools and libraries] |
Audio signal processing | Pezzoli | Modeling and characterization of HRTF | Full thesis, Short thesis | The HRTFs are individualized acoustic characteristics of human ears. The goal of this thesis is to define suitable models for the HRTFs which can be used when simulating sound fields. Long or short thesis depends on the depth and novelty of the analysis. [Required knowledge: machine learning, basic knowledge of statistical signal processing, spherical harmonics decomposition of sound field]. |
3D audio | Pezzoli, Antonacci | Nearfield filter for spherical microphone array recordings | Full thesis | Spherical Microphone Arrays (SMA) are very suitable for binaural rendering and in general for spatial audio applications. In this thesis we are interested in developing new methods to filter the undesired signals in a nearfield region of the SMA. |
Audio Signal Processing | Pezzoli, Ostan | Development of acoustic simulation framework for GPU | Short thesis | Parallel implementations speed up the computation of acoustic simulations. In this short thesis it is required to develop a Room Impulse Response renderer for Spherical Microphone Arrays for GPUs. The software will be preferrabily developed with CUDA and Python. Required knowledge: computational acoustics (RIRs, Image Source Method, ...) and experience with practical coding. |
Musical Acoustics | Pezzoli, Cillo, Longo | Enhancement of a reduced-order finite-element model of a classical guitar | Full/short thesis | [Thesis abroad at the Institute of Engineering and Computational Mechanics (ITM), University of Stuttgart, Germany.] A recently developed high-fidelity finite-element guitar model combined with experimental modal analysis can successfully identify the material characteristics of already existing instruments. Parametric Model Order Reduction (PMOR) is applied to significantly reduce the computational time of the model. During the PMOR procedure, minor simplifications to the model need to be undertaken, leading to deviations of the reduced-order model from the original model. This thesis aims to enhance the reduced-order model via optimization and/or data-driven methods to compensate for the error term resulting from the simplifications in the reduced-order model. Required knowledge: foundations on Finite Element Methods, in depth-knowledge of deep learning (long version). |
Audio forensics | Bestagini, Antonacci | Detection of text-to-speech algorithms | Full thesis | Nowadays, text-to-speech and voice conversion algorithms are able to produce very realistic speech signals, which can easily trick human ear. Moreover, this technology is in rapid evolution and it is not possible to take in account all new synthesis methods. It is necessary to develop effective synthetic speech detection systems able to work in open-set scenarios. |
Audio signal processing | Pezzoli, Antonacci | Deep learning solution for localization of acoustic sources in the spherical harmonics domain | Full thesis | The spherical harmonics representation of the sound field is a widely adopted description of spatial sound. The goal of this thesis is to devise deep learning solutions that exploit the spherical harmonics representation for the analysis of the acoustic field e.g., localization of acoustic sources. [Required knowledge: spherical harmonics decomposition of sound field, theoretical knowledge and pratical experience with deep learning] |
Audio signal processing | Giampiccolo, Bernardini | Kolmogorov-Arnold Networks for Virtual Analog Modeling | Full thesis | The thesis concerns the study and implementation of Kolmogorov-Arnold neural networks for the emulation of audio circuits in the Wave Digital domain. Kolmogorov-Arnold networks reverse the Multi-layer Perceptron's paradigm by learning activation functions rather than weights and biases. We envision to apply such a fascinating theory in the context of Virtual Analog applications. |
Audio signal processing | Giampiccolo, Bernardini | Gradient Descent Methods for the Emulation of Nonlinear Audio Circuits | Full thesis | The thesis concerns the study and implementation of gradient methods for the emulation of audio circuits in the Wave Digital domain. In fact, in presence of multiple nonlinear elements, iterative methods are needed to find the solution of nonlinear circuits. Over the past few years, several iterative techniques have been considered, namely fixed-point, Netwon-Raphson methods, etc. We propose to change paradigm and explore new ways for solving circuits in order to find the cheapest one as this is desired for the real-time emulation of audio circuits in the context of Virtual Analog applications. |
Audio signal processing | Giampiccolo, Bernardini | Virtual Analog Modeling, Audio Circuit Emulation, Physical Modeling Sound Synthesis through Wave Digital Filters | Full thesis | |
Musical Acoustics | Antonacci, Olivieri, Pezzoli | Transfer Learning techniques for Nearfield Acoustic Holography analysis | Full/short thesis | Recent data-driven based NAH methods can predict the vibrational behavior on sources from the acquisition of the radiated sound field. Nevertheless, these approaches are dependent on the training dataset used (i.e., acquisition setup and vibrational source). This thesis aims at extending the recent solutions with transfer learning strategies in order to tune the networks with different data and improving the model with specific physical priors to reconstruct the vibrational content with an unsupervised approach (long thesis). [knowledge of Deep Learning required] |
Musical Acoustics | Gonzalez, Malvermi, Antonacci | Experimental measurement and construction of violin top plates | Full/short thesis | The aim of this thesis is twofold: measure the material properties of violin top plates and build violin top plates with certain material properties. For this the student will use a CNC router to build the plates and a experimental set up that measures the FRF of the plate to compute its material parameters. The goal is to be able to produce top plates with a defined mechanical response irrespective of the varying material parameters of the wood the top is made of. Experimental thesis in Cremona Campus, FEM modelling required, Fusion 360 optional. |
Musical Acoustics | Gonzalez, Antonacci | Role of f-hole design in stress distribution and radiation of the violin | Long thesis | The role of the f-holes in violins is to let the sound vibrations leave the body of the instrument and reach the audience. However, cutting holes in the top plate weakens it. By cutting curves and circles, the instrument maker avoids creating the stress concentrations associated with sharp corners. The aim of this thesis is to study the behaviour of a violin for different f-hole designs/locations. Comsol experience needed. |
Musical Acoustics | Gonzalez, Malvermi, Antonacci | Effect of tailpiece height in the acoustic response of a violin | Long thesis | Varying the height of the tailpiece is one of the ways luthiers can control the sound production of the violin. By changing the angle of the strings, there is a modification in the effective pressure that the bridge, and consequently the violin top plate, feels. This compression of the violin is believed to affect the sound production of the instrument. This thesis aims to study, by means of simulations, the effect the net static force in the bridge has in the dynamics of the instrument. If time allows the thesis could also include experimental measurements with the help of Amorim fine violins. |
Musical Acoustics | Gonzalez, Longo, Antonacci | Linear interpolation between shapes in western guitars | Long thesis | In one of our last thesis projects we have developed a completely parametric model of the guitar. The objective of this thesis is to study how vibrational characteristics change when smoothly varying the shape of a guitar between standard models, say between a Jumbo and a Dreadnought. The work involves the creation of different virtual models and its study with Comsol multiphysics. |
Musical Acoustics | Gonzalez, Greco, Antonacci | Timbral Study of 3D printed organ pipes | Long thesis | Recently, researchers have presented a theoretical model to understand the timbre of the organ by mapping its sound to a bi-dimensional map in the spectral-centroid and envelope slope of the spectra. This thesis wants to study how geometric variations in 3D printed organ pipes determine the location of the sound in this timbral map. |
Musical Acoustics | Gonzalez, Malvermi, Antonacci | Experimental study of wooden metamaterials | Long thesis | Experimental realisation of metamaterials for instrument making: guitar top plates, violin top plates, archtop top plates. Studies of vibrational and stiffness behaviour. Needs to live in Cremona. |
Musical Acoustics | Gonzalez, Antonacci | Developing a new Manouche guitar: studying different bracings models for the gypsy jazz icon | Long thesis | Manouche guitars are a mix between mandolins, parlour and archtop guitars. Created in Paris by Italian luthier Macaferri, they represent a particular understanding of how to make instruments. Their design takes from the parlour guitar in terms of bracing, from the archtop in its shape and floating bridge, and from the mandolin in its bent top plate. The aim of this thesis is to study, by means of simulations, different bracing patterns that could inform a new way of crafting these instruments. The selected model when then be built by one of the advisors. |
Musical Acoustics | Greco, Antonacci | Neural Network-Based Prediction of Woodwind Mouthpiece Sound Characteristics through Finite Element Method Simulations | Long thesis | This master's thesis proposes a novel approach to explore the relationship between geometric parameters of woodwind instrument mouthpieces and their corresponding sound characteristics. Employing COMSOL Multiphysics, Finite Element Method (FEM) simulations will be conducted to assess impedance variations. Simulated geometries will be transformed into transfer matrices to create a dataset for training a neural network. The objective is to develop a predictive model capable of estimating sound behavior without explicit FEM simulations, thus offering a more efficient and accessible method for instrument design and optimization. The study aims to contribute to the field of music and acoustic engineering by reducing computational costs and time associated with traditional simulation methods. |
eXplainability in Generative AI and deep learning for audio and music applications | Ronchini, Comanducci | Enhancing understanding and transparency in generative models for music through eXplainable Artificial Intelligence | Long thesis | Research in generative models for music is rapidly growing, reflecting the increasing interest in using AI for creative purposes. However, current models have significant flaws, and their black-box nature makes it hard to understand how they generate music or make predictions. This thesis aims to address these issues by applying eXplainable Artificial Intelligence (XAI) techniques to music generative models. By making these models more transparent and easier to interpret, this research hopes to deepen our understanding of how they work and contribute to advancements in creative AI. If you are interested in this thesis, please send an email to luca.comanducci@polimi.it and francesca.ronchini@polimi.it Requirements: Experience with deep learning packages. |
Ethics in deep learning for audio and music applications | Ronchini, Comanducci | Ethical aspects in generative AI for audio and music applications | Long thesis | The rapid advancement of deep learning systems has raised important ethical concerns, including issues related to increasing complexity, energy consumption, and broader societal impacts. This thesis aims to explore these ethical aspects within the context of generative AI for music or sound event detection, classification, and localization models (based on the student's interest). The research will focus on understanding the environmental and social impact of state-of-the-art models, examining critical parameters during both training and inference phases to assess their carbon footprint and/or other ethical implications (depending on the student's interests). If you are interested in this thesis, please send an email to luca.comanducci@polimi.it and francesca.ronchini@polimi.it Preferred Requirements: No requirement needed. Experience with pytorch and deep learning packages is preferred, but not required! |
Generative AI for audio and music applications | Ronchini, Comanducci | Foley sound synthesis/Sound Scene Synthesis | Long thesis | This thesis aims to explore innovative approaches for generating Foley sounds, the sound effects integrated into multimedia during post-production to enhance acoustic realism. With growing interest in AI-driven sound synthesis, generative models offer new possibilities for automating and enriching Foley sound production. The scope of this research includes investigating various techniques to generate original audio clips across diverse sound categories, with the goal of encanching deep learning techniques for Foley Sound Generation. If you are interested in this thesis, please send an email to luca.comanducci@polimi.it and francesca.ronchini@polimi.it Requirements: Experience with deep learning packages. |
Human-AI interaction in music domain | Ronchini, Comanducci | Human-AI interaction with generative models for music | Long thesis | This thesis aims to investigate human-AI interaction in the context of generative models, focusing on how these systems can realistically be integrated into creative practices. While generative models represent a significant breakthrough, their practical use by musicians and practitioners remains an open question. This research will explore how users interact with these models, studying their impact on creativity and satisfaction. The project will include user experience studies and the development of tools to enable personalized music generation, evaluating how effectively these systems meet artistic needs. If you are interested in this thesis, please send an email to. If you are interested in this thesis, please send an email to luca.comanducci@polimi.it and francesca.ronchini@polimi.it Requirements: Experience with deep learning packages. |
Generative AI for audio and music application | Ronchini, Comanducci | Generative models for Audio and Music Applications | Long thesis | This thesis aims to explore the efficient integration of generative models for various audio and music applications. Certain areas of generative models application remain underexplored, and new ways to leverage these models across different contexts need further investigation. The research will focus on how end-users can benefit from generative models, and how to enhance and apply them in previously unexplored domains. If you are interested in this thesis, please send an email to luca.comanducci@polimi.it and francesca.ronchini@polimi.it Requirements: Experience with deep learning packages. |
Music Informatics/Human-computer interaction | Ronchini, Comanducci | INTERACTION DESIGN FOR LIVE PERFORMANCE OF ACOUSMATIC MUSIC (In collaboration with Fondazione Culturale San Fedele) CURRENTLY NOT AVAILABLE | Long thesis | Experimental music and technology developments have always went hand in hand, from the development of the synthesizer to the recent introduction of artificial intelligence in music production practices. INNER SPACES is a series of events related to experimental electronic music and audiovisual arts realized in the exclusive context of the San Fedele Auditorium, in Milan. The objective of the thesis is to develop tools for augmented performance in the context of acousmatic/electronic music, with the possibility of including the developed software in the second season of INNER SPACES (spring). While the project will be focused on scientific developments related to audio and programming, a keen interest and motivation for art and experimental music is highly desired. For any information please contact: luca.comanducci@polimi.it and francesca.ronchini@polimi.it Preferred Requirements: Experience in python/supercollider/MaxMsp Interest for acousmatic/experimental electronic music |
Audio Signal Processing | Miotello, Pezzoli | Wavelet-based Deep Learning for Room Impulse Response Reconstruction | Full thesis | This thesis aims to develop a deep learning model that integrates wavelet transforms for efficient estimate Room Impulse Responses (RIRs). By combining wavelet transforms with deep learning, the proposed model seeks to capture both temporal and spectral features of RIRs more effectively. The research will involve designing a neural network architecture that incorporates wavelet analysis, and evaluating its performance against existing methods. The expected outcome is a more accurate and efficient approach to RIR reconstruction, with applications in virtual reality and acoustic simulation tools. [Required knowledge: audio signal processing, deep learning] |
Audio Signal Processing | Miotello, Pezzoli | Relative Transfer Function Estimation using Physics-Informed Models | Full thesis | Relative Transfer Functions (RTFs) are defined as the ratios of acoustic transfer functions from a sound source to multiple microphones relative to a reference microphone, effectively characterizing the relative acoustic paths between sensors. This thesis proposes estimating RTFs by integrating physical constraints of sound propagation into the estimation process, utilizing physics-informed or physics-constrained neural networks. By embedding acoustic principles into neural network models, the aim is to enhance the accuracy and robustness of RTF estimation for improved applications in acoustic signal processing. [Required knowledge: audio signal processing, deep learning] |
Audio Signal Processing | Miotello, Pezzoli | Low-Rank Adaptation for Transfer Learning of Physics-Informed Neural Networks | Full thesis | Physics-informed neural networks (PINNs) effectively model physical systems by integrating PDEs characterizing a specific domain, but cannot generalize to different problems without retraining. This thesis proposes enhancing PINNs' transfer learning capabilities in acoustic signal processing using low-rank adaptation, a technique that simplifies neural networks by approximating weight matrices with lower-rank representations. By reducing computational demands and enabling efficient adaptation to new acoustic environments with minimal retraining, we aim to develop algorithms to improve the scalability and adaptability of PINNs in practical acoustic applications. [Required knowledge: audio signal processing, deep learning] |
Audio Signal Processing | Miotello, Pezzoli | Multichannel Sound Source Separation using Diffusion Models | Full thesis | Multichannel sound source separation involves extracting individual audio sources from a mixture recorded using an array of microphones. The process is crucial for real-world applications applications like speech enhancement, telecommunication systems and immersive audio experiences, where isolating specific sounds enhances quality and intellegibility. This thesis proposes exploiting diffusion models to improve multichannel sound source separation. [Required knowledge: audio signal processing, deep learning] |
Audio Signal Processing | Miotello, Pezzoli | Room Impulse Responses Synthesis using Generative Models | Full thesis | The accurate modeling of room impulse responses (RIRs) is crucial for applications in acoustics, audio signal processing, and virtual reality. Traditional methods for obtaining RIRs involve time-consuming measurements with expensive equipment, limiting their practicality. This thesis proposes the development of a generative model that synthesizes realistic RIRs based on room characteristics using state-of-the-art deep learning techniques, such as diffusion models. [Required knowledge: audio signal processing, deep learning] |
Music Informatics/Multimedia Forensics/Generative Models | Comanducci | Music Deepfake Detection | Full thesis | Deep learning-based music generation has been recently revolutionized by the introduction of Text-To-Music models. These models are characterized by being good in terms of performance and also simple to use, lowering the technical proficiency needed to successfully interact with them. This combination of factors has made them extremely appetible to the general public and of interest by private industries. In Fact several commercial TTMs have been proposed, which recently have been sued by major record companies and have consecutively admitted to copyright infringement, by training their respective models also using unlicensed music. As both the capabilities and the commercial interest of these models grow it is becoming increasingly necessary to try to develop forensic approaches to detect and analyze music generated via TTMs. In this thesis we aim at exploring the problem of: can we detect if some music is AI-generated or not? |
Image and Video
Field | Contact | Title | Thesis type | Description |
---|---|---|---|---|
Image/video forensics | Bestagini, Mandelli, Cannas | Detect and localize image and video manipulations | Full | Images and videos can be manipulated in many different ways (e.g., object insertion and removal, local retouching, laundering operations, etc.). We are interested in developing methods to detect and localize possible editing operations on images and videos. |
Image/video forensics | Bestagini, Mandelli, Cannas | Distinguish original videos from DeepFakes | Full | DeepFake videos can be maliciously spread online. We are interested in developing techniques to detect whether a video is a DeepFake or not, why a detector says a video is fake, and understand which DeepFake generation software has been used to create a video. |
Image/video forensics | Bestagini, Mandelli, Cannas | Assess the authenticity of satellite images | Full | Satellites can acquire visual data with different sensors. We are interested in developing techniques that verify whether an overhead image has been edited or not. |
Image/video forensics | Bestagini, Mandelli, Cannas | Forensic analysis of scientific images | Full | Scientific publications in the life science area typically contain charcateristic kinds of images to showcase the achieved results (e.g., western blots, microscopy acquisitions, etc.). As these images differ from natural photographs, we are interested in developing novel techniques to detect possible scientific image forgery operations. |
Image processing | Bestagini, Mandelli, Giganti | Enhancement of emission maps | Full | Accurate BVOC emission maps are crucial for understanding their effects on air quality and climate, yet existing maps often lack the spatial resolution needed for detailed analysis. This thesis proposes using Super-Resolution Neural Networks (SRNNs) to enhance these maps by generating high-resolution data from low-resolution inputs. SRNNs can capture finer spatial details and improve the accuracy of emission maps, bridging gaps in sparse data to support high-precision environmental modeling. |
Spatiotemporal processing | Bestagini, Mandelli, Giganti | Spatiotemporal analysis of climate data | Full | Climate data analysis is hindered by complex patterns and frequent data gaps. This thesis proposes using Spatiotemporal Graph Neural Networks (STGNNs) to improve climate forecasting and data imputation by capturing spatial and temporal relationships. By testing STGNNs for predicting future climate variables and filling missing data, this research aims to enhance data accuracy and reliability in climate modeling. |
Geophysics
Contact | Title | Thesis type | Description |
---|---|---|---|
Tubaro, Bestagini | Improving Full Waveform Inversion with CNNs | Full/short | Full Waveform Inversion reconstructs the subsurface velocities from a set of measurements. It is very expensive, time-consuming and prone to a number of tips and tricks for avoiding local minima, numerical instability and optimization errors. |
Tubaro, Bestagini | Denoising and Interpolation of seismic data through CNNs | Full/short | The amount of data is constantly increasing and the areas of interest are more and more complex to analyze. Moreover, they require a subsurface mapping at increasingly higher resolution and higher fidelity. Can CNNs help this process? |
Tubaro, Bestagini | Machine Learning guided Seismic Interpretation | Full/short | Human experts visually inspects seismic images looking for subsurface features. On the other hand, Machine Learning techniques have proven to be effective in image segmentation (i.e., recognizing objects and targets from a set of pixels). Can we merge these two worlds? |
Currently on-going
Expand list
Field | Supervisor | Topic | Student(s) |
---|---|---|---|
Musical acoustics | Pezzoli, Malvermi | Statistical charcterization of directivity | Gian Marco Ricci |
Musical acoustics | Pezzoli, Malvermi | Deep prior based vibroacustic analysis | Riccardo Sebastiani Croce |
Musical acoustics | Pezzoli, Malvermi | PINN based vibroacoustic analysis | Federico Zese |
3D audio | Pezzoli, Greco | Localization of sound sources using spherical harmonics | Silvia Messena |
3D audio | Pezzoli, Ostan | Acoustic Virtual Reality evaluation system | Francesca Del Gaudio |
Audio signal processing | Massi, Giampiccolo, Bernardini | Deep Learning Models of Nonlinear Time-Varying Circuits in the Wave Digital Domain | Shijie Yang |
Audio signal processing | Giampiccolo, Bernardini | Automatic Generation of VSTs based on WDFs | Stefano Ravasi |
Audio signal processing | Giampiccolo, Bernardini | Modeling Circuits with Two Multiport Nonlinearities | Sebastian Gafencu |
Audio signal processing | Giampiccolo, Bernardini | Modeling of MOSFETs for Virtual Analog Applications | Marco Ferrè |
Audio signal processing | Massi, Giampiccolo, Bernardini | Optimization of MEMs Loudspeaker circuital models via Automatic Differentiation | Lelio Casale |
Space-time audio | Pezzoli, Greco | Sound field reconstruction for 6DoF navigation | Silvio Attolini |
Space-time audio | Antonacci, Pezzoli | Sound field separation in the spherical harmonics domain | Sagi Della-Torre |
Audio signal processing | Giampiccolo, Massi, Bernardini | Vacuum Tubes Modeling by means of Neural Networks in the Wave Digital Domain | Genis Casanova |
Music informatics | Sarti, Mezza, Bernardini | Unsupervised selection of harmonic complexity metrics | Giorgio De Luca |
Musical Acoustics | Gonzalez, Antonacci | Random variation of guitar bracings | Mattia Vanessa |
Musical acoustics | Gonzalez, Antonacci | Metamaterials for guitarmaking | Gabriele Marelli, Mattia Lercari |
Musical acoustics / AI | Gonzalez, Antonacci | AI-powered pick up: making guitars sound great again | Emanuele Voltini |
Music Informatics | Sarti, Comanducci | HandMonizer, personalized digital musical instrument design | Antonios Pappas |
Generative AI for audio | Comanducci, Ronchini | Adding temporal information and event order modeling to generative models for audio/music | Marco Furio Colombo |
Deep Learning for audio | Ronchini, Comanducci | Balance between performance end carbon footspring of state-of-the-art deep learning systems for audio domain applications | Riccardo Passoni |
Generative AI for audio/music | Ronchini, Comanducci | Generative Controllable Neural Audio Synthesis | Simone Marcucci |
DCASE | Ronchini, Comanducci, Cobos | Sound Event Detection and Localization using Mel-FSGCC | Federico Angelo Luigi Ferreri |
Generative AI for audio/music | Ronchini, Comanducci | Timbre Transfer | Guglielmo Fraticcioli |
DCASE | Comanducci | Bioacoustic detection | Nicolò Pisanu |
Past (from 2017)
Expand list
Field | Supervisor | Title | Student(s) | Link |
---|---|---|---|---|
Space-time audio | Pezzoli, Comanducci | Generative Models for HRTF prediction | Juan Camilo Albarracín Sánchez | |
Space-time audio | Pezzoli, Miotello | Spherical microphone array upsampling | Ferdinando Terminiello | |
3D audio | Pezzoli, Malvermi | Neural Network-based representation of sound source directivity | Edoardo Morena | |
Musical acoustics | Pezzoli | Nearfield Acoustic Holography solver based on Physics-Informed Neural Network | Xinmeng Luan | |
Space-time audio | Pezzoli, Miotello | Real-time microphone array rendering framework for binaural reproduction | Paolo Ostan | |
Music Informatics | Comanducci, Mezza | Impact of velocity on drum patterns perceived complexity | Gabriele Maucione | |
Audio signal processing | Giampiccolo, Bernardini | Wave Digital Models of Nonlinear Piezoelectric Loudspeakers | Armando Boemio | https://www.politesi.polimi.it/handle/10589/218000 |
Music Informatics | Comanducci, Ronchini, Zanoni | Personalized Music Generation using text-to-music models | Gabriele Perego | |
Space-time audio | Pezzoli | Analysis of the directivity of sound sources | Hou Hin Au-Yeung | |
Audio signal processing | Bernardini, Giampiccolo, Albertini | Application of antiderivative antialiasing to MOSFET elements in wave digital filters | Christian Parra | https://www.politesi.polimi.it/handle/10589/214898 |
Music Informatics | Zanoni, Comanducci | Procedural Music Generation For Video games | Francesco Zumerle | https://www.politesi.polimi.it/handle/10589/210809 |
Audio signal processing | Bernardini, Giampiccolo, Mezza | On the Use of Fundamental Frequency Estimation for Virtual Bass Enhancement | Fabio Spreafico | https://www.politesi.polimi.it/handle/10589/210018 |
Image forensics | Bestagini, Mandelli | Manipulation detection for scientific images | Giovanni Zanocco | |
Video forensics | Bestagini, Cannas | Deepfake video detection through multi-look analysis | Adriano Bonfantini | |
Video processing | Bestagini, Redondi | Automatic video analysis of badminton matches | Ivan Motasov | |
Space-time audio | Bernardini, Giampiccolo, Mezza | Designing of Scattering Delay Networks Via Automatic Differentiation | Francesco Boarino | https://www.politesi.polimi.it/handle/10589/211644 |
Audio signal processing | Bernardini, Giampiccolo | A Wave Digital Extended Fixed-Point Method for Virtual Analog Applications | Davide Marin Pasin | https://www.politesi.polimi.it/handle/10589/212614 |
Space-time audio | Antonacci, Pezzoli | DIRECTION OF ARRIVAL ESTIMATION USING CONVOLUTIONAL RECURRENT NEURAL NETWORK WITH RELATIVE HARMONIC COEFFICIENTS AND TRIPLET LOSS IN NOISY AND REVERBERATING ENVIRONMENTS | Luca Cattaneo | https://www.politesi.polimi.it/handle/10589/208311 |
Musical Acoustics | Ripamonti, Malvermi, Gonzalez | Experimental Validation for data-driven Near-field Acoustic Holography | Alessio Lampis | |
Musical Acoustics | Antonacci, Malvermi | Improved sensors for low-cost Vibrometric Kit | Fabio Guarnieri | |
Audio signal processing | Bernardini, Giampiccolo | A Wave Digital Hierarchical Quasi-Newton Method for Virtual Analog Modeling | Luca Gobbato | https://www.politesi.polimi.it/handle/10589/198537 |
Musical Acoustics | Sarti, Paoletti, Adali, Malvermi | Acoustic Characterization of materials | Marco Donzelli | |
Music Informatics | Zanoni, Comanducci | Deep Learning-based Timbre Transfer | Silvio Pol | https://www.politesi.polimi.it/handle/10589/189682 |
Audio signal processing | Antonacci, Pezzoli, Borra | A perceptual evaluation of sound field reconstruction algorithms | Miriam Papagno | https://www.politesi.polimi.it/handle/10589/186341 |
Audio signal processing | Bernardini, Giampiccolo | Characterization of Small-Size Loudspeakers for Mobile Applications | Samuele Buonassisi | https://www.politesi.polimi.it/handle/10589/189746 |
Image forensics | Bestagini, Cannas | Enhanced Amplitude SAR Imagery Splicing Localization through Land Cover Mapping Techniques | Emanuele Intagliata | |
Geophysics | Bestagini, Lipari | Salt Segmentation of Geophysical Images through Explainable CNNs | Francesco Maffezzoli | |
Music informatics | Sarti, Borrelli | Connecting NN to bio-metric signals | Joep Rene Wulms | |
Audio forensics | Bestagini, Borrelli | A metric learning approach for splicing localization based on synthetic speech detection | Francesco Castelli | https://www.politesi.polimi.it/handle/10589/184332 |
Audio forensics | Bestagini, Borrelli | Combining automatic speaker verification and prosody analysis for synthetic speech detection | Luigi Attorresi | https://www.politesi.polimi.it/handle/10589/187094 |
Music informatics | Zanoni, Borrelli | Social interaction based music recommendation system | Carlo Pulvirenti | |
Music informatics | Bestagini, Cuccovillo | Speech fingerprinting and matching for content retrieval | Laura Colzani | https://www.politesi.polimi.it/handle/10589/187212 |
Musical Acoustics | Antonacci, Olivieri | Towards white-box data-driven methods for Near-field Acoustic Holography | Hagar Kafri | |
Video forensics | Bestagini | A CNN-based detector for video frame-rate interpolation | Simone Mariani | https://www.politesi.polimi.it/handle/10589/186433 |
Image/video processing | Bestagini | Audio-video techniques for the analysis of players behaviour in Badminton matches | Samuele Bosi | https://www.politesi.polimi.it/handle/10589/186571 |
Video forensics | Bestagini, Mandelli | Forensic detection of deepfakes generated through video-to-video translation | Carmelo Fascella | https://www.politesi.polimi.it/handle/10589/182988 |
Audio signal processing | Bernardini, Mezza, Giampiccolo | Wave Digital Filter Modeling of Audio Circuits with Hysteresis Nonlinearities using Neural Networks | Oliviero Massi | https://www.politesi.polimi.it/handle/10589/186739 |
Music informatics | Antonacci, Pezzoli, Comanducci | Deep Prior Audio Inpainting | Federico Miotello | |
Audio signal processing | Bestagini, Buccoli | Low-latency speaker recognition | Francesco Salani | |
Video forensics | Bestagini, Bonettini | A Data Driven Approach to Deepfake Detection via Feature Analysis Based on Limited Data | Bingyang Hu | |
Space-time audio | Antonacci, Borrelli, Borra | Beamforming and Speaker Identification through Deep Neural Networks | Matteo Scerbo | https://www.politesi.polimi.it/handle/10589/176160 |
Music informatics | Sarti, Borrelli | Harmonic complexity estimation of jazz music | Giovanni Agosti | |
Audio forensics | Antonacci, Borrelli | A model selection method for room shape classification based on mono speech signals | Gabriele Antonacci | https://www.politesi.polimi.it/handle/10589/179887 |
Audio forensics | Bestagini | Audio splicing detection and localization based on recording device cues | Daniele Ugo Leonzio | https://www.politesi.polimi.it/handle/10589/179424 |
Audio forensics | Bestagini | Speaker-Independent Microphone Identification via Blind Channel Estimation in Noisy Condition | Antonio Giganti | https://www.politesi.polimi.it/handle/10589/179420 |
Audio forensics | Bestagini, Borrelli | Synthetic Speech Detection through Convolutional Neural Networks in Noisy Environments | Eleonora Landini | https://www.politesi.polimi.it/handle/10589/179458 |
Audio forensics | Bestagini, Borrelli, Salvi | Synthetic speech detection based on sentiment analysis | Emanuele Conti | https://www.politesi.polimi.it/handle/10589/177968 |
Multimedia forensics | Bestagini, Salvi, Borrelli | Audio-video deepfake detection through emotion recognition | Jacopo Gino | https://www.politesi.polimi.it/handle/10589/179037 |
Audio signal processing | Sarti, Giampiccolo, Bernardini | Parallel Wave Digital Implementations of Nonlinear Audio Circuits | Natoli Antonino | https://www.politesi.polimi.it/handle/10589/178037 |
Musical Acoustics | Antonacci, Malvermi | Data driven methods for frequency response functions interpolation | Matteo Acerbi | https://www.politesi.polimi.it/handle/10589/170179 |
Audio forensics | Bestagini, Mandelli | Time-Scaling Detection in Audio Recordings | Michele Pilia | https://www.politesi.polimi.it/handle/10589/173711 |
Audio forensics | Bestagini, Borrelli | Speech Intelligibility Parameters Estimation Through Convolutional Neural Networks | Mattia Papa | https://www.politesi.polimi.it/handle/10589/173756 |
Audio forensics | Antonacci | Closed and open set classification of real and AI synthesised speech | Michelangelo Medori | https://www.politesi.polimi.it/handle/10589/170094 |
Audio forensics | Antonacci | An approach to room volume estimation from single-channel speech signals based on neural networks | Castelnuovo Carlo | https://www.politesi.polimi.it/handle/10589/164749 |
Audio forensics | Bestagini | Audio Splicing Detection and Localization Based on Acoustic Cues | Capoferri Davide | https://www.politesi.polimi.it/handle/10589/164950 |
Audio processing | Sarti, Comanducci | Audio frame reconstruction from incomplete observations using Deep Learning techniques | Schils Minh Cédric | https://matheo.uliege.be/handle/2268.2/10138 |
Audio processing | Sarti, Bernardini | Wave Digital Modeling and Simulation of Nonlinear Electromagnetic Circuits | Giampiccolo Riccardo | https://www.politesi.polimi.it/handle/10589/153994 |
Audio processing | Sarti, Bernardini | Antiderivative Antialiasing in Nonlinear Wave Digital Filters | Albertini Davide | https://www.politesi.polimi.it/handle/10589/152934 |
Audio processing | Sarti, Bernardini | Wave Digital Implementation of Nonlinear Audio Circuits based on the Scattering Iterative Method | Proverbio Alessandro | https://www.politesi.polimi.it/handle/10589/152323 |
Audio processing | Antonacci | A system for super resolution vibrometric analysis through convolutional neural networks | Campagnoli Chiara | https://www.politesi.polimi.it/handle/10589/152613 |
Audio processing | Antonacci | Development of a low-cost platform for acoustic and vibrometric analysis on lutherie products, with a special focus on the estimation of the elastic parameters of the tonewood | Villa Luca | https://www.politesi.polimi.it/handle/10589/150531 |
Audio processing | Bestagini | DNN based post-filtering for quality improvement of AMR-WB decoded speech | Gupta Kishan | https://www.politesi.polimi.it/handle/10589/151000 |
Audio processing | Sarti | Studio sull'implementazione degli algoritmi per il musical instruments ed il sound reinforcement basato su un processore multicore | Aretino Michele | https://www.politesi.polimi.it/handle/10589/139079 |
Audio processing | Sarti, Bernardini | Modeling nonlinear 3-terminal devices in the wave digital domain | Vergani Alessio Emanuele | https://www.politesi.polimi.it/handle/10589/133184 |
Forensics | Bestagini | Convolutional and recurrent neural networks for video tampering detection and localization | Cannas Edoardo Daniele | https://www.politesi.polimi.it/handle/10589/149900 |
Forensics | Bestagini | A study on Bagging-Voronoi algorithm for tampering localization | Cereghetti Corinne Elena | https://www.politesi.polimi.it/handle/10589/141725 |
Forensics | Bestagini | JPEG-based forensics through convolutional neural networks | Bonettini Nicolò | https://www.politesi.polimi.it/handle/10589/133727 |
Forensics | Bestagini | Analysis of different footprints for JPEG compression detection | Chen Ke | https://www.politesi.polimi.it/handle/10589/132721 |
Geophysics | Bestagini | Landmine detection on GPR data employing convolutional autoencoder | Testa Giuseppe | https://www.politesi.polimi.it/handle/10589/142106 |
Image and video | Marcon, Paracchini | A novel tomographic approach for an early detection of multiple myeloma progression | Andrea Leggio | |
Image and video | Marcon, Paracchini | Limited angle computed tomography reconstruction with deep learning enhancement | Erbol Kasenov, Girolamo Gerace | |
Image and video | Marcon | Upper body postural assessment for common dentistry visual aids | Trotta Emilio | https://www.politesi.polimi.it/handle/10589/145563 |
Image and video | Tubaro | Real-time tracking of electrode during deep-brain surgery | Dilauro Valerio | https://www.politesi.polimi.it/handle/10589/144685 |
Image and video | Marcon | Analytical estimation of the error on the radius of industrial pipes | Lazzarin Sara | https://www.politesi.polimi.it/handle/10589/144394 |
Image and video | Marcon | 3D reconstruction from stereo video acquired from odontoiatric microscope | Spatafora Leonardo | https://www.politesi.polimi.it/handle/10589/143780 |
Image and video | Marcon | Denoising and classification of hyperspectral X-ray images for food quality assessment | Re Marco | https://www.politesi.polimi.it/handle/10589/142922 |
Image and video | Marcon | A computer vision approach for assessment of dental bracket removal | Behnami Arezoo | https://www.politesi.polimi.it/handle/10589/142362 |
Image and video | Marcon | Sistema per il rilevamento automatico di contaminanti alimentari basato su immagini iperspettrali | Ramoni Francesco | https://www.politesi.polimi.it/handle/10589/135891 |
Image and video | Marcon | Postural assessment in dentistry by computer vision | Pignatelli Nicola | https://www.politesi.polimi.it/handle/10589/135030 |
Multimedia forensics | Bestagini, Mandelli | A Multi-Modal Approach to Forensic Audio-Visual Device Identification | Davide Dal Cortivo | https://www.politesi.polimi.it/handle/10589/175593 |
Music informatics | Sarti, Bernardini, Borrelli, Mezza | Estimating Harmonic Complexity of Chord Sequences using Transformer Networks | Cecilia Morato | |
Music informatics | Zanoni, Comanducci | Modeling Harmonic Complexity in Automatic Music Generation using Conditional Variational Autoencoders | Davide Gioiosa | |
Music informatics | Sarti, Borrelli, Comanducci | Cellular music : a novel music-generation platform based on an evolutionary paradigm | Matteo Manzolini | https://www.politesi.polimi.it/handle/10589/167291 |
Music informatics | Sarti, Borrelli | Music emotion detection. A framework based on electrodermal activities. | Gioele Pozzi | https://www.politesi.polimi.it/handle/10589/152931 |
Music informatics | Sarti, Comanducci | Techniques for mitigating the impact of latency in Networked Music Performance (NMP) through adaptive metronomes | Battello Riccardo | https://www.politesi.polimi.it/handle/10589/152923 |
Music information retrieval | Sarti | Musical instrument recognition: a transfer learning approach | Molgora Andrea | https://www.politesi.polimi.it/handle/10589/147383 |
Music information retrieval | Sarti | Unsupervised domain adaptation for deep learning based acoustic scene classification | Mezza Alessandro Ilic | https://www.politesi.polimi.it/handle/10589/145573 |
Music information retrieval | Antonacci | An investigation of piano transcription algorithm for jazz music | Marzorati Giorgio | https://www.politesi.polimi.it/handle/10589/144745 |
Music information retrieval | Sarti | Automatic playlist generation using recurrent neural network | Irene Rosilde Tatiana | https://www.politesi.polimi.it/handle/10589/142101 |
Music information retrieval | Sarti | A personalized metric for music similarity using Siamese deep neural networks | Sala Federico | https://www.politesi.polimi.it/handle/10589/139078 |
Music information retrieval | Sarti | Learning a personalized similarity metric for musical content | Carloni Luca | https://www.politesi.polimi.it/handle/10589/139076 |
Music information retrieval | Sarti | Beat tracking using recurrent neural network : a transfer learning approach | Fiocchi Davide | https://www.politesi.polimi.it/handle/10589/139073 |
Music information retrieval | Sarti | Python-based framework for managing a base of complex data for music information retrieval | Avocone Giuseppe | https://www.politesi.polimi.it/handle/10589/138449 |
Music information retrieval | Sarti | Individual semantic modeling for music information retrieval | Ansidei Pietro | https://www.politesi.polimi.it/handle/10589/137160 |
Music information retrieval | Sarti | Chord sequences : evaluating the effect of complexity on preference | Foscarin Francesco | https://www.politesi.polimi.it/handle/10589/136448 |
Music information retrieval | Sarti | Audio features compensation based on coding bitrate | Tavella Maria Stella | https://www.politesi.polimi.it/handle/10589/134607 |
Musical Acoustics | Antonacci | Modal analysis and optimization of the top plate of string instruments through a parametric control of their shape | Salvi Davide | https://www.politesi.polimi.it/handle/10589/166557 |
Musical Acoustics | Antonacci, Pezzoli, Malvermi | An approach for Near-field Acoustic Holography based on Convolutional Autoencoders | Olivieri Marco | https://www.politesi.polimi.it/handle/10589/167039 |
Space-time audio | Antonacci, Borra | A parametric approach to virtual miking with distributed microphone arrays | Marco Langè | |
Space-time audio | Antonacci, Pezzoli, Borra, Bernardini | A Deep Prior Approach to Room Impulse Response Interpolation | Davide Perini | https://www.politesi.polimi.it/handle/10589/175583 |
Space-time audio | Antonacci, Comanducci | Interpreting Deep Neural Networks Models for Acoustic Source Localization using Layer-wise Relevance Propagation | Alessandro Montali | https://www.politesi.polimi.it/handle/10589/169239 |
Space-time audio | Antonacci, Borra, Bernardini | Analysis of Uniform Linear Arrays of Differential Microphones | Bertuletti Ivan | https://www.politesi.polimi.it/handle/10589/154604 |
Space-time audio | Sarti | A geometrical method of 3D sound spatialization for virtual reality applications | Iamele Jacopo | https://www.politesi.polimi.it/handle/10589/143770 |
Space-time audio | Antonacci | Convolutional neural networks applied to space-time audio processing applications | Comanducci Luca | https://www.politesi.polimi.it/handle/10589/139077 |
Space-time audio | Canclini | Denoising in the spherical harmonic domain of sound scenes acquired by compact arrays | Borrelli Clara | https://www.politesi.polimi.it/handle/10589/139075 |
Space-time audio | Antonacci | Simulazione di sistemi complessi. Case study : l'altoparlante a tromba | Moscara Francesco | https://www.politesi.polimi.it/handle/10589/139074 |
Space-time audio | Sarti, Bernardini | Steerable differential microphone arrays | Lovatello Jacopo | https://www.politesi.polimi.it/handle/10589/139072 |
Space-time audio | Antonacci | A plenacoustic approach to sound scene manipulation | Picetti Francesco | https://www.politesi.polimi.it/handle/10589/138430 |
Space-time audio | Antonacci | Reconstruction of the soundfield in arbitrary locations using the distributed ray space transform | Pezzoli Mirco | https://www.politesi.polimi.it/handle/10589/136447 |
Space-time audio | Sarti | A method for HRTF personalization : weighted sparse representation synthesis of HRTFs | Zhu Mo | https://www.politesi.polimi.it/handle/10589/135952 |
Space-time audio | Antonacci | Robust parametric spatial audio processing using beamforming techniques | Milano Guendalina | https://www.politesi.polimi.it/handle/10589/134609 |
Space-time audio | Antonacci | Estimation of singing voice quality through microphone in air and contact microphone | Landini Roberta | https://www.politesi.polimi.it/handle/10589/134604 |
Musical Acoustics | Antonacci, Malvermi | Mechanical parameter estimation for vibrometric analysis and development of a low-cost platform for violin making | Federico Simeon | https://www.politesi.polimi.it/handle/10589/170995 |
Space-time audio | Antonacci, Comanducci | 3D audio with irregular microphone setups using deep learning | Davide Mori | https://www.politesi.polimi.it/handle/10589/175608 |
Space-time audio | Antonacci, Comanducci | Personalized Sound Zone Generation using Deep Learning | Roberto Alessandri | https://www.politesi.polimi.it/handle/10589/203852 |