This item is available under a Creative Commons License for non-commercial use only
Electrical and electronic engineering
A novel method to segment stereo music recordings into formal musical structures such as verses and choruses is presented. The method performs dimensional reduction on a time-azimuth representation of audio which results in a set of time activation sequences, each of which corresponds to a repeating structural segment. This is based on the assumption that each segment type such as verse or chorus has a unique energy distribution across the stereo field. It can be shown that these unique energy distributions along with their time activation sequences are the latent principal components of the time-azimuth representation. It can be shown that each time activation sequence represents a structural segment such as a verse or chorus.
Barry, D., Gainza, M., Coyle, E.: Music Structure Segmentation using the Azimugram in conjunction with Principal Component Analysis. Audio Engineering Society, 123rd Convention, October 5–8 2007, New York, NY, USA.