train.zip Contains 1,948 audio files of musical note being played on a range of pianos (for training a deep learning model) test.zip Contains 516 audio files of the same nature as the training files. However, these are to be used for testing the performance of a model (do not update model parameters with these). images.zip Contains 88 ground truth sheet music images for each of the 88 notes playable on a piano. (There are only sharp representations as we are only considering single notes and thus there is no musical context to determine whether a note should be sharp or flat). metadata.zip Contains the annotation files to be used for a dataloader when training a model. There are separate test and train annotations if you would like two separate dataloaders for the test and train sets. These annotation files are used to align the audio files with their correct ground truth images and also determine wwhat records are in what set (test or train). Thus these would need to be altered should you want to change any of these properties.