Toward an Artificial Intelligence-Based Music Tutor

This project envisions using technology to make music education, specifically learning how to play an instrument, more accessible, enjoyable, and comprehensive than current techniques. While there is no substitute for human-based instruction, especially regarding creative and expressive subjects like music, there is room for productive use of technology to augment existing techniques, particularly in terms of filling the time gaps between interactions with a human instructor and addressing knowledge gaps about particular elements and genres of musical expression. This project aims to utilize recent breakthroughs in AI to bridge the gap between human- and software-based music education.

Need

The very first concepts that a person learns when they pick up a new instrument are the names of pitches, how those pitches are annotated musically, and how to manipulate their instrument to reliably produce a pitch corresponding to each note. To date, this type of note- and pitch-based knowledge represents the extent of mainstream research in the domain of technology-aided music education. As a musician of any skill level can attest, however, being able to identify and play the correct notes on an instrument is only half the skill set needed to actually play the instrument. The other half includes learning how to read and express articulation, dynamics, phrasing, and playing styles. A comprehensive, combined literature and commercial product search indicates a notable lack of products or innovations aimed at assisting students in these aspects of learning to play a musical instrument, which is the technological goal of this project.

Approach

The idea is to create an infrastructure and set of tools around an Intelligent Music Tutor which is able to ingest a musical score or piece of sheet music, parse not only the correct notes, chords, timings, and note durations, but also all expressive musical notations including loudness (piano, forte, etc), playing-style-based indicators (staccato, pizzicato, strummed, etc), phrasing (crescendos, rallentandos, etc), instrument modifications (sostenuto, trumpet mute, etc), and rhythm style indicators (swing tempo, etc), and then be able to intelligently listen to a student playing this piece of music and provide feedback on not only whether the correct notes and durations were played, but also on how closely their musical expression matches what is notated.

Many different tutoring modes can be used to provide this feedback, ranging from an immediate, real-time graphical interface that constantly updates and provides suggestions as each individual note is played, to providing comprehensive feedback on entire phrases or musical sections, all the way to providing an overall score or assessment upon conclusion of a piece. This feedback could even be integrated into curricular training courses that focus on learning and perfecting different aspects of musical expression, or it could provide individually tailored training modules based on specific sections of a piece of music that require additional attention.

This project envisions using technology to make music education, specifically learning how to play a new instrument, more accessible, enjoyable, and comprehensive than current techniques. It is important to note that this technology is not intended as a replacement for traditional human-based instruction, but rather as a tool to augment existing techniques so that students have the ability to progress more quickly than they normally would, to learn new techniques and understand and correct errors in real-time as opposed to learning bad habits over the course of several weeks before a human teacher is able to identify and correct those issues, to explore learning to play different genres and styles of music using a single platform, which is something that typically requires instruction by different music teachers, and to provide a way for students to learn on their own time and at their own speed.

Opportunities

There are a plethora of needs that could be addressed by creating dedicated tools ranging from standalone software packages that utilize this project research all the way to integrating the technology into “training instruments,” including existing digital pianos. On top of that, the potential for additional applications and future commercialization products is endless. Both the research and product maturation phases of creating this technology lend themselves to the creation of applications ranging from the very simple, such as an automatic page-turning and place-keeping app for performing artists, to the intermediate, such as providing ways to translate songs between different musical genres, both for playback and for learning how to play different genres, to the advanced, such as a tool for the automatic, intelligent creation and transcription of full musical scores, which is something that is highly specialized and currently relegated to only a few very talented arrangers, especially for full orchestrations and symphonic or band compositions.

Outcomes

The short-term outcomes targeted upon completion of this specific project include full implementation of a Web-Based Music API, a Rhythm and Phrasing Detector, and a minimum viable prototype of the Single-Instrument Live Performance Assessor. The Web API will be used to parse and store a digital representation of the unaltered sheet music, as well as to play back illustrative audio clips of musical segments to a student. The Rhythm Detector will be used to quantize an incoming stream of audio from a microphone into contextually relevant musical phrases and beats. Finally, the Live Performance Assessor will interpret the incoming audio stream and compare it to a known piece of sheet music to provide feedback about the student’s performance. Note that we will not have completed any research regarding detection and contextualization of musical expression during this period, so the assessor tool will only take into account pitch correctness and note durations. Note that even though this outcome is a long way from the ultimate goal of the project, it nonetheless represents a crucial and non-trivial research step toward achieving that goal.

Sponsors
LIVE Initiative
Lead PI
Will Hedgecock
Co-PI
Pascal Le Boeuf