Visualize Music using Generative Models of Artificial Intelligence

Project Visualize Music using Generative Models of Artificial Intelligence Program DUIRI - Discovery Undergraduate Interdisciplinary Research Internship Term Fall 2024 Status Accepted Research Area Artificial Intelligence, Generative Models, Music, Hear Impaired Description This project belongs to the strategic area of global health. Playing and listening to music is one of the most universal forms of communication and entertainment across cultures. Unfortunately, many people are hearing-impaired, unable to fully enjoy music. By 2050, nearly 2.5 billion people are projected to have some degree of hearing loss, and at least 700 million will require hearing rehabilitation (source: World Health Organization). This project aims to create technologies that visualize music, assisting the hearing-impaired in understanding and enjoying music. The technical process comprises the following steps: (a) Music is analyzed and classified into multiple dimensions, including instrumentation, emotion, tempo, pitch range, harmony, dynamics, and more. This analysis yields textual descriptions. (b) The texts form features and are inputs to machine learning models for classification. (c) These models predict the genre of the input audio and its associated emotions. These predictions form the basis for text prompts that describe the music. (d) The prompts are fed into generative machine models (e.g., DALL-E-2, Stable Diffusion, and Midjourney) to create visual representations, such as images or videos. (e) The visual representations are continuously updated as the music plays, ensuring that the visual effects aptly mirror the musical changes. Variations in the prompts produce different styles of images. These images and videos are generated without human intervention, significantly reducing costs and time to produce visual representations tailored to the specific piece of music. This approach is different from video recordings of music performance because recordings can provide only passive watching experience. In contrast, the proposed solution has the advantage of creating personalized and interactive entertainment experience for the hearing-impaired. This proposed project will conduct extensive user studies to evaluate whether generative models can effectively produce visual representations of music's rich expression, heralding a novel form of entertainment for the hearing-impaired. Supervisor Yung-hsiang Lu Mentor Purvish Jatin Jajal Type of work required The students will (1) conduct literature survey to understand the state of the art using generative artificial intelligence for entertainment, (2) develop software that implements the steps mentioned in the description, (3) evaluate the efficacy of proposed solution (in the form of user studies), (4) design a website to promote generative arts used in entertainment. Websites https://ai4musicians.org/ Qualifications has taken at least one programming course Credit Hours 2 Weekly Hours 10 (estimated)

This project is not currently accepting applications.