Top 10 Interesting Facts About the Internet’s Favorite ChatGPT You Should Know
Twitter Removes Blue Tick from PM Narendra Modi’s Account
Trump’s NFT Collectibles are Dying After Being Accused of Plagiarism
Importance of Big Data in Education
The 10 Most Impactful Chief AI Officers of the Year 2022
The 10 Most Promising AI Solution Providers of 2022
The 10 Most Inspiring Tech Leaders to Watch in 2022
The Top 10 Most Influential CEOs to Watch in 2022
No products in the cart.
Riffusion was developed by Seth Forsgren and Hayk Martiros as a side project. It stores audio in sonograms, which are two-dimensional images. Riffusion, an AI model that makes music from text prompts by constructing a visual representation of sound and converting it to audio for playback, was launched on Thursday by a couple of IT enthusiasts. It applies visual latent diffusion to sound processing in a novel manner using a fine-tuned version of the Stable Diffusion 1.5 image synthesis model. The X-axis in a sonogram depicts time (the left-to-right order in which the frequencies are played), and the Y-axis is the frequency of the sounds.
The color of each pixel in the image, meanwhile, shows the volume of the sound at that specific instant in time. A sonogram can be processed using stable diffusion because it is a sort of image. With the help of examples of sonograms that were connected to descriptions of the sounds or musical genres they represented, Forsgren and Martiros trained a unique Stable Diffusion model. With this knowledge, Riffusion can produce fresh music on demand based on text prompts that specify the genre of music or sound you like, such as “jazz,” “rock,” or even keystrokes on a keyboard. Riffusion creates the sonogram image, converts it to sound using Torchaudio, and then plays it back as audio.
The Riffusion website offers an interactive web tool that allows users to engage with the AI model by creating interpolated sonograms that are seamlessly stitched together for uninterrupted playback and continuously displaying the spectrogram on the left side of the page. The developers of Riffusion state on its explanation page, “This is the v1.5 Stable Diffusion model with no alterations, merely fine-tuned on images of spectrograms combined with text. “By changing the seed, it can produce an endless number of prompt variations. Img2img, inpainting, negative prompts, and interpolation are all the same web user interfaces and techniques that function right out of the box.”
Additionally, it can combine several fashion trends. Typing “smooth tropical dance jazz,” for instance, combines components of various genres to produce a fresh outcome, fostering innovation through the blending of forms. Riffusion is not the first AI-driven music maker, of course. An AI-driven generative music model called Dance Diffusion was released by Harmonai earlier this year. A neural network is also used in OpenAI’s Jukebox, which was revealed in 2020. Additionally, websites like Soundraw continuously produce music on demand. Riffusion feels more like the side project it is in comparison to those more organized AI music attempts. Although the music it produces fluctuates from being intriguing to being incomprehensible, it is nevertheless an impressive use of the latent diffusion technology, which modifies audio in a visual environment.
Disclaimer: The information provided in this article is solely the author’s opinion and not investment advice – it is provided for educational purposes only. By using this, you agree that the information does not constitute any investment or financial instructions. Do conduct your own research and reach out to financial advisors before making any investment decisions.
Subscribe to our weekly newsletter. Get the latest news about architecture, design, city, and inspiration.
Analytics Insight® is an influential platform dedicated to insights, trends, and opinion from the world of data-driven technologies. It monitors developments, recognition, and achievements made by Artificial Intelligence, Big Data and Analytics companies across the globe.