Computational Music Analysis
10.4K views | +0 today
Computational Music Analysis
New technologies and research dedicated to the analysis of music using computers. Towards Music 2.0!
Your new post is loading...
Your new post is loading...
Scooped by Olivier Lartillot!

Is There a Smarter Path to Artificial Intelligence? Some Experts Hope So - The New York Times

Is There a Smarter Path to Artificial Intelligence? Some Experts Hope So - The New York Times | Computational Music Analysis |
A branch of A.I. called deep learning has transformed computer performance in tasks like vision and speech. But meaning, reasoning and common sense remain elusive.
Olivier Lartillot's insight:
Could have an important impact in computational music analysis.

Slightly condensed article:


For the past five years, the hottest thing in artificial intelligence has been a branch known as deep learning. The grandly named statistical technique, put simply, gives computers a way to learn by processing vast amounts of data. Thanks to deep learning, computers can easily identify faces and recognize spoken words, making other forms of humanlike intelligence suddenly seem within reach. Companies like Google, Facebook and Microsoft have poured money into deep learning. Start-ups pursuing everything from cancer cures to back-office automation trumpet their deep learning expertise. And the technology’s perception and pattern-matching abilities are being applied to improve progress in fields such as drug discovery and self-driving cars.

But now some scientists are asking whether deep learning is really so deep after all.

In recent conversations, online comments and a few lengthy essays, a growing number of A.I. experts are warning that the infatuation with deep learning may well breed myopia and overinvestment now — and disillusionment later.

“There is no real intelligence there. And I think that trusting these brute force algorithms too much is a faith misplaced.” The danger is that A.I. will run into a technical wall and eventually face a popular backlash — a familiar pattern in artificial intelligence since that term was coined in the 1950s. With deep learning in particular, the concerns are being fueled by the technology’s limits.

Deep learning algorithms train on a batch of related data — like pictures of human faces — and are then fed more and more data, which steadily improve the software’s pattern-matching accuracy. Although the technique has spawned successes, the results are largely confined to fields where those huge data sets are available and the tasks are well defined, like labeling images or translating speech to text.

The technology struggles in the more open terrains of intelligence — that is, meaning, reasoning and common-sense knowledge. While deep learning software can instantly identify millions of words, it has no understanding of a concept like “justice,” “democracy” or “meddling.”

Researchers have shown that deep learning can be easily fooled. Scramble a relative handful of pixels, and the technology can mistake a turtle for a rifle or a parking sign for a refrigerator.

In a widely read article published early this year, Gary Marcus, a professor at New York University, posed the question: “Is deep learning approaching a wall? As is so often the case, the patterns extracted by deep learning are more superficial than they initially appear.”

If the reach of deep learning is limited, too much money and too many fine minds may now be devoted to it. “We run the risk of missing other important concepts and paths to advancing A.I.”

Amid the debate, some research groups, start-ups and computer scientists are showing more interest in approaches to artificial intelligence that address some of deep learning’s weaknesses. For one, the Allen Institute, a nonprofit lab in Seattle, announced in February that it would invest $125 million over the next three years largely in research to teach machines to generate common-sense knowledge — an initiative called Project Alexandria.

While that program and other efforts vary, their common goal is a broader and more flexible intelligence than deep learning. And they are typically far less data hungry. They often use deep learning as one ingredient among others in their recipe.

“We’re not anti-deep learning. We’re trying to raise the sights of A.I., not criticize tools.”

Those other, non-deep learning tools are often old techniques employed in new ways. At Kyndi, a Silicon Valley start-up, computer scientists are writing code in Prolog, a programming language that dates to the 1970s. It was designed for the reasoning and knowledge representation side of A.I., which processes facts and concepts, and tries to complete tasks that are not always well defined. Deep learning comes from the statistical side of A.I. known as machine learning.

Kyndi has been able to use very little training data to automate the generation of facts, concepts and inferences. The Kyndi system can train on 10 to 30 scientific documents of 10 to 50 pages each. Once trained, Kyndi’s software can identify concepts and not just words.

Kyndi and others are betting that the time is finally right to take on some of the more daunting challenges in A.I. That echoes the trajectory of deep learning, which made little progress for decades before the recent explosion of digital data and ever-faster computers fueled leaps in performance of its so-called neural networks. Those networks are digital layers loosely analogous to biological neurons. The “deep” refers to many layers.

There are other hopeful signs in the beyond-deep-learning camp. Vicarious, a start-up developing robots that can quickly switch from task to task like humans, published promising research in the journal Science last fall. Its A.I. technology learned from relatively few examples to mimic human visual intelligence, using data 300 times more efficiently than deep learning models. The system also broke through the defenses of captchas, the squiggly letter identification tests on websites meant to foil software intruders.

Vicarious, whose investors include Elon Musk, Jeff Bezos and Mark Zuckerberg, is a prominent example of the entrepreneurial pursuit of new paths in A.I.

“Deep learning has given us a glimpse of the promised land, but we need to invest in other approaches.”
No comment yet.
Scooped by Olivier Lartillot!

Opinion | A Note to the Classically Insecure - The New York Times

Opinion | A Note to the Classically Insecure - The New York Times | Computational Music Analysis |
Most people believe they don’t “get” classical music. But in the most important sense, they do.
Olivier Lartillot's insight:
One reason why computational visualisation of music would be of high interest. Excerpts from the article:

Although [my friend] loved classical music, when he listened to it he wasn’t able to perceive anything other than his own emotional reactions. Could it be true? Well, he thought it was. But he was wrong. What my friend was expressing was merely a symptom of a common affliction, one that crosses all intellectual, social and economic classes: the Classical Music Insecurity Complex.

There’s no question that he perceives more than just his own reactions. Lots more. In every piece he listens to he perceives changes, both great and small, in tempo, volume, pitch and instrumentation. He perceives melodies, harmonies and rhythms, and their patterns. He perceives, in short, virtually all the musical ingredients that composers manipulate to stimulate emotional effects, which is precisely why he’s emotionally affected. His “problem” isn’t perception — it’s description. And what he doesn’t know is the jargon, the technical terms for the ingredients and manipulations.

It’s not even essential that he be aware of the specific musical and technical means by which his reactions are being stimulated.

It’s sad but true that many people denigrate and distrust their own reactions to classical music out of fear that they don’t “know enough,” and that other, more sophisticated folks know more. When people leave the movie theater they rarely hesitate to give their opinion of the movie, and it never occurs to them that they don’t have a right to that opinion. And yet after most classical music concerts you can swing your program around from any spot in the lobby and hit a dozen perfectly capable and intelligent people issuing apologetic disclaimers: “Boy, I really loved that — but I’m no expert” or “It sounded pretty awful to me, but I don’t really know anything, so I guess I just didn’t get it.” At least those people showed up. Many others are too intimidated to attend classical concerts at all.

It’s human nature to want to know more, and to try to understand and explain our experiences and reactions. And there’s no denying that the more we know about music, as with cooking or gardening or football, the more levels of enjoyment are available to us, and the better we’re able to recognize great achievement. Do we have to know the Latin names of flowers — or the English names, for that matter — to be moved by the beauty of a garden? No. Do we have to know about blocking schemes and “defensive packages” to be excited when our team scores a touchdown? No. But we find these things … interesting. They add to our appreciation.

The Classical Music Insecurity Complex is a barrier of discomfort. Experience, exposure and familiarity play critical roles in helping to lower that barrier, and a little learning, along with basic explanations of technical (and foreign) terms and concepts, can be of great value. What is not of value, and is in fact completely off-putting and counterproductive, is the kind of introductory concert talk, review or program note that uses technical terms rather than plain English to explain other technical terms and to “describe” musical works. Program notes that use phrases like “the work features a truncated development with chromatic modulations to distant keys and modally inflected motivic cells,” for example, do not exactly help to break down barriers and put people at ease. Perhaps it’s overly optimistic of me, but I still cling to the hope that, with the right approaches and experiences, longtime sufferers will feel sufficiently encouraged to go ahead and jettison the C.M.I. Complex outright. I’d like the legions of actual and potential classical music lovers to believe that they hear more than they can name, and that the very point of listening to great music is to be moved, not to put names on what moves you.
No comment yet.
Scooped by Olivier Lartillot!

Meet the man classifying every genre of music on Spotify — all 1,387 of them | Toronto Star

Meet the man classifying every genre of music on Spotify — all 1,387 of them | Toronto Star | Computational Music Analysis |
Do you prefer ‘neurostep’ or ‘vapor house’? Spotify’s ‘data alchemist’ is using technology to help identify new musical trends.
Olivier Lartillot's insight:
From the article:

Would you classify Justin Bieber as pop or “pop Christmas”? Is Arcade Fire a rock band or “permanent wave”? How about describing Ed Sheeran as “neo mellow,” Alabama Shakes as “stomp and holler,” or the versatile Grimes as “grave wave,” “metropopolis,” “nu gaze” or all of the above? 

Spotify’s “data alchemist” Glenn McDonald can proudly claim authorship of some of those eccentric descriptors. His company has used a complex algorithm to analyze and categorize upwards of 60 million songs on a molecular level — and the micro-classifications now number 1,387 sub-genres in total. 

Ultimately, these machinations are allowing a new level of specificity to answer that impossible old first-date query: What kind of music are you into?

“The most interesting thing is how much music there is in the world. The number of places where I expected to find five or 10 bands and found 200 is just amazing.” 

Using a tool called machine listening, songs are analyzed by Spotify’s music-intelligence division, the Echo Nest, based on their digital signatures for a number of factors, including tempo, acoustic-ness, energy, danceability, strength of the beat and emotional tone. Taken together, those numbers can identify the distinguishing aural characteristics of different genres and regional sounds.

McDonald has provided a visual representation of the subtle sonic differences at, which maps music along a spectrum where mechanical sounds reside at the top, organic at the bottom, atmospheric on the left and bouncy on the right. The still-evolving tool can highlight the subtle differences in the hip hop of Quebec and Finland, for instance, or create cross-genre comparisons between such unlikely bedfellows as Thai indie and Spanish new wave. Any curiosity about what a genre called “necrogrind,” “fallen angel” or “electrofox” could possibly sound like can be satisfied by clicking on it to hear a sample.

The process is still imperfect. At one point, the computers confused the sound of the banjo with human singers, simply because not enough banjo samples had been entered. In another instance, Indonesian dangdut was classified as near-identical to American country because, McDonald said, the computers had no protocol to account for twang.

“(People) imagine metal humanoid robots sitting in chairs with silver headphones on nodding mechanically to songs and making up their robot minds. But the process is totally different. There’s no emotion involved. The machines are not pretending to be people. “They’re just trying to find mathematical ways of approximating the effect that humans get from music so the scores can be intelligible and reliable.”

Once the machines have identified sonic similarities, a human touch is required to research the sub-genres or create new ones, a task that often falls to McDonald himself. It can be vexing as sub-genres are divvied into progressively tinier fractions; indie pop, for instance, has been subdivided into shiver pop, gauze pop, etherpop, indie fuzz pop, popgaze and so on.

In analyzing so much musical minutiae, McDonald has noticed that for all this century’s rampant boundary-blurring, distinctive differences still exist between types of music. “I don’t think genres have gone away ... any more than cuisines have disappeared. There are more fusions,” he said. “So just like you can find the Korean burrito place, that doesn’t mean there aren’t also Korean restaurants and Mexican restaurants. “Maybe the firmness of the boundaries has gone away. It’s more possible to exist in the spaces between genres as they blend into each other.” 

A look at some of Spotify’s more creative genre headings:

Catstep: Vancouver-based EDM label Monstercat is sufficiently unique that its signature sound spawned its own feline-themed subgenre.

Deep sunset lounge: Generally, McDonald uses “deep” to narrow a genre to its most obscure outliers. In this case, the term refers to the post-party chill-out music favoured by “people in the know.”

Epicore: This is McDonald’s shorthand for the type of “big, booming” music that scores seemingly every movie trailer.

Laboratorio: This is the music “you imagine people producing with beakers,” said McDonald, who coined the phrase after seeing Icelandic band Amiina open for Sigur Ros.

Stomp and whittle: One of McDonald’s prouder creations, this genre is reserved for acts skewing slightly less emphatic than “stomp-and-holler” artists like the Lumineers or Alabama Shakes.
No comment yet.
Scooped by Olivier Lartillot!

Can you get from 'dog' to 'car' with one pixel? Japanese AI boffins can

Can you get from 'dog' to 'car' with one pixel? Japanese AI boffins can | Computational Music Analysis |
Fooling an image classifier is surprisingly easy and suggests novel attacks
Olivier Lartillot's insight:
Knowing that a very large ratio of current research in Computational Music Analysis (or Music Information Retrieval, MIR, if you prefer) is based on deep learning, it might be unsettling to learn that those systems might not be very robust. The research presented below focuses on image but we may suppose it could be applied to music? Let’s suppose for instance a model that is supposedly able to detect different genres of music. If this research could be applied to this context, we could take a piece of music that is clearly on a given genre, modify the sound in a way that is not noticeable by human listeners, and suddenly the machine would recognise that piece of music as belonging to another genre.. 

(Thanks Davide Andrea Mauro!) 

UPDATE2: Interesting discussing about this post here:  


Excerpts: " It doesn't take much to confuse AI image classifiers: a group from Japan's Kyushu University reckon you can fool them by changing the value of a single pixel in an image. They were working with two objectives: first, to predictably trick a Deep Neural Network (DNN), and second, to automate their attack as far as possible. 

In other words, what does it take to get the AI to look at an image of an automobile, and classify it as a dog? The surprising answer: an adversarial perturbation of just one pixel would do the trick – a kind of attack that you'd be unlikely to detect with the naked eye.

The researchers came up with the startling conclusion that a one-pixel attack worked on nearly three-quarters of standard training images. Not only that, but the boffins didn't need to know anything about the inside of the DNN – as they put it, they only needed its “black box” output of probability labels to function. 

The attack was based on a technique called “differential evolution” (DE), an optimisation method which in this case identified the best target for their attack. Editing that one, carefully-chosen pixel meant “each natural image can be perturbed to 2.3 other classes on average”. The best result was an image of a dog in the training set, which the trio managed to trick the DNN into classifying as all nine of the “target” classes – airplane, automobile, bird, cat, deer, frog, horse, ship and truck."
No comment yet.
Scooped by Olivier Lartillot!

Recommendation systems are seriously harming the "biodiversity" of music

Technologies such as recommendation systems are seriously harming the "biodiversity" of music.

Olivier Lartillot's insight:
This article raises a very important issue that should be very carefully be taken into consideration by the music industry and especially by the people designing the new technologies: namely, that technologies such as recommendation systems are seriously harming the "biodiversity" of music.

I think the Music Information Retrieval community, in particular, should take the time to become aware of this important problem, and should develop strategies, including axes of research, to counteract these highly worrying developments. Maybe research have already been developed along those lines, in such case it would be important to learn more about them. 

Excerpt from the article:

«There is an important reason why music in particular and the arts in general are floundering. That reason is that, with a very few exceptions, no one cares any more. Much has been made of the transition from an analogue to a binary age. Not so much has been made of the even more insidious transition from a binary to an algorithmic age. There is a limited understanding of the algorithms used by Google, Amazon, Facebook and other social media platforms to create content filter bubbles which ensure that we stay in our self-defined comfort zones. Even less attention has been paid to how the algorithms virus has expanded beyond online platforms. 

For example the Guardian uses editorial algorithms to unashamedly slant its journalism towards the prejudices of its readership, and concert promoters use subjective algorithms to present concerts of familiar and non-challenging repertoire. The problem is that no one cares that this is happening. In fact everyone feels very contended in their own comfort zone with ever faster broadband, ever cheaper streamed content, ever more friends and followers, ever more selfie opportunities and - most importantly - ever fewer challenges to their prejudices. And the media - particularly the classical music media - is quite happy to play along; because keeping your readers in their comfort zone means keeping your readers. 

Today the vast majority no longer care about protecting the arts. And we are all to blame. This article is being written on a free blogging platform provided by Google, the pioneer of algorithmic determination. If it reaches any audience at all it will be because it is favoured by the algorithms of Facebook and Twitter. However, it is unlikely to reach any significant social media audience because my views are not favoured by the online vigilantes who police the borders of classical music's comfort zones. And for the same reason the dissenting views expressed here and elsewhere are unlikely to find their way into the Guardian or Spectator or to be aired on BBC Radio 3's Music Matters. But why should any of this matter? Why should people care when they can watch safe within the comfort zone of their own home an outstanding performance lasting 2 hours 44 minutes of Berlioz's Damnation of Faust by the world-class forces of Simon Rattle and the London Symphony Orchestra recorded in high quality video and audio for free on YouTube?

There is no viable solution because we are all part of the problem. Classical music's biggest challenge is not ageing audiences, disruptive business models, institutionalised discrimination, unsatisfactory concert halls etc etc. The biggest challenge facing classical music is adapting to a society in which no one cares about anything except staying firmly within their own algorithmically defined comfort zone.»
No comment yet.
Scooped by Olivier Lartillot!

Spotify buys AI firm Niland to power recommendations, stave off Apple Music advances

Spotify buys AI firm Niland to power recommendations, stave off Apple Music advances | Computational Music Analysis |
In an escalating war with Apple Music, streaming music market leader Spotify on Thursday announced the acquisition of Niland, a small machine learning startup whose technology will help deliver song recommendations to users.
Olivier Lartillot's insight:
from the article:

"In an escalating war with Apple Music, streaming music market leader Spotify on Thursday announced the acquisition of Niland, a small machine learning startup whose technology will help deliver song recommendations to users.

A small startup headquartered in Paris, Niland developed machine learning algorithms to analyze digital music. Prior to Spotify's purchase, Niland offered music search and recommendation services to "innovative music companies" through custom APIs.

For example, Niland marketed its AI and audio processing technology by offering content creators and publishers a specialized audio search engine. Customers were able to upload tracks for processing and receive a list of similar sounding songs. The technology could also be used to surface similar tracks within a particular catalog, making for a powerful recommendation engine.

Going further, Niland's tech can extract complex metadata pertaining to mood, type of voice, instrumentation, genre and tempo. The firm's APIs automatically process and tag these sonic watermarks for keyword searches like "pop female voice" or "jazz saxophone."

Some of the same features went into a music recommendation engine that offered suggestions based on mood, activity, genre, song style and other factors. Spotify is most likely looking to integrate the AI-based engine into its app in the near term.

"Niland has changed the game for how AI technology can optimize music search and recommendation capabilities and shares Spotify's passion for surfacing the right content to the right user at the right time," Spotify said in a statement.

Niland's team will be relocated to Spotify's office in New York, where they will help the streaming giant improve its recommendation and personalization technologies.

The move comes amidst a wider race to deliver the perfect personalized listening experience. Industry rivals are looking for ways to develop customized playlists, and Spotify appears to be investing heavily in intelligent software.

Apple Music, on the other hand, touted human-curated playlists when it launched in 2015. Since then, Apple has integrated its own software-driven recommendation assets, the latest being iOS 10's "My New Music Mix."

News of the Niland acquisition arrives less than a month after Spotify announced the purchase of Mediachain Labs, a blockchain startup focused that developed technologies for registering, identifying, tracking and managing content across the internet.

The buying spree comes amid rapid growth for Spotify, which in March hit a milestone 50 million paid subscribers. Counting free-to-stream listeners, the service is said to boast more than 100 million users.

By comparison, Apple Music reached 20 million in December, though Apple executive Jimmy Iovine in a recent interview said the product would have "400 million people on it" if a free tier was offered."
No comment yet.
Scooped by Olivier Lartillot!

Jean-Claude Risset — Wikipédia

Jean-Claude Risset - Wikipédia

Après des études scientifiques à l' École normale supérieure de la rue d'Ulm, Jean-Claude Risset devient jeune agrégé de physique à 23 ans, en 1961. Il poursuit ses recherches sous la direction du professeur Pierre Grivet, à la Faculté des sciences d'Orsay (au début du siècle composante de l'Université Paris-Sud 11).

Olivier Lartillot's insight:
My quick translation:

"Jean-Claude Risset was a French composer (18 March 1938 – 21 November 2016 in Marseille). A pioneer in computer music, first working in US, J.-C. Risset later contributed to the introduction of computer music in France (in institutions such as IRCAM). He was, through his double training, scientific and artistic, the first French composer to open the way to sounds synthesized by computer. He is now a major figure in contemporary music creation and, at the same time, electronic music research. His contributions mark the aesthetics of the years 1970-1990.

He studied science at the Ecole Normale Supérieure in rue d'Ulm, and at the same time music at the Conservatoire National Supérieure in Paris: piano studies with Robert Trimaille and Huguette Goullon, writing with Suzanne Demarquez, and composition (harmony and counterpoint) with André Jolivet. He obtained a first prize in the UFAM piano competition in 1963. His Prélude for orchestra was created the same year at the Maison de la radio, in Paris.

His PhD thesis focuses on the analysis, synthesis and perception of musical sounds. His work accounts for the complexity and diversity of the mechanisms involved in hearing. He perceived the limit and inadequacies of models prevailing at that time. His approach focused on timbre has the merit to illuminate the preoccupations now central to music computing: to unite two fields of knowledge (physics of sound and music), and to exploit with this design a new promising technology, namely computer. Through a successful transition from "calculators” to computers, Risset contributed to the foundations of what will become Computer music.

From 1975 to 1979, Jean-Claude Risset participated with Pierre Boulez in the creation of IRCAM (Institute for Research and Coordination Acoustics / Music). The artistic direction of the "Computer" department allowed him to study the integration of computer science in music research. He was appointed professor at the University of Aix-Marseille from 1979 to 1985; Where he chaired the Arts Section of the Higher Council of Universities (1984-1985). After 1985, he headed the CNRS Laboratory of Mechanics and Acoustics at the CNRS in Marseille. He was also responsible for the Master ATIAM "Acoustics, signal processing and computer science applied to music", which involves several universities and IRCAM.

A musician and composer recognized by the international artistic community, Jean-Claude Risset is also a technician and an undisputed theoretician of computer music."
No comment yet.
Scooped by Olivier Lartillot!

Inside Spotify’s Hunt for the Perfect Playlist

Inside Spotify’s Hunt for the Perfect Playlist | Computational Music Analysis |
Spotify is launching a new playlist service called Discover Weekly that uses your data to serve you songs you might like.
Olivier Lartillot's insight:

Most popular Spotify playlists are made by human curator, “but they live and die by data.”


The playlists are made using an “internal Spotify tool called Truffle Pig. Jim Lucchese, CEO of The Echo Nest (which was also acquired by Spotify) refers to Truffle Pig as “Pro Tools for playlists.” It’s part of a version of the Spotify app that’s only available to employees. It lets them build a playlist from almost anything: an artist’s name, a song, a vague adjective or feeling. You tell Truffle Pig you want, say, a twangy alt-country playlist. That’s enough to get started. Then you refine: “Say you want high acousticness with up-tempo tracks that are aggressive up to a certain value. It’ll generate a bunch of candidates, you can listen to them there, and then drop them in and add them to your playlist.”

The Echo Nest’s job within Spotify is to endlessly categorize and organize tracks. The team applies a huge number of attributes to every single song: Is it happy or sad? Is it guitar-driven? Are the vocals spoken or sung? Is it mellow, aggressive, or dancy? On and on the list goes. Meanwhile, the software is also scanning blogs and social networks—ten million posts a day—to see the words people use to talk about music. With all this data combined, The Echo Nest can start to figure out what a “crunk” song sounds like, or what we mean when we talk about “dirty south” music.”

No comment yet.
Scooped by Olivier Lartillot!

Genetic Data Tools Reveal How Pop Music Evolved In The US

Genetic Data Tools Reveal How Pop Music Evolved In The US - The Physics arXiv Blog - Medium
And show that The Beatles didn’t start the American music revolution of 1964
Olivier Lartillot's insight:

From the blog post:


"Despite the keen interest in the evolution of pop music, there is little to back up most claims in the form of hard analytical evidence.


In a new study, number crunching techniques developed to understand genomic data have been used to study the evolution of American pop music. The study found an objective way to categorise musical styles and to measure the way these styles change in popularity over time.


The team started with the complete list of US chart topping songs in the form of the US Billboard Hot 100 from 1960 to 2010. To analyse the music itself, they used 30-second segments of more than 80 per cent of these singles — a total of more than 17,000 songs.


They then analysed each segment for harmonic features such as chord changes and for the quality of timbre, whether guitar or piano or orchestra based, for example. In total, they rated each song in one of 8 different harmonic categories and one of 8 different timbre categories.


They assumed that the specific combination of harmonic and timbre qualities determines the genre of music, whether rock, rap, country and so on. However, the standard definitions of music genres also capture non-musical features such as the age and ethnicity of the performers, as in classic rock or Korean pop and so on.


So the team used an algorithmic technique for finding clusters within networks of data to find objective categories of musical genre that depend only on the musical qualities. This technique threw up 13 separate styles of music.


An interesting question is what these styles represent. To find out, the team analysed the tags associated with each song on the Last-FM music discovery service. Using a technique from bioinformatics called enrichment analysis, they searched for tags that were more commonly associated with songs in each music style and then assumed that these gave a sense of the musical genres involved.


For example, they found that style 1 was associated with soul tags, style 2 with hip hop, style 3 with country music and easy listening, style 4 with jazz and blues and so on.


Finally, they plotted the popularity of each style over time.


The data allows them to settle some long standing debates among connoisseurs of popular music. One question is whether various practices in the music industry have led to a decline in the cultural variety of new music.


To study this issue, they developed several measures of diversity and tracked how they changed over time. “We found that although all four evolve, two — diversity and disparity — show the most striking changes, both declining to a minimum around 1984, but then rebounding and increasing to a maximum in the early 2000s.”"

No comment yet.
Rescooped by Olivier Lartillot from Digital Music Market!

Spotify’s Secret Weapon: The Echo Nest

Spotify’s Secret Weapon: The Echo Nest | Computational Music Analysis |
By Christopher D Amico from Berklee College of Music's Music Business Journal. Spotify announced in March the acquisition of The Echo Nest, the industry’s leading music intelligence company. The deal signals the rising importance of big data in the music industry. Founded by MIT Media Lab doctoral students Tristan Jehan and Brian Whitman, The Echo Nest provided intelligence to some of the world’s leading music services including Clear Channel’s iHeart radio, Rdio, SiriusXM, and social media networks such as Foursquare, MTV, Twitter, and Yahoo. This might change as the company moves away from being an open source platform, useful to...

Olivier Lartillot's insight:


Spotify announced in March the acquisition of The Echo Nest, the industry’s leading music intelligence company. The deal signals the rising importance of big data in the music industry. Founded by MIT Media Lab doctoral students Tristan Jehan and Brian Whitman, The Echo Nest provided intelligence to some of the world’s leading music services including Clear Channel’s iHeart radio, Rdio, SiriusXM, and social media networks such as Foursquare, MTV, Twitter, and Yahoo.  


Tristan Jehan earned his doctorate in Media Arts and Sciences from MIT in 2005. His academic work combined machine listening and machine learning technologies in teaching computers how to hear and make music. He first earned a Masters in Science in Electrical Engineering and Computer Science from the University of Rennes in France, later working on music signal parameter extraction at the Center for New Music and Audio Technologies at U.C. Berkeley. He has worked with leading research and development labs in the U.S. and France as a software and hardware engineer in areas of machine listening and audio analysis.


For purposes of analysis and recommendation, songs are not taken whole but rather are broken down into specific attributes, qualities, and even segments. Acoustic analysis has a major role in the company’s algorithms when it decides what to play next. Listeners expect smooth transitions between songs in playlists, so this involves, in part, the analysis of tempo, key, and overall genre.

By dissecting these particulars, The Echo Nest can both create coherent playlists and applications with which listeners can manipulate music. The latter is especially important,  for new consumer devices will inevitably hit the marketplace soon.


Paul Lamere, one of their top software developers, has built several of The Echo Nest’s popular web applications; see ‘Girl Talk in a Box’ allows interaction with a user’s favorite song by speeding, skipping beats, playing it backwards, swinging it, and more. ‘The Infinite Jukebox’, on the other hand, will generate a never-ending and ever changing version of an MP3 song, which it breaks into beats: at every beat there’s a chance that it will jump to a different part of song that happens to sound very similar to the current beat.

These two applications are for the fun user market, and, perhaps, DJ’s. There is much more behind the scene at a pro level, such as its use of Spotify’s entire range of streaming data to identify, for example, where user listener wanes during a song—is it the extended drum or guitar solo, the weak chorus, or what? There has never been such a measurable tracking of listening habits and musical tastes, and in a digital world of 0s and 1s data reduction is more necessary than ever.

No comment yet.
Scooped by Olivier Lartillot!

"Deep Listening" attracts the interest of Google, Spotify, Pandora

"Deep Listening" attracts the interest of Google, Spotify, Pandora | Computational Music Analysis |
Deep learning amounts to one of those technologies that several companies could start to implement in the future, in order to improve music streaming.
Olivier Lartillot's insight:



Google, Pandora, and Spotify have recently hired deep learning experts. This branch of A.I. involves training systems called “artificial neural networks” with terabytes of information derived from images, text, and other inputs. It then presents the systems with new information and receives inferences about it in response. A neural network for a music-streaming service could recognize patterns like chord progressions in music without needing music experts to direct machines to look for them. Then it could introduce a listener to a song, album, or artist in accord with their preferences.


The new wave of attention leads back to an academic paper where Ph.D. students Sander Dieleman and Aäron van den Oord collaborated with professor Benjamin Schrauwen to make convolutional neural networks (CNNs) pick up attributes of songs, rather than using them to observe features in images, as engineers have done for years.The trio found that their model “produces sensible recommendations.” What’s more, their experiments showed the system “significantly outperforming the traditional approach.”


The paper captured the imagination of academics who work with music and deep learning wonks as well. Microsoft researchers even cited the paper in a recent overview of the deep learning field.


Deep learning stands out from the recommendation systems in place at Spotify, which uses more traditional data analysis. And down the line, it could provide for improvements in key metrics. Spotify currently recommends songs using technology from Echo Nest, which Spotify ended up buying this year. Echo Nest gathers data using two systems: analysis of text on the Internet about specific music, as well as acoustic analysis of the songs themselves. The latter entails directing machines to listen for certain qualities in songs, like tempo, volume, and key. We “apply knowledge about music listening in general and music theory and try to model every step of the way musicians actually perceive first and then understand and analyze music,” said Tristan Jehan, a cofounder of Echo Nest and now principal scientist at Spotify. That system required lots of domain-specific input on the part of the people who built it. But the deep learning approach, while also complex, is completely different. “Sander [Dieleman] is just taking a waveform and assumes we don’t know that stuff, but the machine can derive everything, more or less. That’s why it’s a very generic model, and it has a lot of potential.” The idea is to predict what songs listeners might like, even when usage data isn’t available.


The interest goes beyond Spotify. Even Pandora, known for its use of humans in its process of finding songs to play, has been exploring the technique.

Pandora’s musicologists identify attributes in songs based on their knowledge of music. The end product is data that they can feed into complex algorithms — but it fundamentally depends on human beings. The human-generated data feeds into a system befitting a company with a $3.86 billion market cap. Pandora’s data centers retain an arsenal of more than 50 recommendation systems. “No one of these approaches is optimal for every station of every user of Pandora.” 


Meanwhile, deep learning has come in handy for a wide variety of purposes at Google, and employees certainly are investigating its applications in a music-streaming context. “Deep learning represents a complete revolution, literally a revolution, in how machine learning is done.” The trouble is, deep learning on its own might do a good job of detecting similarities among songs, but maximizing outcomes might mean drawing on several kinds of data other than the raw content. “What we see from these deep-learning models, including the best of best we’ve seen is there’s still a lot of noise out there in these models.”

And so deep learning might not be a sort of drop-in replacement for music streaming. It can be another tool, and perhaps not only for determining which song to play next. Its capabilities could go beyond that.


“What I do see is that deep learning is allowing us to better understand music and allow us to actually better understand what music is.”

No comment yet.
Scooped by Olivier Lartillot!

Computer becomes a bird enthusiast

Computer becomes a bird enthusiast | Computational Music Analysis |
Program can distinguish among hundreds of species in recorded birdsongs
Olivier Lartillot's insight:


If you’re a bird enthusiast, you can pick out the “chick-a-DEE-dee” song of the Carolina chickadee with just a little practice. But if you’re an environmental scientist faced with parsing thousands of hours of recordings of birdsongs in the lab, you might want to enlist some help from your computer. A new approach to automatic classification of birdsong borrows techniques from human voice recognition software to sort through the sounds of hundreds of species and decides on its own which features make each one unique.


Typically, scientists build one computer program to recognize one species, and then start all over for another species. Training a computer to recognize lots of species in one pass is “a challenge that we’re all facing.”


That challenge is even bigger in the avian world, says Dan Stowell, a computer scientist at Queen Mary University of London who studied human voice analysis before turning his attention to the treetops. “I realized there are quite a lot of unsolved problems in birdsong.” Among the biggest issues: There are hundreds of species with distinct and complex calls—and in tropical hotspots, many of them sing all at once.


Most methods for classifying birdsong rely on a human to define which features separate one species from another. For example, if researchers know that a chickadee’s tweet falls within a predictable range of frequencies, they can program a computer to recognize sounds in that range as chickadee-esque. The computer gets better and better at deciding how to use these features to classify a new sound clip, based on “training” rounds where it examines clips with the species already correctly labeled.


In the new paper, Stowell and his Queen Mary colleague, computer scientist Mark Plumbley, used a different approach, known as unsupervised training. Instead of telling the computer which features of a birdsong are going to be important, they let it decide for itself, so to speak. The computer has to figure out “what are the jigsaw pieces” that make up any birdsong it hears. For example, some of the jigsaw pieces it selects are split-second upsweeps or downsweeps in frequency—the sharp pitch changes that make up a chirp. After seeing correctly labeled examples of which species produce which kinds of sounds, the program can spit out a list—ranked in order of confidence—of the species it thinks are present in a recording.

Their unsupervised approach performed better than the more traditional methods of classification—those based on a set of predetermined features.


The new system’s accuracy fell short of beating the top new computer programs that analyzed the same data sets for the annual competition. But the new system deserves credit for applying unsupervised computer learning to the complex world of birdsong for the first time. This approach could be combined with other ways of processing and classifying sound, because it can squeeze out some info that other techniques may miss.


Eighty-five percent accuracy on a choice between more than 500 calls and songs is impressive and shows both the biological community and the computer community what you can do with these large sound archives.


No comment yet.
Scooped by Olivier Lartillot!

Shazam-Like Dolphin System ID's Their Whistles: Scientific American Podcast

Olivier Lartillot's insight:

I am glad to see such popularization of research related to “melodic” pattern identification that generalizes beyond the music context and beyond the human species, and also this interesting link to music identification technologies (like Shazam). Before discussing further on this, here is first of all what this Scientific American podcast explains in a simple way the computational attempt of mimicking dolphins' melodic pattern identification abilities:


“Shazam-Like Dolphin System ID's Their Whistles: A program uses an algorithm to identify dolphin whistles similar to that of the Shazam app, which identifies music from databases by changes in pitch over time.

Used to be, if you happened on a great tune on the radio, you might miss hearing what it was. Of course, now you can just Shazam it—let your smartphone listen, and a few seconds later, the song and performer pop up. Now scientists have developed a similar tool—for identifying dolphins.

Every dolphin has a unique whistle.  They use their signature whistles like names: to introduce themselves, or keep track of each other. Mothers, for example, call a stray offspring by whistling the calf's ID.

To tease apart who's saying what, researchers devised an algorithm based on the Parsons code, the software that mammals, I mean that fishes songs from music databases, by tracking changes in pitch over time.

They tested the program on 400 whistles from 20 dolphins. Once a database of dolphin sounds was created, the program identified subsequent dolphins by their sounds nearly as well as humans who eyeballed the whistles' spectrograms.

Seems that in noisy waters, just small bits of key frequency change information may be enough to help Flipper find a friend.”


More precisely, the computer program generates a compact description of each dolphin whistle indicating how the pitch curve progressively ascends and descends. This enables to get a description that is characteristic of each dolphin, and to compare these whistle curves and see which curve belongs to which dolphin.


But to be more precise, Shazam does not use this kind of approach to identify music. It does not try to detect melodic lines in the music recorded by the user, but take a series of several-second snapshot of each song, such that each snapshot contains all the complex sound at that particular moment (with the polyphony of instruments). A compact description (a “fingerprint”) of each snapshot is produced, that indicate the most important spectral peaks (let's say the more prominent pitch of the polyphony). This fingerprint is then compared with those of each songs in the music database. Finally the identified song in the database is the one whose series of fingerprints fits best with the series of fingerprints of the user's music query. Here is a simple explanation of how Shazam works:


Shazam does not model *how* humans identify music. The dolphin whistle comparison program does not model *how* dolphins identify each other. And Shazam and the dolphin whistle ID program do not use similar approaches. But on the other hand, we might assume that dolphins and humans abilities of identifying auditory patterns (in whistles, in music for humans) rely on same core cognitive processes?

No comment yet.
Scooped by Olivier Lartillot!

Google's engineers say that lack of rigor is ruining AI research —

Google's engineers say that lack of rigor is ruining AI research — | Computational Music Analysis |

Knowledge, nominally the goal of scientific research, is currently in second place to “wins,” the practice of beating a benchmark as a way to getting recognized in the AI community. This may be skewing the true nature of progress in AI and contributing to wasted effort and suboptimal performance.

Olivier Lartillot's insight:

In a recent article in Science : “There’s anguish in the field. Many of us feel like we’re operating on alien technology.” the “entire field has become a black box.” 

Knowledge, nominally the goal of scientific research, is currently in second place to “wins,” the practice of beating a benchmark as a way to getting recognized in the AI community. This may be skewing the true nature of progress in AI and contributing to wasted effort and suboptimal performance. For example, how stripping “bells and whistles” from a translation algorithm made it work better, which highlighted how its creators didn’t know which parts of the AI were doing what. It’s not uncommon for the core of an AI to be “technically flawed.”

“A paper is more likely to be published if the reported algorithm beats some benchmark than if the paper sheds light on the software’s inner workings”. Similarly, Francois Chollet, a computer scientist at Google in California, told the magazine that people rely on “folklore and magic spells,” referring to how AI engineers “adopt pet methods to tune their AIs.”

The authors suggest a range of fixes, all focused on learning which algorithms work best, when, and why. They include deleting parts of an algorithm and seeing what works and what breaks, reporting the performance of an algorithm across different dimensions so that performance improvements in one area don’t mask a drop in another, and “sanity checks,” where an algorithm is tested on counter-factual or alternative data.

As awareness grows around AI’s impact on our society and as the tech giants continue to concentrate AI capability within their walls, there is as much need to maintain transparency and accountability around the creation of AI as there is around the use of it.
No comment yet.
Scooped by Olivier Lartillot!

Fourier’s transformational thinking

Fourier’s transformational thinking | Computational Music Analysis |
The mathematics of Joseph Fourier, born 250 years ago this week, shows the value of intellectual boldness.
Olivier Lartillot's insight:
From the article:

When you listen to digital music, the harmonies and chords that you hear have probably been reconstructed from a file that stored them as components of different frequencies, broken down by a process known as Fourier analysis. As you listen, the cochleae in your ears repeat the process — separating the sounds into those same sinusoidal components before sending electrical signals to the brain, which puts the components together again. 

Fourier analysis allows complex waveforms to be understood and analysed by breaking them down into simpler signals. And it’s a shining example of the power and value of intellectual boldness.

The roots of the idea go back to the mid-1700s, when the Italian mathematical physicist Joseph-Louis Lagrange and others studied the vibration of strings and the propagation of sound. But it was one of Lagrange’s pupils, Joseph Fourier, who in 1822 truly founded the field that carries his name.

Fourier was born 250 years ago this week, on 21 March 1768. Today, there is virtually no branch of science, technology and engineering that is left untouched by his ideas. Modern versions and analogues of his theory help researchers to analyse their data in almost every discipline, powering everything from YouTube’s videos to machine-learning techniques.

Among the scientists who benefited is Ingrid Daubechies, an applied mathematician, who in the 1980s helped to develop the theory of wavelets, which generalized Fourier analysis and opened up previously inaccessible problems. “He’s one of my heroes,” Daubechies says.

Fourier wanted to understand how heat propagates in a solid object. He discovered the equation that governs this, and showed how to solve it — predicting how the temperature distribution will evolve, starting from the known distribution at an initial time. To do so, he broke the temperature profile down into trigonometric functions, as if it were a sound wave. Crucially, his analysis included functions for which temperature was allowed to have ‘discontinuities’, or abrupt jumps. This possibility horrified mathematicians at the time, who were much more comfortable with smooth curves that promised aesthetic simplicity. Fourier stuck to his guns and, as he developed his ideas, started to win his critics over.

Beyond breaking down a function into frequencies, Fourier created a ‘dual’ profile that encodes all those frequencies, and that became known as the Fourier transform.

Modern incarnations of Fourier analysis include the ‘fast Fourier transform’ and ‘discrete Fourier transform’, which allow faster and more-efficient processing of large amounts of information.
No comment yet.
Scooped by Olivier Lartillot!

The Artificial Intelligence Revolution: The exponential rise of Superintelligence, and its consequences..

The Artificial Intelligence Revolution: The exponential rise of Superintelligence, and its consequences.. | Computational Music Analysis |
Part 1 of 2: "The Road to Superintelligence". Artificial Intelligence — the topic everyone in the world should be talking about.
Olivier Lartillot's insight:
Computer-automated music analysis (and music composition) is a particular application of Artificial Intelligence (AI). 

The article linked to this post clarifies three types of “calibers” for AI. Current research in music AI belongs to the first AI Caliber “Artificial Narrow Intelligence” (ANI), or weak AI, in the sense that the machine solely focuses on a particular task and cannot use its expertise for other aspects of music or for tasks outside music. 

The second AI Caliber, “Artificial General Intelligence” (AGI), or strong AI, corresponds to machines that could perform any intellectual task that a human being can.

In my view, a good music analysis, even of a particular aspect of music, requires taking into consideration a large range of musical considerations, because in music everything is intertwined. Hence a good computer automation of music analysis would need to achieve a certain degree of Artificial General Music Intelligence (not necessarily all aspects of human intelligence, but all that is required to fully understand music).

But the main point of the linked article is about the third AI Caliber, Artificial Superintelligence (ASI), where the machine develops a degree of intelligence that is far superior to our human capabilities. The application to music analysis and composition would be extremely exciting. Once the machine would be able to reach the same degree of music intelligence than, say, Bach or Boulez — and to flood us with an infinite amount of extremely refined music, it would immediately completely transcend our mere human capabilities and start developing a kind of Supermusic, somewhat unfathomable to us. It would not be of much interest if we cannot grasp anything about it, but if the machine can still try to let us find ways to approach this artistic monster, by keeping cognitive constraints in the composition and by offering some kind of guiding auditory tools, this would be fantastic. 

Well, the linked article is not about music but about the general characteristics of ASI as well as the terrible underlying risks of existential threat.

UPDATE: So basically, as soon as we achieve the technological level to create artificial composers as genius as Bach or Boulez, and immediately after to create Supermusic, there is high risk that another Superintelligence developed by a particular company or organization will take the lead, accidentally take the control of the human kind, suppress all other AI systems (including music) and impose its own objective, such as turning the universe into endless paperclips..


My digest: 

“ Human progress moving quicker and quicker as time goes on—is what futurist Ray Kurzweil calls human history’s Law of Accelerating Returns. This happens because more advanced societies have the ability to progress at a faster rate than less advanced societies—because they’re more advanced.

Kurzweil suggests that the progress of the entire 20th century would have been achieved in only 20 years at the rate of advancement in the year 2000—in other words, by 2000, the rate of progress was five times faster than the average rate of progress during the 20th century. He believes another 20th century’s worth of progress happened between 2000 and 2014 and that another 20th century’s worth of progress will happen by 2021, in only seven years. A couple decades later, he believes a 20th century’s worth of progress will happen multiple times in the same year, and even later, in less than one month. All in all, because of the Law of Accelerating Returns, Kurzweil believes that the 21st century will achieve 1,000 times the progress of the 20th century.

If Kurzweil and others who agree with him are correct, then we may be as blown away by 2030 as a 1750 guy was by 2015 and the world in 2050 might be so vastly different than today’s world that we would barely recognize it. This isn’t science fiction. It’s what many scientists smarter and more knowledgeable than you or I firmly believe—and if you look at history, it’s what we should logically predict. 

“singularity” or “technological singularity”: This term has been used in math to describe an asymptote-like situation where normal rules no longer apply. It’s been used in physics to describe a phenomenon like an infinitely small, dense black hole or the point we were all squished into right before the Big Bang. Again, situations where the usual rules don’t apply. In 1993, Vernor Vinge wrote a famous essay in which he applied the term to the moment in the future when our technology’s intelligence exceeds our own—a moment for him when life as we know it will be forever changed and normal rules will no longer apply. Ray Kurzweil then muddled things a bit by defining the singularity as the time when the Law of Accelerating Returns has reached such an extreme pace that technological progress is happening at a seemingly-infinite pace, and after which we’ll be living in a whole new world.

While there are many different types or forms of AI since AI is a broad concept, the critical categories we need to think about are based on an AI’s caliber. There are three major AI caliber categories:

AI Caliber 1) Artificial Narrow Intelligence (ANI): Sometimes referred to as Weak AI, Artificial Narrow Intelligence is AI that specializes in one area. There’s AI that can beat the world chess champion in chess, but that’s the only thing it does. Ask it to figure out a better way to store data on a hard drive, and it’ll look at you blankly. For instance, your phone is a little ANI factory. When you navigate using your map app, receive tailored music recommendations from Pandora, check tomorrow’s weather, talk to Siri, or dozens of other everyday activities, you’re using ANI.

AI Caliber 2) Artificial General Intelligence (AGI): Sometimes referred to as Strong AI, or Human-Level AI, Artificial General Intelligence refers to a computer that is as smart as a human across the board—a machine that can perform any intellectual task that a human being can. Creating AGI is a much harder task than creating ANI, and we’re yet to do it. “a very general mental capability that, among other things, involves the ability to reason, plan, solve problems, think abstractly, comprehend complex ideas, learn quickly, and learn from experience.”

AI Caliber 3) Artificial Superintelligence (ASI): “an intellect that is much smarter than the best human brains in practically every field, including scientific creativity, general wisdom and social skills.” Artificial Superintelligence ranges from a computer that’s just a little smarter than a human to one that’s trillions of times smarter—across the board.

As of now, humans have conquered the lowest caliber of AI—ANI—in many ways, and it’s everywhere. The AI Revolution is the road from ANI, through AGI, to ASI—a road we may or may not survive but that, either way, will change everything.

Nothing will make you appreciate human intelligence like learning about how unbelievably challenging it is to try to create a computer as smart as we are. Building skyscrapers, putting humans in space, figuring out the details of how the Big Bang went down—all far easier than understanding our own brain or how to make something as cool as it. As of now, the human brain is the most complex object in the known universe. [Is it, really? (OL)]

What’s interesting is that the hard parts of trying to build AGI (a computer as smart as humans in general, not just at one narrow specialty) are not intuitively what you’d think they are. Build a computer that can multiply two ten-digit numbers in a split second—incredibly easy. Build one that can look at a dog and answer whether it’s a dog or a cat—spectacularly difficult. Make AI that can beat any human in chess? Done. Make one that can read a paragraph from a six-year-old’s picture book and not just recognize the words but understand the meaning of them? Google is currently spending billions of dollars trying to do it. Hard things—like calculus, financial market strategy, and language translation—are mind-numbingly easy for a computer, while easy things—like vision, motion, movement, and perception—are insanely hard for it. Or, as computer scientist Donald Knuth puts it, “AI has by now succeeded in doing essentially everything that requires ‘thinking’ but has failed to do most of what people and animals do ‘without thinking.'”

Those things that seem easy to us are actually unbelievably complicated, and they only seem easy because those skills have been optimized in us (and most animals) by hundreds of millions of years of animal evolution. When you reach your hand up toward an object, the muscles, tendons, and bones in your shoulder, elbow, and wrist instantly perform a long series of physics operations, in conjunction with your eyes, to allow you to move your hand in a straight line through three dimensions. It seems effortless to you because you have perfected software in your brain for doing it. Same idea goes for why it’s not that malware is dumb for not being able to figure out the slanty word recognition test when you sign up for a new account on a site—it’s that your brain is super impressive for being able to.
On the other hand, multiplying big numbers or playing chess are new activities for biological creatures and we haven’t had any time to evolve a proficiency at them, so a computer doesn’t need to work too hard to beat us.

And everything we just mentioned is still only taking in stagnant information and processing it. To be human-level intelligent, a computer would have to understand things like the difference between subtle facial expressions, the distinction between being pleased, relieved, content, satisfied, and glad, and why Braveheart was great but The Patriot was terrible.

On the hardware side, the raw power needed for AGI is technically available now, in China, and we’ll be ready for affordable, widespread AGI-caliber hardware within 10 years. But raw computational power alone doesn’t make a computer generally intelligent—the next question is, how do we bring human-level intelligence to all that power? The truth is, no one really knows how to make it smart—we’re still debating how to make a computer human-level intelligent and capable of knowing what a dog and a weird-written B and a mediocre movie is. But there are a bunch of far-fetched strategies out there and at some point, one of them will work. Here are the three most common strategies I came across:

1) Plagiarize the brain. The science world is working hard on reverse engineering the brain to figure out how evolution made such a rad thing—optimistic estimates say we can do this by 2030. Once we do that, we’ll know all the secrets of how the brain runs so powerfully and efficiently and we can draw inspiration from it and steal its innovations.

2) Try to make evolution do what it did before but for us this time. If the brain is just too complex for us to emulate, we could try to emulate evolution instead.

3) Make this whole thing the computer’s problem, not ours. This is when scientists get desperate and try to program the test to take itself. But it might be the most promising method we have. The idea is that we’d build a computer whose two major skills would be doing research on AI and coding changes into itself—allowing it to not only learn but to improve its own architecture. We’d teach computers to be computer scientists so they could bootstrap their own development. And that would be their main job—figuring out how to make themselves smarter 

Rapid advancements in hardware and innovative experimentation with software are happening simultaneously, and AGI could creep up on us quickly and unexpectedly for two main reasons: 

1) Exponential growth is intense and what seems like a snail’s pace of advancement can quickly race upwards 

2) When it comes to software, progress can seem slow, but then one epiphany can instantly change the rate of advancement (kind of like the way science, during the time humans thought the universe was geocentric, was having difficulty calculating how the universe worked, but then the discovery that it was heliocentric suddenly made everything much easier). Or, when it comes to something like a computer that improves itself, we might seem far away but actually be just one tweak of the system away from having it become 1,000 times more effective and zooming upward to human-level intelligence. 

AGI with an identical level of intelligence and computational capacity as a human would still have significant advantages over humans.

AI, which will likely get to AGI by being programmed to self-improve, wouldn’t see “human-level intelligence” as some important milestone—it’s only a relevant marker from our point of view—and wouldn’t have any reason to “stop” at our level. And given the advantages over us that even human intelligence-equivalent AGI would have, it’s pretty obvious that it would only hit human intelligence for a brief instant before racing onwards to the realm of superior-to-human intelligence.

As AI zooms upward in intelligence toward us, we’ll see it as simply becoming smarter, for an animal. Then, when it hits the lowest capacity of humanity—“the village idiot”—we’ll be like, “Oh wow, it’s like a dumb human. Cute!” The only thing is, in the grand spectrum of intelligence, all humans, from the village idiot to Einstein, are within a very small range—so just after hitting village idiot level and being declared to be AGI, it’ll suddenly be smarter than Einstein and we won’t know what hit us. 

Most of our current models for getting to AGI involve the AI getting there by self-improvement. And once it gets to AGI, even systems that formed and grew through methods that didn’t involve self-improvement would now be smart enough to begin self-improving if they wanted to.

Recursive self-improvement: An AI system at a certain level—let’s say human village idiot—is programmed with the goal of improving its own intelligence. Once it does, it’s smarter—maybe at this point it’s at Einstein’s level—so now when it works to improve its intelligence, with an Einstein-level intellect, it has an easier time and it can make bigger leaps. These leaps make it much smarter than any human, allowing it to make even bigger leaps. As the leaps grow larger and happen more rapidly, the AGI soars upwards in intelligence and soon reaches the superintelligent level of an ASI system. This is called an Intelligence Explosion, and it’s the ultimate example of The Law of Accelerating Returns. There is some debate about how soon AI will reach human-level general intelligence. The median year on a survey of hundreds of scientists about when they believed we’d be more likely than not to have reached AGI was 2040—that’s only 25 years from now, which doesn’t sound that huge until you consider that many of the thinkers in this field think it’s likely that the progression from AGI to ASI happens very quickly.

Superintelligence of that magnitude is not something we can remotely grasp. In our world, smart means a 130 IQ and stupid means an 85 IQ—we don’t have a word for an IQ of 12,952. What we do know is that humans’ utter dominance on this Earth suggests a clear rule: with intelligence comes power.

What it would mean for a machine to be superintelligent? A key distinction is the difference between speed superintelligence and quality superintelligence. What makes humans so much more intellectually capable than chimps isn’t a difference in thinking speed—it’s that human brains contain a number of sophisticated cognitive modules that enable things like complex linguistic representations or longterm planning or abstract reasoning, that chimps’ brains do not.

In an intelligence explosion—where the smarter a machine gets, the quicker it’s able to increase its own intelligence, until it begins to soar upwards—a machine might take years to rise from the chimp step to the one above it, but perhaps only hours to jump up a step once it’s on the dark green step two above us, and by the time it’s ten steps above us, it might be jumping up in four-step leaps every second that goes by.

And since we just established that it’s a hopeless activity to try to understand the power of a machine only two steps above us, let’s very concretely state once and for all that there is no way to know what ASI will do or what the consequences will be for us. Anyone who pretends otherwise doesn’t understand what superintelligence means. Evolution has advanced the biological brain slowly and gradually over hundreds of millions of years, and in that sense, if humans birth an ASI machine, we’ll be dramatically stomping on evolution. Or maybe this is part of evolution—maybe the way evolution works is that intelligence creeps up more and more until it hits the level where it’s capable of creating machine superintelligence, and that level is like a tripwire that triggers a worldwide game-changing explosion that determines a new future for all living things:

A large portion of the people who know the most about this topic would agree that 2060 is a very reasonable estimate for the arrival of potentially world-altering ASI. Only 45 years from now.

Bostrom and many others also believe that the most likely scenario is that the very first computer to reach ASI will immediately see a strategic benefit to being the world’s only ASI system. And in the case of a fast takeoff, if it achieved ASI even just a few days before second place, it would be far enough ahead in intelligence to effectively and permanently suppress all competitors. Bostrom calls this a decisive strategic advantage, which would allow the world’s first ASI to become what’s called a singleton—an ASI that can rule the world at its whim forever, whether its whim is to lead us to immortality, wipe us from existence, or turn the universe into endless paperclips.

The singleton phenomenon can work in our favor or lead to our destruction. If the people thinking hardest about AI theory and human safety can come up with a fail-safe way to bring about Friendly ASI before any AI reaches human-level intelligence, the first ASI may turn out friendly. It could then use its decisive strategic advantage to secure singleton status and easily keep an eye on any potential Unfriendly AI being developed. We’d be in very good hands.

But if things go the other way—if the global rush to develop AI reaches the ASI takeoff point before the science of how to ensure AI safety is developed, it’s very likely that an Unfriendly ASI emerges as the singleton and we’ll be treated to an existential catastrophe.

As for where the winds are pulling, there’s a lot more money to be made funding innovative new AI technology than there is in funding AI safety research… 

This may be the most important race in human history. There’s a real chance we’re finishing up our reign as the King of Earth—and whether we head next to a blissful retirement or straight to the gallows still hangs in the balance.
No comment yet.
Scooped by Olivier Lartillot!

Why we really, really, really like repetition in music

Why we really, really, really like repetition in music | Computational Music Analysis |
It slays, okay?
Olivier Lartillot's insight:
Nice video:

“Colin is a computer scientist who created a tool called SongSim that runs pop song lyrics through a self-similarity matrix to visualize musical repetition. He also made a chart showing the rise of repetition in music lyrics over time.

Elisabeth Margulis has dedicated her career to music research and runs the music cognition lab at the University of Arkansas. Her book On Repeat: How music plays the mind delves deep into the science behind musical repetition and explores the many ways our brains react to it.”

Actually, this type of visualization can be performed also directly on the music itself, not only on the lyrics. For instance, you can visualize the self-similarity matrices related to rhythm, or melody, or chords, etc. You can perform this analysis for instance using MIRtoolbox (
No comment yet.
Scooped by Olivier Lartillot!

How A.I. Is Creating Building Blocks to Reshape Music and Art

How A.I. Is Creating Building Blocks to Reshape Music and Art | Computational Music Analysis |
Project Magenta, a team at Google, is crossbreeding sounds from different instruments based on neural networks and building networks that can draw.
Olivier Lartillot's insight:

Enrolling as a graduate student at Indiana University, in Bloomington, not far from where he grew up, Douglas Eck pitched his idea (to build machines that could make their own songs, to mix A.I. and music) to Douglas Hofstadter, the cognitive scientist who wrote the Pulitzer Prize-winning book on minds and machines, “Gödel, Escher, Bach: An Eternal Golden Braid.” Mr. Hofstadter turned him down, adamant that even the latest artificial intelligence techniques were much too primitive. But over the next two decades, working on the fringe of academia, Mr. Eck kept chasing the idea, and eventually, the A.I. caught up with his ambition. Last spring, a few years after taking a research job at Google, Mr. Eck pitched the same idea he pitched Mr. Hofstadter all those years ago. The result is Project Magenta, a team of Google researchers who are teaching machines to create not only their own music but also to make so many other forms of art, including sketches, videos and jokes. The project is part of a growing effort to generate art through a set of A.I. techniques that have only recently come of age. Called deep neural networks, these complex mathematical systems allow machines to learn specific behavior by analyzing vast amounts of data. But these complex systems can also create art. By analyzing a set of songs, for instance, they can learn to build similar sounds. As Mr. Eck says, these systems are at least approaching the point — still many, many years away — when a machine can instantly build a new Beatles song or perhaps trillions of new Beatles songs, each sounding a lot like the music the Beatles themselves recorded, but also a little different.
No comment yet.
Scooped by Olivier Lartillot!

I couldn’t tell that this was a robot singing Duke Ellington’s signature song

I couldn’t tell that this was a robot singing Duke Ellington’s signature song | Computational Music Analysis |
Sometimes we hear something and can't believe it. That happened to me today. New research from Barcelona's Pompeu Fabra University has trained an AI to sing better than I can after listening to just 35 minutes of audio. Speech generation has been drastically improving over the last few years, with research firms like Baid
Olivier Lartillot's insight:
New research from Barcelona’s Pompeu Fabra University has trained an AI to sing after listening to just 35 minutes of audio. Speech generation has been drastically improving over the last few years, with research firms like Baidu and DeepMind constantly one-upping each other in who can make the most realistic robot voice. The team used an older approach on top of DeepMind’s new WaveNet voice generator. Instead of learning from just from the audio itself, they analyzed the audio and broke it into components: pitch, timbre and aperiodicity (the “breathy component of the voice”). By separating pitch and timbre components, they can easily manipulate pitch to match any desired melody. The AI can also learn from smaller amount of audio when the data is broken down. How long until an interview with a musician or a solo voice track is all it takes for someone to copy their voice? (Couple that with face-stealing technology and you’ve got an AI cover band on your hands.) Blaauw was actually surprised at how well the neural network, a statistical approximation of how the brain learns, was able to use these components to understand how the voice should sound. They were able to gauge the network’s understanding by trying to have it mimic a softer voice—and the network applied things it learned from the regular voice, like notes and phonetic transitions, to the softer one.
No comment yet.
Scooped by Olivier Lartillot!

Scarlett, the artificial intelligence that listens to your music

Artificial intelligence applied to music

Olivier Lartillot's insight:
My translation, using Google Translate:

Launched by the French start-up Niland in 2013, the application Scarlett is finally born. Unlike Apple or Spotify Music, Scarlett recommends songs based on the music you listen to, and not on statistical data.

Launched in 2013, the Parisian start-up Niland specializes in the search engines and music recommendation. This year, they will release their Scarlett application, after three years of work on the development of innovative new algorithms. Scarlett is an Artificial Intelligence Music that helps users discover new music based on the tastes and context of each listening (mood and activity) from Soundcloud . Thus, the application is unique to each. "We have no special deal with Soundcloud, except that we have access to their programming interface that allows us to analyze and stream songs" explains Damien Tardieu. Unlike the algorithms used by Youtube or Spotify, which offer the most listened to songs by other users, those of Scarlett analyze directly streamed sounds. During each interaction with the application, Scarlett learns to improve the relevance of its recommendations. This is called deep learning. 

“The technology of deep learning learns to represent the world. How the machine will represent the voice or image for example." Here it is a representation of musical tastes. "With artificial intelligence, our algorithm is its own representation of the music. It uses a thousand criteria to compare the music together. "

"Our recommendation technology is very different from Spotify. It is based solely on the content of the music to recommend it, not on statistics or plays on information found on the Internet. This allows us to work on very different catalogs for which little or no data are available, such as the catalog of Soundcloud, or recommend for young or emerging artists. The application will be available on the Apple Store and Google Play next September. Meanwhile, you can already test it.
No comment yet.
Scooped by Olivier Lartillot!

Spotify's Head of Deep Learning Reveals How AI Is Changing the Music Industry

Spotify's Head of Deep Learning Reveals How AI Is Changing the Music Industry | Computational Music Analysis |
The New York Observer talked with Nicola Montecchio, a music information retrieval scientist who is the head of deep learning for Spotify, who detailed how the company is using deep learning to imp...
Olivier Lartillot's insight:

The article itself:

"Music is undergoing drastic changes all of the time. From styles and genres to devices we use to listen and how content is produced, everything music-related is constantly evolving—a phenomenon that is beginning to accelerate with the research and introduction of new technologies.


Most of us are unaware of this behind-the-scene job, but the industry is inching more and more into the world of artificial intelligence and machine learning. The New York Observer talked with Nicola Montecchio, a music information retrieval scientist who is the head of deep learning for Spotify, who detailed how the company is using deep learning to improve the way we find and listen to music.


What’s your role at Spotify and how is the company utilizing deep learning?

My job here is to apply machine learning techniques to the content of songs. We’re trying to figure out if a song is happy or sad, what are similar sounding songs and things of that sort. The characteristics that we associate with songs are subjective, but we’re trying to infer these subjective characteristics of songs by focusing on the acoustic elements themselves without considering the popularity of the artists.


How does that better user experience?

It’s figuring out users’ interests by seeing what else they and others are interested in, and it’s been working well.

If a new song comes on the platform from an artist that’s not popular, it would be hard to associate that with you using more traditional methods because no one else is listening. But when we rely on the acoustics of the song, we can make better suggestions and direct users to lesser-known artists.


So, is introducing new music to users part of Spotify’s mission?

I would say so, yes. We try to make you discover new music. If there are some unknown artists that sound good and are similar to what you like, we think you should listen to that.


How were you able to curate music in the past? How successful was it?

We already did this some without deep learning, but this allows us to go more in-depth with the understanding of the song. It’s a bit more flexible in a way. We had an intern last summer who did a really nice job, and his idea was to map the acoustic elements of a song to the listening patterns. He was trying to predict what you will listen to using deep learning. It’s also interesting because you can’t predict the popularity from just the audio, so you’re also predicting something about the humanality [sic] of it.


How has deep learning allowed Spotify to grow?

On my side, I think it brings us a lot in terms of accuracy. Then of course we’re serving this into the recommendation engine. It’s surely enabling us to be more diversified—less tied to popularity and more tied to what the song actually sounds like.


How will this technology change the way we find and listen to music in the future?

It will be more about the sound. You’ll be able to search by the content of the music instead of just text information associated with it.  This means you can search for truly similar sounding music by actually searching the sound itself instead of doing what we currently do—search a title, image or artist. That’s what I think will come to the table that’s significantly better than what’s out there.


How will this affect the music industry?

For sure it’s a way to make some other artists more visible that you might not have heard of otherwise. To say how much listening patterns could change is a little bit of a harder of a question, but it should make the field more even and less biased. It will certainly level the playing field.

No comment yet.
Scooped by Olivier Lartillot!

How music listening programmes can be easily fooled

How music listening programmes can be easily fooled | Computational Music Analysis |
Like automatic image recognition systems, music listening programmes are easily fooled by almost undetectable changes in the music they are studying.
Olivier Lartillot's insight:

From the blog post:


“For well over two decades, researchers have sought to build music listening software that can address the deluge of music growing faster than our Spotify-spoilt appetites. From software that can tell you about the music you are hearing in a club and software that can recommend the music you didn't know you wanted to hear. From software that can intelligently accompany you practicing your instrument or act as an automated sound engineer, machine music listening is becoming increasingly prevalent.


One particular area of intense research has been devoted to getting these “computer ears” to recognise generic attributes of music: genres such as blues or disco, moods such as sad or happy, or rhythms like waltz or cha cha. The potential is enormous. Like spiders that crawl the web to make its information accessible to anyone, these machine listeners will be able to do the same for massive collections of music.


So far, we see some systems achieve accuracies that equal that of humans. The appearance of this human-level performance, however, may instead be a classic sign of unintentional cues from the experimentalist - a danger in experimental design recognised for over a century.


Clever Hans taps to an unexpected beat


Hans was a typical horse in Germany at the turn of the 20th century, with an atypical ability in abstract thought. Anyone could ask him to add several numbers, and away he would tap until he reached the correct answer. He could subtract, multiply, divide, factor, and even correctly answer questions written on a slate. His trainer also noted with pride that Hans learned to master new subjects with amazing ease, such as music theory, and the Gregorian calendar. However, give Hans an arithmetic problem to which no one knew the answer, or while he was blindfolded, and he was rendered back to the world of his more oat-minded barn mates.


It turns out that Hans really was clever, but only in the sense that he learned the most carrot-rewarding interpretation of the unconscious cues of the turn of the century German enquirer. It was apparently common to slightly bow one's torso while asking questions of a horse, and then erect oneself at the moment it produced the correct answer. Hans had simply learned to begin tapping his hoof once he saw the torso bow, and stop once it unbowed, a cue so subtle that it eluded detection until properly controlled experiments were conducted. This brings us back to our music listening systems that appear to perform as well as humans.


Blindfolding and commissioning the listening machine


Taking one state of the art system measured to classify seven music rhythms with an 88% accuracy, we find that it only appears so because it has learned the most carrot-rewarding interpretation of the data it has seen: the generic rhythm labels are strongly correlated in the dataset with tempo. As long as a system can accurately estimate tempo, it can appear in this particular dataset to be capable of recognizing rhythm. If we slightly change the music tempi of the test dataset (like blindfolding Hans), our formerly fantastic system begins performing no better than chance.


To gain insight into what a state of the art music genre recognition system has learned, we have employed it as a kind of “professor” for young and naïve (computerised) music composition students. These students come to the professor with their random compositions, and it tells them whether or not their music is, for instance, unlike disco, quite like disco, or a perfect representation of disco. We keep the ones it says are perfect.


This system, which has been measured to recognise ten music genres with 82% accuracy (arguably human performance), confidently labelled 10 compositions as representative of each genre it has learnt (see the video in the link). By a listening experiment, we found that humans could not recognise any of the genres.


It appears then that our “professor” is like Clever Hans.


We’ve used a similar procedure to make another state of the art music listening system label the same pieces of classical and country music as a variety of other genres. It is hard to hear much difference between them.


Just like horses, “horses” are not all bad


We should note that discovering a “horse” in an algorithm is not necessarily its ticket to the algorithmic glue factory. For one, a “horse” might be completely sufficient to meet the needs of a target application. It also provides an opportunity and mechanism to improve the validity of the system evaluation. And finally, discovering a “horse” provides a way to improve the system itself since it identifies the reasons for its behaviour. For that, I think even Clever Hans is clever enough to tap his hoof once for “Ja”!”

No comment yet.
Scooped by Olivier Lartillot!

Spotify: Friend or Foe?

Spotify: Friend or Foe? | Computational Music Analysis |
Earlier this year, Spotify bought a Boston-based startup called the Echo Nest, which has developed a form of artificial music intelligence—a kind of A.I. hipster that finds cool music for you. The Echo Nest powers Spotify’s automated radio stations and is also behind an in-house programming tool called Truffle Pig, which can be told to sniff out music with combinations of more than fifty parameters, such as “speechiness” and “acoustic-ness.” Now that the Echo Nest is part of Spotify, its team has access to the enormous amount of data generated by Spotify users which show how they consume music.
Olivier Lartillot's insight:

"Earlier this year, Spotify bought a Boston-based startup called the Echo Nest, which has developed a form of artificial music intelligence—a kind of A.I. hipster that finds cool music for you. The Echo Nest powers Spotify’s automated radio stations and is also behind an in-house programming tool called Truffle Pig, which can be told to sniff out music with combinations of more than fifty parameters, such as “speechiness” and “acoustic-ness.” Now that the Echo Nest is part of Spotify, its team has access to the enormous amount of data generated by Spotify users which show how they consume music."

No comment yet.
Scooped by Olivier Lartillot!

Inside Google's Infinite Music Intelligence Machine

Inside Google's Infinite Music Intelligence Machine | Computational Music Analysis |

In May, Google launched a music service that will challenge Spotify and Pandora for radio domination. We asked Google research scientist Doug Eck how it works.

Olivier Lartillot's insight:


The Beatles have a way of keeping Doug Eck up at night. Specifically, the research scientist at Google grapples with one of the core problems of automated music discovery: With a band whose catalog is as evolutionary and nuanced as The Beatles's, how can computers truly understand the artist and recommend relevant music to fans? For humans, detecting the difference is easy. For machines, it's not so simple.

Solving problems like these resolves only part of a larger, more complex equation at play when Google attempts to help its users discover music. Music discovery is a crucial piece of that puzzle and one that's notoriously challenging to lock into place.
In taking its own stab at music recommendation, Google blends technical solutions like machine listening and collaborative filtering with good, old-fashioned human intuition. Employing both engineers and music editors, the service continually tries to understand what people are listening to, why they enjoy it, and what they might want to hear next.

How Google Music Intelligence Works

Eck's team is focused on the technical side of this equation, relying on a dual-sided machine learning methodology. One component of that is collaborative filtering of the variety employed by Netflix and Amazon to recommend horror flicks and toasters. The other involves machine listening. That is, computers "listen" to the audio and try to pick out specific qualities and details within each song.

Collaborative filtering works wonders for the Amazons of the world. But since this type of If-you-like-that-you'll-also-like-this logic works better for kitchen appliances than it does for art, the system needs a way to learn more about the music itself. To teach it, Eck's team leverages Google's robust infrastructure and machine-listening technology to pick apart the granular qualities of each song.
"By and large, audio-based models are very good at timbre," says Eck. "So they're very good at recognizing instruments, very good at recognizing things like distorted guitars, very good at recognizing things like whether it's a male or female vocalist."

These are precisely the kinds of details that Pandora relies on human, professionally trained musicians to figure out. The Internet radio pioneer has long employed musicologists to listen to songs and help build out a multipoint, descriptive data set designed to place each track into a broader context and appropriately relate it to other music. For Pandora, the results have been extremely valuable, but mapping out this musical intelligence manually doesn't scale infinitely. Thankfully, machine listening has come a long way in recent years. Much like Google indexes the Web, the company is able to index a massive database of audio, mapping the musical qualities found within. Since it's automated, this part of Google's music recommendation technology can be scaled to a much larger set of data.

"If the four of us decided we were going to record a jazz quartet right here and now and we uploaded it to Play Music, our system will be aware that were talking about that," explains Eck. "By pulling these audio features out of every track that we work with, it gives us a kind of musical vocabulary that we can work with for doing recommendation even if it’s a very long tail."

Indeed, when it comes to music, the tail has never been longer. The world's selection of recorded music was never finite, but today creating and distributing new songs is virtually devoid of friction and financial cost. However much human intelligence as Pandora feeds into its algorithm, its Music Genome Project will never be able to keep up and understand everything. That's where machine learning gets a leg up.

The Limits Of Machine Listening

Still, there's a reason Pandora has more than 70 million active listeners and continues to increase its share of overall radio listening time. Its music discovery engine is very good. It might not know about my friend's band on a small Georgia-based record label, but the underlying map of data that Pandora uses to create stations is still incredibly detailed. When I start a radio station based on Squarepusher, an acclaimed but not particularly popular electronic music artist, the songs it plays are spun for very specific reasons. It plays a track by Aphex Twin because it features "similar electronica roots, funk influences, headnodic beats, the use of chordal patterning, and acoustic drum samples." Then, when I skip to the next song, it informs me that, "We're playing this track because it features rock influences, meter complexity, unsyncopated ensemble rhythms, triple meter style, and use of modal harmonies."

Pandora knows this much about these tracks thanks to those aforementioned music experts who sat down and taught it. Automated machine listening, by comparison, can't get quite as specific. At least, not yet.

"It’s very hard and we haven’t solved the problem with a capital S," says Eck, whose has an academic background in automated music analysis. "Nor has anybody else."

Computers might be able to pick out details about timbre, instruments used, rhythm, and other on-the-surface sonic qualities, but they can only dig so deep.

"You can learn a lot from one second of audio. Certainly you can tell if there’s a female voice there or if there’s distorted guitar there. What about when we stretch out and we look what our musical phrase is. What’s happening melodically? Where’s this song going? As we move out and have longer time scale stretches that we’re trying to outline, it becomes very hard to use machines alone to get the answer."
Thanks Algorithms, But The Humans Can Take It From Here.
That's where the good, old-fashioned human beings come in. To help flesh out the music discovery and radio experiences in All Access, Google employs music editors who have an intuition that computers have yet to successfully mimic. Heading up this editor-powered side of the equation is Tim Quirk, a veteran of the online music industry who worked at the now-defunct before Napster was a household name.

"Algorithms can tell you what are the most popular tracks in any genre, but an algorithm might not know that "You Don't Miss Your Water" was sort of the first classic, Southern soul ballad in that particular time signature and that it became the template for a decade's worth of people doing the same thing," says Quirk. "That’s sort of arcane human knowledge."

[well, we could hope computers could discovery such knowledge automatically in the future.. (Olivier)]

Google's blend of human and machine intelligence is markedly different from Pandora's. Rather than hand-feeding tons of advanced musical knowledge directly into its algorithms, Google mostly keeps the human-curated stuff in its own distinct areas, allowing the computers to do the heavy lifting elsewhere. Quirk and his team of music editors are the ones who define the most important artists, songs and albums in a given genre (of which there are hundreds in Google Play Music).

Quirk's team also creates curated playlists and make specific, hand-picked music recommendations. To the extent that these manually curated parts of the service influence its users' listening behavior, the human intelligence does find its way back into the algorithms. It just loops back around and takes a longer road to get there.
Google's employees aren't the only people feeding intelligence into this semiautomated music machine. Google is also constantly learning from its users. Like Pandora and its many copycats, Google Play Music's Radio feature has thumbs up and thumbs down buttons, which help inform the way the radio stations work over time.

Some day, computers will be better able to understand not just that I like The Beatles, but why I like The Beatles. They'll know that John is my favorite, which songs on Revolver I skip, and that of all the heavily Beatles-influenced bands in the world, I love Tame Impala, but loathe Oasis. The machines will get smarter, much like a toddler does: by watching and listening. As users spend time with the product and tap buttons, the nuanced details will become more obvious.
Meanwhile, the ability of computers to actually hear what's contained within each song can only improve over time, just as voice recognition continues to do. The promise of Google Play Music is the same thing that made Google successful to begin with: its ability to use massive data to understand who we are and what we want. If anybody can crack the notoriously hard nut of music discovery in a hugely scalable fashion, it's them. Just don't expect them to do it with machines alone.



No comment yet.
Scooped by Olivier Lartillot!

The Future of Music Genres Is Here

The Future of Music Genres Is Here | Computational Music Analysis |
We’ve always felt ambivalent about the word “genre” at The Echo Nest. On one hand, it’s the most universal shorthand for classifying music, because everyone has a basic understanding of the big, old...
Olivier Lartillot's insight:

The Echo Nest “top terms,” which are the words most commonly used to describe a piece of music, are “far more granular than the big, static genres of the past. We’ve been maintaining an internal list of dynamic genre categories for about 800 different kinds of music. We also know what role each artist or song plays in its genre (whether they are a key artist for that genre, one of the most commonly played, or an up-and-comer).


The Echo Nest just announced “a bunch of new genre-oriented features, including:

- A list of nearly 800 genres from the real world of music

- Names and editorial descriptions of every genre

- Essential artists from any genre

- Similar genres to any genre

- Verified explainer links to third-party resources when available

- Genre search by keyword

- Ranked genres associated with artists

- Three radio “presets” for each genre: Core (the songs most representative of the genre); In Rotation (the songs being played most frequently in any genre today); and Emerging (up-and-coming songs within the genre).


Where did these genres come from?


“The Echo Nest’s music intelligence platform continuously learns about music. Most other static genre solutions classify music into rigid, hierarchical relationships, but our system reads everything written about music on the web, and listens to millions of new songs all the time, to identify their acoustic attributes.


“This enables our genres to react to changes in music as they happen. To create dynamic genres, The Echo Nest identifies salient terms used to describe music (e.g., “math rock,” “IDM”, etc.), just as they start to appear. We then model genres as dynamic music clusters — groupings of artists and songs that share common descriptors, and similar acoustic and cultural attributes. When a new genre forms, we know about it, and music fans who listen to our customers’ apps and services will be able to discover it right away, too.


“Our approach to genres is trend-aware. That means it knows not only what artists and songs fall into a given genre, but also how those songs and artists are trending among actual music fans, within those genres.


“About 260 of these nearly 800 genres are hyper-regional, meaning that they are tied to specific places. Our genre system sees these forms of music as they actually exist; it can help the curious music fan hear the differences, for instance, between Luk Thung, Benga, and Zim music.”

No comment yet.