Computational Music Analysis
6.8K views | +4 today
Computational Music Analysis
New technologies and research dedicated to the analysis of music using computers. Towards Music 2.0!
Your new post is loading...
Your new post is loading...
Scooped by Olivier Lartillot!

Google new research project to create AI systems trained to create original pieces of music, art or video

Google new research project to create AI systems trained to create original pieces of music, art or video | Computational Music Analysis |

Google wants to put the art back in artificial intelligence. Magenta will use TensorFlow, the machine-learning engine that Google built and opened up to the public at the end of 2015, to determine whether AI systems can be trained to create original pieces of music, art, or video.

No comment yet.
Scooped by Olivier Lartillot!

How Brain Architecture Leads to Abstract Thought

How Brain Architecture Leads to Abstract Thought | Computational Music Analysis |
Olivier Lartillot's insight:

Taken from the article:


Using 20 years of functional magnetic resonance imaging (fMRI) data from tens of thousands of brain imaging experiments, computational neuroscientists have created a geometry-based method for massive data analysis to reach a new understanding of how thought arises from brain structure.

The authors say their work paves the way for advances in the identification and treatment of brain disease, as well as in deep learning artificial intelligence (AI) systems. 

fMRI detects changes in neural blood flow allowing researchers to relate brain activity with a cognitive behavior such as talking. No one ever tied together the tens of thousands of experiments performed over decades to show how the physical brain could give rise to abstract thought.

Cognitive function and abstract thought exist as an agglomeration of many cortical sources ranging from those close to sensory cortices to far deeper from them along the brain connectome, or connection wiring diagram.

The work demonstrates not only the basic operational paradigm of cognition, but shows that all cognitive behaviors exist on a hierarchy, starting with the most tangible behaviors such as finger tapping or pain, then to consciousness and extending to the most abstract thoughts and activities such as naming. This hierarchy of abstraction is related to the connectome structure of the whole human brain.

For this study, the researchers took a data-science approach. They first defined a physiological directed network of the whole brain, starting at input areas and labeling each brain area with the distance or “depth” from sensory inputs. They then processed the massive repository of fMRI data. The idea was to project the active regions for a cognitive behavior onto the network depth and describe that cognitive behavior in terms of its depth distribution. Although each cognitive behavior showed activity through many network depths, cognition is far richer, it wasn’t the simple hierarchy that everyone was looking for.

Imagine a balance where the right pan holds total brain activity with the shallowest depth; the other pan holds activity in deepest brain areas most removed from inputs. If the balance arm describes the total brain activity for a particular cognitive behavior, the right pan will be lower, creating a negative slope, when most activity is in shallow areas, and the left pan will go lower when most activity is deeper, creating a positive slope. The balance arm’s slope describes the relative shallow-to-deep brain activity for any behavior.

The new geometric algorithm works on this principle, but instead of two pans, it has many. The researchers summed all neural activity for a given behavior over all related fMRI experiments, then analyzed it using the slope algorithm. With a slope identifier, behaviors could now be ordered by their relative depth activity with no human intervention or bias. They ranked slopes for all cognitive behaviors from the fMRI databases from negative to positive and found that they ordered from more tangible to highly abstract. 

This work may have great impact in computer science, especially in deep learning. Deep learning is a computational system employing a multi-layered neural net, and is at the forefront of artificial intelligence (AI) learning algorithms. It bears similarity to the human brain in that higher layers are agglomerations of previous layers, and so provides more information in a single neuron.

But the brain’s processing dynamic is far richer and less constrained because it has recurrent interconnection, sometimes called feedback loops. In current human-made deep learning networks that lack recurrent interconnections, a particular input cannot be related to other recent inputs, so they can’t be used for time-series prediction, control operations, or memory.

Their lab is now creating a massively recurrent deep learning network  for a more brain-like and superior learning AI. Another interesting outcome of this research will be a new geometric data-science tool, which is likely to find widespread use in other fields where massive data is difficult to view coherently due to data overlap.


No comment yet.
Scooped by Olivier Lartillot!

​Former Nokia design head named CEO of The Sync Project - Boston Business Journal

​Former Nokia design head named CEO of The Sync Project - Boston Business Journal | Computational Music Analysis |
Marko Ahtisaari, who served as head of design for the Finnish telecommunications company, was named to head the startup today less than a year after it was launched.
Olivier Lartillot's insight:

The Boston-based startup The Sync Project, focused on studying how music can change human health, aims to combine two emerging trends in technology: The study of the structure of music that’s now being used in music programs like Echo Nest, Spotify and Pandora to identify similarities and differences in music, and the explosion of wearable technology like FitBits and Apple Watches to allow constant monitoring of health.

No comment yet.
Scooped by Olivier Lartillot!

A copyright victory for Marvin Gaye’s family is terrible for the future of music

A copyright victory for Marvin Gaye’s family is terrible for the future of music | Computational Music Analysis |
Even if you love Marvin Gaye and hate Robin Thicke's misogynist and vaguely rapey song.
Olivier Lartillot's insight:

I wonder if computational music analysis could not be used here to found juries' decisions on objective facts?

No comment yet.
Scooped by Olivier Lartillot!

How music listening programmes can be easily fooled

How music listening programmes can be easily fooled | Computational Music Analysis |
Like automatic image recognition systems, music listening programmes are easily fooled by almost undetectable changes in the music they are studying.
Olivier Lartillot's insight:

From the blog post:


“For well over two decades, researchers have sought to build music listening software that can address the deluge of music growing faster than our Spotify-spoilt appetites. From software that can tell you about the music you are hearing in a club and software that can recommend the music you didn't know you wanted to hear. From software that can intelligently accompany you practicing your instrument or act as an automated sound engineer, machine music listening is becoming increasingly prevalent.


One particular area of intense research has been devoted to getting these “computer ears” to recognise generic attributes of music: genres such as blues or disco, moods such as sad or happy, or rhythms like waltz or cha cha. The potential is enormous. Like spiders that crawl the web to make its information accessible to anyone, these machine listeners will be able to do the same for massive collections of music.


So far, we see some systems achieve accuracies that equal that of humans. The appearance of this human-level performance, however, may instead be a classic sign of unintentional cues from the experimentalist - a danger in experimental design recognised for over a century.


Clever Hans taps to an unexpected beat


Hans was a typical horse in Germany at the turn of the 20th century, with an atypical ability in abstract thought. Anyone could ask him to add several numbers, and away he would tap until he reached the correct answer. He could subtract, multiply, divide, factor, and even correctly answer questions written on a slate. His trainer also noted with pride that Hans learned to master new subjects with amazing ease, such as music theory, and the Gregorian calendar. However, give Hans an arithmetic problem to which no one knew the answer, or while he was blindfolded, and he was rendered back to the world of his more oat-minded barn mates.


It turns out that Hans really was clever, but only in the sense that he learned the most carrot-rewarding interpretation of the unconscious cues of the turn of the century German enquirer. It was apparently common to slightly bow one's torso while asking questions of a horse, and then erect oneself at the moment it produced the correct answer. Hans had simply learned to begin tapping his hoof once he saw the torso bow, and stop once it unbowed, a cue so subtle that it eluded detection until properly controlled experiments were conducted. This brings us back to our music listening systems that appear to perform as well as humans.


Blindfolding and commissioning the listening machine


Taking one state of the art system measured to classify seven music rhythms with an 88% accuracy, we find that it only appears so because it has learned the most carrot-rewarding interpretation of the data it has seen: the generic rhythm labels are strongly correlated in the dataset with tempo. As long as a system can accurately estimate tempo, it can appear in this particular dataset to be capable of recognizing rhythm. If we slightly change the music tempi of the test dataset (like blindfolding Hans), our formerly fantastic system begins performing no better than chance.


To gain insight into what a state of the art music genre recognition system has learned, we have employed it as a kind of “professor” for young and naïve (computerised) music composition students. These students come to the professor with their random compositions, and it tells them whether or not their music is, for instance, unlike disco, quite like disco, or a perfect representation of disco. We keep the ones it says are perfect.


This system, which has been measured to recognise ten music genres with 82% accuracy (arguably human performance), confidently labelled 10 compositions as representative of each genre it has learnt (see the video in the link). By a listening experiment, we found that humans could not recognise any of the genres.


It appears then that our “professor” is like Clever Hans.


We’ve used a similar procedure to make another state of the art music listening system label the same pieces of classical and country music as a variety of other genres. It is hard to hear much difference between them.


Just like horses, “horses” are not all bad


We should note that discovering a “horse” in an algorithm is not necessarily its ticket to the algorithmic glue factory. For one, a “horse” might be completely sufficient to meet the needs of a target application. It also provides an opportunity and mechanism to improve the validity of the system evaluation. And finally, discovering a “horse” provides a way to improve the system itself since it identifies the reasons for its behaviour. For that, I think even Clever Hans is clever enough to tap his hoof once for “Ja”!”

No comment yet.
Scooped by Olivier Lartillot!

Spotify: Friend or Foe?

Spotify: Friend or Foe? | Computational Music Analysis |
Earlier this year, Spotify bought a Boston-based startup called the Echo Nest, which has developed a form of artificial music intelligence—a kind of A.I. hipster that finds cool music for you. The Echo Nest powers Spotify’s automated radio stations and is also behind an in-house programming tool called Truffle Pig, which can be told to sniff out music with combinations of more than fifty parameters, such as “speechiness” and “acoustic-ness.” Now that the Echo Nest is part of Spotify, its team has access to the enormous amount of data generated by Spotify users which show how they consume music.
Olivier Lartillot's insight:

"Earlier this year, Spotify bought a Boston-based startup called the Echo Nest, which has developed a form of artificial music intelligence—a kind of A.I. hipster that finds cool music for you. The Echo Nest powers Spotify’s automated radio stations and is also behind an in-house programming tool called Truffle Pig, which can be told to sniff out music with combinations of more than fifty parameters, such as “speechiness” and “acoustic-ness.” Now that the Echo Nest is part of Spotify, its team has access to the enormous amount of data generated by Spotify users which show how they consume music."

No comment yet.
Scooped by Olivier Lartillot!

New Coursera course on Audio Signal Processing for Music Applications

New Coursera course on Audio Signal Processing for Music Applications | Computational Music Analysis |
Audio Signal Processing for Music Applications is a free online class taught by Xavier Serra and Julius O Smith of Stanford University and Universitat Pompeu Fabra of Barcelona
Olivier Lartillot's insight:

In this course you will learn about audio signal processing methodologies that are specific for music and of use in real applications. You will learn to analyse, synthesize and transform sounds using the Python programming language.

Audio signal processing is an engineering field that focuses on the computational methods for intentionally altering sounds, methods that are used in many musical applications.


The teachers have tried to put together a course that can be of interest and accessible to people coming from diverse backgrounds while going deep into several signal processing topics. They focus on the spectral processing techniques of relevance for the description and transformation of sounds, developing the basic theoretical and practical knowledge with which to analyze, synthesize, transform and describe audio signals in the context of music applications.


The course is based on open software and content. The demonstrations and programming exercises are done using Python under Ubuntu, and the references and materials for the course come from open online repositories. They are also distributing with open licenses the software and materials developed for the course.


Course Syllabus

Week 1: Introduction; basic mathematics 

Week 2: Discrete Fourier transform

Week 3: Fourier transform properties

Week 4: Short-time Fourier transform

Week 5: Sinusoidal model

Week 6: Harmonic model

Week 7: Sinusoidal plus residual modeling

Week 8: Sound transformations

Week 9: Sound/music description

Week 10: Concluding topics; beyond audio signal processing


The course assumes some basic background in mathematics and signal processing. Also, since the assignments are done with the programming language Python, some software programming background in any language is most helpful. 


Each week is structured around 6 types of activities:

Theory: video lectures covering the core signal processing concepts.   Demos: video lectures presenting tools and examples that complement the theory.Programming: video lectures introducing the needed programming skills (using Python) to implement the techniques described in the theory. Quiz: questionnaire to review the concepts covered. Assignment: programming exercises to implement and use the methodologies presented. Advanced topics: videos and written documents that extend the topics covered.

No comment yet.
Scooped by Olivier Lartillot!

Inside Google's Infinite Music Intelligence Machine

Inside Google's Infinite Music Intelligence Machine | Computational Music Analysis |

In May, Google launched a music service that will challenge Spotify and Pandora for radio domination. We asked Google research scientist Doug Eck how it works.

Olivier Lartillot's insight:


The Beatles have a way of keeping Doug Eck up at night. Specifically, the research scientist at Google grapples with one of the core problems of automated music discovery: With a band whose catalog is as evolutionary and nuanced as The Beatles's, how can computers truly understand the artist and recommend relevant music to fans? For humans, detecting the difference is easy. For machines, it's not so simple.

Solving problems like these resolves only part of a larger, more complex equation at play when Google attempts to help its users discover music. Music discovery is a crucial piece of that puzzle and one that's notoriously challenging to lock into place.
In taking its own stab at music recommendation, Google blends technical solutions like machine listening and collaborative filtering with good, old-fashioned human intuition. Employing both engineers and music editors, the service continually tries to understand what people are listening to, why they enjoy it, and what they might want to hear next.

How Google Music Intelligence Works

Eck's team is focused on the technical side of this equation, relying on a dual-sided machine learning methodology. One component of that is collaborative filtering of the variety employed by Netflix and Amazon to recommend horror flicks and toasters. The other involves machine listening. That is, computers "listen" to the audio and try to pick out specific qualities and details within each song.

Collaborative filtering works wonders for the Amazons of the world. But since this type of If-you-like-that-you'll-also-like-this logic works better for kitchen appliances than it does for art, the system needs a way to learn more about the music itself. To teach it, Eck's team leverages Google's robust infrastructure and machine-listening technology to pick apart the granular qualities of each song.
"By and large, audio-based models are very good at timbre," says Eck. "So they're very good at recognizing instruments, very good at recognizing things like distorted guitars, very good at recognizing things like whether it's a male or female vocalist."

These are precisely the kinds of details that Pandora relies on human, professionally trained musicians to figure out. The Internet radio pioneer has long employed musicologists to listen to songs and help build out a multipoint, descriptive data set designed to place each track into a broader context and appropriately relate it to other music. For Pandora, the results have been extremely valuable, but mapping out this musical intelligence manually doesn't scale infinitely. Thankfully, machine listening has come a long way in recent years. Much like Google indexes the Web, the company is able to index a massive database of audio, mapping the musical qualities found within. Since it's automated, this part of Google's music recommendation technology can be scaled to a much larger set of data.

"If the four of us decided we were going to record a jazz quartet right here and now and we uploaded it to Play Music, our system will be aware that were talking about that," explains Eck. "By pulling these audio features out of every track that we work with, it gives us a kind of musical vocabulary that we can work with for doing recommendation even if it’s a very long tail."

Indeed, when it comes to music, the tail has never been longer. The world's selection of recorded music was never finite, but today creating and distributing new songs is virtually devoid of friction and financial cost. However much human intelligence as Pandora feeds into its algorithm, its Music Genome Project will never be able to keep up and understand everything. That's where machine learning gets a leg up.

The Limits Of Machine Listening

Still, there's a reason Pandora has more than 70 million active listeners and continues to increase its share of overall radio listening time. Its music discovery engine is very good. It might not know about my friend's band on a small Georgia-based record label, but the underlying map of data that Pandora uses to create stations is still incredibly detailed. When I start a radio station based on Squarepusher, an acclaimed but not particularly popular electronic music artist, the songs it plays are spun for very specific reasons. It plays a track by Aphex Twin because it features "similar electronica roots, funk influences, headnodic beats, the use of chordal patterning, and acoustic drum samples." Then, when I skip to the next song, it informs me that, "We're playing this track because it features rock influences, meter complexity, unsyncopated ensemble rhythms, triple meter style, and use of modal harmonies."

Pandora knows this much about these tracks thanks to those aforementioned music experts who sat down and taught it. Automated machine listening, by comparison, can't get quite as specific. At least, not yet.

"It’s very hard and we haven’t solved the problem with a capital S," says Eck, whose has an academic background in automated music analysis. "Nor has anybody else."

Computers might be able to pick out details about timbre, instruments used, rhythm, and other on-the-surface sonic qualities, but they can only dig so deep.

"You can learn a lot from one second of audio. Certainly you can tell if there’s a female voice there or if there’s distorted guitar there. What about when we stretch out and we look what our musical phrase is. What’s happening melodically? Where’s this song going? As we move out and have longer time scale stretches that we’re trying to outline, it becomes very hard to use machines alone to get the answer."
Thanks Algorithms, But The Humans Can Take It From Here.
That's where the good, old-fashioned human beings come in. To help flesh out the music discovery and radio experiences in All Access, Google employs music editors who have an intuition that computers have yet to successfully mimic. Heading up this editor-powered side of the equation is Tim Quirk, a veteran of the online music industry who worked at the now-defunct before Napster was a household name.

"Algorithms can tell you what are the most popular tracks in any genre, but an algorithm might not know that "You Don't Miss Your Water" was sort of the first classic, Southern soul ballad in that particular time signature and that it became the template for a decade's worth of people doing the same thing," says Quirk. "That’s sort of arcane human knowledge."

[well, we could hope computers could discovery such knowledge automatically in the future.. (Olivier)]

Google's blend of human and machine intelligence is markedly different from Pandora's. Rather than hand-feeding tons of advanced musical knowledge directly into its algorithms, Google mostly keeps the human-curated stuff in its own distinct areas, allowing the computers to do the heavy lifting elsewhere. Quirk and his team of music editors are the ones who define the most important artists, songs and albums in a given genre (of which there are hundreds in Google Play Music).

Quirk's team also creates curated playlists and make specific, hand-picked music recommendations. To the extent that these manually curated parts of the service influence its users' listening behavior, the human intelligence does find its way back into the algorithms. It just loops back around and takes a longer road to get there.
Google's employees aren't the only people feeding intelligence into this semiautomated music machine. Google is also constantly learning from its users. Like Pandora and its many copycats, Google Play Music's Radio feature has thumbs up and thumbs down buttons, which help inform the way the radio stations work over time.

Some day, computers will be better able to understand not just that I like The Beatles, but why I like The Beatles. They'll know that John is my favorite, which songs on Revolver I skip, and that of all the heavily Beatles-influenced bands in the world, I love Tame Impala, but loathe Oasis. The machines will get smarter, much like a toddler does: by watching and listening. As users spend time with the product and tap buttons, the nuanced details will become more obvious.
Meanwhile, the ability of computers to actually hear what's contained within each song can only improve over time, just as voice recognition continues to do. The promise of Google Play Music is the same thing that made Google successful to begin with: its ability to use massive data to understand who we are and what we want. If anybody can crack the notoriously hard nut of music discovery in a hugely scalable fashion, it's them. Just don't expect them to do it with machines alone.



No comment yet.
Scooped by Olivier Lartillot!

Google Student Blog: Getting to Know a PhD

Google Student Blog: Getting to Know a PhD | Computational Music Analysis |
Olivier Lartillot's insight:

Cynthia is a PhD student. Her research is in the field of music information retrieval: she is working on technologies to analyze, organize and present the considerable amount of digital music information. She is particularly interested in making sense of music data by obtaining richer perspectives on it by taking into account information from multiple data sources. These can be recordings of multiple interpretations of the same music piece, but also related information in non-audio modalities, such as videos of performing musicians and textual information from collaborative web resources describing songs and their usage contexts.

No comment yet.
Scooped by Olivier Lartillot!

The Future of Music Genres Is Here

The Future of Music Genres Is Here | Computational Music Analysis |
We’ve always felt ambivalent about the word “genre” at The Echo Nest. On one hand, it’s the most universal shorthand for classifying music, because everyone has a basic understanding of the big, old...
Olivier Lartillot's insight:

The Echo Nest “top terms,” which are the words most commonly used to describe a piece of music, are “far more granular than the big, static genres of the past. We’ve been maintaining an internal list of dynamic genre categories for about 800 different kinds of music. We also know what role each artist or song plays in its genre (whether they are a key artist for that genre, one of the most commonly played, or an up-and-comer).


The Echo Nest just announced “a bunch of new genre-oriented features, including:

- A list of nearly 800 genres from the real world of music

- Names and editorial descriptions of every genre

- Essential artists from any genre

- Similar genres to any genre

- Verified explainer links to third-party resources when available

- Genre search by keyword

- Ranked genres associated with artists

- Three radio “presets” for each genre: Core (the songs most representative of the genre); In Rotation (the songs being played most frequently in any genre today); and Emerging (up-and-coming songs within the genre).


Where did these genres come from?


“The Echo Nest’s music intelligence platform continuously learns about music. Most other static genre solutions classify music into rigid, hierarchical relationships, but our system reads everything written about music on the web, and listens to millions of new songs all the time, to identify their acoustic attributes.


“This enables our genres to react to changes in music as they happen. To create dynamic genres, The Echo Nest identifies salient terms used to describe music (e.g., “math rock,” “IDM”, etc.), just as they start to appear. We then model genres as dynamic music clusters — groupings of artists and songs that share common descriptors, and similar acoustic and cultural attributes. When a new genre forms, we know about it, and music fans who listen to our customers’ apps and services will be able to discover it right away, too.


“Our approach to genres is trend-aware. That means it knows not only what artists and songs fall into a given genre, but also how those songs and artists are trending among actual music fans, within those genres.


“About 260 of these nearly 800 genres are hyper-regional, meaning that they are tied to specific places. Our genre system sees these forms of music as they actually exist; it can help the curious music fan hear the differences, for instance, between Luk Thung, Benga, and Zim music.”

No comment yet.
Scooped by Olivier Lartillot!

Brainlike Computers, Learning From Experience

Brainlike Computers, Learning From Experience | Computational Music Analysis |
The new computing approach is based on the biological nervous system, specifically on how neurons react to stimuli and connect with other neurons to interpret information.
Olivier Lartillot's insight:

Some excerpts from the New York Times paper:


“The new computing approach, already in use by some large technology companies, is based on the biological nervous system, specifically on how neurons react to stimuli and connect with other neurons to interpret information. It allows computers to absorb new information while carrying out a task, and adjust what they do based on the changing signals.


In coming years, the approach will make possible a new generation of artificial intelligence systems that will perform some functions that humans do with ease: see, speak, listen, navigate, manipulate and control. That can hold enormous consequences for tasks like facial and speech recognition, navigation and planning, which are still in elementary stages and rely heavily on human programming.


“We’re moving from engineering computing systems to something that has many of the characteristics of biological computing.” 


Conventional computers are limited by what they have been programmed to do. Computer vision systems, for example, only “recognize” objects that can be identified by the statistics-oriented algorithms programmed into them. An algorithm is like a recipe, a set of step-by-step instructions to perform a calculation.


Until now, the design of computers was dictated by ideas originated by the mathematician John von Neumann about 65 years ago. Microprocessors perform operations at lightning speed, following instructions programmed using long strings of 1s and 0s. They generally store that information separately in what is known, colloquially, as memory, either in the processor itself, in adjacent storage chips or in higher capacity magnetic disk drives.

The data are shuttled in and out of the processor’s short-term memory while the computer carries out the programmed action. The result is then moved to its main memory.


The new processors consist of electronic components that can be connected by wires that mimic biological synapses. Because they are based on large groups of neuron-like elements, they are known as neuromorphic processors.


They are not “programmed.” Rather the connections between the circuits are “weighted” according to correlations in data that the processor has already “learned.” Those weights are then altered as data flows in to the chip, causing them to change their values and to “spike.” That generates a signal that travels to other components and, in reaction, changes the neural network, in essence programming the next actions much the same way that information alters human thoughts and actions.


“Instead of bringing data to computation as we do today, we can now bring computation to data. Sensors become the computer, and it opens up a new way to use computer chips that can be everywhere.”

The new computers, which are still based on silicon chips, will not replace today’s computers, but will augment them, at least for now.


Many computer designers see them as coprocessors, meaning they can work in tandem with other circuits that can be embedded in smartphones and in the giant centralized computers that make up the cloud. Modern computers already consist of a variety of coprocessors that perform specialized tasks, like producing graphics on your cellphone and converting visual, audio and other data for your laptop.”

No comment yet.
Scooped by Olivier Lartillot!

Shazam-Like Dolphin System ID's Their Whistles: Scientific American Podcast

Olivier Lartillot's insight:

I am glad to see such popularization of research related to “melodic” pattern identification that generalizes beyond the music context and beyond the human species, and also this interesting link to music identification technologies (like Shazam). Before discussing further on this, here is first of all what this Scientific American podcast explains in a simple way the computational attempt of mimicking dolphins' melodic pattern identification abilities:


“Shazam-Like Dolphin System ID's Their Whistles: A program uses an algorithm to identify dolphin whistles similar to that of the Shazam app, which identifies music from databases by changes in pitch over time.

Used to be, if you happened on a great tune on the radio, you might miss hearing what it was. Of course, now you can just Shazam it—let your smartphone listen, and a few seconds later, the song and performer pop up. Now scientists have developed a similar tool—for identifying dolphins.

Every dolphin has a unique whistle.  They use their signature whistles like names: to introduce themselves, or keep track of each other. Mothers, for example, call a stray offspring by whistling the calf's ID.

To tease apart who's saying what, researchers devised an algorithm based on the Parsons code, the software that mammals, I mean that fishes songs from music databases, by tracking changes in pitch over time.

They tested the program on 400 whistles from 20 dolphins. Once a database of dolphin sounds was created, the program identified subsequent dolphins by their sounds nearly as well as humans who eyeballed the whistles' spectrograms.

Seems that in noisy waters, just small bits of key frequency change information may be enough to help Flipper find a friend.”


More precisely, the computer program generates a compact description of each dolphin whistle indicating how the pitch curve progressively ascends and descends. This enables to get a description that is characteristic of each dolphin, and to compare these whistle curves and see which curve belongs to which dolphin.


But to be more precise, Shazam does not use this kind of approach to identify music. It does not try to detect melodic lines in the music recorded by the user, but take a series of several-second snapshot of each song, such that each snapshot contains all the complex sound at that particular moment (with the polyphony of instruments). A compact description (a “fingerprint”) of each snapshot is produced, that indicate the most important spectral peaks (let's say the more prominent pitch of the polyphony). This fingerprint is then compared with those of each songs in the music database. Finally the identified song in the database is the one whose series of fingerprints fits best with the series of fingerprints of the user's music query. Here is a simple explanation of how Shazam works:


Shazam does not model *how* humans identify music. The dolphin whistle comparison program does not model *how* dolphins identify each other. And Shazam and the dolphin whistle ID program do not use similar approaches. But on the other hand, we might assume that dolphins and humans abilities of identifying auditory patterns (in whistles, in music for humans) rely on same core cognitive processes?

No comment yet.
Scooped by Olivier Lartillot!

The Man Behind the Google Brain: Andrew Ng and the Quest for the New AI | Wired Enterprise |

The Man Behind the Google Brain: Andrew Ng and the Quest for the New AI | Wired Enterprise | | Computational Music Analysis |
There's a theory that human intelligence stems from a single algorithm. The idea arises from experiments suggesting that the portion of your brain dedicated to processing sound from your ears could also handle sight for your eyes.
Olivier Lartillot's insight:

My digest:



There’s a theory that human intelligence stems from a single algorithm. The idea arises from experiments suggesting that the portion of your brain dedicated to processing sound from your ears could also handle sight for your eyes. This is possible only while your brain is in the earliest stages of development, but it implies that the brain is — at its core — a general-purpose machine that can be tuned to specific tasks.


In the early days of artificial intelligence, the prevailing opinion was that human intelligence derived from thousands of simple agents working in concert, what MIT’s Marvin Minsky called “The Society of Mind.” To achieve AI, engineers believed, they would have to build and combine thousands of individual computing modules. One agent, or algorithm, would mimic language. Another would handle speech. And so on. It seemed an insurmountable feat.


A new field of computer science research known as Deep Learning seeks to build machines that can process data in much the same way the brain does, and this movement has extended well beyond academia, into big-name corporations like Google and Apple. Google is building one of the most ambitious artificial-intelligence systems to date, the so-called Google Brain.


This movement seeks to meld computer science with neuroscience — something that never quite happened in the world of artificial intelligence. “I’ve seen a surprisingly large gulf between the engineers and the scientists.” Engineers wanted to build AI systems that just worked, but scientists were still struggling to understand the intricacies of the brain. For a long time, neuroscience just didn’t have the information needed to help improve the intelligent machines engineers wanted to build.


What’s more, scientists often felt they “owned” the brain, so there was little collaboration with researchers in other fields. The end result is that engineers started building AI systems that didn’t necessarily mimic the way the brain operated. They focused on building pseudo-smart systems that turned out to be more like a Roomba vacuum cleaner than Rosie the robot maid from the Jetsons.


Deep Learning is a first step in this new direction. Basically, it involves building neural networks — networks that mimic the behavior of the human brain. Much like the brain, these multi-layered computer networks can gather information and react to it. They can build up an understanding of what objects look or sound like.


In an effort to recreate human vision, for example, you might build a basic layer of artificial neurons that can detect simple things like the edges of a particular shape. The next layer could then piece together these edges to identify the larger shape, and then the shapes could be strung together to understand an object. The key here is that the software does all this on its own — a big advantage over older AI models, which required engineers to massage the visual or auditory data so that it could be digested by the machine-learning algorithm.


With Deep Learning, you just give the system a lot of data “so it can discover by itself what some of the concepts in the world are.” Last year, one algorithms taught itself to recognize cats after scanning millions of images on the internet. The algorithm didn’t know the word “cat” but over time, it learned to identify the furry creatures we know as cats, all on its own.


This approach is inspired by how scientists believe that humans learn. As babies, we watch our environments and start to understand the structure of objects we encounter, but until a parent tells us what it is, we can’t put a name to it.


No, deep learning algorithms aren’t yet as accurate — or as versatile — as the human brain. But he says this will come.


In 2011, the Deep Learning project was launched at Google, and in recents months, the search giant has significantly expanded this effort, acquiring the artificial intelligence outfit founded by University of Toronto professor Geoffrey Hinton, widely known as the godfather of neural networks.


Chinese search giant Baidu has opened its own research lab dedicated to deep learning, vowing to invest heavy resources in this area. And big tech companies like Microsoft and Qualcomm are looking to hire more computer scientists with expertise in neuroscience-inspired algorithms.


Meanwhile, engineers in Japan are building artificial neural nets to control robots. And together with scientists from the European Union and Israel, neuroscientist Henry Markman is hoping to recreate a human brain inside a supercomputer, using data from thousands of real experiments.


The rub is that we still don’t completely understand how the brain works, but scientists are pushing forward in this as well. The Chinese are working on what they call the Brainnetdome, described as a new atlas of the brain, and in the U.S., the Era of Big Neuroscience is unfolding with ambitious, multidisciplinary projects like President Obama’s newly announced (and much criticized) Brain Research Through Advancing Innovative Neurotechnologies Initiative — BRAIN for short.


If we map how out how thousands of neurons are interconnected and “how information is stored and processed in neural networks,” engineers will have better idea of what their artificial brains should look like. The data could ultimately feed and improve Deep Learning algorithms underlying technologies like computer vision, language analysis, and the voice recognition tools offered on smartphones from the likes of Apple and Google.


“That’s where we’re going to start to learn about the tricks that biology uses. I think the key is that biology is hiding secrets well. We just don’t have the right tools to grasp the complexity of what’s going on.”


Right now, engineers design around these issues, so they skimp on speed, size, or energy efficiency to make their systems work. But AI may provide a better answer. “Instead of dodging the problem, what I think biology could tell us is just how to deal with it….The switches that biology is using are also inherently noisy, but biology has found a good way to adapt and live with that noise and exploit it. If we could figure out how biology naturally deals with noisy computing elements, it would lead to a completely different model of computation.”


But scientists aren’t just aiming for smaller. They’re trying to build machines that do things computer have never done before. No matter how sophisticated algorithms are, today’s machines can’t fetch your groceries or pick out a purse or a dress you might like. That requires a more advanced breed of image intelligence and an ability to store and recall pertinent information in a way that’s reminiscent of human attention and memory. If you can do that, the possibilities are almost endless.


“Everybody recognizes that if you could solve these problems, it’s going to open up a vast, vast potential of commercial value."

No comment yet.
Scooped by Olivier Lartillot!

Inside Spotify’s Hunt for the Perfect Playlist

Inside Spotify’s Hunt for the Perfect Playlist | Computational Music Analysis |
Spotify is launching a new playlist service called Discover Weekly that uses your data to serve you songs you might like.
Olivier Lartillot's insight:

Most popular Spotify playlists are made by human curator, “but they live and die by data.”


The playlists are made using an “internal Spotify tool called Truffle Pig. Jim Lucchese, CEO of The Echo Nest (which was also acquired by Spotify) refers to Truffle Pig as “Pro Tools for playlists.” It’s part of a version of the Spotify app that’s only available to employees. It lets them build a playlist from almost anything: an artist’s name, a song, a vague adjective or feeling. You tell Truffle Pig you want, say, a twangy alt-country playlist. That’s enough to get started. Then you refine: “Say you want high acousticness with up-tempo tracks that are aggressive up to a certain value. It’ll generate a bunch of candidates, you can listen to them there, and then drop them in and add them to your playlist.”

The Echo Nest’s job within Spotify is to endlessly categorize and organize tracks. The team applies a huge number of attributes to every single song: Is it happy or sad? Is it guitar-driven? Are the vocals spoken or sung? Is it mellow, aggressive, or dancy? On and on the list goes. Meanwhile, the software is also scanning blogs and social networks—ten million posts a day—to see the words people use to talk about music. With all this data combined, The Echo Nest can start to figure out what a “crunk” song sounds like, or what we mean when we talk about “dirty south” music.”

No comment yet.
Scooped by Olivier Lartillot!

Spotify's Head of Deep Learning Reveals How AI Is Changing the Music Industry

Spotify's Head of Deep Learning Reveals How AI Is Changing the Music Industry | Computational Music Analysis |
The New York Observer talked with Nicola Montecchio, a music information retrieval scientist who is the head of deep learning for Spotify, who detailed how the company is using deep learning to imp...
Olivier Lartillot's insight:

The article itself:

"Music is undergoing drastic changes all of the time. From styles and genres to devices we use to listen and how content is produced, everything music-related is constantly evolving—a phenomenon that is beginning to accelerate with the research and introduction of new technologies.


Most of us are unaware of this behind-the-scene job, but the industry is inching more and more into the world of artificial intelligence and machine learning. The New York Observer talked with Nicola Montecchio, a music information retrieval scientist who is the head of deep learning for Spotify, who detailed how the company is using deep learning to improve the way we find and listen to music.


What’s your role at Spotify and how is the company utilizing deep learning?

My job here is to apply machine learning techniques to the content of songs. We’re trying to figure out if a song is happy or sad, what are similar sounding songs and things of that sort. The characteristics that we associate with songs are subjective, but we’re trying to infer these subjective characteristics of songs by focusing on the acoustic elements themselves without considering the popularity of the artists.


How does that better user experience?

It’s figuring out users’ interests by seeing what else they and others are interested in, and it’s been working well.

If a new song comes on the platform from an artist that’s not popular, it would be hard to associate that with you using more traditional methods because no one else is listening. But when we rely on the acoustics of the song, we can make better suggestions and direct users to lesser-known artists.


So, is introducing new music to users part of Spotify’s mission?

I would say so, yes. We try to make you discover new music. If there are some unknown artists that sound good and are similar to what you like, we think you should listen to that.


How were you able to curate music in the past? How successful was it?

We already did this some without deep learning, but this allows us to go more in-depth with the understanding of the song. It’s a bit more flexible in a way. We had an intern last summer who did a really nice job, and his idea was to map the acoustic elements of a song to the listening patterns. He was trying to predict what you will listen to using deep learning. It’s also interesting because you can’t predict the popularity from just the audio, so you’re also predicting something about the humanality [sic] of it.


How has deep learning allowed Spotify to grow?

On my side, I think it brings us a lot in terms of accuracy. Then of course we’re serving this into the recommendation engine. It’s surely enabling us to be more diversified—less tied to popularity and more tied to what the song actually sounds like.


How will this technology change the way we find and listen to music in the future?

It will be more about the sound. You’ll be able to search by the content of the music instead of just text information associated with it.  This means you can search for truly similar sounding music by actually searching the sound itself instead of doing what we currently do—search a title, image or artist. That’s what I think will come to the table that’s significantly better than what’s out there.


How will this affect the music industry?

For sure it’s a way to make some other artists more visible that you might not have heard of otherwise. To say how much listening patterns could change is a little bit of a harder of a question, but it should make the field more even and less biased. It will certainly level the playing field.

No comment yet.
Scooped by Olivier Lartillot!

Genetic Data Tools Reveal How Pop Music Evolved In The US

Genetic Data Tools Reveal How Pop Music Evolved In The US - The Physics arXiv Blog - Medium
And show that The Beatles didn’t start the American music revolution of 1964
Olivier Lartillot's insight:

From the blog post:


"Despite the keen interest in the evolution of pop music, there is little to back up most claims in the form of hard analytical evidence.


In a new study, number crunching techniques developed to understand genomic data have been used to study the evolution of American pop music. The study found an objective way to categorise musical styles and to measure the way these styles change in popularity over time.


The team started with the complete list of US chart topping songs in the form of the US Billboard Hot 100 from 1960 to 2010. To analyse the music itself, they used 30-second segments of more than 80 per cent of these singles — a total of more than 17,000 songs.


They then analysed each segment for harmonic features such as chord changes and for the quality of timbre, whether guitar or piano or orchestra based, for example. In total, they rated each song in one of 8 different harmonic categories and one of 8 different timbre categories.


They assumed that the specific combination of harmonic and timbre qualities determines the genre of music, whether rock, rap, country and so on. However, the standard definitions of music genres also capture non-musical features such as the age and ethnicity of the performers, as in classic rock or Korean pop and so on.


So the team used an algorithmic technique for finding clusters within networks of data to find objective categories of musical genre that depend only on the musical qualities. This technique threw up 13 separate styles of music.


An interesting question is what these styles represent. To find out, the team analysed the tags associated with each song on the Last-FM music discovery service. Using a technique from bioinformatics called enrichment analysis, they searched for tags that were more commonly associated with songs in each music style and then assumed that these gave a sense of the musical genres involved.


For example, they found that style 1 was associated with soul tags, style 2 with hip hop, style 3 with country music and easy listening, style 4 with jazz and blues and so on.


Finally, they plotted the popularity of each style over time.


The data allows them to settle some long standing debates among connoisseurs of popular music. One question is whether various practices in the music industry have led to a decline in the cultural variety of new music.


To study this issue, they developed several measures of diversity and tracked how they changed over time. “We found that although all four evolve, two — diversity and disparity — show the most striking changes, both declining to a minimum around 1984, but then rebounding and increasing to a maximum in the early 2000s.”"

No comment yet.
Rescooped by Olivier Lartillot from Digital Music Market!

Spotify’s Secret Weapon: The Echo Nest

Spotify’s Secret Weapon: The Echo Nest | Computational Music Analysis |
By Christopher D Amico from Berklee College of Music's Music Business Journal. Spotify announced in March the acquisition of The Echo Nest, the industry’s leading music intelligence company. The deal signals the rising importance of big data in the music industry. Founded by MIT Media Lab doctoral students Tristan Jehan and Brian Whitman, The Echo Nest provided intelligence to some of the world’s leading music services including Clear Channel’s iHeart radio, Rdio, SiriusXM, and social media networks such as Foursquare, MTV, Twitter, and Yahoo. This might change as the company moves away from being an open source platform, useful to...

Olivier Lartillot's insight:


Spotify announced in March the acquisition of The Echo Nest, the industry’s leading music intelligence company. The deal signals the rising importance of big data in the music industry. Founded by MIT Media Lab doctoral students Tristan Jehan and Brian Whitman, The Echo Nest provided intelligence to some of the world’s leading music services including Clear Channel’s iHeart radio, Rdio, SiriusXM, and social media networks such as Foursquare, MTV, Twitter, and Yahoo.  


Tristan Jehan earned his doctorate in Media Arts and Sciences from MIT in 2005. His academic work combined machine listening and machine learning technologies in teaching computers how to hear and make music. He first earned a Masters in Science in Electrical Engineering and Computer Science from the University of Rennes in France, later working on music signal parameter extraction at the Center for New Music and Audio Technologies at U.C. Berkeley. He has worked with leading research and development labs in the U.S. and France as a software and hardware engineer in areas of machine listening and audio analysis.


For purposes of analysis and recommendation, songs are not taken whole but rather are broken down into specific attributes, qualities, and even segments. Acoustic analysis has a major role in the company’s algorithms when it decides what to play next. Listeners expect smooth transitions between songs in playlists, so this involves, in part, the analysis of tempo, key, and overall genre.

By dissecting these particulars, The Echo Nest can both create coherent playlists and applications with which listeners can manipulate music. The latter is especially important,  for new consumer devices will inevitably hit the marketplace soon.


Paul Lamere, one of their top software developers, has built several of The Echo Nest’s popular web applications; see ‘Girl Talk in a Box’ allows interaction with a user’s favorite song by speeding, skipping beats, playing it backwards, swinging it, and more. ‘The Infinite Jukebox’, on the other hand, will generate a never-ending and ever changing version of an MP3 song, which it breaks into beats: at every beat there’s a chance that it will jump to a different part of song that happens to sound very similar to the current beat.

These two applications are for the fun user market, and, perhaps, DJ’s. There is much more behind the scene at a pro level, such as its use of Spotify’s entire range of streaming data to identify, for example, where user listener wanes during a song—is it the extended drum or guitar solo, the weak chorus, or what? There has never been such a measurable tracking of listening habits and musical tastes, and in a digital world of 0s and 1s data reduction is more necessary than ever.

No comment yet.
Scooped by Olivier Lartillot!

"Deep Listening" attracts the interest of Google, Spotify, Pandora

"Deep Listening" attracts the interest of Google, Spotify, Pandora | Computational Music Analysis |
Deep learning amounts to one of those technologies that several companies could start to implement in the future, in order to improve music streaming.
Olivier Lartillot's insight:



Google, Pandora, and Spotify have recently hired deep learning experts. This branch of A.I. involves training systems called “artificial neural networks” with terabytes of information derived from images, text, and other inputs. It then presents the systems with new information and receives inferences about it in response. A neural network for a music-streaming service could recognize patterns like chord progressions in music without needing music experts to direct machines to look for them. Then it could introduce a listener to a song, album, or artist in accord with their preferences.


The new wave of attention leads back to an academic paper where Ph.D. students Sander Dieleman and Aäron van den Oord collaborated with professor Benjamin Schrauwen to make convolutional neural networks (CNNs) pick up attributes of songs, rather than using them to observe features in images, as engineers have done for years.The trio found that their model “produces sensible recommendations.” What’s more, their experiments showed the system “significantly outperforming the traditional approach.”


The paper captured the imagination of academics who work with music and deep learning wonks as well. Microsoft researchers even cited the paper in a recent overview of the deep learning field.


Deep learning stands out from the recommendation systems in place at Spotify, which uses more traditional data analysis. And down the line, it could provide for improvements in key metrics. Spotify currently recommends songs using technology from Echo Nest, which Spotify ended up buying this year. Echo Nest gathers data using two systems: analysis of text on the Internet about specific music, as well as acoustic analysis of the songs themselves. The latter entails directing machines to listen for certain qualities in songs, like tempo, volume, and key. We “apply knowledge about music listening in general and music theory and try to model every step of the way musicians actually perceive first and then understand and analyze music,” said Tristan Jehan, a cofounder of Echo Nest and now principal scientist at Spotify. That system required lots of domain-specific input on the part of the people who built it. But the deep learning approach, while also complex, is completely different. “Sander [Dieleman] is just taking a waveform and assumes we don’t know that stuff, but the machine can derive everything, more or less. That’s why it’s a very generic model, and it has a lot of potential.” The idea is to predict what songs listeners might like, even when usage data isn’t available.


The interest goes beyond Spotify. Even Pandora, known for its use of humans in its process of finding songs to play, has been exploring the technique.

Pandora’s musicologists identify attributes in songs based on their knowledge of music. The end product is data that they can feed into complex algorithms — but it fundamentally depends on human beings. The human-generated data feeds into a system befitting a company with a $3.86 billion market cap. Pandora’s data centers retain an arsenal of more than 50 recommendation systems. “No one of these approaches is optimal for every station of every user of Pandora.” 


Meanwhile, deep learning has come in handy for a wide variety of purposes at Google, and employees certainly are investigating its applications in a music-streaming context. “Deep learning represents a complete revolution, literally a revolution, in how machine learning is done.” The trouble is, deep learning on its own might do a good job of detecting similarities among songs, but maximizing outcomes might mean drawing on several kinds of data other than the raw content. “What we see from these deep-learning models, including the best of best we’ve seen is there’s still a lot of noise out there in these models.”

And so deep learning might not be a sort of drop-in replacement for music streaming. It can be another tool, and perhaps not only for determining which song to play next. Its capabilities could go beyond that.


“What I do see is that deep learning is allowing us to better understand music and allow us to actually better understand what music is.”

No comment yet.
Scooped by Olivier Lartillot!

Machine Learning Algorithm Studying Fine Art Paintings Sees Things Art Historians Had Never Noticed

Machine Learning Algorithm Studying Fine Art Paintings Sees Things Art Historians Had Never Noticed | Computational Music Analysis |
Artificial intelligence reveals previously unrecognised influences between great artists
Olivier Lartillot's insight:

taken from the article:



The task of classifying pieces of fine art is hugely complex. When examining a painting, an art expert can usually determine its style, its genre, the artist and the period to which it belongs. Art historians often go further by looking for the influences and connections between artists, a task that is even trickier.


So the possibility that a computer might be able to classify paintings and find connections between them at first glance seems laughable. And yet, that is exactly what Babak Saleh and pals have done at Rutgers University in New Jersey.


These guys have used some of the latest image processing and classifying techniques to automate the process of discovering how great artists have influenced each other. They have even been able to uncover influences between artists that art historians have never recognised until now.


The way art experts approach this problem is by comparing artworks according to a number of high-level concepts such as the artist’s use of space, texture, form, shape, colour and so on. Experts may also consider the way the artist uses movement in the picture, harmony, variety, balance, contrast, proportion and pattern. Other important elements can include the subject matter, brushstrokes, meaning, historical context and so on. Clearly, this is a complex business.

So it is easy to imagine that the limited ability computers have for analysing two-dimensional images would make this process more or less impossible to automate. But Salah and co show how it can be done.


At the heart of their method, is a new technique for classifying pictures according to the visual concepts that they contain. These concepts are called classemes and include everything from simple object description such as duck, frisbee, man, wheelbarrow to shades of colour to higher-level descriptions such as dead body, body of water, walking and so on.


Comparing images is then a process of comparing the words that describe them, for which there are a number of well-established techniques.


Salah and co apply this approach to over 1700 paintings by 66 artists working in 13 different styles. Together, these artists cover the time period from the early 15th century to the late 20th century. To create a ground truth against which to measure their results, they also collate expert opinions on which of these artists have influenced the others.


For each painting, they limit the number of concepts and points of interest generated by their method to 3000 in the interests of efficient computation. This process generates a list of describing words that can be thought of as a kind of vector. The task is then to look for similar vectors using natural language techniques and a machine learning algorithm.


Determining influence is harder though since influence is itself a difficult concept to define. Should one artist be deemed to influence another if one painting has a strong similarity to another? Or should there be a number of similar paintings and if so how many?

So Saleh and co experiment with a number of different metrics. They end up creating two-dimensional graphs with metrics of different kinds on each axis and then plotting the position of all of the artists in this space to see how they are clustered.


The results are interesting. In many cases, their algorithm clearly identifies influences that art experts have already found. The algorithm is also able to identify individual paintings that have influenced others. Most impressive of all is the link the algorithm makes between Frederic Bazille’s Studio 9 Rue de la Condamine (1870) and Norman Rockwell’s Shuffleton’s Barber Shop (1950). “After browsing through many publications and websites, we concluded, to the best of our knowledge, that this comparison has not been made by an art historian before.”


Of course, Saleh and co do not claim that this kind of algorithm can take the place of an art historian. After all, the discovery of a link between paintings in this way is just the starting point for further research about an artist’s life and work.

But it is a fascinating insight into the way that machine learning techniques can throw new light on a subject as grand and well studied as the history of art.


No comment yet.
Scooped by Olivier Lartillot!

Computer becomes a bird enthusiast

Computer becomes a bird enthusiast | Computational Music Analysis |
Program can distinguish among hundreds of species in recorded birdsongs
Olivier Lartillot's insight:


If you’re a bird enthusiast, you can pick out the “chick-a-DEE-dee” song of the Carolina chickadee with just a little practice. But if you’re an environmental scientist faced with parsing thousands of hours of recordings of birdsongs in the lab, you might want to enlist some help from your computer. A new approach to automatic classification of birdsong borrows techniques from human voice recognition software to sort through the sounds of hundreds of species and decides on its own which features make each one unique.


Typically, scientists build one computer program to recognize one species, and then start all over for another species. Training a computer to recognize lots of species in one pass is “a challenge that we’re all facing.”


That challenge is even bigger in the avian world, says Dan Stowell, a computer scientist at Queen Mary University of London who studied human voice analysis before turning his attention to the treetops. “I realized there are quite a lot of unsolved problems in birdsong.” Among the biggest issues: There are hundreds of species with distinct and complex calls—and in tropical hotspots, many of them sing all at once.


Most methods for classifying birdsong rely on a human to define which features separate one species from another. For example, if researchers know that a chickadee’s tweet falls within a predictable range of frequencies, they can program a computer to recognize sounds in that range as chickadee-esque. The computer gets better and better at deciding how to use these features to classify a new sound clip, based on “training” rounds where it examines clips with the species already correctly labeled.


In the new paper, Stowell and his Queen Mary colleague, computer scientist Mark Plumbley, used a different approach, known as unsupervised training. Instead of telling the computer which features of a birdsong are going to be important, they let it decide for itself, so to speak. The computer has to figure out “what are the jigsaw pieces” that make up any birdsong it hears. For example, some of the jigsaw pieces it selects are split-second upsweeps or downsweeps in frequency—the sharp pitch changes that make up a chirp. After seeing correctly labeled examples of which species produce which kinds of sounds, the program can spit out a list—ranked in order of confidence—of the species it thinks are present in a recording.

Their unsupervised approach performed better than the more traditional methods of classification—those based on a set of predetermined features.


The new system’s accuracy fell short of beating the top new computer programs that analyzed the same data sets for the annual competition. But the new system deserves credit for applying unsupervised computer learning to the complex world of birdsong for the first time. This approach could be combined with other ways of processing and classifying sound, because it can squeeze out some info that other techniques may miss.


Eighty-five percent accuracy on a choice between more than 500 calls and songs is impressive and shows both the biological community and the computer community what you can do with these large sound archives.


No comment yet.
Scooped by Olivier Lartillot!

Different Brain Regions Handle Different Musics

Different Brain Regions Handle Different Musics | Computational Music Analysis |
Functional MRI of the listening brain found that different regions become active when listening to different types of music and instrumental versus vocals. Allie Wilkinson reports.
Olivier Lartillot's insight:

"Computer algorithms were used to identify specific aspects of the music, which the researchers were able to match with specific, activated brain areas. The researchers found that vocal and instrumental music get treated differently. While both hemispheres of the brain deal with musical features, the presence of lyrics shifts the processing of musical features to the left auditory cortex.

These results suggest that the brain’s hemispheres are specialized for different kinds of sound processing. A finding revealed but what you might call instrumental analysis."

No comment yet.
Scooped by Olivier Lartillot!

Listen to Pandora, and It Listens Back

Listen to Pandora, and It Listens Back | Computational Music Analysis |
The Internet radio service has started to mine user data for the best ways to target advertising. It can deconstruct your song choices to predict, for example, your political party of choice.
Olivier Lartillot's insight:

“After years of customizing playlists to individual listeners by analyzing components of the songs they like, then playing them tracks with similar traits, the company has started data-mining users’ musical tastes for clues about the kinds of ads most likely to engage them.”

No comment yet.
Scooped by Olivier Lartillot!

“Art, sciences, humanities” theme in Futurium project by European Commission

“Art, sciences, humanities” theme in Futurium project by European Commission | Computational Music Analysis |


Art practice will gain a whole new status and role in future societies. Creativity will be key to harness the new possibilities offered by science and technology, and by the hyper-connected environments that will surround us, in useful directions. Art, science and humanities will connect to help boost this wave of change and creativity in Europe.


Olivier Lartillot's insight:

Here is first of all a bit of background related to this Futurium project from the European Commission:

("Why your vote is crucial")


“If you are interested in policy-making, this is the right place to be! Have a say on eleven compelling themes that will likely shape policy debates in the coming few decades!

They are a synthesis of more than 200 futures co-created by hundreds of "futurizens", including young thinkers as well as renowned scientists from different disciplines, in brainstorming sessions, both online and actual events all around Europe.

The themes include many insights on how policy-making could evolve in the near future. They can potentially help to guide future policy choices or to steer the direction of research funding; for instance, because they cast new light on the sweeping changes that could occur in areas like jobs and welfare; also by furthering our understanding of new routes to the greater empowerment of human-beings; and by exploring the societal impacts of the emergence of super-centenarians.

Everyone can now provide feedback and rate the relevance and timing of the themes.

Which one has the greatest impact? When will these themes become relevant?

Vote and help shape the most compelling options for future policies!”


Below is the theme “Art, sciences, humanities”. All these ideas seem to have important repercussion in music research. It would be splendid to see such ideals having an impact in future European research policies. So if you would support these ideas, please vote for this theme in the poll, which closes at the end of the week.


“The challenges facing humanity are revealing themselves as increasingly global and highly interconnected. The next few decades will give us the tools to start mastering this complexity in terms of a deeper understanding, but also in terms of policy and action with more predictability of impacts.

This will result from a combination of thus far unseen Big Data from various sources of evidence (smart grids, mobility data, sensor data, socio-economic data) along with the rise of dynamical modelling and new visualisation, analysis, and synthesis techniques (like narrative). It will also rely on a new alliance between science and society.

The virtualisation of the scientific process and the advent of social networks will allow every scientist to join forces with others in the open global virtual laboratory.  Human performance enhancement and embeddable sensors will enable scientists to perceive and observe processes in the real world in new ways. New ICT tools will allow better understanding of the social processes underlying all societal actions.

Digital games will increasingly be used as training grounds for developing worlds that work – from testing new systems of governance, to new systems of economy, medical and healing applications, industrial applications, educational systems and models – across every aspect of life, work, and culture.

Digital technologies will also empower people to co-create their environments, the products they buy, the science they learn, and the art they enjoy.  Digital media will break apart traditional models of art practice, production, and creativity, making production of previously expensive art forms like films affordable to anyone.

The blurring boundaries between artist and audience will completely disappear as audiences increasingly ‘applaud’ a great work by replying with works of their own, which the originating artist will in turn build upon for new pieces.  Digital media creates a fertile space for a virtuous circle of society-wide creativity and art production.

Art practice will gain a whole new status and role in future societies. Creativity will be key to harness the new possibilities offered by science and technology, and by the hyper-connected environments that will surround us, in useful directions. Art, science and humanities will connect to help boost this wave of change and creativity in Europe.

Key Issues

•How do we engage policy makers and civic society throughout the process of gathering data and analysing evidence on global systems? How do we cross-fertilise sciences, humanities and art?

•How do we ensure reward and recognition in a world of co-creation where everyone can be a scientist or an artist from his/her own desktop? How do we deal with ownership, responsibility and liability?

•How do we keep scientific standards alive as peer-reviewed research and quality standards are challenged by the proliferation of open-access publication? How do we assure the quality and credibility of data and models?

•How do we channel the force of creativity into areas of society that are critical but often slow to change, like healthcare, education, etc.?

•How do we ensure universal access and competency with emerging digital and creative technologies? Greater engagement of citizens in science and the arts? How do we disseminate learning about creativity and the arts to currently underserved populations?

•Equitable benefit distribution: how do we ensure that the benefits scientific discoveries and innovations are distributed evenly in society?

•Clear, effective communication, across multiple languages: how do we communicate insights from complex systems analyses to people who were not participants in the process in ways that create value shifts and behavioural changes to achieve solutions to global issues?

•Can the development of new narratives and metaphors make scientific results accessible to all humanity to reframe global challenges?

•Can the virtualisation of research and innovation lifecycles, the multidisciplinary collaboration and the cross fertilisation with arts and humanities help improve the impact of research?

•Transformation of education: how might the roles of schools and professional educators evolve in the light of the science and art revolution? What might be the impact on jobs and productivity?

•How do we respond to the increasing demand for data scientists and data analysts?

•How do we cope with unintended and undesirable effects of pervasive digitization of society such as media addictions, IPR and authenticity, counterfeiting, plagiarism, life history theft? How do we build trust in both artists and audiences?

•How do we ensure that supercomputing, simulation and big data are not invasive to privacy and support free will and personal aspirations?

•Can crowd-financing platforms for art initiatives balance the roles in current artistic economies (e.g. arts granting agencies, wealthy patrons)?

•How do we harness digital gaming technologies, and developments in live gaming, to allow users to create imagined worlds that empower them and the communities they live within?”

anita sánchez's curator insight, December 1, 2013 7:29 PM

Very interesting...:)!!!



Olivier Lartillot's comment, December 2, 2013 2:00 AM
Thanks for rescooping, Anita. Please note that is not possible anymore to cast a vote in the Futurium poll, which ended 2 days ago, unfortunately.
NoahData's curator insight, December 19, 2013 5:42 AM


The big data technology ecosystem is fast evolving space comprising technologies such as Hadoop, Greenplum, Ambari, Cassandra, Mahout and Zookeeper.

We help you understand this Ecosystem and the myriad technology choices available and help you with a strategic roadmap to help your organization leverage and implement the right best fit solution be it on the Cloud or within your IT infrastructure on clusters.

Through a well phased approach covering proof of concepts, road map recommendations, sourcing strategy, our consultants help you through the entire lifecycle of implementing Big Data technologies.

Scooped by Olivier Lartillot!

Scientific Data Has Become So Complex, We Have to Invent New Math to Deal With It - Wired Science

Scientific Data Has Become So Complex, We Have to Invent New Math to Deal With It - Wired Science | Computational Music Analysis |
Olivier Lartillot's insight:

[ Note from curator: Wired already wrote an article about Carlsson and his compressed sensing method 4 years ago.

There are interesting critical comments about this article in Slashdot:

Olivier ]


“It is not sufficient to simply collect and store massive amounts of data; they must be intelligently curated, and that requires a global framework. “We have all the pieces of the puzzle — now how do we actually assemble them so we can see the big picture? You may have a very simplistic model at the tiny local scale, but calculus lets you take a lot of simple models and integrate them into one big picture.” Similarly, modern mathematics — notably geometry — could help identify the underlying global structure of big datasets.


Gunnar Carlsson, a mathematician at Stanford University, is representing cumbersome, complex big data sets as a network of nodes and edges, creating an intuitive map of data based solely on the similarity of the data points; this uses distance as an input that translates into a topological shape or network. The more similar the data points are, the closer they will be to each other on the resulting map; the more different they are, the further apart they will be on the map. This is the essence of topological data analysis (TDA).


TDA is an outgrowth of machine learning, a set of techniques that serves as a standard workhorse of big data analysis. Many of the methods in machine learning are most effective when working with data matrices, like an Excel spreadsheet, but what if your data set doesn’t fit that framework? “Topological data analysis is a way of getting structured data out of unstructured data so that machine-learning algorithms can act more directly on it.”


As with Euler’s bridges, it’s all about the connections. Social networks map out the relationships between people, with clusters of names (nodes) and connections (edges) illustrating how we’re all connected. There will be clusters relating to family, college buddies, workplace acquaintances, and so forth. Carlsson thinks it is possible to extend this approach to other kinds of data sets as well, such as genomic sequences.”

[… and music?!]


 “One can lay the sequences out next to each other and count the number of places where they differ,” he explained. “That number becomes a measure of how similar or dissimilar they are, and you can encode that as a distance function.”


The idea behind topological data analysis is to reduce large, raw data sets of many dimensions to compressed representation of the data sets in smaller lower dimensions without sacrificing the most relevant topological properties. Ideally, this will reveal the underlying shape of the data. For example, a sphere technically exists in every dimension, but we can perceive only the three spatial dimensions. However, there are mathematical glasses through which one can glean information about these higher-dimensional shapes, Carlsson said. “A shape is an infinite number of points and an infinite amount of distances between those points. But if you’re willing to sacrifice a little roundness, you can represent [a circle] by a hexagon with six nodes and six edges, and it’s still recognizable as a circular shape.”


That is the basis of the proprietary technology Carlsson offers through his start-up venture, Ayasdi, which produces a compressed representation of high dimensional data in smaller bits, similar to a map of London’s tube system. Such a map might not accurately represent the city’s every last defining feature, but it does highlight the primary regions and how those regions are connected. In the case of Ayasdi’s software, the resulting map is not just an eye-catching visualization of the data; it also enables users to interact directly with the data set the same way they would use Photoshop or Illustrator. “It means we won’t be entirely faithful to the data, but if that set at lower representations has topological features in it, that’s a good indication that there are features in the original data also.”


Topological methods are a lot like casting a two-dimensional shadow of a three-dimensional object on the wall: they enable us to visualize a large, high-dimensional data set by projecting it down into a lower dimension. The danger is that, as with the illusions created by shadow puppets, one might be seeing patterns and images that aren’t really there.


It is so far unclear when TDA works and when it might not. The technique rests on the assumption that a high-dimensional big data set has an intrinsic low-dimensional structure, and that it is possible to discover that structure mathematically. Recht believes that some data sets are intrinsically high in dimension and cannot be reduced by topological analysis. “If it turns out there is a spherical cow lurking underneath all your data, then TDA would be the way to go,” he said. “But if it’s not there, what can you do?” And if your dataset is corrupted or incomplete, topological methods will yield similarly flawed results.


Emmanuel Candes, a mathematician at Stanford University, and his then-postdoc, Justin Romberg, were fiddling with a badly mangled image on his computer, the sort typically used by computer scientists to test imaging algorithms. They were trying to find a method for improving fuzzy images, such as the ones generated by MRIs when there is insufficient time to complete a scan. On a hunch, Candes applied an algorithm designed to clean up fuzzy images, expecting to see a slight improvement. What appeared on his computer screen instead was a perfectly rendered image. Candes compares the unlikeliness of the result to being given just the first three digits of a 10-digit bank account number, and correctly guessing the remaining seven digits. But it wasn’t a fluke. The same thing happened when he applied the same technique to other incomplete images.


The key to the technique’s success is a concept known as sparsity, which usually denotes an image’s complexity, or lack thereof. It’s a mathematical version of Occam’s razor: While there may be millions of possible reconstructions for a fuzzy, ill-defined image, the simplest (sparsest) version is probably the best fit. Out of this serendipitous discovery, compressed sensing was born. With compressed sensing, one can determine which bits are significant without first having to collect and store them all.


This approach can even be useful for applications that are not, strictly speaking, compressed sensing problems, such as the Netflix prize. In October 2006, Netflix announced a competition offering a $1 million grand prize to whoever could improve the filtering algorithm for their in-house movie recommendation engine, Cinematch. An international team of statisticians, machine learning experts and computer engineers claimed the grand prize in 2009, but the academic community in general also benefited, since they gained access to Netflix’s very large, high quality data set. Recht was among those who tinkered with it. His work confirmed the viability of applying the compressed sensing approach to the challenge of filling in the missing ratings in the dataset.


Cinematch operates by using customer feedback: Users are encouraged to rate the films they watch, and based on those ratings, the engine must determine how much a given user will like similar films. The dataset is enormous, but it is incomplete: on average, users only rate about 200 movies, out of nearly 18,000 titles. Given the enormous popularity of Netflix, even an incremental improvement in the predictive algorithm results in a substantial boost to the company’s bottom line. Recht found that he could accurately predict which movies customers might be interested in purchasing, provided he saw enough products per person. Between 25 and 100 products were sufficient to complete the matrix.


“We have shown mathematically that you can do this very accurately under certain conditions by tractable computational techniques,” Candes said, and the lessons learned from this proof of principle are now feeding back into the research community.


Recht and Candes may champion approaches like compressed sensing, while Carlsson and Coifman align themselves more with the topological approach, but fundamentally, these two methods are complementary rather than competitive. There are several other promising mathematical tools being developed to handle this brave new world of big, complicated data. Vespignani uses everything from network analysis — creating networks of relations between people, objects, documents, and so forth in order to uncover the structure within the data — to machine learning, and good old-fashioned statistics.


Coifman asserts the need for an underlying global theory on a par with calculus to enable researchers to become better curators of big data. In the same way, the various techniques and tools being developed need to be integrated under the umbrella of such a broader theoretical model. “In the end, data science is more than the sum of its methodological parts,” Vespignani insists, and the same is true for its analytical tools. “When you combine many things you create something greater that is new and different.”

No comment yet.