Computational Mus...
Follow
Find
5.3K views | +0 today
Computational Music Analysis
New technologies and research dedicated to the analysis of music using computers. Towards Music 2.0!
Your new post is loading...
Your new post is loading...
Scooped by Olivier Lartillot
Scoop.it!

Inside Google's Infinite Music Intelligence Machine

Inside Google's Infinite Music Intelligence Machine | Computational Music Analysis | Scoop.it

In May, Google launched a music service that will challenge Spotify and Pandora for radio domination. We asked Google research scientist Doug Eck how it works.

Olivier Lartillot's insight:

"

The Beatles have a way of keeping Doug Eck up at night. Specifically, the research scientist at Google grapples with one of the core problems of automated music discovery: With a band whose catalog is as evolutionary and nuanced as The Beatles's, how can computers truly understand the artist and recommend relevant music to fans? For humans, detecting the difference is easy. For machines, it's not so simple.


Solving problems like these resolves only part of a larger, more complex equation at play when Google attempts to help its users discover music. Music discovery is a crucial piece of that puzzle and one that's notoriously challenging to lock into place.
In taking its own stab at music recommendation, Google blends technical solutions like machine listening and collaborative filtering with good, old-fashioned human intuition. Employing both engineers and music editors, the service continually tries to understand what people are listening to, why they enjoy it, and what they might want to hear next.


How Google Music Intelligence Works


Eck's team is focused on the technical side of this equation, relying on a dual-sided machine learning methodology. One component of that is collaborative filtering of the variety employed by Netflix and Amazon to recommend horror flicks and toasters. The other involves machine listening. That is, computers "listen" to the audio and try to pick out specific qualities and details within each song.


Collaborative filtering works wonders for the Amazons of the world. But since this type of If-you-like-that-you'll-also-like-this logic works better for kitchen appliances than it does for art, the system needs a way to learn more about the music itself. To teach it, Eck's team leverages Google's robust infrastructure and machine-listening technology to pick apart the granular qualities of each song.
"By and large, audio-based models are very good at timbre," says Eck. "So they're very good at recognizing instruments, very good at recognizing things like distorted guitars, very good at recognizing things like whether it's a male or female vocalist."


These are precisely the kinds of details that Pandora relies on human, professionally trained musicians to figure out. The Internet radio pioneer has long employed musicologists to listen to songs and help build out a multipoint, descriptive data set designed to place each track into a broader context and appropriately relate it to other music. For Pandora, the results have been extremely valuable, but mapping out this musical intelligence manually doesn't scale infinitely. Thankfully, machine listening has come a long way in recent years. Much like Google indexes the Web, the company is able to index a massive database of audio, mapping the musical qualities found within. Since it's automated, this part of Google's music recommendation technology can be scaled to a much larger set of data.


"If the four of us decided we were going to record a jazz quartet right here and now and we uploaded it to Play Music, our system will be aware that were talking about that," explains Eck. "By pulling these audio features out of every track that we work with, it gives us a kind of musical vocabulary that we can work with for doing recommendation even if it’s a very long tail."


Indeed, when it comes to music, the tail has never been longer. The world's selection of recorded music was never finite, but today creating and distributing new songs is virtually devoid of friction and financial cost. However much human intelligence as Pandora feeds into its algorithm, its Music Genome Project will never be able to keep up and understand everything. That's where machine learning gets a leg up.


The Limits Of Machine Listening


Still, there's a reason Pandora has more than 70 million active listeners and continues to increase its share of overall radio listening time. Its music discovery engine is very good. It might not know about my friend's band on a small Georgia-based record label, but the underlying map of data that Pandora uses to create stations is still incredibly detailed. When I start a radio station based on Squarepusher, an acclaimed but not particularly popular electronic music artist, the songs it plays are spun for very specific reasons. It plays a track by Aphex Twin because it features "similar electronica roots, funk influences, headnodic beats, the use of chordal patterning, and acoustic drum samples." Then, when I skip to the next song, it informs me that, "We're playing this track because it features rock influences, meter complexity, unsyncopated ensemble rhythms, triple meter style, and use of modal harmonies."


Pandora knows this much about these tracks thanks to those aforementioned music experts who sat down and taught it. Automated machine listening, by comparison, can't get quite as specific. At least, not yet.


"It’s very hard and we haven’t solved the problem with a capital S," says Eck, whose has an academic background in automated music analysis. "Nor has anybody else."


Computers might be able to pick out details about timbre, instruments used, rhythm, and other on-the-surface sonic qualities, but they can only dig so deep.


"You can learn a lot from one second of audio. Certainly you can tell if there’s a female voice there or if there’s distorted guitar there. What about when we stretch out and we look what our musical phrase is. What’s happening melodically? Where’s this song going? As we move out and have longer time scale stretches that we’re trying to outline, it becomes very hard to use machines alone to get the answer."
Thanks Algorithms, But The Humans Can Take It From Here.
That's where the good, old-fashioned human beings come in. To help flesh out the music discovery and radio experiences in All Access, Google employs music editors who have an intuition that computers have yet to successfully mimic. Heading up this editor-powered side of the equation is Tim Quirk, a veteran of the online music industry who worked at the now-defunct Listen.com before Napster was a household name.


"Algorithms can tell you what are the most popular tracks in any genre, but an algorithm might not know that "You Don't Miss Your Water" was sort of the first classic, Southern soul ballad in that particular time signature and that it became the template for a decade's worth of people doing the same thing," says Quirk. "That’s sort of arcane human knowledge."


[well, we could hope computers could discovery such knowledge automatically in the future.. (Olivier)]


Google's blend of human and machine intelligence is markedly different from Pandora's. Rather than hand-feeding tons of advanced musical knowledge directly into its algorithms, Google mostly keeps the human-curated stuff in its own distinct areas, allowing the computers to do the heavy lifting elsewhere. Quirk and his team of music editors are the ones who define the most important artists, songs and albums in a given genre (of which there are hundreds in Google Play Music).


Quirk's team also creates curated playlists and make specific, hand-picked music recommendations. To the extent that these manually curated parts of the service influence its users' listening behavior, the human intelligence does find its way back into the algorithms. It just loops back around and takes a longer road to get there.
Google's employees aren't the only people feeding intelligence into this semiautomated music machine. Google is also constantly learning from its users. Like Pandora and its many copycats, Google Play Music's Radio feature has thumbs up and thumbs down buttons, which help inform the way the radio stations work over time.

Some day, computers will be better able to understand not just that I like The Beatles, but why I like The Beatles. They'll know that John is my favorite, which songs on Revolver I skip, and that of all the heavily Beatles-influenced bands in the world, I love Tame Impala, but loathe Oasis. The machines will get smarter, much like a toddler does: by watching and listening. As users spend time with the product and tap buttons, the nuanced details will become more obvious.
Meanwhile, the ability of computers to actually hear what's contained within each song can only improve over time, just as voice recognition continues to do. The promise of Google Play Music is the same thing that made Google successful to begin with: its ability to use massive data to understand who we are and what we want. If anybody can crack the notoriously hard nut of music discovery in a hugely scalable fashion, it's them. Just don't expect them to do it with machines alone.

 

"

more...
No comment yet.
Scooped by Olivier Lartillot
Scoop.it!

Google Student Blog: Getting to Know a PhD

Google Student Blog: Getting to Know a PhD | Computational Music Analysis | Scoop.it
Olivier Lartillot's insight:

Cynthia is a PhD student. Her research is in the field of music information retrieval: she is working on technologies to analyze, organize and present the considerable amount of digital music information. She is particularly interested in making sense of music data by obtaining richer perspectives on it by taking into account information from multiple data sources. These can be recordings of multiple interpretations of the same music piece, but also related information in non-audio modalities, such as videos of performing musicians and textual information from collaborative web resources describing songs and their usage contexts.

more...
No comment yet.
Scooped by Olivier Lartillot
Scoop.it!

The Future of Music Genres Is Here

The Future of Music Genres Is Here | Computational Music Analysis | Scoop.it
We’ve always felt ambivalent about the word “genre” at The Echo Nest. On one hand, it’s the most universal shorthand for classifying music, because everyone has a basic understanding of the big, old...
Olivier Lartillot's insight:

The Echo Nest “top terms,” which are the words most commonly used to describe a piece of music, are “far more granular than the big, static genres of the past. We’ve been maintaining an internal list of dynamic genre categories for about 800 different kinds of music. We also know what role each artist or song plays in its genre (whether they are a key artist for that genre, one of the most commonly played, or an up-and-comer).

 

The Echo Nest just announced “a bunch of new genre-oriented features, including:

- A list of nearly 800 genres from the real world of music

- Names and editorial descriptions of every genre

- Essential artists from any genre

- Similar genres to any genre

- Verified explainer links to third-party resources when available

- Genre search by keyword

- Ranked genres associated with artists

- Three radio “presets” for each genre: Core (the songs most representative of the genre); In Rotation (the songs being played most frequently in any genre today); and Emerging (up-and-coming songs within the genre).

 

Where did these genres come from?

 

“The Echo Nest’s music intelligence platform continuously learns about music. Most other static genre solutions classify music into rigid, hierarchical relationships, but our system reads everything written about music on the web, and listens to millions of new songs all the time, to identify their acoustic attributes.

 

“This enables our genres to react to changes in music as they happen. To create dynamic genres, The Echo Nest identifies salient terms used to describe music (e.g., “math rock,” “IDM”, etc.), just as they start to appear. We then model genres as dynamic music clusters — groupings of artists and songs that share common descriptors, and similar acoustic and cultural attributes. When a new genre forms, we know about it, and music fans who listen to our customers’ apps and services will be able to discover it right away, too.

 

“Our approach to genres is trend-aware. That means it knows not only what artists and songs fall into a given genre, but also how those songs and artists are trending among actual music fans, within those genres.

 

“About 260 of these nearly 800 genres are hyper-regional, meaning that they are tied to specific places. Our genre system sees these forms of music as they actually exist; it can help the curious music fan hear the differences, for instance, between Luk Thung, Benga, and Zim music.”

more...
No comment yet.
Scooped by Olivier Lartillot
Scoop.it!

Brainlike Computers, Learning From Experience

Brainlike Computers, Learning From Experience | Computational Music Analysis | Scoop.it
The new computing approach is based on the biological nervous system, specifically on how neurons react to stimuli and connect with other neurons to interpret information.
Olivier Lartillot's insight:

Some excerpts from the New York Times paper:

 

“The new computing approach, already in use by some large technology companies, is based on the biological nervous system, specifically on how neurons react to stimuli and connect with other neurons to interpret information. It allows computers to absorb new information while carrying out a task, and adjust what they do based on the changing signals.

 

In coming years, the approach will make possible a new generation of artificial intelligence systems that will perform some functions that humans do with ease: see, speak, listen, navigate, manipulate and control. That can hold enormous consequences for tasks like facial and speech recognition, navigation and planning, which are still in elementary stages and rely heavily on human programming.

 

“We’re moving from engineering computing systems to something that has many of the characteristics of biological computing.” 

 

Conventional computers are limited by what they have been programmed to do. Computer vision systems, for example, only “recognize” objects that can be identified by the statistics-oriented algorithms programmed into them. An algorithm is like a recipe, a set of step-by-step instructions to perform a calculation.

 

Until now, the design of computers was dictated by ideas originated by the mathematician John von Neumann about 65 years ago. Microprocessors perform operations at lightning speed, following instructions programmed using long strings of 1s and 0s. They generally store that information separately in what is known, colloquially, as memory, either in the processor itself, in adjacent storage chips or in higher capacity magnetic disk drives.

The data are shuttled in and out of the processor’s short-term memory while the computer carries out the programmed action. The result is then moved to its main memory.

 

The new processors consist of electronic components that can be connected by wires that mimic biological synapses. Because they are based on large groups of neuron-like elements, they are known as neuromorphic processors.

 

They are not “programmed.” Rather the connections between the circuits are “weighted” according to correlations in data that the processor has already “learned.” Those weights are then altered as data flows in to the chip, causing them to change their values and to “spike.” That generates a signal that travels to other components and, in reaction, changes the neural network, in essence programming the next actions much the same way that information alters human thoughts and actions.

 

“Instead of bringing data to computation as we do today, we can now bring computation to data. Sensors become the computer, and it opens up a new way to use computer chips that can be everywhere.”

The new computers, which are still based on silicon chips, will not replace today’s computers, but will augment them, at least for now.

 

Many computer designers see them as coprocessors, meaning they can work in tandem with other circuits that can be embedded in smartphones and in the giant centralized computers that make up the cloud. Modern computers already consist of a variety of coprocessors that perform specialized tasks, like producing graphics on your cellphone and converting visual, audio and other data for your laptop.”

more...
No comment yet.
Scooped by Olivier Lartillot
Scoop.it!

Shazam-Like Dolphin System ID's Their Whistles: Scientific American Podcast

Olivier Lartillot's insight:

I am glad to see such popularization of research related to “melodic” pattern identification that generalizes beyond the music context and beyond the human species, and also this interesting link to music identification technologies (like Shazam). Before discussing further on this, here is first of all what this Scientific American podcast explains in a simple way the computational attempt of mimicking dolphins' melodic pattern identification abilities:

 

“Shazam-Like Dolphin System ID's Their Whistles: A program uses an algorithm to identify dolphin whistles similar to that of the Shazam app, which identifies music from databases by changes in pitch over time.

Used to be, if you happened on a great tune on the radio, you might miss hearing what it was. Of course, now you can just Shazam it—let your smartphone listen, and a few seconds later, the song and performer pop up. Now scientists have developed a similar tool—for identifying dolphins.

Every dolphin has a unique whistle.  They use their signature whistles like names: to introduce themselves, or keep track of each other. Mothers, for example, call a stray offspring by whistling the calf's ID.

To tease apart who's saying what, researchers devised an algorithm based on the Parsons code, the software that mammals, I mean that fishes songs from music databases, by tracking changes in pitch over time.

They tested the program on 400 whistles from 20 dolphins. Once a database of dolphin sounds was created, the program identified subsequent dolphins by their sounds nearly as well as humans who eyeballed the whistles' spectrograms.

Seems that in noisy waters, just small bits of key frequency change information may be enough to help Flipper find a friend.”

 

More precisely, the computer program generates a compact description of each dolphin whistle indicating how the pitch curve progressively ascends and descends. This enables to get a description that is characteristic of each dolphin, and to compare these whistle curves and see which curve belongs to which dolphin.

 

But to be more precise, Shazam does not use this kind of approach to identify music. It does not try to detect melodic lines in the music recorded by the user, but take a series of several-second snapshot of each song, such that each snapshot contains all the complex sound at that particular moment (with the polyphony of instruments). A compact description (a “fingerprint”) of each snapshot is produced, that indicate the most important spectral peaks (let's say the more prominent pitch of the polyphony). This fingerprint is then compared with those of each songs in the music database. Finally the identified song in the database is the one whose series of fingerprints fits best with the series of fingerprints of the user's music query. Here is a simple explanation of how Shazam works: http://laplacian.wordpress.com/2009/01/10/how-shazam-works/

 

Shazam does not model *how* humans identify music. The dolphin whistle comparison program does not model *how* dolphins identify each other. And Shazam and the dolphin whistle ID program do not use similar approaches. But on the other hand, we might assume that dolphins and humans abilities of identifying auditory patterns (in whistles, in music for humans) rely on same core cognitive processes?

more...
No comment yet.
Scooped by Olivier Lartillot
Scoop.it!

The Man Behind the Google Brain: Andrew Ng and the Quest for the New AI | Wired Enterprise | Wired.com

The Man Behind the Google Brain: Andrew Ng and the Quest for the New AI | Wired Enterprise | Wired.com | Computational Music Analysis | Scoop.it
There's a theory that human intelligence stems from a single algorithm. The idea arises from experiments suggesting that the portion of your brain dedicated to processing sound from your ears could also handle sight for your eyes.
Olivier Lartillot's insight:

My digest:

 

"

There’s a theory that human intelligence stems from a single algorithm. The idea arises from experiments suggesting that the portion of your brain dedicated to processing sound from your ears could also handle sight for your eyes. This is possible only while your brain is in the earliest stages of development, but it implies that the brain is — at its core — a general-purpose machine that can be tuned to specific tasks.

 

In the early days of artificial intelligence, the prevailing opinion was that human intelligence derived from thousands of simple agents working in concert, what MIT’s Marvin Minsky called “The Society of Mind.” To achieve AI, engineers believed, they would have to build and combine thousands of individual computing modules. One agent, or algorithm, would mimic language. Another would handle speech. And so on. It seemed an insurmountable feat.

 

A new field of computer science research known as Deep Learning seeks to build machines that can process data in much the same way the brain does, and this movement has extended well beyond academia, into big-name corporations like Google and Apple. Google is building one of the most ambitious artificial-intelligence systems to date, the so-called Google Brain.

 

This movement seeks to meld computer science with neuroscience — something that never quite happened in the world of artificial intelligence. “I’ve seen a surprisingly large gulf between the engineers and the scientists.” Engineers wanted to build AI systems that just worked, but scientists were still struggling to understand the intricacies of the brain. For a long time, neuroscience just didn’t have the information needed to help improve the intelligent machines engineers wanted to build.

 

What’s more, scientists often felt they “owned” the brain, so there was little collaboration with researchers in other fields. The end result is that engineers started building AI systems that didn’t necessarily mimic the way the brain operated. They focused on building pseudo-smart systems that turned out to be more like a Roomba vacuum cleaner than Rosie the robot maid from the Jetsons.

 

Deep Learning is a first step in this new direction. Basically, it involves building neural networks — networks that mimic the behavior of the human brain. Much like the brain, these multi-layered computer networks can gather information and react to it. They can build up an understanding of what objects look or sound like.

 

In an effort to recreate human vision, for example, you might build a basic layer of artificial neurons that can detect simple things like the edges of a particular shape. The next layer could then piece together these edges to identify the larger shape, and then the shapes could be strung together to understand an object. The key here is that the software does all this on its own — a big advantage over older AI models, which required engineers to massage the visual or auditory data so that it could be digested by the machine-learning algorithm.

 

With Deep Learning, you just give the system a lot of data “so it can discover by itself what some of the concepts in the world are.” Last year, one algorithms taught itself to recognize cats after scanning millions of images on the internet. The algorithm didn’t know the word “cat” but over time, it learned to identify the furry creatures we know as cats, all on its own.

 

This approach is inspired by how scientists believe that humans learn. As babies, we watch our environments and start to understand the structure of objects we encounter, but until a parent tells us what it is, we can’t put a name to it.

 

No, deep learning algorithms aren’t yet as accurate — or as versatile — as the human brain. But he says this will come.

 

In 2011, the Deep Learning project was launched at Google, and in recents months, the search giant has significantly expanded this effort, acquiring the artificial intelligence outfit founded by University of Toronto professor Geoffrey Hinton, widely known as the godfather of neural networks.

 

Chinese search giant Baidu has opened its own research lab dedicated to deep learning, vowing to invest heavy resources in this area. And big tech companies like Microsoft and Qualcomm are looking to hire more computer scientists with expertise in neuroscience-inspired algorithms.

 

Meanwhile, engineers in Japan are building artificial neural nets to control robots. And together with scientists from the European Union and Israel, neuroscientist Henry Markman is hoping to recreate a human brain inside a supercomputer, using data from thousands of real experiments.

 

The rub is that we still don’t completely understand how the brain works, but scientists are pushing forward in this as well. The Chinese are working on what they call the Brainnetdome, described as a new atlas of the brain, and in the U.S., the Era of Big Neuroscience is unfolding with ambitious, multidisciplinary projects like President Obama’s newly announced (and much criticized) Brain Research Through Advancing Innovative Neurotechnologies Initiative — BRAIN for short.

 

If we map how out how thousands of neurons are interconnected and “how information is stored and processed in neural networks,” engineers will have better idea of what their artificial brains should look like. The data could ultimately feed and improve Deep Learning algorithms underlying technologies like computer vision, language analysis, and the voice recognition tools offered on smartphones from the likes of Apple and Google.

 

“That’s where we’re going to start to learn about the tricks that biology uses. I think the key is that biology is hiding secrets well. We just don’t have the right tools to grasp the complexity of what’s going on.”

 

Right now, engineers design around these issues, so they skimp on speed, size, or energy efficiency to make their systems work. But AI may provide a better answer. “Instead of dodging the problem, what I think biology could tell us is just how to deal with it….The switches that biology is using are also inherently noisy, but biology has found a good way to adapt and live with that noise and exploit it. If we could figure out how biology naturally deals with noisy computing elements, it would lead to a completely different model of computation.”

 

But scientists aren’t just aiming for smaller. They’re trying to build machines that do things computer have never done before. No matter how sophisticated algorithms are, today’s machines can’t fetch your groceries or pick out a purse or a dress you might like. That requires a more advanced breed of image intelligence and an ability to store and recall pertinent information in a way that’s reminiscent of human attention and memory. If you can do that, the possibilities are almost endless.

 

“Everybody recognizes that if you could solve these problems, it’s going to open up a vast, vast potential of commercial value."

more...
No comment yet.
Scooped by Olivier Lartillot
Scoop.it!

Literary History, Seen Through Big Data’s Lens

Literary History, Seen Through Big Data’s Lens | Computational Music Analysis | Scoop.it
Big Data is pushing into the humanities, as evidenced by new, illuminating computer analyses of literary history.
Olivier Lartillot's insight:

My digest:

 

"

Big Data technology is steadily pushing beyond the Internet industry and scientific research into seemingly foreign fields like the social sciences and the humanities. The new tools of discovery provide a fresh look at culture, much as the microscope gave us a closer look at the subtleties of life and the telescope opened the way to faraway galaxies.

 

“Traditionally, literary history was done by studying a relative handful of texts. What this technology does is let you see the big picture — the context in which a writer worked — on a scale we’ve never seen before.”

 

Some of those tools are commonly described in terms familiar to an Internet software engineer — algorithms that use machine learning and network analysis techniques. For instance, mathematical models are tailored to identify word patterns and thematic elements in written text. The number and strength of links among novels determine influence, much the way Google ranks Web sites.

 

It is this ability to collect, measure and analyze data for meaningful insights that is the promise of Big Data technology. In the humanities and social sciences, the flood of new data comes from many sources including books scanned into digital form, Web sites, blog posts and social network communications.

 

Data-centric specialties are growing fast, giving rise to a new vocabulary. In political science, this quantitative analysis is called political methodology. In history, there is cliometrics, which applies econometrics to history. In literature, stylometry is the study of an author’s writing style, and these days it leans heavily on computing and statistical analysis. Culturomics is the umbrella term used to describe rigorous quantitative inquiries in the social sciences and humanities.

 

“Some call it computer science and some call it statistics, but the essence is that these algorithmic methods are increasingly part of every discipline now.”

 

Cultural data analysts often adapt biological analogies to describe their work. For example: “Computing and Visualizing the 19th-Century Literary Genome.”

 

Such biological metaphors seem apt, because much of the research is a quantitative examination of words. Just as genes are the fundamental building blocks of biology, words are the raw material of ideas.

 

“What is critical and distinctive to human evolution is ideas, and how they evolve.”

 

Some projects mine the virtual book depository known as Google Books and track the use of words over time, compare related words and even graph them. Google cooperated and built the software for making graphs open to the public. The initial version of Google’s cultural exploration site began at the end of 2010, based on more than five million books, dating from 1500. By now, Google has scanned 20 million books, and the site is used 50 times a minute. For example, type in “women” in comparison to “men,” and you see that for centuries the number of references to men dwarfed those for women. The crossover came in 1985, with women ahead ever since.

 

Researchers tapped the Google Books data to find how quickly the past fades from books. For instance, references to “1880,” which peaked in that year, fell to half by 1912, a lag of 32 years. By contrast, “1973” declined to half its peak by 1983, only 10 years later. “We are forgetting our past faster with each passing year.”

 

Other research approached collective memory from a very different perspective, focusing on what makes spoken lines in movies memorable. Sentences that endure in the public mind are evolutionary success stories, cf. “the fitness of language and the fitness of organisms.” As a yardstick, the researchers used the “memorable quotes” selected from the popular Internet Movie Database, or IMDb, and the number of times that a particular movie line appears on the Web. Then they compared the memorable lines to the complete scripts of the movies in which they appeared — about 1,000 movies. To train their statistical algorithms on common sentence structure, word order and most widely used words, they fed their computers a huge archive of articles from news wires. The memorable lines consisted of surprising words embedded in sentences of ordinary structure. “We can think of memorable quotes as consisting of unusual word choices built on a scaffolding of common part-of-speech patterns.”

 

Quantitative tools in the humanities and the social sciences, as in other fields, are most powerful when they are controlled by an intelligent human. Experts with deep knowledge of a subject are needed to ask the right questions and to recognize the shortcomings of statistical models.

 

“You’ll always need both. But we’re at a moment now when there is much greater acceptance of these methods than in the past. There will come a time when this kind of analysis is just part of the tool kit in the humanities, as in every other discipline.”

more...
No comment yet.
Scooped by Olivier Lartillot
Scoop.it!

The Perfect Classical Music App

The Perfect Classical Music App | Computational Music Analysis | Scoop.it
A funny thing happened the last time I was taking in a performance of Beethoven’s Fifth Symphony (just a few minutes ago).
Olivier Lartillot's insight:

Digest:

 

"The Orchestra is a flat-out astounding new app produced by Touch Press, the Philharmonia Orchestra and its principal conductor Salonen. At $13.99, it’s not only one of the best albums—you know, a longish compilation of music—you could purchase for someone this holiday season; it’s an app that could easily change how you consume classical music outside of the concert hall. Or how we introduce new listeners to symphonic works in the first place.

 

Salonen’s venture wisely avoids trying to recapture the form or mediated rhythms of those storied successes. Physical copies of recordings, after all, are pretty much dead. Conductors aren’t going to be invited back to occupy whole hours of network TV time ever again. So: on to the app store.

 

The Philharmonia’s success here wasn’t guaranteed merely by its being the first orchestra to upload some videos to a tablet’s app store. Rather, their opening gambit was deeply thought through by people who understand both Mahler and the iPad. Because the best thing about the app is its synchronous way of making you feel and see various musical values at once, you will derive the best experience of The Orchestra by listening only to the musicians, and having the rest of the app’s information delivered visually. The swooping and aggressive harp glissandos that come during the “Princesses Intercede …” movement of Igor Stravinsky’s “Firebird” ballet are exciting enough as pure sound, but this app gets carried right along with the music’s kinetic qualities: The score speeds expressively through each punchy liftoff in 6/8 time, while, above, a bird’s-eye “BeatMap” graphic of the orchestra pulses to signal which instruments are required at each second in order to whip up the overall noise. The presentation of performance video and graphical information is where the app is elevated beyond being a pleasing curiosity and into something that feels legitimately groundbreaking in our appreciation of music—as though there might be a day when they give out Grammys for app-making. You needn’t be totally comfortable reading musical notation in order to find value in looking at a score; at one vivid juncture of Salonen’s own violin concerto, you can read how the drummer at a “heavy rock kit” is advised to “Go crazy.” (And if you can’t read music, there’s a tablature-style reduction that drives home basic information, in a way that will feel familiar to users of GarageBand.)

 

The app lets you into the music from many angles at once, giving new views on the artistry and technical prowess behind the writing, and playing, of some of the world’s greatest music.

 

Best of all, The Orchestra is no techno-utopian attempt to do away with the concert hall. Rather, it’s an invitation for new listeners to get comfortable with the density of informational delight that can be had there. Until these musicians come around to your town, consider dropping $14 to meet them in app form. Think of The Orchestra, and the wave of symphony apps that ought to follow in its wake, the way you once did of “albums”—as exquisitely good advertisements for a product that still manages to best your expectations once you travel to get it in the real world."

more...
No comment yet.
Scooped by Olivier Lartillot
Scoop.it!

The Infinite Jukebox

The Infinite Jukebox | Computational Music Analysis | Scoop.it

[Note #1 of Olivier Lartillot, curator: It would be great to add more adaptive Markov modeling on top of that. Cf for instance the Continuator project: http://www.youtube.com/watch?v=ynPWOMzossI]

 

[Note #2 of Olivier Lartillot, curator: I suggest a short name for the upcoming dedicated website: adlib.it! ^^]

 

With The Infinite Jukebox, you can create a never-ending and ever changing version of any song. The app works by sending your uploaded track over to The Echo Nest, where it is decomposed into individual beats. Each beat is then analyzed and matched to other similar sounding beats in the song. This information is used to create a detailed song graph of paths though similar sounding beats. As the song is played, when the next beat has similar sounding beats there’s a chance that we will branch to a completely different part of the song. Since the branching is to a very similar sounding beat in the song, you (in theory) won’t notice the jump. This process of branching to similar sounding beats can continue forever, giving you an infinitely long version of the song.

 

To accompany the playback, I created a chord diagram that shows the beats of the song along the circumference of the circle along with with chords representing the possible paths from each beat to it’s similar neighbors. When the song is not playing, you can mouse over any beat and see all of the possible paths for that beat. When the song is playing, the visualization shows the single next potential beat. I was quite pleased at how the visualization turned out. I think it does a good job of helping the listener understand what is going on under the hood, and different songs have very different looks and color palettes. They can be quite attractive.

 

I did have to adapt the Infinite Gangnam Style algorithm for the Infinite Jukebox. Not every song is as self-similar as Psy’s masterpiece, so I have to dynamically adjust the beat-similarity threshold until there are enough pathways in the song graph to make the song infinite. This means that the overall musical quality may vary from song to song depending on the amount of self-similarity in the song.

 

Overall, the results sound good for most songs. I still may do a bit of tweaking on the algorithm to avoid some degenerate cases (you can get stuck in a strange attractor at the end of Karma Police for instance). Give it a try, upload your favorite song and listen to it forever. The Infinite Jukebox.

more...
Olivier Lartillot's comment, December 5, 2012 12:08 PM
Are you starting this blog? The topic sounds interesting!
Scooped by Olivier Lartillot
Scoop.it!

The Computer as Music Critic

The Computer as Music Critic | Computational Music Analysis | Scoop.it

Thanks to advances in computing power, we can analyze music in radically new and different ways. Computers are still far from grasping some of the deep and often unexpected nuances that release our most intimate emotions. However, by processing vast amounts of raw data and performing unprecedented large-scale analyses beyond the reach of teams of human experts, they can provide valuable insight into some of the most basic aspects of musical discourse, including the evolution of popular music over the years. Has there been an evolution? Can we measure it? And if so, what do we observe?

 

In a recent article published in the journal Scientific Reports, authors used computers to analyze 464,411 Western popular music recordings released between 1955 and 2010, including pop, rock, hip-hop, folk and funk. They first looked for static patterns characterizing the generic use of primary musical elements like pitch, timbre and loudness. They then measured a number of general trends for these elements over the years.

 

Common practice in the growing field of music information processing starts by cutting an audio signal into short slices — in our case the musical beat, which is the most relevant and recognizable temporal unit in music (the beat roughly corresponds to the periodic, sometimes unconscious foot-tapping of music listeners).

 

For each slice, computers represented basic musical information with a series of numbers. For pitch, they computed the relative intensity of the notes present in every beat slice, thus accounting for the basic harmony, melody and chords. For timbre, what some call the “color” of a note, they measured the general waveform characteristics of each slice, thus accounting for the basic sonority of a given beat and the combinations of instruments and effects. And for loudness, they calculated the energy of each slice, accounting for sound volume or perceived intensity.

 

They then constructed a music “vocabulary”: they assigned code words to slice-based numbers to generate a “text” that could represent the popular musical discourse of a given year or age. Doing so allowed to discover static patterns by counting how many different code words appeared in a given year, how often they were used and which were the most common successions of code words at a given point in time.

 

Interestingly, in creating a musical “vocabulary,” they found a well-known phenomenon common in written texts and many other domains: Zipf’s law, which predicts that the most frequent word in a text will appear twice as often as the next most frequent word, three times as often as the third most frequent, and so on. The same thing, they found, goes for music.

 

If we suppose that the most common note combination is used 100 times, the second most common combination will be used 50 times and the third 33 times. Importantly, they found that Zipf’s law held for each year’s vocabulary, from 1955 to 2010, with almost exactly the same “usage ordering” of code words every year. That suggests a general, static rule, one shared with linguistic texts and many other natural and artificial phenomena.

 

Beyond these static patterns, they also found three significant trends over time. Again using pitch code words, they counted the different transitions between note combinations and found that this number decreased over the decades. The analysis also indicated that pop music’s variety of timbre has been decreasing since the 1960s, meaning that artists and composers tend to stick to the same sound qualities — in other words, instruments playing the same notes sound more similar than they once did. Finally, they found that recording levels had consistently increased since 1955, confirming a so-called race toward louder music.

more...
No comment yet.
Scooped by Olivier Lartillot
Scoop.it!

Measuring the goodness or badness of singing, applied...

Measuring the goodness or badness of singing, applied... | Computational Music Analysis | Scoop.it

It is not easy to measure the goodness or badness of singing. There is "no consensus on how to obtain objective measures of singing proficiency in sung melodies".


They devised their own test, using the refrain from a song called Gens du Pays, which people in Quebec commonly sing as part of their ritual to celebrate a birthday. That refrain, they explain, has 32 notes, a vocal range of less than one octave, and a stable tonal centre.


These scientists went to a public park, where they used a clever subterfuge to recruit test subjects: "The experimenter pretended that it was his birthday and that he had made a bet with friends that he could get 100 individuals each to sing the refrain of Gens du Pays for him on this special occasion."


The resulting recordings became the raw material for intensive computer-based analysis, centring on the vowel sounds – the "i" in "mi", for example. Peretz, Giguère and Dalla Bella assessed each performance for pitch stability, number of pitch interval errors, "changes in pitch directions relative to musical notation", interval deviation, number of time errors, temporal variability, and timing consistency.


For comparison, they recorded and assessed several professional singers performing the same snatch of song.


"We found that the majority of individuals can carry a tune with remarkable proficiency. Occasional singers typically sing in time, but are less accurate in pitch as compared to professional singers. When asked to slow down, occasional singers greatly improve in performance, making as few pitch errors as professional singers." Only a very few, they say, were "clearly out of tune".

more...
No comment yet.
Scooped by Olivier Lartillot
Scoop.it!

new Spotify app from Reebok and music intelligence platform The Echo Nest

new Spotify app from Reebok and music intelligence platform The Echo Nest | Computational Music Analysis | Scoop.it

Music intelligence platform, The Echo Nest, has partnered with Reebok for a Spotify app for fitness fanatics or casual exercisers to get their groove on and motivate themselves into motion.


Reebok FitList is a free app for Spotify and it provides a few methods for creating the perfect soundtrack to work out to.


Through the new app, users can specify an activity such as jogging, yoga, dancing, walking or training; set an intensity level and the length of the workout. From there, choose an artist to base the playlist around and The Echo Nest’s information about every song in the Spotify catalog helps to create a playlist.


The end result includes songs by the chosen artist as well as similar tracks to spice things up and add a little fresh flavour.


Over 350 applications have been built on The Echo Nest platform. As a machine learning system that actively reads about and listens to music everywhere on the web, The Echo Nest opens up dynamic music data for over 5 billion data points on over 30 million songs. This naturally helps developers to re-shape the experience of experimenting with and listening to music.


A four-time National Science Foundation grantee, the Echo Nest was co-founded by two MIT PhDs. Investors include Matrix Partners, Commonwealth Capital Ventures, and three co-founders of MIT Media Lab.

more...
No comment yet.
Scooped by Olivier Lartillot
Scoop.it!

Measuring the Evolution of Contemporary Western Popular Music : Scientific Reports : Nature Publishing Group

Measuring the Evolution of Contemporary Western Popular Music : Scientific Reports : Nature Publishing Group | Computational Music Analysis | Scoop.it
[DIGEST:]

 

The study of patterns and long-term variations in popular music could shed new light on relevant issues concerning its organization, structure, and dynamics. More importantly, it addresses valuable questions for the basic understanding of music as one of the main expressions of contemporary culture: Can we identify some of the patterns behind music creation? Do musicians change them over the years? Can we spot differences between new and old music? Is there an ‘evolution’ of musical discourse?

 

Current technologies for music information processing provide a unique opportunity to answer the above questions under objective, empirical, and quantitative premises. Moreover, akin to recent advances in other cultural assets, they allow for unprecedented large-scale analyses. One of the first publicly-available large-scale collections that has been analyzed by standard music processing technologies is the million song dataset. Among others, the dataset includes the year annotations and audio descriptions of 464,411 distinct music recordings (from 1955 to 2010), which roughly corresponds to more than 1,200 days of continuous listening. Such recordings span a variety of popular genres, including rock, pop, hip hop, metal, or electronic. Explicit descriptions available in the dataset cover three primary and complementary musical facets: loudness, pitch, and timbre.

 

By exploiting tools and concepts from statistical physics and complex networks, we unveil a number of statistical patterns and metrics characterizing the general usage of pitch, timbre, and loudness in contemporary western popular music.

 

In order to build a ‘vocabulary’ of musical elements, we encode the dataset descriptions by a discretization of their values, yielding what we call music codewords. Next, to quantify long-term variations of a vocabulary, we need to obtain samples of it at different periods of time. For that we perform a Monte Carlo sampling in a moving window fashion. In particular, for each year, we sample one million beat-consecutive codewords, considering entire tracks and using a window length of 5 years. This procedure, which is repeated 10 times, guarantees a representative sample with a smooth evolution over the years.

 

We first count the frequency of usage of pitch codewords (i.e. the number of times each codeword type appears in a sample). We observe that most used pitch codewords generally correspond to well-known harmonic items, while unused codewords correspond to strange/dissonant pitch combinations. Sorting the frequency counts in decreasing order provides a very clear pattern behind the data: a power law, which indicates that a few codewords are very frequent while the majority are highly infrequent (intuitively, the latter provide the small musical nuances necessary to make a discourse attractive to listeners). Nonetheless, it also states that there is no characteristic frequency nor rank separating most used codewords from largely unused ones (except for the largest rank values due to the finiteness of the vocabulary). Another non-trivial consequence of power-law behavior is that it can detect case where extreme events (i.e. very rare codewords) will certainly show up in a continuous discourse providing the listening time is sufficient and the pre-arranged dictionary of musical elements is big enough.

 

Importantly, we find this power-law behavior to be invariant across years, with practically the same fit parameters. However, it could well be that, even though the distribution is the same for all years, codeword rankings were changing (e.g. a certain codeword was used frequently in 1963 but became mostly unused by 2005). To assess this possibility we compute the Spearman's rank correlation coefficients for all possible year pairs and find that they are all extremely high. This indicates that codeword rankings practically do not vary with years.

 

Codeword frequency distributions provide a generic picture of vocabulary usage. However, they do not account for discourse syntax, as well as a simple selection of words does not necessarily constitute an intelligible sentence. One way to account for syntax is to look at local interactions or transitions between codewords, which define explicit relations that capture most of the underlying regularities of the discourse and that can be directly mapped into a network or graph. Hence, analogously to language-based analyses, we consider the transition networks formed by codeword successions, where each node represents a codeword and each link represents a transition. The topology of these networks and common metrics extracted from them can provide us with valuable clues about the evolution of musical discourse.

 

All the transition networks we obtain are sparse, meaning that the number of links connecting codewords is of the same order of magnitude as the number of codewords. Thus, in general, only a limited number of transitions between codewords is possible. Such constraints would allow for music recognition and enjoyment, since these capacities are grounded in our ability for guessing/learning transitions and a non-sparse network would increase the number of possibilities in a way that guessing/learning would become unfeasible. Thinking in terms of originality and creativity, a sparse network means that there are still many ‘composition paths’ to be discovered. However, some of these paths could run into the aforementioned guessing/learning tradeoff. Overall, network sparseness provides a quantitative account of music's delicate balance between predictability and surprise.

 

In sparse networks, the most fundamental characteristic of a codeword is its degree, which measures the number of links to other codewords. With pitch networks, this quantity is distributed according to a power law with the same fit parameters for all considered years. We observe important trends in some network metrics, namely the average shortest path length l, the clustering coefficient C, and the assortativity with respect to random Γ. Specifically, l slightly increases from 2.9 to 3.2, values comparable to the ones obtained when randomizing the network links. The values of C show a considerable decrease from 0.65 to 0.45, and are much higher than those obtained for the randomized network. Thus, the small-worldness of the networks decreases with years. This trend implies that the reachability of a pitch codeword becomes more difficult. The number of hops or steps to jump from one codeword to the other (as reflected by l) tends to increase and, at the same time, the local connectivity of the network (as reflected by C) tends to decrease. Additionally, Γ is always below 1, which indicates that the networks are always less assortative than random (i.e. well-connected nodes are less likely to be connected among them), a tendency that grows with time if we consider the biggest hubs of the network. The latter suggests that there are less direct transitions between ‘referential’ or common codewords. Overall, a joint reduction of the small-worldness and the network assortativity shows a progressive restriction of pitch transitions, with less transition options and more defined paths between codewords.

 

In contrast to pitch, timbre networks are more assortative than random. The values of l fluctuate around 4.8 and C is always below 0.01. Noticeably, both are close to the values obtained with randomly wired networks. This close to random topology quantitatively demonstrates that, as opposed to language, timbral contrasts (or transitions) are rarely the basis for a musical discourse. This does not regard timbre as a meaningless facet. Global timbre properties, like the aforementioned power law and rankings, are clearly important for music categorization tasks (one example is genre classification). Notice however that the evolving characteristics of musical discourse have important implications for artificial or human systems dealing with such tasks. For instance, the homogenization of the timbral palette and general timbral restrictions clearly challenge tasks exploiting this facet. A further example is found with the aforementioned restriction of pitch codeword connectivity, which could hinder song recognition systems (artificial song recognition systems are rooted on pitch codeword-like sequences).

 

Loudness distributions are generally well-fitted by a reversed log-normal function. Plotting them provides a visual account of the so-called loudness race (or loudness war), a terminology that is used to describe the apparent competition to release recordings with increasing loudness, perhaps with the aim of catching potential customers' attention in a music broadcast. The empiric median of the loudness values x grows from −22 dBFS to −13 dBFS , with a least squares linear regression yielding a slope of 0.13 dB/year. In contrast, the absolute difference between the first and third quartiles of x remains constant around 9.5 dB, with a regression slope that is not statistically significant. This shows that, although music recordings become louder, their absolute dynamic variability has been conserved, understanding dynamic variability as the range between higher and lower loudness passages of a recording. However, and perhaps most importantly, one should notice that digital media cannot output signals over 0 dBFS35, which severely restricts the possibilities for maintaining the dynamic variability if the median continues to grow.

 

Finally the loudness transition networks has a one-dimensional character, inferring that no extreme loudness transitions occur (one rarely finds loudness transitions to drive a musical discourse). The very stable metrics obtained for loudness networks imply that, despite the race towards louder music, the topology of loudness transitions is maintained.

 

Some of the conclusions reported here have historically remained as conjectures, based on restricted resources, or rather framed under subjective, qualitative, and non-systematic premises. With the present work, we gain empirical evidence through a formal, quantitative, and systematic analysis of a large-scale music collection. We encourage the development of further historical databases to be able to quantify the major transitions in the history of music, and to start looking at more subtle evolving characteristics of particular genres or artists, without forgetting the whole wealth of cultures and music styles present in the world.

more...
No comment yet.
Scooped by Olivier Lartillot
Scoop.it!

Computer becomes a bird enthusiast

Computer becomes a bird enthusiast | Computational Music Analysis | Scoop.it
Program can distinguish among hundreds of species in recorded birdsongs
Olivier Lartillot's insight:

"

If you’re a bird enthusiast, you can pick out the “chick-a-DEE-dee” song of the Carolina chickadee with just a little practice. But if you’re an environmental scientist faced with parsing thousands of hours of recordings of birdsongs in the lab, you might want to enlist some help from your computer. A new approach to automatic classification of birdsong borrows techniques from human voice recognition software to sort through the sounds of hundreds of species and decides on its own which features make each one unique.

 

Typically, scientists build one computer program to recognize one species, and then start all over for another species. Training a computer to recognize lots of species in one pass is “a challenge that we’re all facing.”

 

That challenge is even bigger in the avian world, says Dan Stowell, a computer scientist at Queen Mary University of London who studied human voice analysis before turning his attention to the treetops. “I realized there are quite a lot of unsolved problems in birdsong.” Among the biggest issues: There are hundreds of species with distinct and complex calls—and in tropical hotspots, many of them sing all at once.

 

Most methods for classifying birdsong rely on a human to define which features separate one species from another. For example, if researchers know that a chickadee’s tweet falls within a predictable range of frequencies, they can program a computer to recognize sounds in that range as chickadee-esque. The computer gets better and better at deciding how to use these features to classify a new sound clip, based on “training” rounds where it examines clips with the species already correctly labeled.

 

In the new paper, Stowell and his Queen Mary colleague, computer scientist Mark Plumbley, used a different approach, known as unsupervised training. Instead of telling the computer which features of a birdsong are going to be important, they let it decide for itself, so to speak. The computer has to figure out “what are the jigsaw pieces” that make up any birdsong it hears. For example, some of the jigsaw pieces it selects are split-second upsweeps or downsweeps in frequency—the sharp pitch changes that make up a chirp. After seeing correctly labeled examples of which species produce which kinds of sounds, the program can spit out a list—ranked in order of confidence—of the species it thinks are present in a recording.

Their unsupervised approach performed better than the more traditional methods of classification—those based on a set of predetermined features.

 

The new system’s accuracy fell short of beating the top new computer programs that analyzed the same data sets for the annual competition. But the new system deserves credit for applying unsupervised computer learning to the complex world of birdsong for the first time. This approach could be combined with other ways of processing and classifying sound, because it can squeeze out some info that other techniques may miss.

 

Eighty-five percent accuracy on a choice between more than 500 calls and songs is impressive and shows both the biological community and the computer community what you can do with these large sound archives.

"

more...
No comment yet.
Scooped by Olivier Lartillot
Scoop.it!

Different Brain Regions Handle Different Musics

Different Brain Regions Handle Different Musics | Computational Music Analysis | Scoop.it
Functional MRI of the listening brain found that different regions become active when listening to different types of music and instrumental versus vocals. Allie Wilkinson reports.
Olivier Lartillot's insight:

"Computer algorithms were used to identify specific aspects of the music, which the researchers were able to match with specific, activated brain areas. The researchers found that vocal and instrumental music get treated differently. While both hemispheres of the brain deal with musical features, the presence of lyrics shifts the processing of musical features to the left auditory cortex.

These results suggest that the brain’s hemispheres are specialized for different kinds of sound processing. A finding revealed but what you might call instrumental analysis."

more...
No comment yet.
Scooped by Olivier Lartillot
Scoop.it!

Listen to Pandora, and It Listens Back

Listen to Pandora, and It Listens Back | Computational Music Analysis | Scoop.it
The Internet radio service has started to mine user data for the best ways to target advertising. It can deconstruct your song choices to predict, for example, your political party of choice.
Olivier Lartillot's insight:

“After years of customizing playlists to individual listeners by analyzing components of the songs they like, then playing them tracks with similar traits, the company has started data-mining users’ musical tastes for clues about the kinds of ads most likely to engage them.”

more...
No comment yet.
Scooped by Olivier Lartillot
Scoop.it!

“Art, sciences, humanities” theme in Futurium project by European Commission

“Art, sciences, humanities” theme in Futurium project by European Commission | Computational Music Analysis | Scoop.it

"... 

Art practice will gain a whole new status and role in future societies. Creativity will be key to harness the new possibilities offered by science and technology, and by the hyper-connected environments that will surround us, in useful directions. Art, science and humanities will connect to help boost this wave of change and creativity in Europe.

..."

Olivier Lartillot's insight:

Here is first of all a bit of background related to this Futurium project from the European Commission:

("Why your vote is crucial") https://ec.europa.eu/digital-agenda/futurium/en/content/get-started

 

“If you are interested in policy-making, this is the right place to be! Have a say on eleven compelling themes that will likely shape policy debates in the coming few decades!

They are a synthesis of more than 200 futures co-created by hundreds of "futurizens", including young thinkers as well as renowned scientists from different disciplines, in brainstorming sessions, both online and actual events all around Europe.

The themes include many insights on how policy-making could evolve in the near future. They can potentially help to guide future policy choices or to steer the direction of research funding; for instance, because they cast new light on the sweeping changes that could occur in areas like jobs and welfare; also by furthering our understanding of new routes to the greater empowerment of human-beings; and by exploring the societal impacts of the emergence of super-centenarians.

Everyone can now provide feedback and rate the relevance and timing of the themes.

Which one has the greatest impact? When will these themes become relevant?

Vote and help shape the most compelling options for future policies!”

 

Below is the theme “Art, sciences, humanities”. All these ideas seem to have important repercussion in music research. It would be splendid to see such ideals having an impact in future European research policies. So if you would support these ideas, please vote for this theme in the poll, which closes at the end of the week.

 

“The challenges facing humanity are revealing themselves as increasingly global and highly interconnected. The next few decades will give us the tools to start mastering this complexity in terms of a deeper understanding, but also in terms of policy and action with more predictability of impacts.

This will result from a combination of thus far unseen Big Data from various sources of evidence (smart grids, mobility data, sensor data, socio-economic data) along with the rise of dynamical modelling and new visualisation, analysis, and synthesis techniques (like narrative). It will also rely on a new alliance between science and society.

The virtualisation of the scientific process and the advent of social networks will allow every scientist to join forces with others in the open global virtual laboratory.  Human performance enhancement and embeddable sensors will enable scientists to perceive and observe processes in the real world in new ways. New ICT tools will allow better understanding of the social processes underlying all societal actions.

Digital games will increasingly be used as training grounds for developing worlds that work – from testing new systems of governance, to new systems of economy, medical and healing applications, industrial applications, educational systems and models – across every aspect of life, work, and culture.

Digital technologies will also empower people to co-create their environments, the products they buy, the science they learn, and the art they enjoy.  Digital media will break apart traditional models of art practice, production, and creativity, making production of previously expensive art forms like films affordable to anyone.

The blurring boundaries between artist and audience will completely disappear as audiences increasingly ‘applaud’ a great work by replying with works of their own, which the originating artist will in turn build upon for new pieces.  Digital media creates a fertile space for a virtuous circle of society-wide creativity and art production.

Art practice will gain a whole new status and role in future societies. Creativity will be key to harness the new possibilities offered by science and technology, and by the hyper-connected environments that will surround us, in useful directions. Art, science and humanities will connect to help boost this wave of change and creativity in Europe.

Key Issues

•How do we engage policy makers and civic society throughout the process of gathering data and analysing evidence on global systems? How do we cross-fertilise sciences, humanities and art?

•How do we ensure reward and recognition in a world of co-creation where everyone can be a scientist or an artist from his/her own desktop? How do we deal with ownership, responsibility and liability?

•How do we keep scientific standards alive as peer-reviewed research and quality standards are challenged by the proliferation of open-access publication? How do we assure the quality and credibility of data and models?

•How do we channel the force of creativity into areas of society that are critical but often slow to change, like healthcare, education, etc.?

•How do we ensure universal access and competency with emerging digital and creative technologies? Greater engagement of citizens in science and the arts? How do we disseminate learning about creativity and the arts to currently underserved populations?

•Equitable benefit distribution: how do we ensure that the benefits scientific discoveries and innovations are distributed evenly in society?

•Clear, effective communication, across multiple languages: how do we communicate insights from complex systems analyses to people who were not participants in the process in ways that create value shifts and behavioural changes to achieve solutions to global issues?

•Can the development of new narratives and metaphors make scientific results accessible to all humanity to reframe global challenges?

•Can the virtualisation of research and innovation lifecycles, the multidisciplinary collaboration and the cross fertilisation with arts and humanities help improve the impact of research?

•Transformation of education: how might the roles of schools and professional educators evolve in the light of the science and art revolution? What might be the impact on jobs and productivity?

•How do we respond to the increasing demand for data scientists and data analysts?

•How do we cope with unintended and undesirable effects of pervasive digitization of society such as media addictions, IPR and authenticity, counterfeiting, plagiarism, life history theft? How do we build trust in both artists and audiences?

•How do we ensure that supercomputing, simulation and big data are not invasive to privacy and support free will and personal aspirations?

•Can crowd-financing platforms for art initiatives balance the roles in current artistic economies (e.g. arts granting agencies, wealthy patrons)?

•How do we harness digital gaming technologies, and developments in live gaming, to allow users to create imagined worlds that empower them and the communities they live within?”

more...
anita sánchez's curator insight, December 1, 2013 4:29 PM

Very interesting...:)!!!

 

 

Olivier Lartillot's comment, December 1, 2013 11:00 PM
Thanks for rescooping, Anita. Please note that is not possible anymore to cast a vote in the Futurium poll, which ended 2 days ago, unfortunately.
NoahData's curator insight, December 19, 2013 2:42 AM

 

The big data technology ecosystem is fast evolving space comprising technologies such as Hadoop, Greenplum, Ambari, Cassandra, Mahout and Zookeeper.

We help you understand this Ecosystem and the myriad technology choices available and help you with a strategic roadmap to help your organization leverage and implement the right best fit solution be it on the Cloud or within your IT infrastructure on clusters.

Through a well phased approach covering proof of concepts, road map recommendations, sourcing strategy, our consultants help you through the entire lifecycle of implementing Big Data technologies.

Scooped by Olivier Lartillot
Scoop.it!

Scientific Data Has Become So Complex, We Have to Invent New Math to Deal With It - Wired Science

Scientific Data Has Become So Complex, We Have to Invent New Math to Deal With It - Wired Science | Computational Music Analysis | Scoop.it
Olivier Lartillot's insight:

[ Note from curator: Wired already wrote an article about Carlsson and his compressed sensing method 4 years ago.

There are interesting critical comments about this article in Slashdot: http://science.slashdot.org/comments.pl?sid=4328305&cid=45105969

Olivier ]

 

“It is not sufficient to simply collect and store massive amounts of data; they must be intelligently curated, and that requires a global framework. “We have all the pieces of the puzzle — now how do we actually assemble them so we can see the big picture? You may have a very simplistic model at the tiny local scale, but calculus lets you take a lot of simple models and integrate them into one big picture.” Similarly, modern mathematics — notably geometry — could help identify the underlying global structure of big datasets.

 

Gunnar Carlsson, a mathematician at Stanford University, is representing cumbersome, complex big data sets as a network of nodes and edges, creating an intuitive map of data based solely on the similarity of the data points; this uses distance as an input that translates into a topological shape or network. The more similar the data points are, the closer they will be to each other on the resulting map; the more different they are, the further apart they will be on the map. This is the essence of topological data analysis (TDA).

 

TDA is an outgrowth of machine learning, a set of techniques that serves as a standard workhorse of big data analysis. Many of the methods in machine learning are most effective when working with data matrices, like an Excel spreadsheet, but what if your data set doesn’t fit that framework? “Topological data analysis is a way of getting structured data out of unstructured data so that machine-learning algorithms can act more directly on it.”

 

As with Euler’s bridges, it’s all about the connections. Social networks map out the relationships between people, with clusters of names (nodes) and connections (edges) illustrating how we’re all connected. There will be clusters relating to family, college buddies, workplace acquaintances, and so forth. Carlsson thinks it is possible to extend this approach to other kinds of data sets as well, such as genomic sequences.”

[… and music?!]

 

 “One can lay the sequences out next to each other and count the number of places where they differ,” he explained. “That number becomes a measure of how similar or dissimilar they are, and you can encode that as a distance function.”

 

The idea behind topological data analysis is to reduce large, raw data sets of many dimensions to compressed representation of the data sets in smaller lower dimensions without sacrificing the most relevant topological properties. Ideally, this will reveal the underlying shape of the data. For example, a sphere technically exists in every dimension, but we can perceive only the three spatial dimensions. However, there are mathematical glasses through which one can glean information about these higher-dimensional shapes, Carlsson said. “A shape is an infinite number of points and an infinite amount of distances between those points. But if you’re willing to sacrifice a little roundness, you can represent [a circle] by a hexagon with six nodes and six edges, and it’s still recognizable as a circular shape.”

 

That is the basis of the proprietary technology Carlsson offers through his start-up venture, Ayasdi, which produces a compressed representation of high dimensional data in smaller bits, similar to a map of London’s tube system. Such a map might not accurately represent the city’s every last defining feature, but it does highlight the primary regions and how those regions are connected. In the case of Ayasdi’s software, the resulting map is not just an eye-catching visualization of the data; it also enables users to interact directly with the data set the same way they would use Photoshop or Illustrator. “It means we won’t be entirely faithful to the data, but if that set at lower representations has topological features in it, that’s a good indication that there are features in the original data also.”

 

Topological methods are a lot like casting a two-dimensional shadow of a three-dimensional object on the wall: they enable us to visualize a large, high-dimensional data set by projecting it down into a lower dimension. The danger is that, as with the illusions created by shadow puppets, one might be seeing patterns and images that aren’t really there.

 

It is so far unclear when TDA works and when it might not. The technique rests on the assumption that a high-dimensional big data set has an intrinsic low-dimensional structure, and that it is possible to discover that structure mathematically. Recht believes that some data sets are intrinsically high in dimension and cannot be reduced by topological analysis. “If it turns out there is a spherical cow lurking underneath all your data, then TDA would be the way to go,” he said. “But if it’s not there, what can you do?” And if your dataset is corrupted or incomplete, topological methods will yield similarly flawed results.

 

Emmanuel Candes, a mathematician at Stanford University, and his then-postdoc, Justin Romberg, were fiddling with a badly mangled image on his computer, the sort typically used by computer scientists to test imaging algorithms. They were trying to find a method for improving fuzzy images, such as the ones generated by MRIs when there is insufficient time to complete a scan. On a hunch, Candes applied an algorithm designed to clean up fuzzy images, expecting to see a slight improvement. What appeared on his computer screen instead was a perfectly rendered image. Candes compares the unlikeliness of the result to being given just the first three digits of a 10-digit bank account number, and correctly guessing the remaining seven digits. But it wasn’t a fluke. The same thing happened when he applied the same technique to other incomplete images.

 

The key to the technique’s success is a concept known as sparsity, which usually denotes an image’s complexity, or lack thereof. It’s a mathematical version of Occam’s razor: While there may be millions of possible reconstructions for a fuzzy, ill-defined image, the simplest (sparsest) version is probably the best fit. Out of this serendipitous discovery, compressed sensing was born. With compressed sensing, one can determine which bits are significant without first having to collect and store them all.

 

This approach can even be useful for applications that are not, strictly speaking, compressed sensing problems, such as the Netflix prize. In October 2006, Netflix announced a competition offering a $1 million grand prize to whoever could improve the filtering algorithm for their in-house movie recommendation engine, Cinematch. An international team of statisticians, machine learning experts and computer engineers claimed the grand prize in 2009, but the academic community in general also benefited, since they gained access to Netflix’s very large, high quality data set. Recht was among those who tinkered with it. His work confirmed the viability of applying the compressed sensing approach to the challenge of filling in the missing ratings in the dataset.

 

Cinematch operates by using customer feedback: Users are encouraged to rate the films they watch, and based on those ratings, the engine must determine how much a given user will like similar films. The dataset is enormous, but it is incomplete: on average, users only rate about 200 movies, out of nearly 18,000 titles. Given the enormous popularity of Netflix, even an incremental improvement in the predictive algorithm results in a substantial boost to the company’s bottom line. Recht found that he could accurately predict which movies customers might be interested in purchasing, provided he saw enough products per person. Between 25 and 100 products were sufficient to complete the matrix.

 

“We have shown mathematically that you can do this very accurately under certain conditions by tractable computational techniques,” Candes said, and the lessons learned from this proof of principle are now feeding back into the research community.

 

Recht and Candes may champion approaches like compressed sensing, while Carlsson and Coifman align themselves more with the topological approach, but fundamentally, these two methods are complementary rather than competitive. There are several other promising mathematical tools being developed to handle this brave new world of big, complicated data. Vespignani uses everything from network analysis — creating networks of relations between people, objects, documents, and so forth in order to uncover the structure within the data — to machine learning, and good old-fashioned statistics.

 

Coifman asserts the need for an underlying global theory on a par with calculus to enable researchers to become better curators of big data. In the same way, the various techniques and tools being developed need to be integrated under the umbrella of such a broader theoretical model. “In the end, data science is more than the sum of its methodological parts,” Vespignani insists, and the same is true for its analytical tools. “When you combine many things you create something greater that is new and different.”

more...
No comment yet.
Scooped by Olivier Lartillot
Scoop.it!

Music Information Retrieval, a tutorial

Music Information Retrieval, a tutorial | Computational Music Analysis | Scoop.it
George Tzanetakis provides give an overview of techniques, applications and capabilities of music information retrieval systems.
Olivier Lartillot's insight:

Great tutorial by George Tzanetakis about the research on computational music analysis (a discipline known as Music Information Retrieval). The tutorial includes introduction of engineering techniques commonly used in those research.

 

Here are the discussion topics that you will find:

Music Information Retrieval

Connections

Music Today

Industry

Music Collections

Overview

Audio Feature Extraction

Linear Systems and Sinusoids

Fourier Transform

Short Time Fourier Transform

Spectrum and Shape Descriptors

Mel Frequency Cepstral Coefficients

Audio Feature Extraction

Pitch Content

Pitch Detection

Time Domain

AutoCorrelation

Frequency Domain

Chroma – Pitch Perception

Automatic Rhythm Description

Beat Histograms

Analysis Overview

Content-based Similarity Retrieval (or query-by-example)

Classification

Classification

Multi-tag Annotation

Stacking

Polyphonic Audio-Score Alignment

Dynamic Time Warping

Query-by-humming

The MUSART system

Conclusions

more...
No comment yet.
Scooped by Olivier Lartillot
Scoop.it!

Too Loud & It All Sounds The Same? Why Researchers Were Wrong On Pop

Too Loud & It All Sounds The Same? Why Researchers Were Wrong On Pop | Computational Music Analysis | Scoop.it

After scientists earlier this year claimed to have proved that music has been sliding a path of diminishing returns and actually does all sound the same, musicologist Stephen Graham points out why pop music is probably as exciting now as it was in 1955.

Stephen Graham is a musicologist and music critic based at Goldsmith's College, and an Editor at the Journal of Music.

Olivier Lartillot's insight:

I "scooped" about that scientific research criticized here, in a post I wrote a few months ago, available here: 

http://www.scoop.it/t/computational-music-analysis/p/2284347210/measuring-the-evolution-of-contemporary-western-popular-music-scientific-reports-nature-publishing-group

 

Here is below a copy (with some short edits) of this new critical essay by Stephen Graham, that you can read entirely on thequietus.com (click on the title above).

 

"

Pedants and cranks have been predicting pop music's demise ever since its emergence in its modern form in the 1950s and 1960s. Always informed by a strong dose of cultural prejudice, sometimes these predictions take the form of blunt broadsides about perceived pop ephemerality, sometimes they are accompanied by chauvinistic gnashings of teeth about authenticity, and sometimes they are snuck in under the radar of 'objective' scientific analysis. I want to discuss something that falls most definitely into that insidious last camp.

 

A paper published in Scientific Reports on July 26 this year claims to 'measure the evolution of contemporary western popular music', statistically analysing a 'dataset' of 465,259 songs dating back to 1955 and distributed evenly across those years and widely across various popular music genres. The paper analyses its dataset under three main criteria: pitch, timbre and loudness. The authors of the paper found a growing homogenisation of the first two of these 'primary musical facets', and additionally detected 'growing loudness levels' in pop music of the past fifty-seven years.

 

These findings are not in themselves inherently problematic, although they derive from a deeply flawed methodology, to which I'll come below. What are deeply problematic are the conclusions drawn from them by the authors themselves, first, and by a range of journalists and music critics, second. My argument is that, in themselves, the measurements that the paper puts forward can basically be taken in good faith. However, since these measurements only take into account some elements of music on the one hand, and nothing of musical meaning and context on the other, the authors' attempt to build them into a grand narrative about pop music's evolving 'value' is shaky at best, and revealing of quite ugly cultural prejudices at worst.

 

Loud and homogenous pop?

 

The authors of the paper claim that they 'observe a number of trends in the evolution of contemporary popular music'. 'These point', they say, 'towards less variety in pitch transitions, towards a consistent homogenisation of the timbral palette, and towards louder and, in the end, potentially poorer volume dynamics'. As is their wont when confronted with apparently authoritative scientific 'data', journalists had a field day with the scientist's interpretations. Many used the paper's comparatively neutral conclusions as a springboard for a plethora of sweeping and condescending generalisations.

However, the Scientific Reports paper itself, and, consequently, many of the newspaper and web articles that reported its findings and conclusions with such misplaced ideological glee, suffers from two fundamental and fatal flaws.

 

First, the paper's analytical framework is inadequate. Its claims to authority are hampered by the absence from its supposedly representative dataset of one of the key elements of music, rhythm, whether that be harmonic rhythm, timbral rhythm, melodic rhythm, or the mensural rhythms of tempo and metre. Although the paper uses 'the temporal resolution of the beat' to aid 'discretisation' of its musical facets, the word 'rhythm' does not appear once in an article aspiring to answer questions about the inner nature of musical discourse and musical evolution.

 

Similarly, since we are dealing with popular music, the absence of language from the sample frame, such as that contained in titles, lyrics, slogans or other pertinent materials, is just as deleterious. Finally, harmony is also ignored to a significant degree, since although the paper focuses on timbre and pitch, precise heirarchisations of pitch, such as chord voicings or layering of the musical texture in order to articulate bass, harmony and melody, are excluded from the analysis. Horizontal or consecutive pitch relationships are elevated over vertical or simultaneous ones.

 

A constituent issue of this first fundamental flaw, deriving more from the paper's methodology than its sample frame, relates to the fact that the authors isolate and thus absolutise various musical 'facets' (timbre, pitch and loudness). These 'facets', in themselves, have little business being isolated, since they gain meaning from various contexts; musical, social, cultural or otherwise. An abstracted set of pitches means little besides itself when considered separately from timbre, rhythm, phrasing, use of technology, language and other technical and expressive 'ingredients' of music. Although many valid and valuable analyses have been carried out under precisely this sort of isolationist rubric, the key point is that specific findings about pitch should not be extrapolated into generalised propositions about how music works; data about pitch organisation are just that, and are not in themselves anything more. The analytical framework of the paper thus pivots on the fallacy of misplaced concreteness, where constituent elements of music are seen as more distinct than they really are.

 

The second fundamental flaw of the paper also relates to this point about isolating and decontextualising musical 'facets', to continue to use the authors' terminology. I noted above that facets such as pitch gain meaning once they are situated in musical contexts. Equally important to this 'meaning' are the socio-cultural discourses through which music becomes encoded with conventionalised meanings.

 

Is objective 'data' about musical evolution possible?

 

As Roland Barthes famously wrote, 'a text's unity lies not in its origin, but in its destination'; an aphorism blithely ignored by the authors of the Scientific Reports paper and by the journalists who appropriated the scientists' findings, all of whose assertions about pop music stay at the level of technical design and thus ignore vital emergent phenomena and processes of perception, interpretation and meaning. It is indeed reasonable to attempt to generate 'objective' data about music in order to 'identify some of the patterns behind music creation'. But in doing so analysts must first of all ensure that the analytical framework is as all-encompassing as possible.

 

Second, they must avoid circularity in building conclusions, a pervasive fault of the paper here, where the authors claim that 'our perception of the new would be rooted on these changing characteristics' (i.e. on the criteria utilised in the paper). This is straight up circular reasoning based on an exclusion bias. Music is reduced by the paper to loudness, timbre and pitch, and in doing so the horizon of the 'new' is likewise reduced to these facets. If 'music' is disclosed by the paper, than any possibility of it becoming 'new' must therefore derive from that disclosure. But there's much more to music than is here, and perceptions of the new have to do with a much fuller panoply of musical facets, as well as, of course, shifting patterns of meaning, than they are given credit for here.

 

Third, and finally, it is vital that if analysts or journalists are seeking to draw conclusions about music's meaning and value, then due heed must be paid to the socio-cultural discourses that largely generate music's meaning. Otherwise their analyses will simply serve to perpetuate antiquated ideas about what is and what is not musically worthwhile, and about what music might be seen to 'mean'.

"

more...
No comment yet.
Scooped by Olivier Lartillot
Scoop.it!

BBC - Research and Development: Pickin' up good vibrations

BBC - Research and Development: Pickin' up good vibrations | Computational Music Analysis | Scoop.it
One of the universal appeals of music lies in its mysterious ability to manipulate and reflect our emotions. Even the simplest of tunes can evoke strong feelings of joy, fear,...
Olivier Lartillot's insight:

Edited digest:

 

“Similarly as looking at new ways of finding TV programmes by mood, similar research is applied to music.

 

As you can imagine, getting a computer to understand human emotions has its challenges - three, in fact. The first one is how to numerically define mood. This is a complicated task as not only do people disagree on the mood of a track, but music often expresses a combination of emotions. Over the years, researchers have come up with various models, notably Hevner's clusters which define eight mood categories, and Russell's circumplex model, which represents mood as a point on a two-dimensional plane. Both approaches have their drawbacks, so researchers at QMUL Centre for Digital Music are developing a model which combines the strengths of both. The model will be based on earlier research conducted on the emotional similarity of common keywords.

 

The next challenge is processing the raw digital music into a format that the computer can handle. This should be a small set of numbers that represent what a track sounds like. They are created by running the music through a set of algorithms, each of which produce an array of numbers called 'features'. These features represent different properties of the music, such as the tempo and what key it's written in. They also include statistics about the frequencies, loudness and rhythm of the music. The trick lies in finding the right set of features that describe all the properties of music that are important for expressing emotion.

 

Now for the final challenge. We need to find out exactly how the properties of the music work together to produce different emotions. Even the smartest musicologists struggle with this question, so - rather lazily - this is left to the computer to work it out.

 

Machine learning is a method of getting a computer to 'learn' how two things are related by analysing lots of real-life examples. In this case, it is looking at the relationship between musical features and mood. There are a number of algorithms that could be used, but the popular 'support vector machine' (SVM) has been shown to work for this task and can handle both linear and non-linear relationships.

 

For the learning stage to be successful, the computer will need to be 'trained' using thousands of songs that have accompanying information about the mood of each track. This kind of collection is very hard to come across, and researchers often struggle to find appropriate data sets. Not only that, but the music should cover a wide range of musical styles, moods and instrumentation.

 

Although the Desktop Jukebox is mostly composed of commercial music tracks, it also houses a huge collection of what is known as 'production music'. This is music that has been recorded using session artists, and so is wholly owned by the music publishers who get paid each time the tracks are used. This business model means that they are keen to make their music easy to find and search, so every track is hand-labelled with lots of useful information.

 

Through project partners at I Like Music, the BBC Research and Development Group obtained over 128,000 production music tracks to use in our research. The tracks, which are sourced from over 80 different labels, include music from every genre.

 

The average production music track is described by 40 keywords, of which 16 describe the genre, 12 describe the mood and 5 describe the instrumentation. Over 36,000 different keywords are used to describe the music, the top 100 of which are shown in the tag cloud below. Interestingly, about a third of the keywords only appear once, including such gems as 'kangaroove', 'kazoogaloo', 'pogo-inducing' and 'hyper-bongo'.

 

In order to investigate how useful the keywords are in describing emotion and mood, the relationships between the words were analysed. The method was to calculate the co-occurrence of keyword pairs - that is, how often a pair of words appear together in the description of a music track. The conjecture was that words which appear together often have similar meanings.

Using the top 75 mood keywords, the co-occurrence of each pair in the production music database were calculated to produce a large matrix. In order to make any sense out of it, the keywords and the connections between them were visualized. Those with strong connections (that often appeared together) were positioned close to each other, and those with weak connections further apart.

 

The keywords arranged themselves into a logical pattern, where negative emotions were on the left and positive emotions on the right, with energetic emotions on top and lethargic emotions on the bottom. This roughly fits Russell's arousal-valence plane, suggesting that this model may be a suitable way to describe moods in the production music library, however more research is required before a model is chosen.

 

The BBC Research and Development Group has been working with the University of Manchester to extract features from over 128,000 production music files using the N8 cluster. Once that is work is complete, they will be able to start training and testing musical mood classifiers which can automatically label music tracks.”

more...
Jayakumar Singaram's comment, December 21, 2012 5:40 AM
For music synthesis and analysis, we had created very tool called Extended CSound. Mentioned tool can handle real time input from singer or player and also perform analysis on the same. We do have some of the paramters such as pitch and tempo are implemented and it si wide open platform to implement many other parameters by using Csoung Signal processing modules.
Jayakumar Singaram's comment, December 21, 2012 5:42 AM
hi pls check http://sdrv.ms/WhPWiN for Extended CSound and present versions
Scooped by Olivier Lartillot
Scoop.it!

The Echo Nest in Fortune Magazine

The Echo Nest in Fortune Magazine | Computational Music Analysis | Scoop.it

“Jehan's research focused on teaching computers to capture the sonic elements of music, while Whitman's studied the cultural and social components. In combining the two approaches they created the Echo Nest, one of the most important digital music companies few have heard about.

Starting in 2005, they set about creating a vast database, a music brain that, based on your interest in Kanye West, can suggest you check out rapper Drake. Sound like Pandora? It's similar – but on a massive scale.

A computer program analyzes songs for their fundamental elements such as key and tempo.

[In Pandora, this music analysis is performed manually by human – (Olivier)]

While Echo Nest's approach is unique, other firms, like Gracenote and Rovi, also compile and market music data. (Apple's iTunes relies on Gracenote, for instance.) Some services, notably Pandora, have built proprietary systems that could compete with Echo Nest.

more...
No comment yet.
Scooped by Olivier Lartillot
Scoop.it!

From Japan, a Music Player That Discovers Structure and Rhythm in Songs | Underwire | Wired.com

A cool new music service you've never heard of has hit the web, offering deep insights into the structure and rhythm of popular songs.

 

Created by Japan’s Advanced Industrial Science and Technology Institute, Songle (SONG-lee) analyzes music tracks hosted on the web and reveals the underlying chords, beats, melodies and repeats. Listeners can see how a song is laid out, and jump immediately to the chorus if they choose. They can search for music based on chord, beat and melody structures, such as all songs with the chord progression Am, B, E. There is also a visualization engine synchronized to a song’s core elements.

 

“This is a showcase for active music listening,” said Songle creator Dr. Masataka Goto, leader of the Media Interaction Group at AIST’s Information Technology Research Group. “Listeners can browse within songs.”


Computer analysis of music has aimed for years at getting inside the structure of songs to create audio fingerprints to help stem piracy and automate listener recommendations – the holy grail of online music retailing – among other things.

 

Goto cautioned against reading too much into Songle at this point, highlighting it’s educational and entertainment value instead.

 

Users submit links to tracks hosted online and Songle analyzes them in about 5-10 minutes, adding the metadata (but no copy of the song) to its database. The service, which launched in Japan in August, has analyzed about 80,000 tracks so far.

 

Songle has an embeddable player with a visualizer that adds graphics synchronized to a song to any web page using the embed code. It also allows listeners to provide corrections to the estimates created by its music analytics algorithm, potentially improving accuracy.

more...
No comment yet.
Scooped by Olivier Lartillot
Scoop.it!

Can the domains of Music Cognition and Music Information Retrieval inform each other?

Can the domains of Music Cognition and Music Information Retrieval inform each other? | Computational Music Analysis | Scoop.it

ISMIR (International Society for Music Information Retrieval) conference is a conference on the processing, searching, organizing and accessing music-related data. It attracts a research community that is intrigued by the revolution in music distribution and storage brought about by digital technology which generated quite some research activity and interest in academia as well as in industry.

 

In this discipline, referred to as Music Information Retrieval (or MIR for short), the topic is not so much to understand and model music (like in the field of music cognition), but to design robust and effective methods to locate and retrieve musical information, including tasks like query-by-humming, music recommendation, music recognition, and genre classification.

 

A common approach in MIR research is to use information-theoretic models to extract information from the musical data, be it the audio recording itself or all kinds of meta-data, such as artist or genre classification. With advanced machine learning techniques, and the availability of so-called ‘ground truth’ data (i.e., annotations made by experts that the algorithm uses to decide on the relevance of the results for a certain query), a model of retrieving relevant musical information is constructed. Overall, this approach is based on the assumption that all relevant information is present in the data and that it can, in principle, be extracted from that data (data-oriented approach).

 

Several alternatives have been proposed, such as models based on perception-based signal processing or mimetic and gesture-based queries. However, with regard to the cognitive aspects of MIR (the perspective of the listener), some information might be implicit or not present at all in the data. Especially in the design of similarity measures (e.g., ‘search for songs that sound like X’) it becomes clear quite quickly that not all required information is present in the data. Elaborating state-of-the-art MIR techniques with recent findings from music cognition seems therefore a natural next step in improving (exploratory) search engines for music and audio (cognition-based approach).

 

A creative paper, discussing the differences and overlaps between the two fields in dialog form, is about to appear in the proceedings of the upcoming ISMIR conference. Emanuel Bigand –a well-known music cognition researcher–, and Jean-Julien Aucouturier –MIR researcher–, wrote a fictitious dialog:


“Mel is a MIR researcher (the audio type) who's always been convinced that his field of research had something to contribute to the study of music cognition … Clarifying what psychologists really think of audio MIR, correcting misconceptions that he himself made about cognition, and maybe, developing a vision of how both fields could work together.”

more...
No comment yet.
Scooped by Olivier Lartillot
Scoop.it!

How Big Data Became So Big

How Big Data Became So Big | Computational Music Analysis | Scoop.it

THIS has been the crossover year for Big Data — as a concept, as a term and, yes, as a marketing tool. Big Data has sprung from the confines of technology circles into the mainstream.

 

First, here are a few, well, data points: Big Data was a featured topic this year at the World Economic Forum in Davos, Switzerland, with a report titled “Big Data, Big Impact.” In March, the federal government announced $200 million in research programs for Big Data computing.

 

Rick Smolan, creator of the “Day in the Life” photography series, has a new project in the works, called “The Human Face of Big Data.” The New York Times has adopted the term in headlines like “The Age of Big Data” and “Big Data on Campus.” And a sure sign that Big Data has arrived came just last month, when it became grist for satire in the “Dilbert” comic strip by Scott Adams. “It comes from everywhere. It knows all,” one frame reads, and the next concludes that “its name is Big Data.”

 

The Big Data story is the making of a meme. And two vital ingredients seem to be at work here. The first is that the term itself is not too technical, yet is catchy and vaguely evocative. The second is that behind the term is an evolving set of technologies with great promise, and some pitfalls.

 

Big Data is a shorthand label that typically means applying the tools of artificial intelligence, like machine learning, to vast new troves of data beyond that captured in standard databases. The new data sources include Web-browsing data trails, social network communications, sensor data and surveillance data.

 

The combination of the data deluge and clever software algorithms opens the door to new business opportunities. Google and Facebook, for example, are Big Data companies. The Watson computer from I.B.M. that beat human “Jeopardy” champions last year was a triumph of Big Data computing. In theory, Big Data could improve decision-making in fields from business to medicine, allowing decisions to be based increasingly on data and analysis rather than intuition and experience.

 

“The term itself is vague, but it is getting at something that is real,” says Jon Kleinberg, a computer scientist at Cornell University. “Big Data is a tagline for a process that has the potential to transform everything.”

 

Rising piles of data have long been a challenge. In the late 19th century, census takers struggled with how to count and categorize the rapidly growing United States population. An innovative breakthrough came in time for the 1890 census, when the population reached 63 million. The data-taming tool proved to be machine-readable punched cards, invented by Herman Hollerith; these cards were the bedrock technology of the company that became I.B.M.

 

So the term Big Data is a rhetorical nod to the reality that “big” is a fast-moving target when it comes to data. The year 2008, according to several computer scientists and industry executives, was when the term “Big Data” began gaining currency in tech circles. Wired magazine published an article that cogently presented the opportunities and implications of the modern data deluge.

 

This new style of computing, Wired declared, was the beginning of the Petabyte Age. It was an excellent magazine piece, but the “petabyte” label was too technical to be a mainstream hit — and inevitably, petabytes of data will give way to even bigger bytes: exabytes, zettabytes and yottabytes.

 

Many scientists and engineers at first sneered that Big Data was a marketing term. But good marketing is distilled and effective communication, a valuable skill in any field. For example, the mathematician John McCarthy made up the term “artificial intelligence” in 1955, when writing a pitch for a Rockefeller Foundation grant. His deft turn of phrase was a masterstroke of aspirational marketing.

 

In late 2008, Big Data was embraced by a group of the nation’s leading computer science researchers, the Computing Community Consortium, a collaboration of the government’s National Science Foundation and the Computing Research Association, which represents academic and corporate researchers. The computing consortium published an influential white paper, “Big-Data Computing: Creating Revolutionary Breakthroughs in Commerce, Science and Society.”

 

Its authors were three prominent computer scientists, Randal E. Bryant of Carnegie Mellon University, Randy H. Katz of the University of California, Berkeley, and Edward D. Lazowska of the University of Washington.

Their endorsement lent intellectual credibility to Big Data. Rod A. Smith, an I.B.M. technical fellow and vice president for emerging Internet technologies, says he likes the term because it nudges people’s thinking up from the machinery of data-handling or precise measures of the volume of data.

“Big Data is really about new uses and new insights, not so much the data itself,” Mr. Smith says.

more...
No comment yet.