Digital Humanitie...
1.1K views | +2 today
Your new post is loading...
Your new post is loading...
Rescooped by Intriguing Networks from e-Xploration!

#Sentiment Analysis Tool Designed to #Predict Veterans’ Suicide Risk I #datascience

The Veterans Administration is funding the Durkheim Project, an effort to use text and sentiment analysis to predict veterans suicide risk.

Via luiy
Intriguing Networks's insight:

Important work in an obviously serious application surely with lessons for DH hopefully without such severe ramifications. Very interesting thanks for the share from e-Xploration.

luiy's curator insight, December 9, 2013 5:02 PM

That’s where computer scientist Chris Poulin and a semantics-based prediction tool enter the picture. Poulin and his company, Patterns & Predictions, had developed a commercial Bayesian analytics tool for predicting events—most notably financial events—based on historical analysis. “You have a stock that went bust on a certain date,” explains Poulin. “What were the forensic features that led up to that stock going bust?”

Ten years ago, as Poulin was launching the company, his best friend committed suicide. “He posted a suicide note—and what turned out to be pre-suicide notes—on social media,” says Poulin.  As time went on, Poulin began to consider whether a similar event prediction model could parse the social media behavior of veterans to uncover those who might be about to harm themselves. Later, as a researcher at Dartmouth, Poulin partnered with Paul Thompson, an instructor at the university’s Geisel School of Medicine who specializes in computational linguistics and also lost a friend to suicide, and they took their pitch to the Pentagon. Three years ago, they were awarded a $1.7 million contract by the Defense Advanced Research Projects Agency (DARPA) to combine Thompson’s linguistics work with Poulin’s event-focused text analytics to create a model to predict those with suicidal or other harmful tendencies.


Dubbed the Durkheim Project (after French sociologist Emile Durkheim known for his 19th century study of suicide data) the researchers ultimately hope to use opt-in data from veterans’ social media and mobile content to create a real-time predictive analytics tool for suicide risk. While the team behind the project is optimistic about its abilities to make predictions with 65 percent accuracy, the challenge at this stage is about gaining the cooperation of veterans to join the effort to gain insights into their well-being.

- See more at:

Intriguing Networks's curator insight, December 9, 2013 6:05 PM

Re-shared because its such an important application and perhaps shows how the computational aspects are transferrable between apparently unconnected use cases. 

Rescooped by Intriguing Networks from Public Datasets - Open Data -!

A Programmer's Guide to #DataMining I #OpenBook #DataScience

A Programmer's Guide to #DataMining I #OpenBook #DataScience | Digital Humanities and Linked Data |

Via Joaquín Herrero Pintado, Toni Sánchez, luiy
Intriguing Networks's insight:

Cheers thanks for this handy for all budding DH students

luiy's curator insight, December 8, 2013 2:51 PM

Table of Contents


This book’s contents are freely available as PDF files. When you click on a chapter title below, you will be taken to a webpage for that chapter. The page contains links for a PDF of that chapter and for any sample Python code and data that chapter requires. Please let me know if you see an error in the book, if some part of the book is confusing, or if you have some other comment. I will use these to revise the chapters.


Chapter 1: Introduction


Finding out what data mining is and what problems it solves. What will you be able to do when you finish this book.


Chapter 2: Get Started with Recommendation Systems


Introduction to social filtering. Basic distance measures including Manhattan distance, Euclidean distance, and Minkowski distance. Pearson Correlation Coefficient. Implementing a basic algorithm in Python.


Chapter 3: Implicit ratings and item-based filtering


A discussion of the types of user ratings we can use. Users can explicitly give ratings (thumbs up, thumbs down, 5 stars, or whatever) or they can rate products implicitly–if they buy an mp3 from Amazon, we can view that purchase as a ‘like’ rating.

Chapter 4: Classification


In  previous chapters we used  people’s ratings of products to make recommendations. Now we turn to using attributes of the products themselves to make recommendations. This approach is used by Pandora among others.


Chapter 5: Further Explorations in Classification


A discussion on how to evaluate classifiers including 10-fold cross-validation, leave-one-out, and the Kappa statistic. The k Nearest Neighbor algorithm is also introduced.


Chapter 6: Naïve Bayes


An exploration of Naïve Bayes classification methods. Dealing with numerical data using probability density functions.


Chapter 7: Naïve Bayes and unstructured text


This chapter explores how we can use Naïve Bayes to classify unstructured text. Can we classify twitter posts about a movie as to whether the post was a positive review or a negative one? (new version coming November 2013)