Bits 'n Pieces on Big Data
1.3K views | +0 today
Bits 'n Pieces on Big Data
Innovative information and insight into Big Data (if you like the content, please consider donating to my bitcoin address #3Pjof6N9xRAYXXSPZ4EAFLfHGn51ZdPcxi)
Curated by onur savas
Your new post is loading...
Your new post is loading...
Scooped by onur savas!

Scientific method: Statistical errors

Scientific method: Statistical errors | Bits 'n Pieces on Big Data |
P values, the 'gold standard' of statistical validity, are not as reliable as many scientists assume.
No comment yet.
Rescooped by onur savas from Big Data and NoSQL Daily!

Apache Spark for Big Analytics

Apache Spark for Big Analytics | Bits 'n Pieces on Big Data |

Via Simon Hunanyan
Simon Hunanyan's curator insight, December 23, 2013 10:09 PM

Spark, an Apache incubator project, is an open source distributed computing framework for advanced analytics in Hadoop. It's 100X faster than what they are able to achieve with MapReduce. Spark includes a machine learning library (MLLib), a graph engine (GraphX), a streaming analytics engine (Spark Streaming) and much more...

Currently, Spark supports programming interfaces for Scala, Java and Python.  The R interface is under development and this is expected to be released in the first half of 2014.

Rescooped by onur savas from Papers!

Who is Dating Whom: Characterizing User Behaviors of a Large Online Dating Site

Online dating sites have become popular platforms for people to look for potential romantic partners. It is important to understand users' dating preferences in order to make better recommendations on potential dates. The message sending and replying actions of a user are strong indicators for what he/she is looking for in a potential date and reflect the user's actual dating preferences. We study how users' online dating behaviors correlate with various user attributes using a large real-world dateset from a major online dating site in China. Many of our results on user messaging behavior align with notions in social and evolutionary psychology: males tend to look for younger females while females put more emphasis on the socioeconomic status (e.g., income, education level) of a potential date. In addition, we observe that the geographic distance between two users and the photo count of users play an important role in their dating behaviors. Our results show that it is important to differentiate between users' true preferences and random selection. Some user behaviors in choosing attributes in a potential date may largely be a result of random selection. We also find that both males and females are more likely to reply to users whose attributes come closest to the stated preferences of the receivers, and there is significant discrepancy between a user's stated dating preference and his/her actual online dating behavior. These results can provide valuable guidelines to the design of a recommendation engine for potential dates.


Who is Dating Whom: Characterizing User Behaviors of a Large Online Dating Site
Peng Xia, Kun Tu, Bruno Ribeiro, Hua Jiang, Xiaodong Wang, Cindy Chen, Benyuan Liu, Don Towsley

Via Complexity Digest
Urbansocial's curator insight, July 14, 2014 11:41 AM

Urban Social - Online dating for sociable singles