NEW YORK, NY--(Marketwired - Oct 28, 2013) - STRATA -- At the Strata Conference + Hadoop World 2013, Paxata launched the first self-service Adaptive Data Preparation™ platform that lets business analysts rapidly collect, explore, transform and...
Mohsen Arjmandi's insight:
paxata is a bigdata startup that came out of dtealth mode in 28th October and secured 8M$ as round B investment from Accel and became partner with Tableau.
Their main focus was on social media analysis but their new tool is Adaptive Data Preoaration platform that prun and processes the data using pattern recognition and anomaly detection algorithm and provide services like:
- similary detection
- graph visualization
- data classification
- "automatically detects and highlights patterns and anomalies within the data so analysts have a visual map to resolve both syntactic and semantic data qualities issues, rapidly improving the quality of large data sets."
some odd reasons that made Rackspace choos Hortonworks
Mohsen Arjmandi's insight:
Hortonworks' Hadoop distribution is unique in that it uses only the standard release of open-source Apache Hadoop. Where other Hadoop distribution vendors (such as Cloudera and MapR) offer features that in many cases bring significant value-added functionality, Hortonworks chooses to support only the most current stable release of the framework
According to John Engates, CTO of Rackspace, this factored into the company’s decision to run with Hortonworks. “The Hortonworks Data Platform packages the open source Apache version of Hadoop,” he commented. “That aligns with our vision of an open cloud future that eliminates fear of vendor lock-in, and allows customers to confidently invest in a technology for the long term.”
Pricing for the Rackspace big data offering breaks out as follows:
For the end-users, though MapReduce Java code is common, any programming language can be used with "Hadoop Streaming" to implement the "map" and "reduce" parts of the user's program. Apache Pig, Apache Hive among other related projects expose higher level user interfaces like Pig latin and a SQL variant respectively.
There is support for the S3 file system in Hadoop distributions, and the Hadoop team generates EC2 machine images after every release. From a pure performance perspective, Hadoop on S3/EC2 is inefficient, as the S3 file system is remote and delays returning from every write operation until the data is guaranteed not to be lost. This removes the locality advantages of Hadoop, which schedules work near data to save on network load.
Sharing your scoops to your social media accounts is a must to distribute your curated content. Not only will it drive traffic and leads through your content, but it will help show your expertise with your followers.
How to integrate my topics' content to my website?
Integrating your curated content to your website or blog will allow you to increase your website visitors’ engagement, boost SEO and acquire new visitors. By redirecting your social media traffic to your website, Scoop.it will also help you generate more qualified traffic and leads from your curation work.
Distributing your curated content through a newsletter is a great way to nurture and engage your email subscribers will developing your traffic and visibility.
Creating engaging newsletters with your curated content is really easy.