BigData
603 views | +0 today
Follow
Your new post is loading...
Your new post is loading...
Scooped by Guru Dharmateja Medasani
Scoop.it!

Stanford scientists put free text-analysis tool on the web

Stanford scientists put free text-analysis tool on the web | BigData | Scoop.it
more...
No comment yet.
Scooped by Guru Dharmateja Medasani
Scoop.it!

Using Avro in MapReduce jobs with Hadoop, Pig, Hive - Michael G. Noll

Example MapReduce jobs in Java, Hadoop Streaming, Pig and Hive that read and/or write data in Avro format.
more...
No comment yet.
Scooped by Guru Dharmateja Medasani
Scoop.it!

[HIVE-3260] support bucketed mapjoin where the small table has different number of buckets for different partitons - ASF JIRA

[HIVE-3260] support bucketed mapjoin where the small table has different number of buckets for different partitons - ASF JIRA | BigData | Scoop.it
more...
No comment yet.
Scooped by Guru Dharmateja Medasani
Scoop.it!

How we do Open Data: #1 - choosing development indicators

How we do Open Data: #1 - choosing development indicators | BigData | Scoop.it
A recent question from Lorenz Noe caught our eye - how do we choose which indicators to publish in World Development Indicators (WDI), a major part of our Open Data Initiative? It’s a good question, so I thought I’d write a post about that - and we’ll also post something similar in the data help desk. 1. There’s no perfect indicator There are sometimes gaps in the data Like many things in life, selecting indicators for the WDI is not an exact science. The intention is to provide good coverage of key development issues, but many of the countries that we work with do not have the quantity - or quality - of data that exists in countries like the United States, for example.
more...
No comment yet.
Scooped by Guru Dharmateja Medasani
Scoop.it!

Kick Start Hadoop: How to create patritions on the fly in hive tables based on data? OR How can you implement Dynamic Partitions in hive for Larger Tables?

Kick Start Hadoop: How to create patritions on the fly in hive tables based on data? OR How can you implement Dynamic Partitions in hive for Larger Tables? | BigData | Scoop.it
more...
No comment yet.
Scooped by Guru Dharmateja Medasani
Scoop.it!

Programming Hive

Programming Hive | BigData | Scoop.it
This comprehensive guide introduces you to Apache Hive, Hadoop’s data warehouse infrastructure. You’ll quickly learn how to use Hive’s SQL dialect—HiveQL—to summarize, query, and analyze large datasets stored in...
more...
No comment yet.
Scooped by Guru Dharmateja Medasani
Scoop.it!

explain my data: Spark should be better than MapReduce (if only it worked)

more...
No comment yet.
Scooped by Guru Dharmateja Medasani
Scoop.it!

Chart: The top 25 companies for pay and benefits - GeekWire

Chart: The top 25 companies for pay and benefits - GeekWire | BigData | Scoop.it
People looking for high pay and good benefits are finding that working in the tech industry comes with plenty of perks. Glassdoor released its rankings of
more...
No comment yet.
Scooped by Guru Dharmateja Medasani
Scoop.it!

Hadoop Tutorial: Map-Reduce on YARN Part 1 -- Overview and Installa...

Full tutorial with source code and pre-configured virtual machine available at http://www.coreservlets.com/hadoop-tutorial/
more...
No comment yet.
Scooped by Guru Dharmateja Medasani
Scoop.it!

Downloads website for Solr

Downloads website for Solr | BigData | Scoop.it
Guru Dharmateja Medasani's insight:

Website to download the different versions of Solr

more...
No comment yet.
Scooped by Guru Dharmateja Medasani
Scoop.it!

Fatal Error Log - Troubleshooting Guide for Java SE 6 with HotSpot VM

more...
No comment yet.
Scooped by Guru Dharmateja Medasani
Scoop.it!

5 Tips for efficient Hive queries - Qubole

5 Tips for efficient Hive queries - Qubole | BigData | Scoop.it
Hive on Hadoop makes data processing so straightforward and scalable that we can easily forget to optimize our Hive queries. Well designed tables and queries can greatly improve your query speed and reduce processing cost. This article includes five tips, which are valuable for ad-hoc queries, to save time, as much as for regular ETL …
more...
No comment yet.
Scooped by Guru Dharmateja Medasani
Scoop.it!

[HIVE-3171] Bucketed sort merge join doesn't work when multiple files exist for small alias - ASF JIRA

[HIVE-3171] Bucketed sort merge join doesn't work when multiple files exist for small alias - ASF JIRA | BigData | Scoop.it
more...
No comment yet.
Scooped by Guru Dharmateja Medasani
Scoop.it!

Partitions & Buckets in #Hive

Partitions & Buckets in #Hive | BigData | Scoop.it
In my previous post, we discussed the map, array and struct data types and their implementation in Hive. Continuing on the Hive theme, this post will introduce partitioning and bucketing as  method...
more...
No comment yet.
Scooped by Guru Dharmateja Medasani
Scoop.it!

How-to: Configure JDBC Connections in Secure Apache Hadoop Environments

How-to: Configure JDBC Connections in Secure Apache Hadoop Environments | BigData | Scoop.it
more...
No comment yet.
Scooped by Guru Dharmateja Medasani
Scoop.it!

YouCubed - Join the Revolution

YouCubed - Join the Revolution | BigData | Scoop.it
The new movement to revolutionize math teaching and learning. YouCubed is a nonprofit providing free and affordable K-12 mathematics resources and professional development for educators and parents.
more...
No comment yet.
Scooped by Guru Dharmateja Medasani
Scoop.it!

Meet the algorithm that can learn "everything about anything"

Meet the algorithm that can learn "everything about anything" | BigData | Scoop.it
Researchers from Allen Institute for AI have built a computer system capable of teaching itself many facets of broad concepts by scouring and analyzing search engines using natural language processing and computer vision techniques.
more...
No comment yet.
Scooped by Guru Dharmateja Medasani
Scoop.it!

A Million Little Files

A Million Little Files | BigData | Scoop.it
My PC-oriented brain says it's easier to work with a million small files than one gigantic file. Hadoop says the opposite -- big files are stored contiguously on disk, so they can be read/written e...
more...
No comment yet.