Blogs.Oracle.Com - The Data Warehouse Insider
I am less and less often mistaken for a pirate when I mention the R language. While I miss the excuse to wear an eyepatch, I'm glad more people are beginning to explore a statistical language I've been touting for years. When it comes to plotting or running complex statistics in a single line of code, R is a great tool to have. That said, there are plenty of pitfalls for the casual or new user: syntax, learning to write vectorized code, or even just knowing which "apply" function you really should choose.
I want to explore a slightly less-often considered aspect of R development: parallelism. Out of the box, R can seem very limited to someone used to working on compute clusters or even a multicore server. However, there are a few tricks we can leverage to get the most out of R on everything from a personal workstation to a Hadoop cluster.