Search and NLP Programming
546 views | +0 today
Follow
Search and NLP Programming
All related tomsearch engine and language processing
Curated by Alberto Paro
Your new post is loading...
Your new post is loading...
Rescooped by Alberto Paro from Search Engine Technologies ( E-Commerce Search, Natural Language Processing, Solr, Lucene, Elasticsearch, etc)
Scoop.it!

solr-synonyms

solr-synonyms | Search and NLP Programming | Scoop.it

Via Carlos Sponchiado (Sponch)
more...
No comment yet.
Rescooped by Alberto Paro from JavaScript for Line of Business Applications
Scoop.it!

Reactive, real-time log search with Play, Akka, AngularJS and Elasticsearch

Reactive, real-time log search with Play, Akka, AngularJS and Elasticsearch | Search and NLP Programming | Scoop.it

So, I’ve decided to contribute an Activator Template to TypeSafe (will submit soon, promise!). Having recently become more and more involved in Elasticsearch, I saw a great opportunity to put together a neat “reactive” application combining Play & Akka with the “bonsai cool” percolation feature of Elasticsearch. Then, to put a cherry on top, use AngularJS on the client-side to create a dynamically updating UI.

What I came up with is slightly contrived – a very basic real-time log entry search tool – but I think it provides a really nice base for apps that want to integrate this bunch of technologies.


Via Jan Hesse
more...
No comment yet.
Rescooped by Alberto Paro from Scala & Cloud Playing
Scoop.it!

OpenStack Local LVM instance storage (Folsom)

I’ve been playing with OpenStack on and off since it was released, but recently I had the opportunity to finally build a production cluster. …

Via Wonil Lee Ph.D.
more...
No comment yet.
Scooped by Alberto Paro
Scoop.it!

Content-based image classification in Python

Content-based image classification in Python | Search and NLP Programming | Scoop.it

Define the Scope

We're going to write a script to predict whether an image is a check or a drivers license. Then we'll publish the script in a manner suitable for use within your team's software application.

 

more...
Scooped by Alberto Paro
Scoop.it!

[Free eBook] Analyzing the Analyzers - O'Reilly Media

[Free eBook] Analyzing the Analyzers - O'Reilly Media | Search and NLP Programming | Scoop.it

There has been intense excitement in recent years around activities labeled "data science," "big data," and "analytics." However, the lack of clarity around these terms and, particularly, around the skill sets and capabilities of their practitioners has led to inefficient communication between "data scientists" and the organizations requiring their services. This lack of clarity has frequently led to missed opportunities. To address this issue, we surveyed several hundred practitioners via the Web to explore the varieties of skills, experiences, and viewpoints in the emerging data science community.

more...
Kun Le's curator insight, July 7, 2013 10:57 AM

add your insight...

 

 

Scooped by Alberto Paro
Scoop.it!

An Architecture for Real-Time Geo-tracking with Python, Celery, RabbitMQ, and More

An Architecture for Real-Time Geo-tracking with Python, Celery, RabbitMQ, and More | Search and NLP Programming | Scoop.it

One of the key challenges with this type of architecture is when you need to get state from the database. Clients end up polling for many real-time tracking scenarios. This presents a problem when, for example, you have 1000s of vehicles confirming state every second. Polling the database at this rate can create an overwhelming and unnecessary amount of traffic. And it doesn’t scale.

more...
Brent Hoover's curator insight, June 30, 2013 12:44 PM

I have implemented a very similar stack on a very high volume site, instead of the Node/Websocket we used Erlang/Ejabberd which offers many advantages over the Node approach. Certainly location is still one of the big areas of mobile development that haven't really been fully exploited.

Scooped by Alberto Paro
Scoop.it!

Introducing Cloudera Search

Introducing Cloudera Search | Search and NLP Programming | Scoop.it

Powered by Apache Solr™, the enterprise standard for open source search, Cloudera Search integrates with the 100% open source Big Data platform, CDH, to bring scale and reliability for a new generation of search – Big Data search.

more...
No comment yet.
Rescooped by Alberto Paro from playframework
Scoop.it!

Using Guice with Play! Framework 2.1 for easy Dependency Injection | 42 Engineering

Using Guice with Play! Framework 2.1 for easy Dependency Injection | 42 Engineering | Search and NLP Programming | Scoop.it

Via opensas
more...
No comment yet.
Rescooped by Alberto Paro from PDG Web Development
Scoop.it!

Scala Liftweb : Re render page element without making any request to the server

Scala Liftweb : Re render page element without making any request to the server | Search and NLP Programming | Scoop.it

Lift introduces a powerful feature that enables developer to create rich interactive application using Comet and Ajax support . Using Comet , its easy to push content from server to browser by send...


Via Kun Le
more...
No comment yet.
Rescooped by Alberto Paro from Scala & Cloud Playing
Scoop.it!

Play 2.0 mindmap

Play 2.0 mindmap | Search and NLP Programming | Scoop.it

Via opensas, Wonil Lee Ph.D.
more...
Rescooped by Alberto Paro from playframework
Scoop.it!

Abstractivate: From imperative to data flow to functional style

Abstractivate: From imperative to data flow to functional style | Search and NLP Programming | Scoop.it

Via opensas
more...
No comment yet.
Rescooped by Alberto Paro from Solr
Scoop.it!

Turbocharging Solr Index Replication with BitTorrent « Code as Craft

Turbocharging Solr Index Replication with BitTorrent « Code as Craft | Search and NLP Programming | Scoop.it

Many of you probably use BitTorrent to download your favorite ebooks, MP3s, and movies. At Etsy, we use BitTorrent in our production systems for search replication.

Search at Etsy
Search at Etsy has grown significantly over the years. In January of 2009 we started using Solr for search. We used the standard master-slave configuration for our search servers with replication.

All of the changes to the search index are written to the master server. The slaves are read-only copies of master which serve production traffic. The search index is replicated by copying files from the master server to the slave servers. The slave servers poll the master server for updates, and when there are changes to the search index the slave servers will download the changes via HTTP. Our search indexes have grown from 2 GB to over 28 GB over the past 2 years, and copying the index from the master to the slave nodes became a problem.


Via Steven Casey ☕
more...
No comment yet.
Scooped by Alberto Paro
Scoop.it!

From Solr to elasticsearch

From Solr to elasticsearch | Search and NLP Programming | Scoop.it
Search is right at the centre of GOV.UK. It’s the main focus of the homepage and it appears in the corner of every single page. Many of our recent and upcoming apps such as licence finder also rely...
more...
No comment yet.
Rescooped by Alberto Paro from Big Data Security Analytics
Scoop.it!

Use rsyslog and ElasticSearch for Powerful Log Aggregation | Puppet Labs

Use rsyslog and ElasticSearch for Powerful Log Aggregation | Puppet Labs | Search and NLP Programming | Scoop.it
With rsyslog and ElasticSearch, you can aggregate log events to a database for easier, more powerful search. Here's how from our friends at Rackspace.

Via cysap
more...
No comment yet.
Rescooped by Alberto Paro from playframework
Scoop.it!

Name based extractors in Scala 2.11 - Heiko's Blog

Update: Thanks to @xuwei_k and @eed3si9n I have learned that value classes which mix in a universal trait incur the cost of allocation. Therefore I …

Via opensas
more...
No comment yet.
Scooped by Alberto Paro
Scoop.it!

Search theory and big data: Applying the math that sank U-boats to today's intel problems

Search theory and big data: Applying the math that sank U-boats to today's intel problems | Search and NLP Programming | Scoop.it

In simple language, search theory uses advanced math to help calculate where your target may be. In recent years, the discipline has been revitalized to help hunt insurgents, missile sites and improvised explosive devices. Now the Pentagon is exploring whether it can help sort and simplify the massive volumes of data compiled by modern ISR sensors.

 

more...
No comment yet.
Scooped by Alberto Paro
Scoop.it!

Random Forests in Python

Random Forests in Python | Search and NLP Programming | Scoop.it

Random forest is a highly versatile machine learning method with numerous applications ranging from marketing to healthcare and insurance. It can be used to model the impact of marketing on customer acquisition, retention, and churn or to predict disease risk and susceptibility in patients.

Random forest is a capable of regression and classification. It can handle a large number of features, and it's helpful for estimating which or your variables are important in the underlying data being modeled.

This is a post about random forests using Python.

more...
No comment yet.
Scooped by Alberto Paro
Scoop.it!

Data Scientists vs. Data Engineers

Data Scientists vs. Data Engineers | Search and NLP Programming | Scoop.it

More and more frequently we see organizations make the mistake of mixing and confusing team roles on a data science or "big data" project - resulting in over-allocation of responsibilities assigned to data scientists. For example, data scientists are often tasked with the role of data engineer leading to a misallocation of human capital. Here the data scientist wastes precious time and energy finding, organizing, cleaning, sorting and moving data. The solution is adding data engineers, among others, to the data science team.

more...
No comment yet.
Rescooped by Alberto Paro from Designer's Resources
Scoop.it!

What’s new for designers, June 2013

What’s new for designers, June 2013 | Search and NLP Programming | Scoop.it
The June edition of what's new for web designers and developers includes new web apps, JavaScript resources, Photoshop scripts, web development tools, color pal

Via Mark Strozier
more...
Bootstraptor's comment, June 20, 2013 7:08 AM
thank you for this!
ramil navidad's comment, July 13, 2013 6:10 PM
nice collection....definitely going to check on some of the suggestions
Bogdan Dimitrov's comment, July 15, 2013 5:03 AM
Great article. Thank You for sharing!
Scooped by Alberto Paro
Scoop.it!

Scoring Engine Via PMML Makes Hadoop Easier

Scoring Engine Via PMML Makes Hadoop Easier | Search and NLP Programming | Scoop.it

Big data application company Concurrent has introduced Pattern, a free and open source "scoring engine" for data professionals to use when deploying machine-learning applications on Apache Hadoop.

more...
Jacek Bugajski's curator insight, May 27, 2013 5:55 AM

Scoring Engine Via PMML Makes Hadoop Easier

Rescooped by Alberto Paro from playframework
Scoop.it!

Valid Scala vs. valid HTML : Play views vs. Lift Views - Alexander Chepurnoy

Valid Scala vs. Valid HTML : Play Views vs. Lift Views Jan 29th, 2013 | Comments Play2 and Lift are major web frameworks in the Scala industry. And …

Via opensas
more...
No comment yet.
Rescooped by Alberto Paro from playframework
Scoop.it!

Version Aware Play Framework Application | 42 Engineering


Via opensas
more...
No comment yet.
Rescooped by Alberto Paro from playframework
Scoop.it!

Using subprojects - Joerg Viola

Using subprojects - Joerg Viola | Search and NLP Programming | Scoop.it
Using Subprojects Decomposing your application into subprojects can reduce build times a lot.
Also, it adds structure to your source tree.
The …

Via opensas
more...
No comment yet.
Rescooped by Alberto Paro from Solr
Scoop.it!

Apache Lucene and Solr 4.0 alpha | Solr Enterprise Search

Apache Lucene and Solr 4.0 alpha | Solr Enterprise Search | Search and NLP Programming | Scoop.it
Today Apache Lucene and Solr PMC announced the release of 4.0 alpha version of Apache Lucene library and Apache Solr search server. When comparing to the 3.6 there were some major changes introduced about which ...

Via Steven Casey ☕
more...
No comment yet.