e-Xploration
27.2K views | +3 today
Follow
e-Xploration
antropologiaNet, dataviz, collective intelligence, algorithms, social learning, social change, digital humanities
Curated by luiy
Your new post is loading...
Your new post is loading...
Scooped by luiy
Scoop.it!

artoo.js · The client-side #scraping companion | #ddj

artoo.js · The client-side #scraping companion | #ddj | e-Xploration | Scoop.it
luiy's insight:

Features

 

- Scrape everything, everywhere: invoke artoo in the JavaScript context of any web page.

 

- Loaded with helpers: Scrape data quick & easy with powerful methods such as artoo.scrape.

 

- Data download: Make your browser download the scraped data withartoo.save methods.

 

- Spiders: Crawl pages through ajax and retrieve accumulated data with artoo's spiders.

 

- Content expansion: Expand pages' content programmatically thanks to artoo.autoExpand utilities.


- Store: stash persistent data in the localStorage with artoo's handyabstraction.


- Sniffers: hook on XHR requests to retrieve circulating data with a variety oftools.


- Instructions: record the instructions typed into the console and save them for later use.


- jQuery: jQuery is injected alongside artoo in the pages you visit so you can handle the DOM easily.


- Custom bookmarklets: you can use artoo as a framework and easily create custom bookmarklets to execute your code.


- User Interfaces: build parasitic user interfaces easily with a creative usageof Shadow DOM.


- Chrome extension: trying to scrape a nasty page abiding by some sneaky HTML5 rules? Here, have a chrome extension.

more...
No comment yet.
Scooped by luiy
Scoop.it!

Morph : Get structured #data out of the web | #crawlers #datascience

luiy's insight:

Morph A Heroku for Scrapers

 

Get structured data out of the web

 

- All code and collaboration through GitHub

- Write your scrapers in Ruby, Python, PHP or Perl

- Simple API to grab dataSchedule scrapers or run manually

- Process isolation via Docker

- Trivial to move scraper code and data from ScraperWiki Classic

more...
No comment yet.