public marks

PUBLIC MARKS with tags webcrawler & ruby

22 August 2006 22:00

Ariel

by dcancel
a library that allows you to extract information from semi-structured documents (such as websites). Ariel will use a small number of labeled examples to generate and learn effective extraction rules.

22 August 2006 01:00

RDig - Ferret based full text search for web sites

by dcancel
RDig provides an HTTP crawler and content extraction utilities to help building a site search for web sites or intranets. Internally, Ferret is used for the full text indexing.

PUBLIC TAGS related to tag webcrawler

delicious +   opensource +   ruby +   rubyonrails +   search +   SearchEngine +   搜索引擎 +  

Active users

dcancel
last mark : 22/08/2006 22:29