From ceae68b94f60005bd3a8a6704320abf2c8e18728 Mon Sep 17 00:00:00 2001 From: Andreas Baumann Date: Sun, 8 Jul 2012 20:53:48 +0200 Subject: some doc links --- docs/LINKS | 11 +++++++++++ 1 file changed, 11 insertions(+) (limited to 'docs') diff --git a/docs/LINKS b/docs/LINKS index 568183f..afa1082 100644 --- a/docs/LINKS +++ b/docs/LINKS @@ -5,8 +5,19 @@ http://mercator.comm.nsdlib.org/ authors working for Microsoft now :-) heritrix +crawler4j mainly dead or unusable: jspider websphinx + +Javascript support + +phantomjs http://code.google.com/p/phantomjs/ +https://github.com/mikeal/spider +https://github.com/joshfire/node-crawler + +Php + +http://www.makeuseof.com/tag/build-basic-web-crawler-pull-information-website/ -- cgit v1.2.3-54-g00ecf