blob: 8931685aa403bf19d3d83734cb6d9d756f7cb951 (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
|
Mercator, the "Altavista" robot
http://mercator.comm.nsdlib.org/
authors working for Microsoft now :-)
Some Java roboter frameworks:
heritrix
crawler4j
mainly dead or unusable:
jspider
websphinx
A C++ web robot
http://code.google.com/p/whalebot/
Javascript support
phantomjs http://code.google.com/p/phantomjs/
https://github.com/mikeal/spider
https://github.com/joshfire/node-crawler
Php
http://www.makeuseof.com/tag/build-basic-web-crawler-pull-information-website/
Streams
http://www.mr-edd.co.uk/blog/beginners_guide_streambuf
Lua embedding
http://www.ibm.com/developerworks/linux/library/l-embed-lua/
|