diff options
author | Andreas Baumann <abaumann@yahoo.com> | 2014-04-30 16:46:00 +0200 |
---|---|---|
committer | Andreas Baumann <abaumann@yahoo.com> | 2014-04-30 16:46:00 +0200 |
commit | 12c50867c04b2c2a11f5026466bbea02d5406b70 (patch) | |
tree | 4008a8d5e3660d823197f97b3c0b244fa37d3ea1 /TODOS | |
parent | eb3771cafb98451116a4f0ec0e7a371800770de1 (diff) | |
download | crawler-12c50867c04b2c2a11f5026466bbea02d5406b70.tar.gz crawler-12c50867c04b2c2a11f5026466bbea02d5406b70.tar.bz2 |
started a robots.txt parser
Diffstat (limited to 'TODOS')
-rwxr-xr-x | TODOS | 4 |
1 files changed, 4 insertions, 0 deletions
@@ -12,3 +12,7 @@ - content based type detection on Windows - port of libmagic? - something from Microsoft (around the index service)? +- robots.txt + - handle Sitemap +- Parse URLs from sitemaps + |