diff options
author | Andreas Baumann <mail@andreasbaumann.cc> | 2014-10-09 08:59:02 +0200 |
---|---|---|
committer | Andreas Baumann <mail@andreasbaumann.cc> | 2014-10-09 08:59:02 +0200 |
commit | 7d8b1ff684b412da292e0fc734748975188a0f10 (patch) | |
tree | 2673e3da51cc80bfc38a426048b30a4d71c31d4c /src/crawl/crawl.conf | |
parent | 62c5bb90525baf0d82c23892c2666f611750d63c (diff) | |
download | crawler-7d8b1ff684b412da292e0fc734748975188a0f10.tar.gz crawler-7d8b1ff684b412da292e0fc734748975188a0f10.tar.bz2 |
first trials with a Google normalizer called from Lua, std::string is the problem currently
and the missing wrapper for the URL class
also added a local 'tolua', we will have to hack it
Diffstat (limited to 'src/crawl/crawl.conf')
-rw-r--r-- | src/crawl/crawl.conf | 4 |
1 files changed, 4 insertions, 0 deletions
diff --git a/src/crawl/crawl.conf b/src/crawl/crawl.conf index ddc1da6..a524eaf 100644 --- a/src/crawl/crawl.conf +++ b/src/crawl/crawl.conf @@ -1,3 +1,7 @@ +local normalizer = GoogleURLNormalizer:new( ) +local baseUrl = normalizer:parseUrl( "http://www.base.com" ) +-- normalizer:normalize( base, "/relativedir/relativefile.html" ) + -- global setting crawler = { |