index
:
crawler
master
Unnamed repository; edit this file 'description' to name the repository.
gitolite user
summary
refs
log
tree
commit
diff
log msg
author
committer
range
path:
root
/
src
Age
Commit message (
Expand
)
Author
2012-08-11
google url normalization works on Windows, test1 must be improved:
Andreas Baumann
2012-08-11
added a file spooling buffer in libfetch rewind input stream, a little
Andreas Baumann
2012-08-11
fixed memory buffering in libfetch fetcher
Andreas Baumann
2012-08-10
started adapting googleurl on windows, icu intergration pending
Andreas Baumann
2012-08-10
fixed simle url normalizer and it's tests
Andreas Baumann
2012-08-10
some more windows modules
Andreas Baumann
2012-08-10
module loader works on Windows, simple URL normalizer test works
Andreas Baumann
2012-08-10
fixed wrong initlializtion order in RewindInputStream
Andreas Baumann
2012-08-10
first porting attempts to Windows:
Andreas Baumann
2012-08-09
-
Andreas Baumann
2012-08-09
added spooling to LibFetchRewindInputStream in order to support rewind
Andreas Baumann
2012-08-09
better libmagic buffer detection with increasing buffer on stream
Andreas Baumann
2012-08-08
added a file rewind input stream
Andreas Baumann
2012-08-08
handle sigint
Andreas Baumann
2012-08-08
-
Andreas Baumann
2012-08-08
modularized all other modules
Andreas Baumann
2012-08-08
chain filter and modules with one ctor param work now
Andreas Baumann
2012-08-08
more testing and docu aroung Type*
Andreas Baumann
2012-08-07
started modularization of URL filters
Andreas Baumann
2012-08-07
allow modules to be linked as static libraries, mainly to be able to
Andreas Baumann
2012-08-07
combined the two url normalizer tests
Andreas Baumann
2012-08-07
more reduction of module code and fixed dependency problem when building
Andreas Baumann
2012-08-07
cleaned up url normalizer tests and made them use module loader
Andreas Baumann
2012-08-07
reduced some code duplication when registering modules
Andreas Baumann
2012-08-06
removed some debug prints
Andreas Baumann
2012-08-06
using typeinfo to find correct destruction function for loadable module objects
Andreas Baumann
2012-08-06
first steps to make URL loader loadable
Andreas Baumann
2012-08-04
cleaned up interface of GoogleURLNormalizer API
Andreas Baumann
2012-08-04
brutal testing and normalization of Google URL, must refactor most things the...
Andreas Baumann
2012-08-04
rearanged google test1 and added a GoogleUrlNormalizer
Andreas Baumann
2012-08-03
tamed some debug output
Andreas Baumann
2012-08-03
basic normalization
Andreas Baumann
2012-08-03
fighting with reverse iterators for url normalization
Andreas Baumann
2012-07-29
-
Andreas Baumann
2012-07-29
somewhat working again
Andreas Baumann
2012-07-29
temporarily removed domain, domain filter is a host filter now
Andreas Baumann
2012-07-29
started to add simple parseUrl implementation
Andreas Baumann
2012-07-28
heavy redesign of URL class, must not contain any parsing logic as
Andreas Baumann
2012-07-28
started to add URL normalizers and testing environment for URLs
Andreas Baumann
2012-07-19
some interface fixes
Andreas Baumann
2012-07-18
fixed memory frontier
Andreas Baumann
2012-07-18
added URLSeen component
Andreas Baumann
2012-07-15
some investemnet in URL parsing
Andreas Baumann
2012-07-15
started to add URL filters
Andreas Baumann
2012-07-14
some pseudo URL normalization
Andreas Baumann
2012-07-14
first working crawler
Andreas Baumann
2012-07-14
added streamhtmlparser
Andreas Baumann
2012-07-14
first fetch works
Andreas Baumann
2012-07-13
-
Andreas Baumann
2012-07-13
added a test for a libfetch_streambuf
Andreas Baumann
[next]