search/strus/README


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

# Search index with strus

# For now create an XML from the content, later have a directory iterator
# over 'content' and read TOML/YAML headers and markdown...

# TODO: this becomes obsolete with a Hugo segmenter which undestands
# YAML/TOML/JSON and Markdown:
# remarshal (https://github.com/dbohdan/remarshal)
# pandoc (http://pandoc.org/)
# client-side needs:
# https://github.com/fortnightlabs/snowball-js

./create_xml.sh > posts.xml

xmllint -noout posts.xml

# test configuration of document analysis

strusAnalyze document.ana posts.xml |& less

# Create the strus search index:

rm -rf storage
mkdir storage
strusCreate -s 'path=storage/wwwandreasbaumanncc; metadata=doclen UINT16, publish_date UINT16'
strusInsert -c 1000 -f 1 -t 1 -s "path=storage/wwwandreasbaumanncc" document.ana posts.xml