+++ title = "Lucene Index Dumper" description = "a mini-contribution to [Lucene](http://lucene.apache.org/)" +++ LuceneAnalyzer is a quick hack for dumping and inspecting a Lucene index. Something for the 'sort-uniq-cut-awk' guys out there. :-) * release 0.0.4 (for Lucene 3.1) * [binaries, version 0.0.4](/luceneanalyzer/luceneanalyzer-0.0.4.tgz) * [sources, version 0.0.4](/luceneanalyzer/luceneanalyzer-0.0.4-src.tgz) * release 0.0.3 (for Lucene 2.x) * [binaries, version 0.0.3](/luceneanalyzer/luceneanalyzer-0.0.3.tgz) * [sources, version 0.0.3](/luceneanalyzer/luceneanalyzer-0.0.3-src.tgz) Show global statistics of the index: ``` shell> ./luceneanalyzer -g /dir_to_some_lucene_index Global Information: =================== number of documents: 17 total number of features: 955 total number of tokens: 1442 version: 1328361447856 still current: true maximal document number: 17 has deletions: false ``` Show field information: ``` shell> ./luceneanalyzer -f /dir_to_some_lucene_index Field Information: ================== Fields of type 'ALL': store_0_coordinate text ... Fields of type 'INDEXED_WITH_TERMVECTOR': includes Fields of type 'TERMVECTOR': Fields of type 'TERMVECTOR_WITH_OFFSET': Fields of type 'TERMVECTOR_WITH_POSITION': Fields of type 'TERMVECTOR_WITH_POSITION_OFFSET': includes Fields of type 'UNINDEXED': store ``` Show information about terms, statistics and positions: ``` shell> ./luceneanalyzer -t -vv /dir_to_some_lucene_index Terms: ====== cat camera 12[0] cat connector 3[0],4[0] cat copier 11[0] cat electronics 1[0],2[0],3[0],4[0],5[0],6[0],7[0],8[0],9[0],10[0],11[0],12[0],15[0],16[0] ... ext using 13[415] text utf 14[3] text v 8[2] text va902b 9[1] text valueselect 7[1] ``` A Git repository is accessible at **git://git.andreasbaumann.cc/LuceneAnalyzer.git** (or at http://git.andreasbaumann.cc/cgit/LuceneAnalyzer/ ) In case of questions, contact me via email.