Search in solr with special characters -

August 15, 2015

i have problem search special characters in solr. document has field "title" , can "titanic - 1999" (it has character "-"). when try search in solr "-" receive 400 error. i've tried escape character, tried "-" , "\-". changes solr doesn't response me error, returns 0 results.

how can search in solr admin special character(something "-" or "'"???

regards

update here can see current solr scheme https://gist.github.com/cpalomaresbazuca/6269375

my search field "title".

excerpt schema.xml:

 ...  <!-- general text field has reasonable, generic      cross-language defaults: tokenizes standardtokenizer,      removes stop words case-insensitive "stopwords.txt"      (empty default), , down cases.  @ query time only,      applies synonyms. -->     <fieldtype name="text_general" class="solr.textfield" positionincrementgap="100">         <analyzer type="index">             <tokenizer class="solr.standardtokenizerfactory"/>             <filter class="solr.stopfilterfactory" ignorecase="true" words="stopwords.txt" enablepositionincrements="true" />             <!-- in example, use synonyms @ query time              <filter class="solr.synonymfilterfactory" synonyms="index_synonyms.txt" ignorecase="true" expand="false"/>              -->             <filter class="solr.lowercasefilterfactory"/>          </analyzer>         <analyzer type="query">             <tokenizer class="solr.standardtokenizerfactory"/>             <filter class="solr.stopfilterfactory" ignorecase="true" words="stopwords.txt" enablepositionincrements="true" />             <filter class="solr.synonymfilterfactory" synonyms="synonyms.txt" ignorecase="true" expand="true"/>             <filter class="solr.lowercasefilterfactory"/>          </analyzer>     </fieldtype> ... <field name="title" type="text_general" indexed="true" stored="true"/>

you using standard text_general field title attribute. might not choice. text_general meant huge chunks of text (or @ least sentences) , not exact matching of names or titles.

the problem here text_general uses standardtokenizerfactory.

 <fieldtype name="text_general" class="solr.textfield" positionincrementgap="100">         <analyzer type="index">             <tokenizer class="solr.standardtokenizerfactory"/>             <filter class="solr.stopfilterfactory" ignorecase="true" words="stopwords.txt" enablepositionincrements="true" />             <!-- in example, use synonyms @ query time              <filter class="solr.synonymfilterfactory" synonyms="index_synonyms.txt" ignorecase="true" expand="false"/>              -->             <filter class="solr.lowercasefilterfactory"/>          </analyzer>         <analyzer type="query">             <tokenizer class="solr.standardtokenizerfactory"/>             <filter class="solr.stopfilterfactory" ignorecase="true" words="stopwords.txt" enablepositionincrements="true" />             <filter class="solr.synonymfilterfactory" synonyms="synonyms.txt" ignorecase="true" expand="true"/>             <filter class="solr.lowercasefilterfactory"/>          </analyzer>     </fieldtype>

standardtokenizerfactory following:

a general purpose tokenizer strips many extraneous characters , sets token types meaningful values. token types useful subsequent token filters type-aware of same token types.

this means '-' character ignored , used tokenize string.

"kong-fu" represented "kong" , "fu". '-' disappears.

this explain why select?q=title:\- won't work here.

choose better fitting field type:

instead of standardtokenizerfactory use solr.whitespacetokenizerfactory, splits on whitespace exact matching of words. making own field type title attribute solution.

solr has mininal fieldtype called text_ws. depending on requirements might enough.

Search This Blog

Detect

Search in solr with special characters -

Comments

Post a Comment

Popular posts from this blog

assembly - 8086 TASM: Illegal Indexing Mode -

javascript - addthis share facebook and google+ url -

Java, LWJGL, OpenGL 1.1, decoding BufferedImage to Bytebuffer and binding to OpenGL across classes -