java - UTF-8 - I don't understand this byte sequence -


i have data provider send me data supposed coded utf-8. data presents sequence of bytes:

28 49 4e 54 e2 80 99 4c 29 20  (int’l) => "(int’l)" 

for reason when java program fetch data , store in database, above sequence becomes:

28 49 4e 54 19 4c 29 20        (int.l) => "(int\u0019l)" 

the java program built on top of hibernate. first fetches data provider, stores in entity , entity persisted in database (postgresql).

why loosing bytes (e2 80 99 becomes 19) ?
how can avoid ?

here core method used transfer data fetched provider entity:

import java.sql.clob;  //...  public static string convertstreamtostring(clob clob) throws sqlexception {     if (clob == null) {         return "";     }      bufferedreader br = null;     stringbuilder result = new stringbuilder();      try {         br = new bufferedreader(new inputstreamreader(clob.getasciistream(), charset.forname("utf-8")));         string lig;         int n = 0;         while ((lig = br.readline()) != null) {             if (n > 0) {                 result.append("\n");             }             result.append(lig);             n++;         }     } catch (ioexception ioe) {          // exception handling code ...     } catch (sqlexception sqlex) {          // exception handling code ...     } {         ioutil.close(br);     }      return result.tostring(); }  // ...  myentity entity = ... oracle.sql.nclob clob = ... entity.setproperty(convertstreamtostring(clob));   @entity class myentity {      @column(name="prop", length=100000)      private string prop;         public void setproperty(string value) {           this.prop=value;      } } 

you using getasciistream() read contents of clob. name says, method usable ascii; breaks non-ascii characters.

use getcharacterstream method instead.

bufferedreader br = null; stringbuilder result = new stringbuilder();  try {     br = new bufferedreader(clob.getcharacterstream());     .... 

Comments

Popular posts from this blog

assembly - 8086 TASM: Illegal Indexing Mode -

Java, LWJGL, OpenGL 1.1, decoding BufferedImage to Bytebuffer and binding to OpenGL across classes -

javascript - addthis share facebook and google+ url -