2 Replies Latest reply on Nov 6, 2018 5:28 AM by Marko Bauhardt

    ExtractAPI: error converting utf32 to utf16: 10

    Marko Bauhardt

      Hi,

      We are using the JAVA tableau SDK version 10200.17.0328.0755 and their API to append rows to a TDE file.

      Our JVM terminates with error `converting utf32 to utf16: 10`, if our String value we want to write is encoded via UTF-32.

      UTF-16 strings works without any issue.

       

      I took a look into the `DataExtract.log` file to figure out what happens behind the scenes. I found the following lines which creates the tableau table

         

      ```

      2018-10-09 15:11:29.225 (0x70000e337000): (create table [Extract].[Extract])

      2018-10-09 15:11:29.225 (0x70000e337000): Compiling query with Memory Budget=17179869184 MemoryAvailable=17179869184

      2018-10-09 15:11:29.261 (0x70000e337000): Session1: QueryExecute: OK, Elapsed time:0.036s, Compilation time:0.000s, Execution time:0.036s

      2018-10-09 15:11:29.261 (0x70000e73a000): Session1: QueryExecute:

      2018-10-09 15:11:29.261 (0x70000e73a000): (create column [Extract].[Extract].[utf32_col] ( ( "collation" "en_US" ) ( "compression" "heap" ) ( "factory" "varchar" ) ( "ordinal" "0" ) ( "scale" "2" ) ( "storagewidth" "8" ) ))

      2018-10-09 15:11:29.268 (0x70000e73a000): Compiling query with Memory Budget=17179869184 MemoryAvailable=17179869184

      2018-10-09 15:11:29.307 (0x70000e73a000): Session1: QueryExecute: OK, Elapsed time:0.046s, Compilation time:0.006s, Execution time:0.040s

      2018-10-09 15:12:05.954 (0x70000fb49000): tdeserver: disconnected connection=::1:51126->::1:51128: IPC_SocketConnection::Read(len=16, connection=::1:51126->::1:51128): The connection was closed by the peer in IPC_Socket::Recv(len=16)

      2018-10-09 15:12:05.954 (0x70000fb49000): tdeserver: closing connection=::1:51126->::1:51128

      2018-10-09 15:12:05.954 (0x70000fb49000): tdeserver: closing orphaned session1

      2018-10-09 15:12:05.964 (0x7fffaa99f380): tdeserver: exit (0)

      ```

      So it looks like that the table is created with collation `en_US` but there is no config value regarding the encoding. I didn't find a method in the API to define the encoding of a tableau table.

      Tableau Extract API: Tableau::TableDefinition Class Reference

       

      So my question is

      Does the extract API from the tableau SDK version 10.2.x  support UTF-32 character encoded values? Or is only UTF-8 supported.

        • 1. Re: ExtractAPI: error converting utf32 to utf16: 10
          Noah Beasley

          Hi Marko,

           

          Are you using Mac OSX to run the Java application? There is a difference in handling of UTF-8 4-byte unicode characters between Mac and Windows which can cause exactly this error in Java with the Tableau SDK.

          If so,  using the CHAR_STRING type and setCharString method for insertion of the affected strings, instead of UNICODE_STRING and setString (respectively), should resolve the error.

           

          If the above does not apply, it would be good to submit a case to Tableau Technical Support with the details, as this could be something new.

          • 2. Re: ExtractAPI: error converting utf32 to utf16: 10
            Marko Bauhardt

            Hi Noah,

            yes we are using Mac OSX, but we had the same error on Linux. We used the setCharString method like you suggested.

             

            ```

            row.setCharString(index, myString);

            ```

             

            When we are using this method we get

             

            ```

            Caused by: com.tableausoftware.TableauException: type mismatch

                at com.tableausoftware.extract.Row.setCharString(Unknown Source)

            ```

             

            So, I will contact technical support as you suggested.

             

            Thanks

            Marko