2 Replies Latest reply on Oct 25, 2018 6:48 AM by Ciara Brennan

    Postgres slow. How to increase Tableau fetching size?

    Javier Seixas

      Hi,

       

      I have tableau prep connected to a Postgres database. The connection works fine and I can get data.

       

      The problem appears when I have to read from a table with 500M rows. I now is quite a lot, but it takes to much time for getting 6M rows.

      I've monitored my server:

      • CPU reaches 0.5%
      • Memory stays in 5%
      • Public Bandwith reaches 1Mbps

       

      Taking a look on postgres logs I could see the queries that Tableu Prep sends to postgres. Are the follow:

       

      2018-10-11 14:04:59.225 UTC [23772] user@db LOG:  statement: BEGIN;declare "SQL_CUR0000015C2C420CB0" cursor with hold for SELECT "d034277e-6303-4d1a-ab2f-af86d8"."id_variable_by_station" AS "id_variable_by_station",
            "d034277e-6303-4d1a-ab2f-af86d8"."tiempo" AS "tiempo",
            "d034277e-6303-4d1a-ab2f-af86d8"."valor" AS "valor",
            "d034277e-6303-4d1a-ab2f-af86d8"."year" AS "year",
            "d034277e-6303-4d1a-ab2f-af86d8"."month" AS "month",
            "d034277e-6303-4d1a-ab2f-af86d8"."id_station" AS "id_station",
            "d034277e-6303-4d1a-ab2f-af86d8"."id_variable" AS "id_variable",
            3 AS "dbead302-044c-435f-9a77-3f7a9e32c70b"
          FROM "public"."samples" "d034277e-6303-4d1a-ab2f-af86d8"
          WHERE ("d034277e-6303-4d1a-ab2f-af86d8"."id_station" = 19);fetch 2048 in "SQL_CUR0000015C2C420CB0"
      
      2018-10-11 14:05:00.176 UTC [23772] user@db LOG:  statement: SAVEPOINT _EXEC_SVP_0000015C2C420CB0
      2018-10-11 14:05:00.411 UTC [23772] user@db LOG:  statement: fetch 626 in "SQL_CUR0000015C2C420CB0"
      2018-10-11 14:05:00.648 UTC [23772] user@db LOG:  statement: RELEASE _EXEC_SVP_0000015C2C420CB0
      2018-10-11 14:05:00.886 UTC [23772] user@db LOG:  statement: SAVEPOINT _EXEC_SVP_0000015C2C420CB0
      2018-10-11 14:05:01.121 UTC [23772] user@db LOG:  statement: fetch 2048 in "SQL_CUR0000015C2C420CB0"
      2018-10-11 14:05:01.595 UTC [23772] user@db LOG:  statement: RELEASE _EXEC_SVP_0000015C2C420CB0
      2018-10-11 14:05:01.833 UTC [23772] user@db LOG:  statement: SAVEPOINT _EXEC_SVP_0000015C2C420CB0
      2018-10-11 14:05:02.068 UTC [23772] user@db LOG:  statement: fetch 626 in "SQL_CUR0000015C2C420CB0"
      2018-10-11 14:05:02.308 UTC [23772] user@db LOG:  statement: RELEASE _EXEC_SVP_0000015C2C420CB0
      

       

      What I see is that Tableu Prep creates a cursor, and uses it for fetching data. It fetches 2048 rows, then 626, then 2048 again, and so on. My hipotesys is that this is a not optimal behaviour, since it requires to do a ton of queries, with the consequent latency. Besides, The server performance is almost plain, as I said above, and it can reach until 9Mbps, 9 times more that what is getting. So I conclude that the server is not limiting performance, but Tableu Prep does.

       

      My question is: How can I increase the fetch count inTableau Prep, to improve every query??

       

      EDIT:

       

      After some research, I've found that customizing the .tdc file I can improve performance. I create that .tdc file:

       

      <?xml version='1.0' encoding='utf-8' ?>
      <connection-customization class='postgres' enabled='true' version='8.10'>
      <vendor name='postgres'/>
      <driver name='postgres'/>
      <customizations>
      <customization name='CAP_ODBC_FETCH_BUFFERS_SIZE_MASSIVE' value='yes' />
      </customizations>
      </connection-customization>
      

       

       

      And now I obtain a new fetch count of 10699, which is quite better:

       

      2018-10-11 17:57:31.565 UTC [25273] user@db LOG:  statement: fetch 10699 in "SQL_CUR00000201FC5864B0"
      

       

      But still I would like to improve it because the server can performance a lot more.

       

      How I can improve it? I gues I should customize the ODBC Connection (as explained here), specially that part that says:

      Row/Resultset Buffers - Prefer larger buffer sizes to improve the performance of fetching numerous rows, which can greatly improve the speed of creating extracts. This is sometimes called the cache size or response size.

      I haven't find which properties in the .tdc file will allow me that.

       

      Any help?

       

      EDIT 2:

       

      In MacOSX I get a fetch count of 6500 aprox with the same .tdc file, so I supose it changes due to the driver.

      So again, how can I configure the driver for increase that count?

        • 1. Re: Postgres slow. How to increase Tableau fetching size?
          Javier Seixas

          After digging into the problem some more hours, I found the solution and my .tdc file looks like this:

          <?xml version='1.0' encoding='utf-8' ?>
          <connection-customization class='postgres' enabled='true' version='8.10'>
          <vendor name='postgres'/>
          <driver name='postgres'/>
          <customizations>
          <customization name='odbc-connect-string-extras' value='Fetch=1000892' />
          <customization name='CAP_ODBC_FETCH_BUFFERS_RESIZABLE' value='yes' />
          </customizations>
          </connection-customization>
          

           

          Notice that the key of the solutin is the tag...

           

          <customization name='odbc-connect-string-extras' value='Fetch=1000892' />
          

           

          ... where I can specify the fetch count I want to be returned.

           

          This post showed me that the tag named odbc-connect-string-extras exists, and after no success looking for Postgres ODBC connector config, I found Redshift ODBC connector options and the Fetch parameter.

           

          Hope this will saved hours of search to other developers.

           

          I hope Tableau team will be able to improve documentation in the future for saving us hours of searching.

          2 of 2 people found this helpful