3 Replies Latest reply on Jan 11, 2017 1:25 AM by jens.bruckmann

    Most efficient way of handling large JSON

    Corentin Martel

      Hello,

       

      I want to perform a data analysis over logs of a website. I have one log by day for about a month, stored as json files of about 500-600 Mo each.

       

      I am wondering what would be the most efficient way to deal with this, as I don't think that I can open all my logs at once as Json with 32 Gb of RAM.

       

      If I load the logs in a database ( MongoDB ?) and then use it in Tableau, will it be more efficient? Or should I just reduce my analysis by weeks to avoid memory overflow ?

        • 1. Re: Most efficient way of handling large JSON

          Hey Corentin,

           

          There's a few ways you could go with this:

           

          1. Create a web data connector to connect to the logs hosted on the site and create an extract: Web Data Connector

           

          2. Migrate to a DB to connect with Tableau

           

          3. Look at the data weekly.

           

          Honestly, I would go with option 1 or 2. Lowering your scope to look at weeks is likely a band-aid solution that may not let you do the level of analysis you eventually decide you want to do. The sooner you migrate to a database, the better set you are for the future. Also, once your data is in a DB, you can filter to only connect to certain subsets of data to improve the speed of your analysis if you so choose.

           

          -Diego

          1 of 1 people found this helpful
          • 2. Re: Most efficient way of handling large JSON
            Corentin Martel

            After looking everywhere for an answer, this was my conclusion also. Thanks !

             

            Corentin

            • 3. Re: Most efficient way of handling large JSON
              jens.bruckmann

              hi,

              I have the same issue with my json files. I am able to get data for 2 month, but when I want to ahve more than 2 month I do not get any data. What I do

               

              1. created webdata connector to connect to my json file (in this connector I have a start and enddate)

              2. Creating extract in tableau and store it.

               

              But for any reason I only have the option to do a full refresh all time and not an incremental (the json files are already extracted)

               

              any Idea how to do an incremental refresh on a json webdataconnector?