3 Replies Latest reply on Dec 9, 2013 8:07 PM by Richard Wall

    Tableau extract - join two CSVs with python

    Sheron Madhushanka
      Hi Community,
      Wonder if you can help. I'm trying to do a simple join of two CSV files with a common key using Tableau Extract module on Python.
      This is the order table
      OrderIDProductIDQty
      1a5
      2a6
      3b7
      4c5
      5d4

        this is the product table
       
      Order ID
      Product IDCostUnit Price
      a12.9913.55
      b5.559.51
      c19.5125.25
      d15.5019.25

       

      This is my python code:

       

      import csv,os,datetime
      import dataextract as tde

      #Step 1: Create the Extract File and open the .csv
      try:
          tdefile = tde.Extract('Join.tde')
      except:
          os.remove('Join.tde')
          tdefile = tde.Extract('Join.tde')

      csvReader1 =csv.reader(open('order.csv','rb'), delimiter=',', quotechar='"')
      csvReader2=csv.reader(open('product.csv','rb'), delimiter=',', quotechar='"')

      #Step 2: Create the tableDef
      tableDef = tde.TableDefinition()
      tableDef.addColumn('Order ID', tde.Type.INTEGER)
      tableDef.addColumn('Product ID', tde.Type.CHAR_STRING)
      tableDef.addColumn('Qty', tde.Type.INTEGER)
      tableDef.addColumn('Cost', tde.Type.DOUBLE)
      tableDef.addColumn('Unit Price', tde.Type.DOUBLE)

      #Step 3: Create the table in the image of the tableDef
      table = tdefile.addTable('Extract', tableDef)

      #Step 4: Loop through the csv, for each row create two new rows,
      #   put the data into those rows and insert them into the table
      newrow = tde.Row(tableDef)
      csvReader1.next()
      csvReader2.next()
      #Skip the first line since it has the headers
      for line1 in csvReader1:
          for line2 in csvReader2:
              if line1[1]==line2[0]:
                  newrow.setInteger(0,int(line1[0]))
                  newrow.setCharString(1,str(line1[1]))
                  newrow.setInteger(2,int(line1[2]))
                  newrow.setDouble(3,float(line2[1]))
                  newrow.setDouble(4,float(line2[2]))
          else:
              newrow.setInteger(0,int(line1[0]))
              newrow.setCharString(1,str(line1[1]))
              newrow.setInteger(2,int(line1[2]))
              newrow.setDouble(3,0)
              newrow.setDouble(4,0)
          table.insert(newrow)
       
      #Step 5: Close the tde
      tdefile.close()

       

      I'm unable to join the table to have the fields Unit price and cost displayed after the join.


      Any help would be much appreciated.

       

      Thanks

      Sheron