5 Replies Latest reply on May 8, 2013 8:07 AM by Mark Jackson

    Charting internet / social media buzz on a topic

    Mark Jackson

      Drawing inspiration from the "Andy Cotgreave's Gartner BI Twitternalysis" blog post, I wrote a quick script in Ruby to extract all mentions of a search term using socialmention.com's JSON API and append them to a CSV file, which Tableau uses as a data source. At some point I will probably enhance this to pull more details from Twitter / Facebook about the user (i.e. number of followers / friends, etc.). I set this to run via the Windows Scheduler every hour.

      Social Media Mentions of Piedmont Hospital.jpg

       

      require 'rubygems'

      require 'json'

      require 'uri'

      require 'net/http'

      require 'csv'

      require 'time'

       

      file = 'M:/Tableau/Data Extracts/socialmention.csv'

      search_str = 'Tableau' #CUSTOMIZE YOUR SEARCH STRING HERE

       

      def news_search(query, from_ts)

        base_url = "http://api2.socialmention.com/search?q="

        url = "#{base_url}#{query.gsub(' ', '+')}&f=json&t=all&from_ts=#{from_ts}"

       

        resp = Net::HTTP.get_response(URI.parse(url))

       

        data = resp.body

       

        # we convert the returned JSON data to native Ruby

        # data structure - a hash

        result = JSON.parse(data)

       

        # if the hash has 'Error' as a key, we raise an error

        if result.has_key? 'Error'

          raise "web service error"

        end

        return result

      end

       

      fileExists = File.exist?(file)

       

      ids = []

      dates = []

      from_ts = 86400 * 5

      if fileExists

        csv_data = CSV.read(file)

        headers = csv_data.shift.map {|i| i.to_s }

        string_data = csv_data.map {|row| row.map {|cell| cell.to_s } }

        array_of_hashes = string_data.map {|row| Hash[*headers.zip(row).flatten] }

       

        array_of_hashes.each do |k,v|

          ids << k['id']

          dates << k['date']

        end

       

        from_ts = Time.now.to_i - Time.parse(dates.max).to_i # TIME SINCE LAST ENTRY

      end

       

      result = news_search(search_str, from_ts)

       

      x = 0

      CSV.open(file,"a") do |csv|

       

        if not fileExists # CREATE FILE HEADER

          csv << %w{ id user_id date title description link source type embed image sentiment user user_link user_image favicon}

        end

       

        result['items'].each do |v|

          if ids.include?(v['id'])

            puts "#{v['id']} already included"

          else

            csv << ["#{v['id']}",

                    "#{v['user_id']}",

                    "#{Time.at(v['timestamp'])}",

                    "#{v['title']}",

                    "#{v['description']}",

                    "#{v['link']}",

                    "#{v['source']}",

                    "#{v['type']}",

                    "#{v['embed']}",

                    "#{v['image']}",

                    "#{v['sentiment']}",

                    "#{v['user']}",

                    "#{v['user_link']}",

                    "#{v['user_image']}",

                    "#{v['favicon']}"]

            puts "#{v['id']} added to CSV"

            x += 1

          end

        end

      end

       

      puts "Added #{x} new records to CSV"