3 Replies Latest reply on Feb 16, 2018 7:05 AM by Justin Doster

    How do I compare all strings within a given dimension value

    Justin Doster

      I am working on a problem that has me vexed and maybe you can help.  My data is SAN Switch data that carries data traffic between servers in a data center.  In a switch environment, individual ports are configured into zones to isolate data traffic.  An example of a zone might be two ports with one being a server and the other being a storage array.  Best practice dictates that each zone has a single host in it (called a single initiator).  It is possible for a single zone to have multiple hosts with the same name (in the case of multiple host bus adapters in the server).  This is not ideal, but not invalid.  What I want to look for is to find zones with more than one host in it, where the hosts are not the same.  Below is an example of the scenario I am looking for.  This actual scenario comes from the attached .tsv file.

       

      FC Switch Vendor           Zone Name                          Zoneset           FC Host Name Host Type

      LSH9513-32 Cisco Systems v300_wdca1cvd_02_p0_vmax2721_2e0 LSH_A_iSeries wdca1cvd      SAN Host

      LSH9513-32 Cisco Systems v300_wdca1cvd_02_p0_vmax2721_2e0 LSH_A_iSeries VMAX-2721      SAN Array

      LSH9513-32 Cisco Systems v300_wdca1cvd_02_p0_vmax2721_2e0 LSH_A_iSeries c0507603f0fd0122      SAN Host

                                                  


      As you can see, on the "LSH9513-32" switch, in the "v300_wdca1cvd_02_p0_vmax2721_2e0" zone, there are two records that are "SAN Hosts" and they have different FC Host Names.  This is an example of a single zone with multiple initiators.  My problem is that my data sets are large.  All are strings in my data.  How do I find the condition where within a single zone name, there exists more than one "SAN Host", where the "FC Host Name"'s are different?

      Ultimately I want to build a visualization that shows the user the total number of zones, the number of zones with a single initiator, the number of zones with multiple initiators with the same host name, and finally the number of zones with multiple initiators that are different (and then display a list of the zones with multiples).

       

      If anyone can help me I would be very appreciative.