The Cross Join Collection

    Introduction

     

    WHY A CROSS JOIN COLLECTION?

    Because it is a powerful tool to dynamically reshape and create data which otherwise is "too hard" to do.

    My hope is that various helpers can use these techniques when helping others and themselves.

     

    WHAT IS A CROSS JOIN?

    A CROSS JOIN produces rows which combine each row from the first table with each row from the second table. This is also known as the Cartesian product, cross product, or Cartesian join.

     

    INNER JOIN vs CROSS JOIN

    Both do more or less the same! [Table A] a INNER JOIN [Table B] b ON 1=1 produces also a Cartesian product. The only difference is that conditions in JOINs are executed before conditions in WHERE, where our CROSS JOIN logic is written.

     

    CROSS JOIN with Tableau's join dialog  explains how to

    imitate a cross join (all-to-all relationship) via Tableau's equi-join dialog.

     

    EXPLICIT vs IMPLICIT CROSS JOIN

    An explicit cross join uses the keyword CROSS JOIN. An implicit cross join is made by separating tables with a comma. I recommend using explicit cross join when possible for readability. The examples below use comma, because they are written in Jet SQL, which doesn't support explicit cross joins. Jet SQL is was used for Excel, Text File, and Access data sources.

     

    TABLE-LESS LOOKUP TABLE

    If the lookup table for the cross join has few rows and isn't readily available, it is often easier to create a table-less table with constants and union. When connecting to a database, we often have no other choice than to use a table-less table. Below is the syntax for table-less tables in various SQL systems:

     

    Jet SQL:  (select top 1 'A' as [columnname] from [table A] union all select top 1 'B' from [table A] ) t

    SQL Server:  (select 'A' columnname union all select 'B') t

    Oracle:  (select 'A' columnname from dual union all select 'B' from dual) t

     

    Examples:  Re: Cross Join Collection with just one table

     

    CROSS JOIN TIPS

    The tips below are kept as simple as possible to make them as easy to understand as possible.

    See results of each tip in attached companion workbook.

     

     

     

    1 Periods Intersecting Calendar Periods

    Generate one row for each calendar period (lookup) per period (data).

     

     

    tip01data$tip01lookup$
    _Task___Start Date____End Date___ ___Year___Month___Month Start____Month End__
    A2013-02-252013-04-25201322013-02-012013-02-28
    B2013-03-112013-03-28201332013-03-012013-03-31
    C2013-03-182013-04-04201342013-04-012013-04-30
    _
    cartesian product
    non- results de-emphasized
    TaskStart DateEnd DateYearMonthMonth StartMonth End
    A2013-02-252013-04-25201322013-02-012013-02-28
    A2013-02-252013-04-25201332013-03-012013-03-31
    A2013-02-252013-04-25201342013-04-012013-04-30
    B2013-03-112013-03-28201322013-02-012013-02-28
    B2013-03-112013-03-28201332013-03-012013-03-31
    B2013-03-112013-03-28201342013-04-012013-04-30
    C2013-03-182013-04-04201322013-02-012013-02-28
    C2013-03-182013-04-04201332013-03-012013-03-31
    C2013-03-182013-04-04201342013-04-012013-04-30

     

    Query

     

    SELECT *
    FROM [tip01data$] d, [tip01lookup$] l
    WHERE d.[Start Date] <= l.[Month End]
    and d.[End Date] >= l.[Month Start]
     

     

     

    See also

    Count distinct records that fit multiple date criteria

    Identify dates between a range

    Rolling Backlog Calculations

    Rolling backlog calculation

    Divide a value between months

    Maximum occupancy of departure lounge/calculation from Gantt chart

    Re: How to get counts of active enrollments (active between two dates)

    Measure activity between dates

    Re: Tableau - Monthwise Data Distribution

    Cross Join Collection with just one table

    Any Workaround to Bring Date Bins to a Calculation Field

    Re: Displaying crowd density by hours with data in start and end time format

    Re: Alarm data with open/close timestamp. Need duration running per day

     

     

     

    2 Periods Intersecting Calendar Dates

    Generate one row for each calendar date (lookup) in period (data).

     

    Warning 1: The more days one has between one's calendar dates, the more activity one risks to miss, and thus get misleading results that can lead to wrong decisions.

     

    Warning 2: Including every single day in the calendar dates lookup table will not miss any day. The cost of doing this is that the number of returned rows may become too many.

     

     

    tip02data$  tip02lookup$ 
    _Task___Start Date____End Date___ _Quarter End
    A2013-02-252013-04-252013-03-31
    B2013-03-112013-07-252013-06-30
    C2013-05-202014-01-232013-09-30
    2013-12-31
    _
    cartesian product
    non- results de-emphasized
    TaskStart DateEnd DateQuarter End
    A2013-02-252013-04-252013-03-31
    A2013-02-252013-04-252013-06-30
    A2013-02-252013-04-252013-09-30
    A2013-02-252013-04-252013-12-31
    B2013-03-112013-07-252013-03-31
    B2013-03-112013-07-252013-06-30
    B2013-03-112013-07-252013-09-30
    B2013-03-112013-07-252013-12-31
    C2013-05-202014-01-232013-03-31
    C2013-05-202014-01-232013-06-30
    C2013-05-202014-01-232013-09-30
    C2013-05-202014-01-232013-12-31

     

    Query

     

    SELECT *
    FROM [tip02data$] d, [tip02lookup$] l
    WHERE l.[Quarter End] BETWEEN d.[Start Date] and d.[End Date]
     

     

     

    See also

    Count In-flight Events

    Finding data points between two date dimensions

    Backlog History Data

    Calculating Headcount and Attrition - Day/Week/Month/Year

    Measure activity between dates

    Re: How to reference a column's value in a calculated field?   One row per day in period

    Re: Headcount, when given the Arrival & Departure Dates   One row per day in period

     

     

     

    3 Unpivot Columns

    Generate one row for each non-empty cell in columns to unpivot.

     

     

    tip03data$  tip03lookup$ 
      Product___USA___Europe____Asia__ _Market
    A10500050000Asia
    B118000230000340000_Europe
    C43000400000100000_USA
    D23200060000_
    _
    cartesian product
    non- results de-emphasized
    ProductUSAEuropeAsia_Market
    A10500050000Asia
    A10500050000Europe
    A10500050000USA
    B118000230000340000_Asia
    B118000230000340000_Europe
    B118000230000340000_USA
    C43000400000100000_Asia
    C43000400000100000_Europe
    C43000400000100000_USA
    D23200060000_Asia
    D23200060000_Europe
    D23200060000_USA

     

    Query

     

    SELECT d.[Product]
    , l.[Market]
    , IIF(l.[Market] = 'USA',    d.[USA],
      IIF(l.[Market] = 'Europe', d.[Europe],
      IIF(l.[Market] = 'Asia',   d.[Asia]
      ))) AS [Sales]
    FROM [tip03data$] d, [tip03lookup$] l
    WHERE 1=1
    AND NOT (d.[USA]    is null and l.[Market] = 'USA')
    AND NOT (d.[Europe] is null and l.[Market] = 'Europe')
    AND NOT (d.[Asia]   is null and l.[Market] = 'Asia')
     

     

     

    See also

    Multiple Measures on an Independent Date?

    I need to count occurences in my data set.

    Tableau & SQL Group

    Displaying the COUNT of a unique value for multiple columns

    Compare two data-sets for changes

    Calculations based on two columns in a row

    Descriptive statistics for measures (min, max, count, average, median)

    Side by Side Open/Closed by Month

    create quick filter that filters 2 fields

    Table that shows MIN MAX and AVG for several measures

     

     

     

    4 Unpivot Column With Multiple Unknown Items

    Generate one row for each item separated by a comma (or other separator) in a column with values which are not known beforehand.

     

     

    tip04data$  tip04lookup$ 
      Customer_Values__ _Pos
    A1,10,31
    B3,12
    3
    4
    5
    6
    _
    cartesian product
    non- results de-emphasized
    ProductValuesPos
    A1,10,31
    A1,10,32
    A1,10,33
    A1,10,34
    A1,10,35
    A1,10,36
    B3,11
    B3,12
    B3,13
    B3,14
    B3,15
    B3,16

     

    Query

     

    SELECT d.[Customer]
    , d.[Values]
    , clng(mid$(d.[Values], l.Pos, instr(l.Pos, d.[Values] + ',', ',') - l.Pos)) as [Value]
    FROM [tip04data$] d, [tip04lookup$] l
    WHERE l.Pos <= len(d.[Values])
    AND mid$(',' + d.[Values], l.Pos, 1) = ','
     

     

     

    See also

    Comma Separated value within a Column

    how do i count words in a column that contains sentences?

    multi valued column

    How to find the number of occurrences of all sub-strings in all strings of a particular dimension?

    How to fix this cross join?

     

     

     

    5 Unpivot Column With Multiple Known Items

    Generate one row for each item separated by a comma (or other separator) in a column with values known beforehand.

     

     

    tip05data$  tip05lookup$ 
    __Project__Employees_ _Employee
    1Abraham Lincoln; Benjamin Franklin; Thomas Edison_Abraham Lincoln
    2Albert Einstein; Thomas EdisonAlbert Einstein
    3Abraham Lincoln; Benjamin Franklin_Benjamin Franklin_
    Thomas Edison
    _
    cartesian product
    non- results de-emphasized
    ProjectEmployeesEmployee
    1Abraham Lincoln; Benjamin Franklin; Thomas EdisonAbraham Lincoln
    1Abraham Lincoln; Benjamin Franklin; Thomas EdisonAlbert Einstein
    1Abraham Lincoln; Benjamin Franklin; Thomas EdisonBenjamin Franklin
    1Abraham Lincoln; Benjamin Franklin; Thomas EdisonThomas Edison
    2Albert Einstein; Thomas EdisonAbraham Lincoln
    2Albert Einstein; Thomas EdisonAlbert Einstein
    2Albert Einstein; Thomas EdisonBenjamin Franklin
    2Albert Einstein; Thomas EdisonThomas Edison
    3Abraham Lincoln; Benjamin FranklinAbraham Lincoln
    3Abraham Lincoln; Benjamin FranklinAlbert Einstein
    3Abraham Lincoln; Benjamin FranklinBenjamin Franklin
    3Abraham Lincoln; Benjamin FranklinThomas Edison

     

    Query

     

    SELECT *
    FROM [tip05data$] d, [tip05lookup$] l
    WHERE d.[Employees] LIKE '%' + l.[Employee] + '%'
     

     

     

    See also

    How do i parse out the dollar amount from this text string  LIKE

    Counting distinct strings within an array for each row  LIKE

    How to sum unique words in string  LIKE

    Multiple/overlapping values in a case statement?  LIKE

    How Can I Display Records Where Multiple Criteria Are Met?  INSTR and INNER JOIN acting as CROSS JOIN

    Filter multiple values  LIKE

    Dynamic tableau filtering with flattened string array  LIKE

    multi valued column  LIKE

    Automatically pull out common words from blocks of text?  LIKE

     

     

    6 Self Cross Join

    This example shows how we can use a self cross join to search for duplicate customers.

     

     

    tip06data$__tip06data$
    .ID.Name_Birth Date_Address.ID.Name_Birth Date_Address
    1Leo da Vinci1452-04-1510 Inventori St_1Leo da Vinci1452-04-1510 Inventori St_
    2Isaac Newton1642-12-257 Physics St2Isaac Newton1642-12-257 Physics St
    3Leonardo da Vinci1452-04-1510 Inventori St3Leonardo da Vinci1452-04-1510 Inventori St
    _
    cartesian product
    non- results de-emphasized
    IDNameBirth DateAddressIDNameBirth DateAddress
    1Leo da Vinci1452-04-1510 Inventori St1Leo da Vinci1452-04-1510 Inventori St
    1Leo da Vinci1452-04-1510 Inventori St2Isaac Newton1642-12-257 Physics St
    1Leo da Vinci1452-04-1510 Inventori St3Leonardo da Vinci1452-04-1510 Inventori St
    2Isaac Newton1642-12-257 Physics St1Leo da Vinci1452-04-1510 Inventori St
    2Isaac Newton1642-12-257 Physics St2Isaac Newton1642-12-257 Physics St
    2Isaac Newton1642-12-257 Physics St3Leonardo da Vinci1452-04-1510 Inventori St
    3Leonardo da Vinci1452-04-1510 Inventori St1Leo da Vinci1452-04-1510 Inventori St
    3Leonardo da Vinci1452-04-1510 Inventori St2Isaac Newton1642-12-257 Physics St
    3Leonardo da Vinci1452-04-1510 Inventori St3Leonardo da Vinci1452-04-1510 Inventori St

     

    Query

     

    SELECT d1.*
    FROM [tip06data$] d1, [tip06data$] d2
    WHERE d1.[ID] <> d2.[ID]
    and d1.[Birth Date] = d2.[Birth Date]
    and d1.[Address] = d2.[Address]
     

     

     

    See also

    audit data from table, looking for differences

    Re: Identifying Overlapping Dates

    Re: Flag for Overlapping Time Period

     

     

     

    7 All Combinations

    The Cartesian product is commonly used to include categories without data in a viz.

    The example below shows how one can make data for a calendar with this technique.

     

     

    tip07data$___tip07lookup$_
    _Calendar Year_____Day of Year_
    20121
    20132
    ...
    365
    366
    _
    cartesian product
    non- results de-emphasized
    Calendar YearDay of Year
    20121
    20122
    2012...
    2012365
    2012366
    20131
    20132
    2013...
    2013365
    2013366

     

    Query

     

    SELECT d.[Calendar Year]
    , l.[Day of Year]
    , dateadd('d', l.[Day of Year], dateserial(d.[Calendar Year], 1, 0)) AS [Calendar Date]
    FROM [tip07data$] d, [tip07lookup$] l
    WHERE d.[Calendar Year] =
    datepart('yyyy', dateadd('d', l.[Day of Year], dateserial(d.[Calendar Year], 1, 0)))
     

     

    Comment

    The condition on line 5 removes day 366 in non-leap years.

     

    See also

    Need help with a SQL query to create dummy data

    The Calendar Workbook Collection

     

     

     

    8 Dynamic Constants

    Cross joining a table with 1 row makes it possible to add various variables/constants which are easy to use in calculations and filters in the query and Tableau calculations. The 1 row table can be done with a subquery or a table with constants. The subquery will be more dynamic.

     

     

    tip08data$tip08data$ subquery
    _User__Posts_____MIN__MAX__AVG__COUNT_
    A45_545235
    B22_
    C5_
    D15_
    E28_
    _
    cartesian product
    non- results de-emphasized
    UserPosts_MINMAXAVGCOUNT
    A45_545235
    B22_545235
    C5_545235
    D15_545235
    E28_545235

     

    Query

     

    SELECT *
    FROM [tip08data$] d, (
      select
        min(posts)  as [MIN]
      , max(posts)  as [MAX]
      , avg(posts)  as [AVG]
      , count(posts) as [COUNT]
      from [tip08data$]
      ) l
     

     

     

    See also

    Show Only Values From Max Date in Dataset

     

     

     

    9 Character Count

    Besides statistics, this technique might be useful to identify letters that might need to be corrected.

     

     

    tip09data$___tip09lookup$_
    _Text_____Pos_
    _Tableau Software_1
    ...
    16
    _
    cartesian product
    non- results de-emphasized
    TextPos
    Tableau Software1
    Tableau Software2
    Tableau Software3
    Tableau Software4
    Tableau Software5
    Tableau Software6
    Tableau Software7
    Tableau Software8
    Tableau Software9
    Tableau Software10
    Tableau Software11
    Tableau Software12
    Tableau Software13
    Tableau Software14
    Tableau Software15
    Tableau Software16

     

    Query

     

    SELECT mid$(d.Text, l.Pos, 1) as [Character]
    FROM [tip09data$] d, [tip09lookup$] l
    WHERE l.Pos <= len(d.Text)
     

     

    Comment

    The condition on line 3 improves performance by limiting number of rows to what is needed.

     

    See also

    Comma Separated value within a Column

     

     

     

    Other Techniques

    There were some techniques I chose not to include as tip of their own. These techniques are still valuable for those who want to learn various cross join techniques and therefore listed below:

     

    Re: measure open and closed cases for week

    Re: Turnover Rate (Hiring Date and Terms Date)

     

     

     

    Related

    Support cross product / cross join / Cartesian join / Cartesian product in new connection interface

    Support custom WHERE clause in new connection interface

    Date period dimension type with built-in row split

    Is it possible to incorporate data reshaping into the Tableau Extract Engine?

    Built-in 2-step SQL engine

    Knowledge Base: Installing the Tableau Add-In for Reshaping Data in Excel

    Knowledge Base: Working with Survey Data

    Knowledge Base: Using a Market Basket Analysis

    Tableau Search On "Cross Join"

    FAQ:  Open & Close Dates

     

    -