
The above snapshot shows a simple join in Alteryx along with a description provided by the Alteryx Designer. Joining both datasets (colored in green) shows you that we are going to join both datasets by a certain ‘ON’ criteria of dimensions. Going to the example I discuss in the video, I joined both csv files by playerID; this ID serves as the unique identifier to join selected dimensions to the left (L) table. The join tool (colored in purple) ingests the datasets and executes the join. What is outputted are three distinct dataset cohorts: L , J , and R.
- L: Data from the L node or what I like to call the L dataset; this cohort of data is info from the left dataset that did not join based on the ‘ON’ criteria of dimensions that were noted in the join execution
- J: Data in the J node or what I like to call the J dataset; this cohort of data is info from both datasets that joined accurately based on the ‘ON’ criteria of dimensions that were noted in the join execution
- R: Data in the R node or what I like to call the R dataset; this cohort of data is info from the right dataset that did not join based on the ‘ON’ criteria of dimensions that were noted in the join execution
With the supplemental use of the Union tool; you can union cohorts of data to mirror the output that you would expect from a conventional SQL join. For example, a SQL left join returns data observations that joined via execution along with the records in the base table (left table, base table or also recognized as table a). In coordination with a union tool, you can union results from the L node and J node to return a single dataset. The above goes off a simple left join in theory; but check out the below image (grabbed off the net) to visually explain the other types of joins via SQL.

We also look at some other useful tools like Append and Find and Replace to show how these tools can execute some really powerful data manipulation functions in your workflow. While we’re only working with datasets in the thousands; these types of tools can be very powerful as you starting working with databases into the millions of records.
I’m hoping this quick walk-through helps you through your data journey.
Take a look at the walk-through video below:
