SQL to Alteryx

Reference comparison between Alteryx functions and SQL and pandas. My perspective on the functionality of these tools.

SQL to Alteryx

The idea of this post is to map constructs from SQL to their equivalent node in Alteryx.

SQL Alteryx Remarks
WHERE Filter In Alteryx, we have access to both matched and unmatched datasets. By match we mean where the condition was met.
ORDER BY Sort
WINDOW FUNCTIONS LEAD LAG Multi-row formula
Forumla such as Multiplication Formula
Formual among multiple fields Multi-field formula In SQL we'd reference each field individually. In Alteryx, we use the expression builder
ROW_NUMBER Record ID
DISTINCT Unique In Alteryx, we are also provided access to the non-unique values as output. Calculating non-unique values in SQL would require another step of subtracting the output of DISTINCT from the original dataset.
LIMIT Sample In Alteryx, the Sample tool also provides other functionality that would require additional steps in SQL. For e.g. Last N, 1 of every N, first N% etc. It also allows grouping, so features like Group by a field and then take the first N rows of each are performed in a single click.
Unknown Random % Sample Random sampling by number or percentage
Not Present Auto Field This detects the optimal field types and lengths. This usually takes a long time to do in Python. Pandas does make it possible to autodetect by setting a parameter when reading in a DataFrame from a file, but it takes a lot longer, so the Alteryx implementation is pretty optimized
NVL and CASE along with aggregate functions Imputation
GROUP BY and Unknown Tile This is a much more advance tool and performs a lot of different grouping functionality that is not natively supported by SQL or would take multiple steps to achieve. This essentially performs a lot of common clustering operations with ease.
Unknown Generate Rows This allows generation of data and looks like a visual representation of a for loop in procedural programming with the addition of using database rows as input for concatentation.

I can see, that through splitting data output to both matched/unmatched, Alteryx is trying to reduce the number of steps a human would need to do by anticipating that ahead of time.