SQL to Alteryx

Reference comparison between Alteryx functions and SQL and pandas. My perspective on the functionality of these tools.

The idea of this post is to map constructs from SQL to their equivalent node in Alteryx.

SQL	Alteryx	Remarks
WHERE	Filter	In Alteryx, we have access to both matched and unmatched datasets. By match we mean where the condition was met.
ORDER BY	Sort
WINDOW FUNCTIONS LEAD LAG	Multi-row formula
Forumla such as Multiplication	Formula
Formual among multiple fields	Multi-field formula	In SQL we'd reference each field individually. In Alteryx, we use the expression builder
ROW_NUMBER	Record ID
DISTINCT	Unique	In Alteryx, we are also provided access to the non-unique values as output. Calculating non-unique values in SQL would require another step of subtracting the output of DISTINCT from the original dataset.
LIMIT	Sample	In Alteryx, the Sample tool also provides other functionality that would require additional steps in SQL. For e.g. Last N, 1 of every N, first N% etc. It also allows grouping, so features like Group by a field and then take the first N rows of each are performed in a single click.
Unknown	Random % Sample	Random sampling by number or percentage
Not Present	Auto Field	This detects the optimal field types and lengths. This usually takes a long time to do in Python. Pandas does make it possible to autodetect by setting a parameter when reading in a DataFrame from a file, but it takes a lot longer, so the Alteryx implementation is pretty optimized
NVL and CASE along with aggregate functions	Imputation
GROUP BY and Unknown	Tile	This is a much more advance tool and performs a lot of different grouping functionality that is not natively supported by SQL or would take multiple steps to achieve. This essentially performs a lot of common clustering operations with ease.
Unknown	Generate Rows	This allows generation of data and looks like a visual representation of a for loop in procedural programming with the addition of using database rows as input for concatentation.

I can see, that through splitting data output to both matched/unmatched, Alteryx is trying to reduce the number of steps a human would need to do by anticipating that ahead of time.