Acing Apache Spark Senario-based Question Series-5 using PySpark Dataframes

Apr 28, 2023

Joining two tables Vetrically when no common column is involved in Pyspark using monotonically_increasing_id() & zipwithIndex() functions.

In this senario we will discuss how to concatenate two tables vertically without having a common join column in different ways using pyspark.

As per the databricks documentation i have able to found only two ways using inbuilt functions

Simply will try to decipt the Input and required Output dataframes:-

Lets check the code …

This Code might look Clumsy but it serves the purpose.

Note:- If anyone has a better approach to generalizing this code happy to embed it in my script.

That’s all for now…Happy Learning….

Please do clap and Subscribe to my profile…Don’t forget to Comment…

Written by Sairamdgr8 -- An Aspiring Full Stack Data Engineer