Dogukan UluData Engineering End-to-End Project — PostgreSQL, Airflow, Docker, PandasRepositorySep 19, 202311Sep 19, 202311
Simardeep SinghBuilding a Data Streaming Pipeline: Leveraging Kafka, Spark, Airflow, and DockerIn our rapidly evolving digital age, data engineering has emerged as the backbone of the modern data-driven world. We’re surrounded by an…Nov 6, 20239Nov 6, 20239
InData Engineer ThingsbyStephen David-WilliamsData pipelines with Python and SQL — Part 1What data pipelines are and how Python & SQL are involvedSep 1, 20235Sep 1, 20235
InDev GeniusbyHaq NawazSetup PySpark locally & build your first ETL pipeline with PySparkUsing Python, Jupyter NotebookOct 2, 20223Oct 2, 20223
AGKETL Pipeline using AWS DataPipeline, EMR, S3 and PySparkWhen I was learning AWS DataPipeline, I didn’t find much resources to create AWS Data Pipeline using Pipeline Definition json and…Sep 30, 2022Sep 30, 2022
InITNEXTbyRamses Alexander Coraspe ValdezBuilding Real-time communication with Apache Spark through Apache LivyDockerizing and Consuming an Apache Livy environmentJun 12, 20221Jun 12, 20221
InTDS ArchivebyEdwin TanHow to Test PySpark ETL Data PipelineValidate big data pipeline with Great ExpectationsDec 6, 20221Dec 6, 20221
InTDS ArchivebyNicholas LeongData Engineering — How to Build a Gmail Data Pipeline on Apache AirflowHack your Gmail InboxJul 8, 20191Jul 8, 20191