![]() Tasks do not move data from one to the other (though tasks can exchange metadata!)”.Īirbnb developed Airflow in 2014, it was made available as a free tool in 2015, and it was donated to the Apache Foundation in the following year. It connects and organizes tasks that manage data, and is not a data streaming tool as mentioned on the official Airflow website: “ Airflow is not a data streaming solution. This software is an open-source data orchestrator tool allowing to build full end-to-end pipelines by connecting several processes in Directed Acyclic Graphs ( DAGs). Introduction to Airflow and Docker 1.1 Apache AirflowĪpache Airflow is one of the most known tools in the data engineering world therefore I will not take long to explain it. For more information about the full version, I advise you to see the Data Engineering Zoomcamp mentioned above and this article (in Portuguese) by Leandro Bueno. The “full” proposed version to run Airflow inside a Docker container is highly resource-intensive, and hence pushes a lot of one computer/laptop (the cooling fan of my laptop was always ON). This is not a tutorial about Airflow or Docker but an explanation on how to set up a less demanding version of Docker environment to run Airflow locally.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |