Pyspark on windows 10

6/20/2023

0Ĭreate a symbolic link: $ ln -s /opt/spark-2.2.0 /opt/spark̀įinally, tell your bash (or zsh, etc.) where to find spark. Unzip it and move it to your /opt folder: $ tar -xzf spark- 2.2. If you want Hive support or more fancy stuff you will have to build your spark distribution by your own -> Build Spark. Select the latest Spark release, a prebuilt package for Hadoop, and download it directly. Of course, you will also need Python (I recommend > Python 3.5 from Anaconda). Make sure you have Java 8 or higher installed on your computer. The $ symbol will mean run in the shell (but don't copy the symbol). I will assume you know what Apache Spark is, and what PySpark is too, but if you have questions don't mind asking them bellow.

This tutorial will only work on Windows >= 10. This may be repetitive for some users, but I found that is a little difficult to get started with Apache Spark (this will focus on PySpark) on your local machine for most people.

0 Comments

Pyspark on windows 10

Leave a Reply.

Author

Archives

Categories