Loading

Quipoin Menu

Learn • Practice • Grow

pyspark / Installing PySpark
tutorial

Installing PySpark

PySpark can be installed on your local machine for learning. This chapter covers installation on Windows, Mac, and Linux, along with setting up a virtual environment.

Prerequisites

  • Python 3.8 or later.
  • Java 8 or later (Spark runs on the JVM).
  • 8+ GB RAM recommended for local testing.

Installing Java

Download and install OpenJDK from adoptium.net. Verify:
java --version

Create and Activate Virtual Environment

python -m venv pyspark-env
source pyspark-env/bin/activate # Mac/Linux
pyspark-envScriptsactivate # Windows

Install PySpark

pip install pyspark
This installs Spark and PySpark together.

Verify Installation

Run Python and import PySpark:
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName("Test").getOrCreate()
print(spark.version)
If no errors, PySpark is ready.


Two Minute Drill
  • Install Java (OpenJDK) before PySpark.
  • Use a virtual environment.
  • Install PySpark with `pip install pyspark`.
  • Test with `SparkSession.builder.getOrCreate()`.

Need more clarification?

Drop us an email at career@quipoinfotech.com