Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Work With Multiple Spark Installations

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

spark-submit and spark-shell

Overwrite the PATH environment variable before invoking spark-submit and/or spark-shell often resolves the issue.

Spark in Jupyter/Lab Notebooks

Remove or reset the environment variable HADOOP_CONF_DIR resolves the issue.

:::python
import os
os.environ["HADOOP_CONF_DIR"] = ""
import findspark
findspark.init("/opt/spark-3.1.1-bin-hadoop3.2/")
from pyspark.sql import SparkSession, DataFrame
spark = SparkSession.builder.appName("PySpark_Notebook") \
    .enableHiveSupport().getOrCreate()
...