It is suggested that you use python-build-standlone instead of conda-pack to build portable Python environments. Please refer to Packaging Python Dependencies for PySpark Using Python-Build-Standalone for more details.
-
All packages in a virtual environment must be managed by conda (rather than pip) so that it can be packe using conda-pack.
-
When using a conda-pack virtual environment with PySpark, the Python package
pyyspark
comes with Spark is automatically injected into PYTHONPATH so that users do not have to installpyspark
into the virtual environemnt by themselves. As a matter of fact, thepyspark
comes with Spark is always used even if you have a local copy installed when you submit a PySpark application with a conda-pack virtual environment. For more discussions, please refer to this isue.
References
Pack a Conda Virtual Environment
https://conda.github.io/conda-pack/
https://conda.github.io/conda-pack/cli.html