Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Spark Issue: _Pickle.Picklingerror: Args[0] from __Newobj__ Args Has the Wrong Class

Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!

Please refer to Spark Issue: Task Not Serializable for a similar serialization issue in Spark/Scala.

Symptom

Cause

For example, if you have the following import

from nltk.corpus import stopwords

then calling the following in UDF or pandas UDFs might cause this issue.

stopwords.words("english")

Solution

Simply move stopwords.words("english") out of UDFs and/or pandas UDFs to define a global variable.

References

关于python:Spark-Submit出现“ Pickling错误”“ _pickle.PicklingError:newobj args中的args [0]具有错误的类”

_pickle.PicklingError: args[0] from newobj args has the wrong class from cloudpickle.py