Things on this page are fragmentary and immature notes/thoughts of the author. Please read with your own judgement!
Please refer to Spark Issue: _Pickle.Picklingerror: Args[0] from Newobj Args Has the Wrong Class for a similar serialization issue in PySpark.
Error Message¶
org.apache.spark.SparkException: Job aborted due to stage failure: Task not serializable: java.io.NotSerializableException: ...
Possible Causes¶
Some object sent to works from the driver is not serializable.
Solutions¶
Don’t send the non-serializable object to workers.
Use a serializable version if you do want to send the object to workders.