1

I'm running Spark on EKS with Fargate. I don't quite understand why the memory of the Spark executor is always different from my expectations. When I set:

spark.executor.memory = 6g
spark.executor.memoryOverhead = 0.10
spark.memory.offHeap.size = 0

the resource limit of the Spark executor pod is actually 8601Mi. I don't understand why this is the case. (6144*0.10)+6144=6,758.4. What else is occupying 1,842.6M of memory?

If I change the spark-submit parameters to:

spark.executor.pyspark.memory = 1g
spark.executor.memory = 6g
spark.executor.memoryOverhead = 0.10
spark.memory.offHeap.size = 0

then the resource limit of the Spark executor pod is 9625Mi. 9625 - ((6144*0.10)+6144+1024) = 1,842.6.

The larger the memory value of the Spark executor is, the greater the difference will be.

1 Answer 1

0

This is because for Spark on Kubernetes non-JVM tasks, the spark.executor.memoryOverhead default value is 0.4.

From the Spark on Kubernetes documentation, "Configuration" section:

This sets the Memory Overhead Factor that will allocate memory to non-JVM memory, which includes off-heap memory allocations, non-JVM tasks, various systems processes, and tmpfs-based local directories when spark.kubernetes.local.dirs.tmpfs is true. For JVM-based jobs this value will default to 0.10 and 0.40 for non-JVM jobs. This is done as non-JVM tasks need more non-JVM heap space and such tasks commonly fail with "Memory Overhead Exceeded" errors. This preempts this error with a higher default. This will be overridden by the value set by spark.driver.memoryOverheadFactor and spark.executor.memoryOverheadFactor explicitly.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.