I am running a Jenkins Controller in kubernetes. I have noticed that the controller has been restarting ALOT.
kgp jkmaster-0
NAME READY STATUS RESTARTS AGE
jkmaster-0 1/1 Running 8 30m
The memory allocation to the pod is as follows
Limits:
memory: 2500M
Requests:
cpu: 300m
memory: 1G
As long as the controller is idle, I dont see any spikes occurring. But as soon as I start spawning jobs, I notice that there are spikes and each spike results in a OOMError and a restart happens
kgp jkmaster-0
NAME READY STATUS RESTARTS AGE
jkmaster-0 0/1 OOMKilled 3 3h8m
Inorder to look into this further, I would like to generate a Heap Dump. So what I am done is to add the following
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/srv/jenkins/
to JAVA_OPTS. I am expecting that the next time when the jenkins controller hits OOM, it should generate a Heapdump with under /srv/jenkins/ but there is none. Any idea if there is something I have missed ?
There is no file of the type java_pid.hprof under /srv/jenkins/ after a restart.
All JAVA_OPTS
JAVA_OPTS: -Djava.awt.headless=true -XX:InitialRAMPercentage=10.0 -XX:MaxRAMPercentage=60.0 -server -XX:NativeMemoryTracking=summary -XX:+UseG1GC -XX:+ExplicitGCInvokesConcurrent -XX:+ParallelRefProcEnabled -XX:+UseStringDeduplication \
-XX:+UnlockDiagnosticVMOptions -XX:G1SummarizeRSetStatsPeriod=1 -XX:+PrintFlagsFinal -Djenkins.install.runSetupWizard=false -Dhudson.DNSMultiCast.disabled=true \
-Dhudson.slaves.NodeProvisioner.initialDelay=5000 -Dsecurerandom.source=file:/dev/urandom \
-Xlog:gc:file=/srv/jenkins/gc-%t.log -Xlog:gc*=debug -XX:+AlwaysPreTouch -XX:+DisableExplicitGC \
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/srv/jenkins/ -Dhudson.model.ParametersAction.keepUndefinedParameters=true -Dhudson.model.DownloadService.noSignatureCheck=true

/srv/jenkinsto ahostPath:or a PVC, it is very likely the Pod bounce is resetting the root FS in your container/srv/jenkinsis mounted on a PVC.kubeletdoes to your container, and not something that the JVM does to itself. That process waskill -9-ed (in fact, I don't know of any "warning shot" k8s offers the Pod); if you are interested in having the JVM participate in the OOM triage, you'll want to lower the Xmx below the Pod's resource boundary, so the JVM exhausts itself before k8s steps in with a more violent outcome