Skip to content

SOCKServer freeze after some create #98

@GindaChen

Description

@GindaChen

The following will make the SOCK server freeze indefinitely:

> ol worker -d -o limits.installer_mem_mb=250,server_mode="sock",mem_pool_mb=500

# Create a sandbox from echo a few times...
# /path/to/echo refers to the test-registry/echo
> curl -X POST http://localhost:5000/create -d '{"parent": "", "leaf": true, "code": "/path/to/echo"}'
> curl -X POST http://localhost:5000/create -d '{"parent": "", "leaf": false, "code": "/path/to/echo"}'

Then do

> curl -X POST http://localhost:5000/create -d '{"parent": "1", "leaf": false, "code": "/path/to/echo"}'
> curl -X POST http://localhost:5000/create -d '{"parent": "1", "leaf": true, "code": "/path/to/echo"}'

Both create will fail. The worker freeze with log

2020/03/04 20:35:44 POST /create
2020/03/04 20:35:44 Parsed Args: map[code:/root/SOCKexp/use-sock/test-registry/echo leaf:false parent:1]
2020/03/04 20:35:44 <sandboxes>.Create(<SB 1>, false, /root/SOCKexp/use-sock/test-registry/echo, /root/SOCKexp/use-sock/default-ol/worker/scratch/dir-1008, <installs=[], imports=[], mem-limit-mb=50>)=8... [SOCK POOL sandboxes]

Then if we try to kill the worker, the worker is not getting properly killed. It freezes after the logs:

^C2020/03/04 20:33:00 received kill signal, cleaning up
2020/03/04 20:33:00 Destroy() [SB 1]
Traceback (most recent call last):
Traceback (most recent call last):
  File "sock2.py", line 175, in <module>
  File "sock2.py", line 175, in <module>
Traceback (most recent call last):
Traceback (most recent call last):
  File "sock2.py", line 175, in <module>
  File "sock2.py", line 175, in <module>
    main()
  File "sock2.py", line 171, in main
    main()
  File "sock2.py", line 171, in main
    start_container()
  File "sock2.py", line 136, in start_container
    main()
  File "sock2.py", line 171, in main
    start_container()
  File "sock2.py", line 136, in start_container
    main()
  File "sock2.py", line 171, in main
    exec(code)
  File "<string>", line 1, in <module>
    start_container()
  File "sock2.py", line 136, in start_container
    exec(code)
  File "<string>", line 1, in <module>
  File "sock2.py", line 52, in web_server
    start_container()
  File "sock2.py", line 136, in start_container
    exec(code)
  File "<string>", line 1, in <module>
  File "sock2.py", line 52, in web_server
    tornado.ioloop.IOLoop.instance().start()
  File "/usr/local/lib/python3.6/dist-packages/tornado/ioloop.py", line 863, in start
  File "sock2.py", line 63, in fork_server
    tornado.ioloop.IOLoop.instance().start()
    exec(code)
  File "/usr/local/lib/python3.6/dist-packages/tornado/ioloop.py", line 863, in start
  File "<string>", line 1, in <module>
    client, info = file_sock.accept()
  File "/usr/lib/python3.6/socket.py", line 205, in accept
  File "sock2.py", line 63, in fork_server
    client, info = file_sock.accept()
  File "/usr/lib/python3.6/socket.py", line 205, in accept
    fd, addr = self._accept()
KeyboardInterrupt
    event_pairs = self._impl.poll(poll_timeout)
KeyboardInterrupt
    fd, addr = self._accept()
    event_pairs = self._impl.poll(poll_timeout)
KeyboardInterrupt
KeyboardInterrupt
2020/03/04 20:33:00 ...returns <SB 5>, <nil> [SOCK POOL sandboxes]
2020/03/04 20:33:00 Save ID '5' to map
2020/03/04 20:33:00 parent.fork returned connection refused [SOCK POOL sandboxes]
2020/03/04 20:33:00 Destroy() [SB 6]
2020/03/04 20:33:00 CG ref count decremented to 0 [SOCK 6]
2020/03/04 20:33:00 parent.fork returned connection refused [SOCK POOL sandboxes]
2020/03/04 20:33:00 Destroy() [SB 7]
2020/03/04 20:33:00 CG ref count decremented to 1 [SOCK 1]
2020/03/04 20:33:00 Destroy() [SB 2]
2020/03/04 20:33:00 CG ref count decremented to 0 [SOCK 7]
2020/03/04 20:33:00 killed PIDs [] in CG [SOCK 6]
2020/03/04 20:33:00 unmount and remove dirs [SOCK 6]
2020/03/04 20:33:00 CG ref count decremented to 0 [SOCK 2]
2020/03/04 20:33:00 killed PIDs [] in CG [SOCK 7]
2020/03/04 20:33:00 unmount and remove dirs [SOCK 7]
2020/03/04 20:33:00 waiting for 1 procs in cg-2 to die [CGROUP POOL default-ol-sandboxes]
2020/03/04 20:33:00 ...returns <nil>, Fork from parent Sandbox failed [SOCK POOL sandboxes]
2020/03/04 20:33:00 Request Handler Failed: Fork from parent Sandbox failed
2020/03/04 20:33:00 ...returns <nil>, Fork from parent Sandbox failed [SOCK POOL sandboxes]
2020/03/04 20:33:00 Request Handler Failed: Fork from parent Sandbox failed
2020/03/04 20:33:00 killed PIDs [28427] in CG [SOCK 2]
2020/03/04 20:33:00 unmount and remove dirs [SOCK 2]
2020/03/04 20:33:01 Destroy() [SB 3]
2020/03/04 20:33:01 CG ref count decremented to 0 [SOCK 3]
2020/03/04 20:33:01 killed PIDs [] in CG [SOCK 3]
2020/03/04 20:33:01 unmount and remove dirs [SOCK 3]
2020/03/04 20:33:01 Destroy() [SB 4]
2020/03/04 20:33:01 CG ref count decremented to 0 [SOCK 4]
2020/03/04 20:33:01 killed PIDs [] in CG [SOCK 4]
2020/03/04 20:33:01 unmount and remove dirs [SOCK 4]
2020/03/04 20:33:01 make sure all memory is free [SOCK POOL sandboxes]

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions