Mercurial > p > roundup > code
comparison roundup/backends/sessions_dbm.py @ 6565:2c2dbfc332ba
Try to handle multiple connections better.
The session database is a hot spot. When multiple requests (e.g. 20)
come in at the same time session database contention can get great.
The original code didn't retry session database access when the open
failed. This resulted in errors at the client.
The second pass delayed 0.01 seconds and retried. It was better but we
still had multiple second stalls. I think the first request got in,
everybody else backed up and then retried at the same time. Again they
stepped on each other. With logging I would see many counters go all
the way to low single digits or to -1 indicating falure.
This pass uses randomint to generate delays from 0-.125 seconds in 5ms
increments. This performs better in testing. I rarely saw a counter
less than 13 (2 failed retries). Current logging starts after 6
failures and counts down until success or failure.
| author | John Rouillard <rouilj@ieee.org> |
|---|---|
| date | Thu, 16 Dec 2021 20:02:00 -0500 |
| parents | bef1e42be04c |
| children | b4d0b48b3096 |
comparison
equal
deleted
inserted
replaced
| 6564:21c7c2041a4b | 6565:2c2dbfc332ba |
|---|---|
| 4 Yes, it's called "sessions" - because originally it only defined a session | 4 Yes, it's called "sessions" - because originally it only defined a session |
| 5 class. It's now also used for One Time Key handling too. | 5 class. It's now also used for One Time Key handling too. |
| 6 """ | 6 """ |
| 7 __docformat__ = 'restructuredtext' | 7 __docformat__ = 'restructuredtext' |
| 8 | 8 |
| 9 import os, marshal, time | 9 import os, marshal, time, logging, random |
| 10 | 10 |
| 11 from roundup.anypy.html import html_escape as escape | 11 from roundup.anypy.html import html_escape as escape |
| 12 | 12 |
| 13 from roundup import hyperdb | 13 from roundup import hyperdb |
| 14 from roundup.i18n import _ | 14 from roundup.i18n import _ |
| 130 | 130 |
| 131 # open the database with the correct module | 131 # open the database with the correct module |
| 132 dbm = __import__(db_type) | 132 dbm = __import__(db_type) |
| 133 | 133 |
| 134 retries_left = 15 | 134 retries_left = 15 |
| 135 logger = logging.getLogger('roundup.hyperdb.backend.sessions') | |
| 135 while True: | 136 while True: |
| 136 try: | 137 try: |
| 137 handle = dbm.open(path, mode) | 138 handle = dbm.open(path, mode) |
| 138 break | 139 break |
| 139 except OSError: | 140 except OSError as e: |
| 140 # Primarily we want to catch and retry: | 141 # Primarily we want to catch and retry: |
| 141 # [Errno 11] Resource temporarily unavailable retry | 142 # [Errno 11] Resource temporarily unavailable retry |
| 142 # FIXME: make this more specific | 143 # FIXME: make this more specific |
| 144 if retries_left < 10: | |
| 145 logger.warning('dbm.open failed, retrying %s left: %s'%(retries_left,e)) | |
| 143 if retries_left < 0: | 146 if retries_left < 0: |
| 144 # We have used up the retries. Reraise the exception | 147 # We have used up the retries. Reraise the exception |
| 145 # that got us here. | 148 # that got us here. |
| 146 raise | 149 raise |
| 147 else: | 150 else: |
| 148 # delay retry a bit | 151 # stagger retry to try to get around thundering herd issue. |
| 149 time.sleep(0.01) | 152 time.sleep(random.randint(0,25)*.005) |
| 150 retries_left = retries_left - 1 | 153 retries_left = retries_left - 1 |
| 151 continue # the while loop | 154 continue # the while loop |
| 152 return handle | 155 return handle |
| 153 | 156 |
| 154 def commit(self): | 157 def commit(self): |
