changeset 8543:1ffa1f42e1da

refactor: rework mime type comparison and clean code rest.py: accept application/* as match for application/json in non /binary_context rest path. allow defining default mime type to return when file/message is missing mime type. Make it a class variable to it can be changed from text/plain to text/markdown or whatever. extract code from determine_output_format() to create create_valid_content_types() method which returns a list of matching mime types for a given type/subtype. Eliminate mostly duplicate return statements by introducing a variable to specify valid mime types in error message. rest_common.py: Fix error messages that now return application/* as valid mime type. CHANGES.txt upgrading.txt rest.txt: top level notes and corrections. Also correct rst syntax on earlier change.
author John Rouillard <rouilj@ieee.org>
date Tue, 24 Mar 2026 21:30:47 -0400
parents a4f017ae1477
children e738377b4ffe
files CHANGES.txt doc/rest.txt doc/upgrading.txt roundup/rest.py test/rest_common.py
diffstat 5 files changed, 160 insertions(+), 70 deletions(-) [+]
line wrap: on
line diff
--- a/CHANGES.txt	Tue Mar 24 16:56:38 2026 -0400
+++ b/CHANGES.txt	Tue Mar 24 21:30:47 2026 -0400
@@ -65,7 +65,10 @@
 - fix bug in 2.5.0 where roundup-admin import (or importtable) fails
   to properly set the next available id for the class. (John Rouillard
   broke it and fixed it 8-))
-  
+- refactor mime detection/handling in the rest interface. Better
+  supports some mime types, ads default mime type for files without a
+  mime type (e.g. message contents). Cleaner code. (John Rouillard)
+
 Features:
 
 - add support for authorized changes. User can be prompted to enter
--- a/doc/rest.txt	Tue Mar 24 16:56:38 2026 -0400
+++ b/doc/rest.txt	Tue Mar 24 21:30:47 2026 -0400
@@ -945,12 +945,13 @@
 representation of the binary data.
 
 If you want it returned with a ``Content-Type: image/png`` header,
-you can use ``image/png`` or ``*/*`` in the Accept header.
+you can use ``image/png``, ``image/*``, or ``*/*`` in the Accept header.
 
 For message files, you can use
 ``https://.../demo/rest/data/msg/23/binary_content`` with ``Accept:
 application/octet-stream; q=0.5, application/json; q=0.4, image/png;
-q=0.495, text/*``. It will return the plain text of the message.
+q=0.495, text/*``. It will return the plain text (text/plain mime
+type) of the message.
 
 Most message files are not stored with a mime type. Getting
 ``https://.../demo/rest/data/msg/23/type`` returns::
@@ -972,10 +973,16 @@
 can be used to access any text style mime type (``text/plain``,
 ``text/x-rst``, ``text/markdown``, ``text/html``, ...) or an empty
 mime type. If the item's type is not empty, it will be used as the
-Content-Type (similar to ``*/*``). Otherwise ``text/*`` will be the
-Content-Type. If your tracker supports markup languages
-(e.g. markdown), you should set the mime type (e.g. ``text/markdown``)
-when storing your message.
+Content-Type (similar to ``*/*``). Otherwise the default mime type
+``text/plain`` will be the Content-Type. The default mime type can be
+changed using ``interfaces.py`` in your tracker. So you can change the
+default ``text/plain`` to ``text/markdown`` using::
+
+   from roundup.rest import RestfulInstance
+   RestfulInstance.default_text_file_mimetype = "text/markdown"
+
+If your tracker supports markup languages (e.g. markdown), you should
+set the mime type (e.g. ``text/markdown``) when storing your message.
 
 Note that the header ``X-Content-Type-Options: nosniff`` is returned
 with a non javascript or xml binary_content response to prevent the
--- a/doc/upgrading.txt	Tue Mar 24 16:56:38 2026 -0400
+++ b/doc/upgrading.txt	Tue Mar 24 21:30:47 2026 -0400
@@ -255,7 +255,7 @@
 newer.
 
 To add this to your tracker, change the ``page.html`` (for TAL
-based trackers) or ``layout/navigation.html (for jinja2 trackers).
+based trackers) or ``layout/navigation.html`` (for jinja2 trackers).
 
 For TAL trackers, replace the ``required`` parameter by finding
 the following password input in the tracker's ``html/page.html``
@@ -279,6 +279,29 @@
 
   <input class="form-control form-control-sm" spellcheck="false" type="password" name="__login_password" placeholder='password' {{ "required" if not db.config.WEB_LOGIN_EMPTY_PASSWORDS }}>
 
+REST interface changes (info)
+-----------------------------
+
+The default type of a message file (the msg class) is now
+``text/plain``.  This can be change to a different mime type using
+`the interfaces.py file`_. See `the rest documentation on getting file
+content for details <rest.html#getting-message-and-files-content>`_.
+
+Roundup supports a wildcard mime type. So ``text/*`` would match any
+text resource. The response to this request should have a full mime
+type like ``text/plain`` or ``text/csv`` etc. However Roundup used to
+return ``text/*`` which makes it difficult for the client to interpret
+the data. This release should do a better job of returning a proper
+subtype replacing the ``*``.
+
+The mime type of ``application/*`` resolves to ``application/json`` by
+default. Using it before resulted in an error.
+
+Malformed mime types assigned to files/messages could cause a crash,
+now they are just ignored.
+
+There have been some internal refactorings and improvements in the
+REST code that will make it a bit faster.
 
 .. index:: Upgrading; 2.4.0 to 2.5.0
 
--- a/roundup/rest.py	Tue Mar 24 16:56:38 2026 -0400
+++ b/roundup/rest.py	Tue Mar 24 21:30:47 2026 -0400
@@ -441,6 +441,7 @@
     __default_patch_op = "replace"  # default operator for PATCH method
     __accepted_content_type = {
         "application/json": "json",
+        "application/*": "json"     # json is preferred over application/xml
     }
     __default_accept_type = "json"
 
@@ -449,6 +450,11 @@
 
     api_version = None
 
+    # to allow override from interfaces.py.
+    # Used for msg class. Could
+    # be useful to set to text/markdown.
+    default_text_file_mimetype = "text/plain"
+
     # allow 10M row response - can change using interfaces.py
     # limit is 1 less than this size.
     max_response_row_size = 10000001
@@ -2227,6 +2233,78 @@
             # human may read it
         ), None)
 
+    def create_valid_content_types(self, mime_type):
+        """Return a list of mime types matching the submitted mime_type.
+
+           For a MIME type of type/subtype return a list that matches:
+
+              type/subtype
+              type/*
+              application/octet-stream
+              */*
+
+           If mime_type for the file is empty return:
+
+               ["text/plain", "text/*", "application/octet-stream", */*]
+
+           Issue messages (msg) often are missing a mime type. So we
+           treat them text/plain. text/plain is the value of
+           RestfulInterface.default_text_file_mimetype. This can be
+           changed.
+
+           The first mime type in the list is usually used as the
+           Content-Type in the http response.
+
+        """
+        valid_binary_content_types = []
+
+        if mime_type:
+            # strip off MIME parameters. We may do param matching
+            # in the future but not now.
+            mime_type, *_more = mime_type.split(';')
+            # remove spaces in case they snuck in.
+            mime_type = mime_type.strip()
+            # Put the mime_type first. If the user is requesting a
+            # specific type/subtype we will match on first item.
+            # Also first item is returned as Content-type/accept_type.
+            valid_binary_content_types.append(mime_type)
+
+            # For mime type main_type/subtype add main_type/*
+            try:
+                (main_type, _subtype) = mime_type.split('/', 2)
+            except ValueError:
+                # malformed mime type, 0, 2+ '/' chars
+                # don't add anything
+                pass
+            else:
+                valid_binary_content_types.append(main_type + "/*")
+
+        if not mime_type:
+            # allow text/* as msg items can have empty type field
+            # also match text/* for text/plain, text/x-rst,
+            # text/markdown, text/html etc.
+            #
+            # Ideally we would know the subtype.
+            #
+            # text/plain works for most messages, but
+            # text/markdown could be valid for a modified tracker.
+            #
+            # Value can be overridden by interfaces.py, but consider
+            # adding config setting for the default file mime type.
+            valid_binary_content_types.append(self.default_text_file_mimetype)
+            valid_binary_content_types.append("text/*")
+
+        # Octet-stream should be allowed for any content.
+        # If mime_type unparsable, the file will get the
+        # Content-Type of application/octet-stream as this will
+        # be the first element of the returned list.
+        valid_binary_content_types.append("application/octet-stream")
+
+        # */* should be allowed for any content.
+        valid_binary_content_types.append("*/*")
+
+        return valid_binary_content_types
+
     def determine_output_format(self, uri):
         """Returns tuple of:
 
@@ -2287,10 +2365,10 @@
                         _("Content type '%(requested)s' requested in URL is "
                           "not available.\nAcceptable types: "
                           "%(acceptable)s\n") %
-                        { "requested": ext_type,
-                          "acceptable": ", ".join(sorted(
-                              set(self.__accepted_content_type.values())))}))
-            
+                        {"requested": ext_type,
+                         "acceptable": ", ".join(sorted(
+                             set(self.__accepted_content_type.values())))}))
+
             # strip extension so uri makes sense.
             # E.G. .../issue.json -> .../issue
             uri = uri[:-(len(ext_type) + 1)]
@@ -2317,12 +2395,12 @@
             return (self.__default_accept_type, uri, None)
 
         accept_type = ""
-        valid_binary_content_types = []
+        valid_binary_content_types = None
         if uri.endswith("/binary_content"):
             request_path = uri
             request_class, request_id = request_path.split('/')[-3:-1]
             try:
-                designator_type = self.db.getclass(
+                declared_type = self.db.getclass(
                     request_class).get(request_id, "type")
             except (KeyError, IndexError):
                 # class (KeyError) or
@@ -2331,28 +2409,14 @@
                 # The 400/404 error will be thrown by other code.
                 return (None, uri, None)
 
-            if designator_type:
-                # put this first as we usually require exact mime
-                # type match and this will be matched most often.
-                # Also for text/* Accept header it will be returned.
-                valid_binary_content_types.append(designator_type)
-
-            if not designator_type or designator_type.startswith('text/'):
-                # allow text/* as msg items can have empty type field
-                # also match text/* for text/plain, text/x-rst,
-                # text/markdown, text/html etc.
-                valid_binary_content_types.append("text/*")
-
-            # Octet-stream should be allowed for any content.
-            # client.py sets 'X-Content-Type-Options: nosniff'
-            # for file downloads (sendfile) via the html interface,
-            # so we should be able to set it in this code as well.
-            valid_binary_content_types.append("application/octet-stream")
+            valid_binary_content_types = self.create_valid_content_types(
+                declared_type)
 
         for part in accept_header:
             if accept_type:
                 # we accepted the best match, stop searching for
-                # lower quality matches.
+                # lower quality matches. A break happens in every clause
+                # that sets accept_type. This is just a safety.
                 break
 
             # check for structured rest return types (json xml)
@@ -2384,25 +2448,24 @@
                                part[1]['version'])
                     return (None, uri,
                            self.error_obj(406, msg))
+                break
 
-            if part[0] == "*/*":
-                if valid_binary_content_types:
-                    self.client.setHeader("X-Content-Type-Options", "nosniff")
-                    accept_type = valid_binary_content_types[0]
-                else:
+            # client.py sets 'X-Content-Type-Options: nosniff'
+            # for file downloads (sendfile) via the html interface,
+            # so we should be able to set it in this code as well.
+            if not valid_binary_content_types:
+                if part[0] == "*/*":
                     accept_type = "json"
+                    break
 
             # check type of binary_content
-            if part[0] in valid_binary_content_types:
+            elif part[0] in valid_binary_content_types:
                 self.client.setHeader("X-Content-Type-Options", "nosniff")
-                accept_type = part[0]
-            # handle text wildcard
-            if ((part[0] in 'text/*') and
-                "text/*" in valid_binary_content_types):
-                self.client.setHeader("X-Content-Type-Options", "nosniff")
-                # use best choice of mime type, try not to use
-                # text/* if there is a real text mime type/subtype.
-                accept_type = valid_binary_content_types[0]
+                # try to return most exact MIME type if request
+                # includes wildcard.
+                accept_type = part[0] if part[0].find('*') == -1 \
+                    else valid_binary_content_types[0]
+                break
 
         # accept_type will be empty only if there is an Accept header
         # with invalid values.
@@ -2410,27 +2473,21 @@
             return (accept_type, uri, None)
 
         if valid_binary_content_types:
-            return (None, uri,
-                    self.error_obj(
-                        406,
-                        _("Requested content type(s) '%(requested)s' not "
-                          "available.\nAcceptable mime types are: */*, "
-                          "%(acceptable)s") %
-                        {"requested":
-                         self.client.request.headers.get('Accept'),
-                         "acceptable": ", ".join(sorted(
-                             valid_binary_content_types))}))
+            report_acceptable_types = valid_binary_content_types
+        else:
+            report_acceptable_types = list(self.__accepted_content_type.keys())
+            report_acceptable_types.append("*/*")
 
         return (None, uri,
                 self.error_obj(
                     406,
                     _("Requested content type(s) '%(requested)s' not "
-                      "available.\nAcceptable mime types are: */*, "
+                      "available.\nAcceptable mime types are: "
                       "%(acceptable)s") %
                     {"requested":
                      self.client.request.headers.get('Accept'),
                      "acceptable": ", ".join(sorted(
-                         self.__accepted_content_type.keys()))}))
+                         report_acceptable_types))}))
 
     def dispatch(self, method, uri, input_payload):
         """format and process the request"""
--- a/test/rest_common.py	Tue Mar 24 16:56:38 2026 -0400
+++ b/test/rest_common.py	Tue Mar 24 21:30:47 2026 -0400
@@ -1909,7 +1909,7 @@
         (output_type, uri, error) = dof("/rest/data/issue")
 
         self.assertEqual(self.server.client.response_code, 406)
-        self.assertIn(b"Requested content type(s) 'application/zot; version=1; q=0.5' not available.\nAcceptable mime types are: */*, application/json",
+        self.assertIn(b"Requested content type(s) 'application/zot; version=1; q=0.5' not available.\nAcceptable mime types are: */*, application/*, application/json",
                       s2b(error['error']['msg']))
 
         # simulate: /rest/data/issue works, multiple acceptable output, one
@@ -1985,7 +1985,7 @@
         self.assertEqual(self.server.client.response_code, 400)
         self.assertEqual(output_type, None)
         self.assertEqual(uri, "/rest/data/issue")
-        self.assertIn('Unable to parse Accept Header. Invalid media type: Xyzzy I am not a mime. Acceptable types: */*, application/json', error['error']['msg'])
+        self.assertIn('Unable to parse Accept Header. Invalid media type: Xyzzy I am not a mime. Acceptable types: */*, application/*, application/json', error['error']['msg'])
 
         # test 5 accept mimetype is ok, param is not
         headers={"accept": "*/*; foo",
@@ -2007,7 +2007,7 @@
         self.assertEqual(self.server.client.response_code, 400)
         self.assertEqual(output_type, None)
         self.assertEqual(uri, "/rest/data/issue")
-        self.assertIn('Unable to parse Accept Header. Invalid param: foo. Acceptable types: */*, application/json', error['error']['msg'])
+        self.assertIn('Unable to parse Accept Header. Invalid param: foo. Acceptable types: */*, application/*, application/json', error['error']['msg'])
 
         # test 6: test paths:
         #
@@ -2144,7 +2144,7 @@
                  "output_type": None,
                  "uri": "/rest/data/file/1/binary_content",
                  "error": {'error': 
-                           {'status': 406, 'msg': "Requested content type(s) 'image/svg+html' not available.\nAcceptable mime types are: */*, application/octet-stream, image/png"}},
+                           {'status': 406, 'msg': "Requested content type(s) 'image/svg+html' not available.\nAcceptable mime types are: */*, application/octet-stream, image/*, image/png"}},
                  "has_nosniff": False,
                 }),
             (# use wildcard accept and get back msg's actual mime type
@@ -2199,16 +2199,16 @@
                 {"path": "/rest/data/msg/2/binary_content",
                  "accept": "*/*",
                  "response_code": "",
-                 "output_type": "text/*",
+                 "output_type": "text/plain",
                  "uri": "/rest/data/msg/2/binary_content",
                  "error": None,
                  "has_nosniff": True,
                 }),
-            (# use text/* and get back text/*
+            (# use text/* and get back text/plain
                 {"path": "/rest/data/msg/2/binary_content",
                  "accept": "text/*",
                  "response_code": "",
-                 "output_type": "text/*",
+                 "output_type": "text/plain",
                  "uri": "/rest/data/msg/2/binary_content",
                  "error": None,
                  "has_nosniff": True,
@@ -2221,7 +2221,7 @@
                  "uri": "/rest/data/msg/2/binary_content",
                  "error": {'error': 
                            {'status': 406, 'msg':
-                            "Requested content type(s) 'text/markdown' not available.\nAcceptable mime types are: */*, application/octet-stream, text/*"}},
+                            "Requested content type(s) 'text/markdown' not available.\nAcceptable mime types are: */*, application/octet-stream, text/*, text/plain"}},
                  "has_nosniff": False,
                 }),
             (# use error accept and get back error
@@ -2232,7 +2232,7 @@
                  "uri": "/rest/data/msg/1/binary_content",
                  "error": {'error':
                            {'status': 400, 'msg':
-                            'Unable to parse Accept Header. Invalid media type: q=2. Acceptable types: */*, application/json'}},
+                            'Unable to parse Accept Header. Invalid media type: q=2. Acceptable types: */*, application/*, application/json'}},
                  "has_nosniff": False,
                  }),
             (# use text/* but override with extension of .json get back json
@@ -2358,7 +2358,7 @@
         print(results)
         json_dict = json.loads(b2s(results))
         self.assertEqual(self.server.client.response_code, 406)
-        self.assertIn("Requested content type(s) 'application/zot; version=1; q=0.5' not available.\nAcceptable mime types are: */*, application/json",
+        self.assertIn("Requested content type(s) 'application/zot; version=1; q=0.5' not available.\nAcceptable mime types are: */*, application/*, application/json",
                       json_dict['error']['msg'])
 
         # simulate: /rest/data/issue works, multiple acceptable output, one
@@ -2439,7 +2439,7 @@
         print(results)
         self.assertEqual(self.server.client.response_code, 400)
         json_dict = json.loads(b2s(results))
-        self.assertIn('Unable to parse Accept Header. Invalid media type: Xyzzy I am not a mime. Acceptable types: */*, application/json', json_dict['error']['msg'])
+        self.assertIn('Unable to parse Accept Header. Invalid media type: Xyzzy I am not a mime. Acceptable types: */*, application/*, application/json', json_dict['error']['msg'])
 
         # test 5 accept mimetype is ok, param is not
         headers={"accept": "*/*; foo",
@@ -2463,7 +2463,7 @@
         print(results)
         self.assertEqual(self.server.client.response_code, 400)
         json_dict = json.loads(b2s(results))
-        self.assertIn('Unable to parse Accept Header. Invalid param: foo. Acceptable types: */*, application/json', json_dict['error']['msg'])
+        self.assertIn('Unable to parse Accept Header. Invalid param: foo. Acceptable types: */*, application/*, application/json', json_dict['error']['msg'])
 
     @skip_on_py2
     def testBadJson(self):

Roundup Issue Tracker: http://roundup-tracker.org/