Skip to content

StdioClientTransport missing explicit UTF-8 charset in InputStreamReader (same issue as #295, but on client side) #898

@KawaRyoji

Description

@KawaRyoji

Bug description

StdioClientTransport has the same encoding mismatch issue that was identified in #295 and fixed for StdioServerTransportProvider in #826 — but the fix was only applied to the server side. The client transport still lacks explicit UTF-8 charset
specification when reading from the subprocess.

In startInboundProcessing, the InputStreamReader is created without specifying a charset:

https://github.com/modelcontextprotocol/java-sdk/blob/main/mcp-core/src/main/java/io/modelcontextprotocol/client/transport/StdioClientTransport.java#L249

try (BufferedReader processReader = new BufferedReader(new InputStreamReader(process.getInputStream()))) {

Similarly, in startErrorProcessing:

https://github.com/modelcontextprotocol/java-sdk/blob/main/mcp-core/src/main/java/io/modelcontextprotocol/client/transport/StdioClientTransport.java#L182-L183

try (BufferedReader processErrorReader = new BufferedReader(
        new InputStreamReader(process.getErrorStream()))) {

Meanwhile, startOutboundProcessing already correctly specifies UTF-8:

os.write(jsonMessage.getBytes(StandardCharsets.UTF_8));
os.write("\n".getBytes(StandardCharsets.UTF_8));

This is the exact same inconsistency that #295 reported for StdioServerTransportProvider, and that #826 fixed — only on the server side.

Steps to reproduce

  1. Start a JVM with default charset set to something other than UTF-8 (e.g., -Dfile.encoding=COMPAT on Windows with Japanese locale, which resolves to MS932/Shift_JIS)
  2. Connect to an MCP server via StdioClientTransport
  3. Call a tool that returns multi-byte UTF-8 characters (e.g., Japanese, Chinese, Korean, emoji) in its response

Expected behavior

Multi-byte characters in the server's JSON-RPC response should be decoded correctly, since the MCP stdio transport specification requires UTF-8.

Actual behavior

The InputStreamReader uses Charset.defaultCharset() instead of UTF-8. When the default charset is not UTF-8, the response bytes are decoded with the wrong charset, corrupting multi-byte characters. This corruption can also break the JSON structure itself,
resulting in JsonParseException:

com.fasterxml.jackson.core.JsonParseException: Unexpected character ('' (code 92)): was expecting double-quote to start field name

For example, with MS932 as the default charset, the last byte of certain UTF-8 characters (0x8B, etc.) is interpreted as a MS932 lead byte, which then consumes the following byte — potentially a JSON structural character like \ (0x5C). This shifts the parser
state and breaks JSON parsing entirely.

Environment

  • MCP Java SDK version: 1.1.1
  • Java version: 21
  • OS: Windows 11 (Japanese locale, default charset MS932 with -Dfile.encoding=COMPAT)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions