Skip to content

VJN uses yuv420p10le as an intermediate, causing subsampling and potentially distorted colors #52

@WeatherWonders

Description

@WeatherWonders

VJN always converts the upscaled video to the pixel format yuv420p10le before encoding to the output format. This causes chroma subsampling.

If the input format is VP9 with pixel format gbrp, this conversion also distorts the colors significantly: the contrast is increased, and saturated colors get brighter or darker. I have seen the same problem occur with FFmpeg directly unless it is given just the right color conversion filter.

The output format and the upscaling model do not matter (e.g. VP9 gbrp, the H.264 preset, or the FFV1 preset).

Examples

Preview of input VP9 bgrp video (frame extracted with FFmpeg):

Input image

VJN's output as VP9 bgrp video:

Distorted upscale

An upscale from ComfyUI using the same model in PTH format:

Correct upscale

VJN's output if I first convert to yuv444p10le is not distorted but still appears to have subsampling:

Correct upscale
Console

Note where it says yuv420p10le.

Generating TensorRT engine with command: python\VSPipe.exe -c y4m --arg "slot=1" --arg "video_path=D:\Vegas Studio\Renders\New Order - Test.webm" --start 0 --end 1 "C:\Small\VideoJaNai\current\backend\animejanai\core\animejanai_encode.vpy" -p .
run_animejanai slot 1 chain_1
chain_conf {'min_px': 0.0, 'max_px': inf, 'min_resolution': '0x0', 'max_resolution': '0x0', 'min_fps': 0.0, 'max_fps': inf, 'models': [{'resize_factor_before_upscale': 100.0, 'resize_height_before_upscale': 0.0, 'name': '2xLiveActionV1_SPAN_490000'}], 'rife': False, 'rife_factor_numerator': 2, 'rife_factor_denominator': 1, 'rife_model': 422, 'rife_ensemble': False, 'rife_scene_detect_threshold': 1.0, 'final_resize_height': 0.0, 'final_resize_factor': 100.0, 'tensorrt_engine_settings': ''}
upscale2x: scaling 2x from 1440x1080 with engine=2xLiveActionV1_SPAN_490000; num_streams=4
trt_settings --bf16 --minShapes=input:1x3x8x8 --optShapes=input:1x3x1080x1920 --maxShapes=input:1x3x1080x1920 --inputIOFormats=fp32:chw --outputIOFormats=fp32:chw --tacticSources=+CUDNN,-CUBLAS,-CUBLAS_LT --skipInference
upscale2x: scaling 2x from 1440x1080 with engine=2xLiveActionV1_SPAN_490000; num_streams=4
trt_settings --bf16 --minShapes=input:1x3x8x8 --optShapes=input:1x3x1080x1920 --maxShapes=input:1x3x1080x1920 --inputIOFormats=fp32:chw --outputIOFormats=fp32:chw --tacticSources=+CUDNN,-CUBLAS,-CUBLAS_LT --skipInference
Script evaluation done in 1.11 seconds
Frame: 2/2
Output 2 frames in 0.45 seconds (4.40 fps)
Upscaling with command: python\VSPipe.exe -c y4m --arg "slot=1" --arg "video_path=D:\Vegas Studio\Renders\New Order - Test.webm" "C:\Small\VideoJaNai\current\backend\animejanai\core\animejanai_encode.vpy" - | "C:\Users\Weather\AppData\Roaming\VideoJaNai\ffmpeg\ffmpeg.exe" -y -i pipe: -i "D:\Vegas Studio\Renders\New Order - Test.webm" -map 0:v -c:v libvpx-vp9 -lossless 1 -pix_fmt gbrp -max_interleave_delta 0 -map 1:t? -map 1:a?  -map 1:s? -c:t copy -c:a copy -c:s copy "D:\Vegas Studio\Renders\New Order - Test Upscaled.mkv"
ffmpeg version 2025-03-17-git-5b9356f18e-essentials_build-www.gyan.dev Copyright (c) 2000-2025 the FFmpeg developers
  built with gcc 14.2.0 (Rev1, Built by MSYS2 project)
  configuration: --enable-gpl --enable-version3 --enable-static --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-bzlib --enable-lzma --enable-zlib --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-sdl2 --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid --enable-libaom --enable-libopenjpeg --enable-libvpx --enable-mediafoundation --enable-libass --enable-libfreetype --enable-libfribidi --enable-libharfbuzz --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-dxva2 --enable-d3d11va --enable-d3d12va --enable-ffnvcodec --enable-libvpl --enable-nvdec --enable-nvenc --enable-vaapi --enable-libgme --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libtheora --enable-libvo-amrwbenc --enable-libgsm --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-librubberband
  libavutil      59. 60.100 / 59. 60.100
  libavcodec     61. 33.102 / 61. 33.102
  libavformat    61.  9.107 / 61.  9.107
  libavdevice    61.  4.100 / 61.  4.100
  libavfilter    10.  9.100 / 10.  9.100
  libswscale      8. 13.102 /  8. 13.102
  libswresample   5.  4.100 /  5.  4.100
  libpostproc    58.  4.100 / 58.  4.100
run_animejanai slot 1 chain_1
chain_conf {'min_px': 0.0, 'max_px': inf, 'min_resolution': '0x0', 'max_resolution': '0x0', 'min_fps': 0.0, 'max_fps': inf, 'models': [{'resize_factor_before_upscale': 100.0, 'resize_height_before_upscale': 0.0, 'name': '2xLiveActionV1_SPAN_490000'}], 'rife': False, 'rife_factor_numerator': 2, 'rife_factor_denominator': 1, 'rife_model': 422, 'rife_ensemble': False, 'rife_scene_detect_threshold': 1.0, 'final_resize_height': 0.0, 'final_resize_factor': 100.0, 'tensorrt_engine_settings': ''}
upscale2x: scaling 2x from 1440x1080 with engine=2xLiveActionV1_SPAN_490000; num_streams=4
trt_settings --bf16 --minShapes=input:1x3x8x8 --optShapes=input:1x3x1080x1920 --maxShapes=input:1x3x1080x1920 --inputIOFormats=fp32:chw --outputIOFormats=fp32:chw --tacticSources=+CUDNN,-CUBLAS,-CUBLAS_LT --skipInference
upscale2x: scaling 2x from 1440x1080 with engine=2xLiveActionV1_SPAN_490000; num_streams=4
trt_settings --bf16 --minShapes=input:1x3x8x8 --optShapes=input:1x3x1080x1920 --maxShapes=input:1x3x1080x1920 --inputIOFormats=fp32:chw --outputIOFormats=fp32:chw --tacticSources=+CUDNN,-CUBLAS,-CUBLAS_LT --skipInference
Input #0, yuv4mpegpipe, from 'pipe:':
  Duration: N/A, start: 0.000000, bitrate: N/A
  Stream #0:0: Video: rawvideo (Y3[11][10] / 0xA0B3359), yuv420p10le(progressive), 2880x2160, 29.97 fps, 29.97 tbr, 29.97 tbn
Input #1, matroska,webm, from 'D:\Vegas Studio\Renders\New Order - Test.webm':
  Metadata:
    ENCODER         : Lavf60.13.100
  Duration: 00:00:02.00, start: 0.000000, bitrate: 15867 kb/s
  Stream #1:0: Video: vp9 (Profile 1), gbrp(pc, gbr/unknown/unknown, progressive), 1440x1080, SAR 1:1 DAR 4:3, 29.97 fps, 29.97 tbr, 1k tbn
    Metadata:
      ENCODER         : Lavc60.26.100 libvpx-vp9
      DURATION        : 00:00:02.002000000
Stream mapping:
  Stream #0:0 -> #0:0 (rawvideo (native) -> vp9 (libvpx-vp9))
[libvpx-vp9 @ 000001e5d9c935c0] v1.15.0-68-g349820a50
Output #0, matroska, to 'D:\Vegas Studio\Renders\New Order - Test Upscaled.mkv':
  Metadata:
    encoder         : Lavf61.9.107
  Stream #0:0: Video: vp9 (VP90 / 0x30395056), gbrp(pc, gbr/unknown/unknown, progressive), 2880x2160, q=2-31, 29.97 fps, 1k tbn
    Metadata:
      encoder         : Lavc61.33.102 libvpx-vp9
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A
frame=    0 fps=0.0 q=0.0 size=       0KiB time=N/A bitrate=N/A speed=N/A    
frame=    0 fps=0.0 q=0.0 size=       0KiB time=N/A bitrate=N/A speed=N/A    
frame=    0 fps=0.0 q=0.0 size=       0KiB time=N/A bitrate=N/A speed=N/A    
frame=    0 fps=0.0 q=0.0 size=       0KiB time=N/A bitrate=N/A speed=N/A    
frame=    0 fps=0.0 q=0.0 size=       0KiB time=N/A bitrate=N/A speed=N/A    
frame=    0 fps=0.0 q=0.0 size=       0KiB time=N/A bitrate=N/A speed=N/A    
frame=    0 fps=0.0 q=0.0 size=       0KiB time=N/A bitrate=N/A speed=N/A    
frame=    0 fps=0.0 q=0.0 size=       0KiB time=N/A bitrate=N/A speed=N/A    
frame=    4 fps=0.9 q=0.0 size=       1KiB time=00:00:00.13 bitrate=  32.2kbits/s speed=0.0287x    
frame=    9 fps=1.7 q=0.0 size=    5120KiB time=00:00:00.30 bitrate=139670.5kbits/s speed=0.0581x    
frame=   12 fps=2.1 q=0.0 size=    5120KiB time=00:00:00.40 bitrate=104752.8kbits/s speed=0.0704x    
Output 60 frames in 6.91 seconds (8.68 fps)
frame=   15 fps=2.4 q=0.0 size=    5120KiB time=00:00:00.50 bitrate=83802.3kbits/s speed=0.0808x    
frame=   18 fps=2.7 q=0.0 size=    5120KiB time=00:00:00.60 bitrate=69835.2kbits/s speed=0.0895x    
frame=   21 fps=2.9 q=0.0 size=    5120KiB time=00:00:00.70 bitrate=59858.8kbits/s speed=0.0971x    
frame=   25 fps=3.2 q=0.0 size=    5120KiB time=00:00:00.83 bitrate=50281.3kbits/s speed=0.108x    
frame=   27 fps=3.3 q=0.0 size=   10752KiB time=00:00:00.90 bitrate=97769.3kbits/s speed=0.109x    
frame=   30 fps=3.4 q=0.0 size=   10752KiB time=00:00:01.00 bitrate=87992.4kbits/s speed=0.114x    
frame=   32 fps=3.4 q=0.0 size=   10752KiB time=00:00:01.06 bitrate=82492.9kbits/s speed=0.115x    
frame=   35 fps=3.6 q=0.0 size=   10752KiB time=00:00:01.16 bitrate=75422.1kbits/s speed=0.119x    
frame=   36 fps=3.5 q=0.0 size=   10752KiB time=00:00:01.20 bitrate=73327.0kbits/s speed=0.116x    
frame=   36 fps=3.3 q=0.0 size=   10752KiB time=00:00:01.20 bitrate=73327.0kbits/s speed=0.111x    
frame=   36 fps=3.2 q=0.0 size=   10752KiB time=00:00:01.20 bitrate=73327.0kbits/s speed=0.106x    
frame=   36 fps=3.0 q=0.0 size=   10752KiB time=00:00:01.20 bitrate=73327.0kbits/s speed=0.101x    
frame=   36 fps=2.9 q=0.0 size=   10752KiB time=00:00:01.20 bitrate=73327.0kbits/s speed=0.097x    
frame=   36 fps=2.8 q=0.0 size=   10752KiB time=00:00:01.20 bitrate=73327.0kbits/s speed=0.0932x    
frame=   36 fps=2.7 q=0.0 size=   10752KiB time=00:00:01.20 bitrate=73327.0kbits/s speed=0.0896x    
frame=   36 fps=2.6 q=0.0 size=   10752KiB time=00:00:01.20 bitrate=73327.0kbits/s speed=0.0862x    
[out#0/matroska @ 000001e5d9d56740] video:25623KiB audio:0KiB subtitle:0KiB other streams:0KiB global headers:0KiB muxing overhead: 0.004326%
frame=   60 fps=4.3 q=0.0 Lsize=   25624KiB time=00:00:02.00 bitrate=104850.5kbits/s speed=0.142x    

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions