Skip to content

Conversation

@borigas
Copy link
Contributor

@borigas borigas commented Mar 5, 2023

Closes #638

  • Adds test for webseed with padding between files
  • Takes padding into account when building http requests from blocks
  • Fixes issue where List<BlockInfo> bundle is non-sequential and non-adjacent, leading to requests with negative byte ranges, negative endIndex, and extra requests for ranges that are already downloaded
    • I couldn't isolate this into a reproducible test that I could share, but by pre-populating some of the destination files, this would happen more often than not

@alanmcgovern
Copy link
Owner

Takes padding into account when building http requests from blocks

From eyeballing the current implementation, it looks like it should be broken. However the test you've added is passing for me anyway. When you said "I couldn't isolate this into a reproducible test that I could share" - did you mean you can't figure out how to replicate the out-of-order requests, or you can't figure out how to replicate the lack-of-padding resulting in corrupt/broken files?

Either way, i'm curious about why those requests are out of order so i'll examine the code a bit more closely and see if i can identify a testcase which will clearly catch this :)

@borigas
Copy link
Contributor Author

borigas commented Jul 30, 2024

Sorry, I've forgotten lots about this problem and my notes in #638 are much better my memory at this point, but I'll do my best.

I was downloading files that were similar to existing files already on disk (some identical, some with similar content), so I was prepopulating the destination with existing files to save bandwidth, letting it checksum and cleanup files that didn't match, then download the missing pieces as a way to save bandwidth. I don't remember if the issue required prepopulating some destination files or if I was able to repro without that step.

The issue had some randomness to it. I could download the same torrent repeatedly. Sometimes, it would finish. Sometimes, it would seem to get stuck. Once it was stuck, it would stay stuck. Restarting the process wouldn't unstick it. Looking at the state once stuck, it was requesting a negative byte range because the bundles were out of order.

What I meant about not being able to share a repro is that I wasn't able to consistently repro this from a unit test. I had to download a handful of times to get to a stuck state and then reuse that state to test my changes. I think the unit test I committed failed sometimes, but passed more often when run without existing files. My changes consistently recover from the out of order requests, but I don't remember being able to figure out why or how the bundles got out of order in the 1st place.

Please treat most of this comment's details with healthy skepticism. I don't remember a lot about this. What I can say confidently is that this change did fix the issue I was having. I've downloaded more than 10,000 torrents without an issue since this change. Before this change, when a web seed was the only seed available, it would get stuck frequently.

Can you elaborate on what looks broken in the current implementation?

@alanmcgovern
Copy link
Owner

Ooh, I figured it out yesterday and have several tweaks to address it! There are two issues.

Firstly, the we request code has several places where it's improperly handling padding. Secondly, the piece picking code is supposed to request a single sequential block of data. However, if you change the internals of the code to force http peers to request 1 block at a time, you can reliably trigger this bug.

What should happen is that a single block is requested, then fully processed, then the next is requested. What actually happens is that the code incorrectly checks that fewer than MaxRequests are requested, but it should check for "CanRequestMorePieces".

Also, to notice the bug you must have a piece length smaller than 8 blocks of data (128kB). With a piece length of 32kB you can make the engine choose 8 blo ks, which is both blocks in 4 different pieces, and then it fails exactly as you describe. I'll push the branch even though its not fully fixed yet!

@borigas
Copy link
Contributor Author

borigas commented Jul 31, 2024

It's great that you were able to figure out a consistent repro and fix the root cause. I really appreciate you taking a look! Thanks!

@alanmcgovern
Copy link
Owner

Thanks for taking the time to dig into the issue and develop a preliminary. Unfortunately I went with a slightly different approach.

In short: fetching 16kB at a time can be problematic for web servers - so monotorrent attempts to download max(2MB, piecelength) bytes per request from webseeds. When creating the request, the piece picking logic selects a sequential run of blocks so that can be converted into a single HTTP request, thus minimising load on the server.

If the requested byte range covers many small files then one request per file will be needed, but there's no way to optimise around that :)

As such, while your approach is (broadly speaking) the best as it ensures any random assortment of block requests will always be fetched correctly, it does have the downside that bugs (such as the ones I addressed) will result in far more HTTP requests than intended.

What I do intend to do is to rebase your branch on master and extract some of the logic and tweak it so it raises an error when block requests are out of sync, which should make future issues/regressions much easier to diagnose!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

WebSeed: Pieces get out of order and create an http request with a negative end

2 participants