Skip to content

Conversation

@aalhossary
Copy link
Member

When a file is being downloaded, two metadata files are created for size and hash (as possible).

I applied this validation framework to the files I know (structure files, SCOP installations, Ecod installation).
In case anybody knows something else where downloaded files can be validated, please advise.

I did not implement the hash code at least yet, because I don't know a place where a hash file is distributed with the resource file. In case anybody knows some example, please advise.

@sbliven was proposing to download to a temporary file before moving the load to the destination file. I found that FileDownloadUtils.downloadFile() does. Therefore, no need to do it ourselves. again, in case anybody knows a place which downloads files but does not use FileDownloadUtils.downloadFile() for download, please advise.

fixes #980.

Currently, we validate the file size only.
We could validate the content using any hashing function later.
Copy link
Contributor

@josemduarte josemduarte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you and sorry about the delays. I can't find much time for biojava lately.

I think the size validation is a good idea and general enough. However the hash validation is not general, it depends on different resources providing one. There's no standard way of doing that at the moment. I'd advise removing the whole hash validation. There's no point in having it around if it won't be applicable.

Typos, wording, and version correction
The expected hashing algorithms are MD5, SHA1, and SHA256
Copy link
Contributor

@josemduarte josemduarte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes and again apologies for the delays. Ok I think we could keep this as is for next release. Please note a couple of minor issues with logging and version number

Copy link
Contributor

@josemduarte josemduarte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks

@aalhossary aalhossary merged commit 1a256ed into biojava:master Jan 26, 2023
@aalhossary aalhossary deleted the add_file_download_validation branch January 26, 2023 22:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Detect incomplete or corrupted downloaded files

2 participants