Skip to content

Image Data Downloading and Processing#64

Open
MattsonCam wants to merge 7 commits intomainfrom
images_metadata
Open

Image Data Downloading and Processing#64
MattsonCam wants to merge 7 commits intomainfrom
images_metadata

Conversation

@MattsonCam
Copy link
Member

In this pr the JUMP image metadata is downloaded and processed. Then the image metadata can be filtered to download the desired images from aws S3 using a python utility.

There is an example of how to do this, but I didn't include it in this pr yet. When you review this pr, please let me know where you think this example would fit best. Similarly, I named the utility 4.download_images_from_metadata.py for now, but I'd appreciate any feedback on where you think this utility would belong. It will likely be used for both the anomalyze and nuclear speckle prediction projects in the future.

Also, I think I am processing the image metadata correctly, however please let me know if I'm not.

Cameron Mattson added 7 commits March 13, 2026 14:38
Changed to:
- Output a parquet file metadata instead of a csv
- Use one channel url per row
Add a downloader for JUMP pilot TIFF images that builds download jobs from
metadata, validates required columns, and maps S3 URLs to local output paths.
Supports dry-run previews, optional overwrite behavior, parallel or serial
execution, progress logging, and a structured summary of downloaded/skipped/failed files.
Changed channel names of files based on the url names, and added the
data
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant