Skip to content

feat(nvd): use go to upload NVD conversion to gcs upon conversion#5099

Open
jess-lowe wants to merge 17 commits intogoogle:masterfrom
jess-lowe:refactor/nvd-use-gcs
Open

feat(nvd): use go to upload NVD conversion to gcs upon conversion#5099
jess-lowe wants to merge 17 commits intogoogle:masterfrom
jess-lowe:refactor/nvd-use-gcs

Conversation

@jess-lowe
Copy link
Contributor

This PR introduces support to immediately upload records to GCS instead of saving locally and then syncing - using the helper functions defined in https://github.com/google/osv.dev/pull/4984/changes.

The decision making for when/whether data is saved/uploaded/downloaded is now done at a much higher level, also allowing for the CVEToOSV function to focus only on converting the record and returning the vulnerability.

@jess-lowe jess-lowe requested review from another-rex and michaelkedar and removed request for another-rex March 20, 2026 03:09
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make the gcs-tools repo generic to only uploading to GCS, but we shouldn't put CVE specific logic into here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If uploading to GCS is going to take a while, I would even put the multithreading / concurrency logic in here.
E.g. provide a function that will spin up X number of works, and a "gcs client" that just contains a channel.

Other code can pass the client to their code to upload.

Probably for a separate PR though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make the gcs-tools repo generic to only uploading to GCS, but we shouldn't put CVE specific logic into here.

Moved these into their own thing in conversion/writer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If uploading to GCS is going to take a while, I would even put the multithreading / concurrency logic in here. E.g. provide a function that will spin up X number of works, and a "gcs client" that just contains a channel.

Other code can pass the client to their code to upload.

Probably for a separate PR though.

For uploading vulnerability records, this is too nuanced, hence it has its own thing in writer.VulnWorker, but with the NVD data this will be happening in the same thread that converts the record

@jess-lowe jess-lowe requested a review from another-rex March 20, 2026 05:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants