NVDump: Because Paying for CVE Data is Just Paying for Someone Else’s curl Command
Have you ever needed a local CVE dataset for a project, a detection pipeline, or just to stop paying some vendor for data that the U.S. government literally gives away at no cost? You open your browser, find the NVD, stare at the API docs for twenty minutes, and close the tab. Maybe you go back to asking IT for database access to some commercial threat intel platform that charges four figures a month to serve you data that NVD publishes openly.
There is a better way. The NVD REST API v2.0 is free, well-documented, and lets you pull the entire CVE catalog going back to 1999. The catch is that it has rate limits, pagination, and some quirks around date windowing that will bite you if you just fire off a naive bulk request. This write-up walks through NVDump, a PowerShell script that handles all of that cleanly, writes yearly cache snapshots to disk, and merges everything into a single local catalog JSON file you can use however you want.
If you work in CTI, this feeds directly into enrichment pipelines, as you query a CVE ID and get a description back. If you are on the offensive side, the use case is just as practical. You have a scoped target running a stack of services. You want to know what CVEs have been published against those products without hammering a vendor API mid-engagement or waiting on someone else’s tooling. With a local catalog you query offline, filter by keyword, product name, or CVSS score, and walk into the engagement already knowing what the surface looks like. A simple Python script looping through cve_catalog.json and matching against your target’s service list is all it takes. NVDump gets you the data. What you do with it is your business.
The full script is on GitHub at wong-hau-pepelu/NVDump. Clone it, schedule it, and you’re done.
Why PowerShell
Analysts live in Windows environments. PowerShell is already there, it handles JSON natively, it has decent HTTP support via Invoke-RestMethod, and you don’t need to install anything. If you want to port this to Python later, the logic translates directly. But for a script you want to hand to another analyst and have them run without setting up a virtual environment, PowerShell wins.
If you’re on a Mac, PowerShell is actually available there too via brew install powershell
The Problem with Naive Bulk Pulls
The NVD API has a hard limit on how many results it will return per request (2,000 max) and it rate-limits aggressively, around 50 requests per 30 seconds with an API key. If you try to pull everything in one shot, a few things happen: you hit pagination and only get the first 2,000 records, you get 429’d and your script dies, or you get a timeout on a giant query window.
The solution is to split each year into bounded date windows, paginate each window to exhaustion, sleep between requests, and merge results as you go. That’s the core of NVDump.
Getting an API Key
Technically optional, but practically required. Without a key you get throttled to about 5 requests per 30 seconds. With a key you get 50. Register at https://nvd.nist.gov/developers/request-an-api-key. It takes a few minutes and it’s free.
The Script
param(
[int]$StartYear = 2024,
[int]$EndYear = 2024,
[int]$DaysPerWindow = 120,
[int]$ResultsPerPage = 2000,
[int]$SleepSeconds = 2,
[int]$MaxWindows = 0,
[switch]$OverwriteExisting
)
The parameters let you control the year range, how wide each date window is, how many results to request per page, and how long to sleep between requests. MaxWindows is useful for test runs where you just want to verify the script works before letting it rip on a full decade of CVEs.
OverwriteExisting is important: by default the script preserves any description you’ve manually curated in the catalog. Pass this flag if you want NVD descriptions to overwrite your local edits.
Date Windowing
function Get-YearWindows([int]$year, [int]$daysPerWindow) {
$windows = @()
$cursor = [DateTimeOffset]::ParseExact(
("{0}-01-01T00:00:00.000Z" -f $year), 'yyyy-MM-ddTHH:mm:ss.fffZ',
[System.Globalization.CultureInfo]::InvariantCulture)
$yearEnd = [DateTimeOffset]::ParseExact(
("{0}-12-31T23:59:59.999Z" -f $year), 'yyyy-MM-ddTHH:mm:ss.fffZ',
[System.Globalization.CultureInfo]::InvariantCulture)
while ($cursor -le $yearEnd) {
$windowEnd = $cursor.AddDays($daysPerWindow).AddMilliseconds(-1)
if ($windowEnd -gt $yearEnd) { $windowEnd = $yearEnd }
$windows += [pscustomobject]@{ Start = $cursor; End = $windowEnd }
$cursor = $windowEnd.AddMilliseconds(1)
}
return $windows
}
This slices a full year into non-overlapping 120-day windows with millisecond-precise boundaries. The explicit UTC parsing with InvariantCulture is there to keep your local timezone from quietly corrupting the timestamps, which is the kind of bug that takes two hours to find at 11pm.
Calling the API with Retry Logic
function Invoke-NvdWindow {
param(
[DateTimeOffset]$WindowStart,
[DateTimeOffset]$WindowEnd,
[int]$ResultsPerPage,
[int]$SleepSeconds,
[string]$ApiKey
)
$headers = @{ apiKey = $ApiKey }
$startIndex = 0
$records = @{}
$requestCount = 0
$maxRetries = 5
do {
$params = @{
pubStartDate = Get-Iso8601 $WindowStart
pubEndDate = Get-Iso8601 $WindowEnd
noRejected = $null
resultsPerPage = $ResultsPerPage
startIndex = $startIndex
}
# build query string manually so noRejected renders as a bare flag
$queryParts = @()
foreach ($entry in $params.GetEnumerator()) {
if ($null -eq $entry.Value) {
$queryParts += [System.Uri]::EscapeDataString($entry.Key)
} else {
$queryParts += ('{0}={1}' -f
[System.Uri]::EscapeDataString($entry.Key),
[System.Uri]::EscapeDataString([string]$entry.Value))
}
}
$uri = $apiBase + '?' + ($queryParts -join '&')
$response = $null
$attempt = 0
while ($attempt -lt $maxRetries) {
$attempt += 1
try {
$response = Invoke-RestMethod -Method Get -Uri $uri `
-Headers $headers -TimeoutSec 60
$requestCount += 1
break
} catch {
if ($attempt -ge $maxRetries) { throw }
$delay = [Math]::Min(30, [Math]::Pow(2, $attempt) + $SleepSeconds)
Write-Warning ("Attempt {0}/{1} failed: {2}. Retrying in {3}s..." `
-f $attempt, $maxRetries, $_.Exception.Message, $delay)
Start-Sleep -Seconds $delay
}
}
foreach ($item in @($response.vulnerabilities)) {
if (-not $item.cve) { continue }
$id = [string]$item.cve.id
if (-not $id) { continue }
$records[$id] = [ordered]@{
description = Get-NvdDescription $item.cve
published = [string]$item.cve.published
lastModified = [string]$item.cve.lastModified
source = 'NVD API'
}
}
$totalResults = [int]$response.totalResults
$startIndex += [int]$response.resultsPerPage
Start-Sleep -Seconds $SleepSeconds
} while ($startIndex -lt $totalResults)
return [pscustomobject]@{ Records = $records; RequestCount = $requestCount }
}
A few things worth noting here. The noRejected parameter is a bare flag in the NVD API, meaning it takes no value in the query string, just the key name alone, no equals sign, no argument. Most HTTP libraries choke on that and either throw an error or silently drop it. The manual query string building handles it correctly by checking for null values and rendering the key without an equals sign.
What the flag actually does is straightforward. We are only interested in published CVEs. Rejected CVEs are rejected for a reason (duplicates, invalid submissions, entries withdrawn by the original submitter). They carry no CVSS score, no patch reference, no operational value. Threat actors don’t tap into them, vendors don’t patch against them, and they have no place in a catalog built for intelligence or offensive research. noRejected keeps the noise out.
Also. the retry logic added uses exponential backoff capped at 30 seconds. If NVD is having a mood and rate-limiting you, this keeps the script alive instead of dying on request 47 of 200!
The Catalog Structure
The output catalog looks like this:
{
"source": "NVD",
"updated_utc": "2026-04-03T12:00:00Z",
"cves": {
"CVE-2024-1234": "A heap buffer overflow in...",
"CVE-2024-5678": "An authentication bypass in..."
}
}
Flat, simple, readable. The cves object is keyed by CVE ID so lookups are O(1). You can load this into Python with json.load() in two lines and query it by CVE ID directly. No database required. No vendor API key required. No monthly invoice.
Writing the Catalog
foreach ($cve in $yearMap.Keys) {
$desc = [string]$yearMap[$cve].description
if (-not $desc) { continue }
if ($catalog.cves.Contains($cve)) {
if ($OverwriteExisting) {
$catalog.cves[$cve] = $desc
$totalUpdated += 1
} else {
$totalSkipped += 1
}
} else {
$catalog.cves[$cve] = $desc
$totalAdded += 1
}
}
New CVEs get added. Existing ones get preserved unless you explicitly pass -OverwriteExisting. This means you can manually curate descriptions for CVEs where NVD’s text is unhelpful (“An issue was discovered in…”) without losing your edits every time you sync.
Scheduling It
The NVD publishes new CVEs continuously. To keep your catalog current, schedule NVDump to run daily via Windows Task Scheduler:
$action = New-ScheduledTaskAction -Execute 'powershell.exe' `
-Argument '-NonInteractive -File "C:\path\to\NVDump.ps1" -StartYear 2026 -EndYear 2026'
$trigger = New-ScheduledTaskTrigger -Daily -At 2am
Register-ScheduledTask -TaskName 'NVDump' -Action $action -Trigger $trigger
Set the year range to the current year only for daily runs. Run a historical backfill manually once to seed your catalog going back as far as you need.
Running It
Pull the last two years and seed your catalog:
.\NVDump.ps1 -StartYear 2023 -EndYear 2024
Test it on a small window first:
.\NVDump.ps1 -StartYear 2024 -EndYear 2024 -MaxWindows 2
Full historical backfill going back to 1999 (grab a coffee, this takes a while):
.\NVDump.ps1 -StartYear 1999 -EndYear 2024
What You End Up With
A cve_catalog.json file sitting on your disk with clean CVE descriptions keyed by ID, year-by-year cache snapshots in nvd_cache/ so you don’t have to re-pull data you already have, and a script you can hand to any analyst on your team without explaining Python environments or API wrappers.
You own the data. It updates on your schedule. And it cost you nothing except the time it took to read this post.
The vendor selling you CVE database access is not doing anything you cannot do yourself on a Tuesday afternoon.