Noticing Changes in NASA NODIS

Comparison of NASA Policy Documents Across Two Time Points

2025-01-20 to 2025-03-07

What is NASA's NODIS site?

NODIS is the publicly accessible "NASA Online Document Information System". It holds the directives that tell NASA staff what policies are for a wide variety of situations across legal, human resources, property management, procurement, financial management, etc. It is common for these documents to have "COMPLIANCE IS MANDATORY FOR NASA EMPLOYEES" printed across the top. While it is great that these are public for government transparency reasons, it can be hard to notice changes. Revisions sometimes get noted in the PDFs themselves, but there isn't a great way to see from the master list which have recently been changed or deleted. This page is an attempt to measure those normally hard to detect changes.

Why this time period?

At the start of the second Trump administration in 2025, NODIS went offline entirely many days , which was unusual. I was curious what changes might have occurred across that gap, but didn't want to read and try to manually compare 260+ PDfs. No one does. Instead, I wrote some python code to collect and analyze the PDFs available at two different points in time. I figured it would be a good excuse to learn about the Internet Archive and build some skills around programmatic accessing of websites and identifying changes in PDFs.

Methods

For the time point of 2025-03-07, PDFs were programmatically downloaded directly from the links to directives listed in the NODIS directive master list webpage. For the earlier time point of 2025-10-20 that occurred in the past, the the Internet Archive's Wayback Machine was used to find a snapshot of the NODIS's master list of directives that was as close to 2025-01-20 as possible. From that master list, links to directive PDFs were followed programmatically until they landed on an actual PDF of the directive and then the available snapshot of that page closest to 2025-01-20 without going over was found and downloaded. Complications included that many links in the master list were not direct links to the directive in PDF form, but frequently a link that resolved to another link that then showed table of contents for PDF in HTML form that required clicking another 1-2 links of different naming schemes before finding landing on the PDF as a page URL. Selenium was used to programmatically navigate these sequences of links to eventually get to the final directive full PDF download URL. For directives that exist in both time point's master directive lists, the tool pdf2image was used to create images of both PDFs and then diff-pdf was used to determine if there were any changes and output a PNG with changes marked in red. Those images with changes marked will appear below each directive that had a detected change below when the directive bar's green arrow is clicked.

Limitations

This page just contains an experimental quick analysis with no promise of accuracy! For official current status always refer to the official NASA NODIS website. Note that some of the directives in the category "exists in time 1 but pdf not available" can still be downloaded manually even if they were not successfully programmatically downloaded and analyzed. For example, some can be manually downloaded as Microsoft Word documents but not PDFs. Clicking on the green arrow near the name of the directive will bring up metadata generated during the programmatic analysis including links to both live versions on NODIS and historic snapshots by Internet Archive.

Quick Conclusions

13 policy directives were deleted entirely with many having a focus on preventing or responding to discrimination. Two of those deleted were replaced by directives with a different name in what is basically a version bump. Although changes could be programmatically detected and 54 directive PDFs had changes that were "large", meaning not just detected in the top part of the PDF suggesting merely a change in applicable date range, figuring out if those changes were actually semantically large still requires reading them. However, the side-by-side image comparisons of PDFs with red marks of where changes occurred speeds up that process.

Summary Counts

Directives organized by what happened to them between time points

Below are sections for each of the labels used to categorize the directive PDFs: deleted, added, changed large, changed small, changed, no change, exists in time 1 but pdf not available, exists in master list time 1, exists in master list time 2. Note that directives will appear in more than one category. For example, a directive might appear in "exists in master list time 1", "exists in master list time 2", "changed", and "small change".