PDF A11y Auditor

Automated Accessibility Checks for Downloadable PDFs

Automated Accessibility Checks for Downloadable PDFs

Learn about the technical architecture, features, and the motivation behind the PDF A11y Auditor tool.

Development and License

Developed by Dr. Harald Hutter.
License: MIT License.
https://a11y-pdf-audit.fly.dev/

Open Source:

View source code, contribute or report issues on GitHub Repository.

Purpose and Idea

The a11y PDF Audit is a modular web application designed to automatically check websites for accessible PDF files. It crawls any given URL, downloads discovered PDFs, validates them using VeraPDF, and generates structured HTML and PDF reports automatically.
VeraPDF is a purpose-built, open source, file-format validator covering all PDF/A and PDF/UA parts and conformance levels.
๐Ÿ” VeraPDF - Industry Supported PDF/A Validation

German Federal Monitoring Agency for Accessibility in Information Technology

The Federal Monitoring Agency for Accessibility of Information Technology (BFIT-Bund) began its work in autumn 2019. It was established on the basis of Section 13(3) of the Disability Equality Act (BGG). As the federal monitoring body, BFIT-Bund performs tasks assigned to Germany by the European Union (EU) Directive on the monitoring, review and reporting of digital services provided by public bodies. (Section 8 of Directive (EU) 2016/2102)

Many people don't know that PDFs actually have to be barrier-free. There are still misunderstandings, e.g. some people say that PDFs are not a website - but it is clear, and PDFs must be just as accessible. I would like to clarify that.

Main Features (Some are New in Version 1.2.0)

Limitations and Issues

VeraPDF vs. axesCheck (PAC) There is a known discrepancy between VeraPDF (used by this tool) and axesCheck/PAC regarding ISO 14289-1:2014 (PDF/UA-1), specifically rule 7.5 (Tables).

Solution: The "ScreenReadable" Profile

To bridge the gap between strict ISO validators and real-world screen reader behavior (like JAWS, NVDA, or axesCheck), this tool runs a dual-audit. First test against the strict PDF/UA-1 standard, and than against a custom ScreenReadable profile, which ignores visual font metrics and strict matrix checks.

View Excluded Rules Details

Performance & Infrastructure

This instance is hosted on a high-efficiency containerized environment. To provide advanced AI features on minimal hardware, we use a Smart-Resource-Architecture:

Note: AI reconstruction of large PDFs (50+ pages) may take several minutes. Our server is configured with a 1000-second timeout to ensure even complex documents are finished successfully.

For heavy enterprise use, you can deploy your own instance using our Docker image.

Quality and Testing

Tool Purpose Status / Result
โœ… flake8 Formatting & Style Checking No critical issues found.
โญ pylint Code Quality / Docstrings Review Score: > 9.63 / 10 points.
๐Ÿ”’ bandit Security Analysis No high severity findings.
๐ŸŒฟ radon cc Cyclomatic Complexity Tests Mainly A-level functions.
๐Ÿš€ PageSpeed
Insights
provides suggestions
on how
that page may be improved
Performance (LCP=0.2s) 100
Accessibility 100
Best Practices 100
SEO 100
๐Ÿ” Screaming Frog
SEO Spider
use for crawling
up to 500 URLs at a time
All reported issues solved.