Menu
Analytics dashboard with charts and data visualizations
How-To GuidesMarch 23, 2026- Leo

How ScamVerify Document Analysis Catches Fake PDFs

What ScamVerify Document Analysis Does

ScamVerify™ document analysis is an AI-powered tool that examines uploaded PDFs, images, and documents for signs of fraud, manipulation, and connections to known scam operations. The system extracts text content, identifies embedded entities (phone numbers, URLs, email addresses), and checks each entity against 8 million+ threat records spanning FTC complaints, URLhaus malicious domains, ThreatFox indicators of compromise, and community reports.

Upload any document at scamverify.ai/document-checker. The analysis runs in seconds and delivers a plain-English risk assessment.

How It Works: The Analysis Pipeline

Step 1: Document Ingestion

Upload a PDF, image (JPG, PNG), or other document file. The system accepts documents from any source: invoices, contracts, offer letters, insurance documents, government correspondence, or any other document you want to verify.

For image-based documents (scanned PDFs, photographs of documents), the system uses optical character recognition (OCR) powered by GPT-4o to extract text from the image. This means even photographed documents, scanned paper records, and image-only PDFs can be analyzed.

Step 2: Text Extraction

The AI extracts all readable text from the document, including:

  • Body text from paragraphs, clauses, and descriptions
  • Headers and footers including letterhead information
  • Tables with structured data
  • Fine print and footnotes
  • Watermarks and stamps (when text-based)

The extracted text forms the foundation for all subsequent analysis. Every word in the document is captured and processed.

Step 3: Entity Identification

The AI identifies specific entities embedded in the document:

Entity TypeWhat Is ExtractedWhat Is Checked
Phone numbersAll formats (with/without country code, dashes, dots, parentheses)FTC complaint database (2.9M+ phone summaries)
URLs and domainsLinks, website references, QR code destinationsURLhaus (74,032 domains), ThreatFox (60,758 IOCs)
Email addressesAll email addresses in the documentDomain reputation, known fraud domains
Company namesIdentified organizationsCross-reference with reported entities
Financial detailsBank routing numbers, account numbers, payment instructionsPattern analysis for known fraud schemes

Step 4: Threat Database Cross-Reference

Each extracted entity is checked against ScamVerify's threat intelligence databases:

FTC Phone Complaints. Phone numbers found in the document are checked against 2.9 million+ FTC phone complaint summaries. If a phone number in an invoice has hundreds of "Do Not Call" complaints, that is a significant red flag.

URLhaus Malicious Domains. URLs and domains in the document are checked against 74,032 known malicious domains tracked for malware distribution. A document linking to a URLhaus-flagged domain indicates a clear threat.

ThreatFox IOCs. Domains and URLs are also checked against 60,758 indicators of compromise from ThreatFox, which tracks malware command-and-control infrastructure, credential harvesting endpoints, and botnet infrastructure.

Community Reports. Entities are checked against ScamVerify's community-reported data, capturing threats that have been flagged by other users but may not yet appear in federal databases.

Step 5: Content Pattern Analysis

Beyond entity checking, the AI analyzes the document's content for patterns associated with fraud:

  • Urgency language such as "immediate action required," "your account will be suspended," or "deadline expires today"
  • Payment pressure including demands for wire transfers, gift cards, cryptocurrency, or other non-reversible payment methods
  • Authority impersonation where the document claims to be from a government agency, bank, or well-known company
  • Information harvesting where the document requests sensitive data (SSN, bank account, passwords) in ways that legitimate organizations do not
  • Inconsistencies between the claimed sender and the document's actual characteristics

Step 6: Risk Assessment

The analysis produces a risk assessment that combines all findings into a clear, actionable evaluation:

  • Entity matches listing any phone numbers, URLs, or domains that appear in threat databases
  • Content flags highlighting language patterns associated with fraud
  • Overall risk level synthesizing all signals into a single assessment
  • Specific recommendations for next steps based on the findings

What Documents Should You Check?

High Priority

Document TypeWhy Check ItCommon Fraud
Invoices from new vendors71% of organizations face payments fraudAltered bank details
Wire transfer instructions$446M in real estate wire fraud annuallyModified routing numbers
Job offer letters118% increase in fake postings since 2020Identity theft via onboarding
Government notices684,045 impersonation complaintsFake IRS, SSA, court documents
Insurance documents$308B+ annual insurance fraudFake policies and cards

Worth Checking

  • Contracts and agreements before signing
  • Tax documents from unfamiliar preparers
  • Shipping and customs notifications with payment demands
  • Loan and mortgage documents from non-traditional lenders
  • Any document requesting sensitive personal information

When to Be Especially Cautious

  • The document arrived unexpectedly
  • It demands immediate action or payment
  • It requests information via non-standard channels
  • The formatting or quality differs from previous correspondence
  • Contact information in the document does not match known, verified information

What Document Analysis Does NOT Do

Transparency matters. ScamVerify document analysis has specific capabilities and limitations:

It does not detect all types of manipulation. Visual alterations to documents (changed logos, modified images, recolored elements) require forensic analysis beyond text extraction. The system focuses on identifying connections to known threat infrastructure and fraudulent content patterns.

It does not replace legal review. For contracts, agreements, and legal documents, a qualified attorney should review the terms. AI analysis identifies fraud indicators, not unfavorable legal terms.

It does not validate document authenticity. The system cannot confirm that a document is genuinely from the claimed sender. It can identify indicators of fraud, but a clean result does not guarantee authenticity. Always verify through independent channels.

It does not scan for malware. Document analysis extracts and evaluates content. For malware detection (embedded executables, malicious macros), use dedicated antivirus software. For more on PDF malware, read our guide on PDF malware in 2026.

Upload a document to analyze

Upload any PDF, image, or document to check for signs of fraud or manipulation.

Analyze Document

How It Compares to Manual Verification

CheckManualScamVerify AI
Read full document content5-15 minutes per documentSeconds
Look up every phone number in FTC databaseImpractical for multiple numbersAutomatic, 2.9M+ records
Check every URL against threat databasesRequires multiple toolsAutomatic, 134K+ threat indicators
Identify urgency and manipulation languageSubjective, easy to missPattern matching against known fraud
Cross-reference entities across databasesRequires specialized toolsSimultaneous multi-database check

Manual verification remains valuable for context that AI cannot assess: Does this invoice match previous invoices from this vendor? Do you actually have an account with this company? Was this document expected? AI analysis and human judgment work best together.

Getting Started

  1. Go to scamverify.ai/document-checker
  2. Upload a PDF, image, or document file
  3. Wait a few seconds for the AI analysis to complete
  4. Review the risk assessment and entity check results
  5. Take action based on the findings

Free registered accounts include document analysis as part of the standard lookup allocation. Paid subscribers receive higher limits. For details on plans and pricing, visit scamverify.ai/pricing.

FAQ

What file types can I upload?

ScamVerify document analysis accepts PDFs, JPG/JPEG images, PNG images, and other common document formats. Scanned documents and photographs of documents are supported through OCR text extraction.

Is my uploaded document stored?

ScamVerify processes your document for analysis and stores the results for your account history. The analysis results are accessible from your dashboard. Review the privacy policy for full details on data handling.

Can I check a document without creating an account?

Document analysis is available to registered users. Creating an account is free and takes under a minute. Free accounts include document checks as part of the 5 free lookups.

How accurate is the analysis?

The accuracy of entity matching depends on the coverage of the underlying databases: 2.9M+ FTC phone summaries, 74K+ URLhaus domains, and 60K+ ThreatFox IOCs. If a phone number or URL in the document appears in these databases, the match is definitive. Content pattern analysis uses AI inference and provides probabilistic assessments rather than binary results. A clean analysis result reduces risk but does not eliminate it. Always combine AI analysis with independent verification.

Can I use this for business document verification?

Yes. The B2B API includes document analysis endpoints for businesses that need to verify documents at scale. See the API documentation at docs.scamverify.ai for integration details.

Photo by Carlos Muza on Unsplash

Check any phone number, website, text, email, document, or QR code for free.

Instant AI analysis backed by millions of federal records and real-time threat data.

Check Now