Skip to main content

Command Palette

Search for a command to run...

Building a Domain Security Analyzer — The ScanReport Data Model

Published
16 min read

In the last post, we built the main Scanner class that coordinates all our security checks. It returns a comprehensive results dictionary. But working with nested dictionaries is tedious — you're constantly writing things like results['ssl'].get('success') and results['ssl'].get('valid', False).

Today we build the ScanReport class — a wrapper that makes our data pleasant to work with.

Today's Goal

By the end of this post, you'll understand:

  1. Why wrap dictionaries in classes — The difference between raw data and convenient interfaces

  2. What the @property decorator does — Methods that behave like attributes

  3. How each property simplifies data access — Replacing verbose checks with simple booleans

  4. Why to_dict() matters — Converting back for serialization


The Problem with Raw Dictionaries

Our Scanner returns this structure:

{
    'domain': 'github.com',
    'scan_time': '2024-12-27T15:30:45+00:00',
    'dns': {
        'a_records': {'success': True, 'records': ['20.207.73.82']},
        'mx_records': {'success': True, 'records': [...]},
        'spf': {'success': True, 'record': 'v=spf1 ...'},
        'dmarc': {'success': True, 'record': 'v=DMARC1; p=reject ...'}
    },
    'ssl': {
        'success': True,
        'valid': True,
        'days_until_expiry': 405,
        ...
    },
    'headers': {
        'success': True,
        'headers': {
            'Strict-Transport-Security': {'present': True, 'value': '...'},
            ...
        }
    }
}

Every time we want to check something, we write defensive code:

# Is SSL valid?
if results['ssl'].get('success') and results['ssl'].get('valid', False):
    days = results['ssl']['days_until_expiry']

# Does SPF exist?
if results['dns']['spf'].get('success', False):
    print("Has SPF")

# Which headers are missing?
missing = []
if results['headers'].get('success'):
    for name, data in results['headers'].get('headers', {}).items():
        if not data['present']:
            missing.append(name)

This is:

  • Verbose — Lots of code to express simple questions

  • Error-prone — Easy to forget a .get() and crash on missing keys

  • Repetitive — Same checks written over and over

  • Hard to read — The intent gets buried in boilerplate


The Solution: A Data Model Class

Instead of accessing raw dictionaries, we wrap them in a class with convenient properties:

# Before (raw dictionary)
if results['ssl'].get('success') and results['ssl'].get('valid', False):
    days = results['ssl']['days_until_expiry']

# After (ScanReport class)
if report.has_valid_ssl:
    days = report.ssl_days_remaining

The ScanReport class:

  • Stores the raw data internally

  • Provides clean properties for common questions

  • Handles edge cases and defaults

  • Makes code readable and maintainable


The Complete Code

Here's the full models/report.py file:

"""Data models for scan reports."""

from datetime import datetime


class ScanReport:
    """
    Holds the results of a domain security scan.

    Provides easy access to different parts of the scan
    and methods to check overall status.
    """

    def __init__(self, scan_results):
        """
        Initialize report from scanner results.

        Args:
            scan_results: dict returned by Scanner.scan()
        """
        self.domain = scan_results['domain']
        self.scan_time = scan_results['scan_time']
        self.dns = scan_results['dns']
        self.ssl = scan_results['ssl']
        self.headers = scan_results['headers']

    @property
    def has_valid_ssl(self):
        """Check if SSL certificate is valid."""
        return self.ssl.get('success') and self.ssl.get('valid', False)

    @property
    def ssl_days_remaining(self):
        """Get days until SSL certificate expires, or None if invalid."""
        if self.has_valid_ssl:
            return self.ssl.get('days_until_expiry')
        return None

    @property
    def has_spf(self):
        """Check if SPF record exists."""
        return self.dns['spf'].get('success', False)

    @property
    def has_dmarc(self):
        """Check if DMARC record exists."""
        return self.dns['dmarc'].get('success', False)

    @property
    def missing_headers(self):
        """Get list of missing security headers."""
        if not self.headers.get('success'):
            return []

        missing = []
        for name, data in self.headers.get('headers', {}).items():
            if not data['present']:
                missing.append(name)
        return missing

    @property
    def present_headers(self):
        """Get list of present security headers."""
        if not self.headers.get('success'):
            return []

        present = []
        for name, data in self.headers.get('headers', {}).items():
            if data['present']:
                present.append(name)
        return present

    def to_dict(self):
        """Convert report to dictionary (useful for JSON/database)."""
        return {
            'domain': self.domain,
            'scan_time': self.scan_time,
            'dns': self.dns,
            'ssl': self.ssl,
            'headers': self.headers
        }

And the package's __init__.py:

"""Data models package."""

from .report import ScanReport

__all__ = ['ScanReport']

Now let's understand every piece.


Part 1: The Module Structure

"""Data models for scan reports."""

from datetime import datetime

The Docstring

"""Data models for scan reports."""

This describes the file's purpose. A "data model" is a class that represents and organizes data — it gives structure and convenience methods to raw information.

The Import

from datetime import datetime

We import datetime even though we don't use it directly in the current code. It's here because:

  1. Future properties might need it (parsing scan_time string back to datetime)

  2. It signals that time-related operations belong in this module

  3. Removing unused imports is easy; remembering to add them later is harder

In production code, you might remove truly unused imports. For learning projects, having them ready is fine.


Part 2: The Class Definition and Docstring

class ScanReport:
    """
    Holds the results of a domain security scan.

    Provides easy access to different parts of the scan
    and methods to check overall status.
    """

What This Class Represents

A ScanReport is a view into scan results. It doesn't perform scanning — it just makes existing results easier to work with.

Think of it like a book's table of contents. The book (raw data) has all the content. The table of contents (ScanReport) helps you find what you need quickly.

The Class Docstring

The docstring explains:

  • What it holds — "results of a domain security scan"

  • What it provides — "easy access" and "methods to check overall status"

Anyone reading the code or using help(ScanReport) immediately understands its purpose.


Part 3: The __init__ Method

    def __init__(self, scan_results):
        """
        Initialize report from scanner results.

        Args:
            scan_results: dict returned by Scanner.scan()
        """
        self.domain = scan_results['domain']
        self.scan_time = scan_results['scan_time']
        self.dns = scan_results['dns']
        self.ssl = scan_results['ssl']
        self.headers = scan_results['headers']

What __init__ Does

The constructor takes the raw results dictionary from Scanner.scan() and unpacks it into instance attributes.

Why Unpack Into Separate Attributes?

We could store the entire dictionary:

def __init__(self, scan_results):
    self.data = scan_results  # Just store everything

But unpacking gives us:

1. Clear interface

report.domain      # Obviously the domain
report.ssl         # Obviously SSL data

vs.

report.data['domain']  # Extra layer
report.data['ssl']     # Nesting

2. IDE autocomplete

When you type report., your IDE shows domain, scan_time, dns, ssl, headers. With report.data, you'd just see data — no visibility into the structure.

3. Validation opportunity

If a required field is missing, unpacking fails immediately:

self.domain = scan_results['domain']  # KeyError if missing

This catches bugs early. With self.data = scan_results, you'd only discover missing fields later when accessing them.

The Attributes

AttributeTypeContains
self.domainstrThe domain name ("github.com")
self.scan_timestrISO format timestamp
self.dnsdictAll DNS check results
self.ssldictSSL certificate check results
self.headersdictHTTP security headers results

Part 4: Understanding the @property Decorator

Before we look at each property, let's understand what @property does.

Methods vs Properties

A method is a function that belongs to a class. You call it with parentheses:

report.to_dict()  # Method call — note the ()

A property looks like an attribute but runs code behind the scenes:

report.has_valid_ssl  # Property access — no ()

The @property Decorator

A decorator modifies how a function behaves. The @property decorator turns a method into a property:

# Without @property
def get_has_valid_ssl(self):
    return self.ssl.get('success') and self.ssl.get('valid', False)

# Usage: report.get_has_valid_ssl()  # Need parentheses

# With @property
@property
def has_valid_ssl(self):
    return self.ssl.get('success') and self.ssl.get('valid', False)

# Usage: report.has_valid_ssl  # No parentheses — looks like an attribute

Why Use Properties?

1. Cleaner syntax

# Method style
if report.get_has_valid_ssl():
    days = report.get_ssl_days_remaining()

# Property style
if report.has_valid_ssl:
    days = report.ssl_days_remaining

Properties read more naturally — like accessing data rather than calling functions.

2. Computed values look like data

The user doesn't need to know whether ssl_days_remaining is stored or calculated. They just access it:

report.domain              # Stored attribute
report.ssl_days_remaining  # Computed property — same syntax!

3. You can add logic later

Start with a simple attribute:

self.domain = scan_results['domain']

Later, add validation without changing how users access it:

@property
def domain(self):
    return self._domain.lower()  # Always lowercase

External code using report.domain doesn't need to change.


Part 5: The has_valid_ssl Property

    @property
    def has_valid_ssl(self):
        """Check if SSL certificate is valid."""
        return self.ssl.get('success') and self.ssl.get('valid', False)

What It Does

Returns True if:

  1. The SSL check succeeded (success is True)

  2. The certificate is valid (valid is True)

Returns False otherwise.

Breaking Down the Logic

self.ssl.get('success') and self.ssl.get('valid', False)

self.ssl.get('success')

Gets the success key from the SSL results dictionary. If the key doesn't exist, returns None.

For SSL, success means "we were able to check the certificate." This is True even if the certificate is invalid — we successfully determined it was invalid.

self.ssl.get('valid', False)

Gets the valid key, defaulting to False if missing. The second parameter to .get() is the default value.

Why default to False? If the key is missing, we can't confirm the certificate is valid. Safer to assume it isn't.

The and operator

Both conditions must be true:

  • success must be truthy (the check ran)

  • valid must be truthy (the certificate passed)

The Truth Table

successvalidResultMeaning
TrueTrueTrueCertificate is valid
TrueFalseFalseCertificate exists but is invalid
False(any)FalseCouldn't check (timeout, no HTTPS, etc.)
None(any)FalseKey missing — shouldn't happen, but handled

Usage

if report.has_valid_ssl:
    print("✓ Valid SSL certificate")
else:
    print("✗ SSL issue detected")

One simple boolean instead of nested dictionary checks.


Part 6: The ssl_days_remaining Property

    @property
    def ssl_days_remaining(self):
        """Get days until SSL certificate expires, or None if invalid."""
        if self.has_valid_ssl:
            return self.ssl.get('days_until_expiry')
        return None

What It Does

Returns the number of days until the SSL certificate expires, or None if we can't determine that.

The Logic

if self.has_valid_ssl:
    return self.ssl.get('days_until_expiry')
return None

Why check has_valid_ssl first?

If the certificate is invalid (expired, self-signed, wrong domain), "days remaining" doesn't make sense. An expired certificate has negative days, but a self-signed one might have 365 days — that doesn't mean it's good for 365 days.

By checking validity first, we only return days for certificates we actually trust.

Reusing has_valid_ssl

Notice we call self.has_valid_ssl — our own property. Properties can use other properties. This keeps logic centralized:

# If we later change what "valid" means, we only update has_valid_ssl
# ssl_days_remaining automatically uses the new definition

Usage

days = report.ssl_days_remaining

if days is None:
    print("Cannot determine certificate expiry")
elif days < 0:
    print(f"Certificate expired {abs(days)} days ago!")
elif days < 30:
    print(f"Warning: Certificate expires in {days} days")
else:
    print(f"Certificate valid for {days} more days")

Part 7: The has_spf Property

    @property
    def has_spf(self):
        """Check if SPF record exists."""
        return self.dns['spf'].get('success', False)

What It Does

Returns True if the domain has an SPF record, False otherwise.

The Logic

self.dns['spf'].get('success', False)

self.dns['spf']

Access the SPF results within DNS results. We use direct access ([]) rather than .get() because the SPF key should always exist — our scanner always returns it.

.get('success', False)

Check if the lookup succeeded. Default to False if the key is missing (defensive programming).

Why This Works

Our DNS scanner returns:

# SPF found
{'success': True, 'record': 'v=spf1 include:...'}

# SPF not found
{'success': False, 'error': 'No SPF record found'}

So success directly tells us whether the record exists.

Usage

if report.has_spf:
    print("✓ SPF record configured")
else:
    print("✗ No SPF record — email spoofing protection missing")

Part 8: The has_dmarc Property

    @property
    def has_dmarc(self):
        """Check if DMARC record exists."""
        return self.dns['dmarc'].get('success', False)

What It Does

Returns True if the domain has a DMARC record, False otherwise.

The Logic

Identical pattern to has_spf:

self.dns['dmarc'].get('success', False)

We access the DMARC results and check if the lookup succeeded.

Why Separate Properties for SPF and DMARC?

They're both DNS TXT records for email security, but they serve different purposes:

  • SPF — "Who is allowed to send email for this domain?"

  • DMARC — "What should receivers do when someone violates the rules?"

A domain might have SPF but not DMARC, or vice versa. Checking them separately lets us give specific recommendations:

if not report.has_spf:
    print("Add SPF record to specify authorized mail servers")

if not report.has_dmarc:
    print("Add DMARC record to specify enforcement policy")

Part 9: The missing_headers Property

    @property
    def missing_headers(self):
        """Get list of missing security headers."""
        if not self.headers.get('success'):
            return []

        missing = []
        for name, data in self.headers.get('headers', {}).items():
            if not data['present']:
                missing.append(name)
        return missing

What It Does

Returns a list of security header names that the domain is missing.

The Logic — Step by Step

Step 1: Check if headers scan succeeded

if not self.headers.get('success'):
    return []

If we couldn't check headers (connection failed, SSL error), we can't say which are missing. Return an empty list rather than guessing.

Step 2: Initialize empty list

missing = []

We'll collect missing header names here.

Step 3: Loop through headers

for name, data in self.headers.get('headers', {}).items():

The headers results look like:

{
    'success': True,
    'headers': {
        'Strict-Transport-Security': {'present': True, 'value': '...'},
        'Content-Security-Policy': {'present': False, 'value': None},
        ...
    }
}

We loop through each header name and its data.

.get('headers', {}) — Get the headers dict, default to empty dict if missing. This prevents errors if the structure is unexpected.

.items() — Returns key-value pairs for iteration.

Step 4: Check if missing

if not data['present']:
    missing.append(name)

If present is False, this header is missing. Add its name to our list.

Step 5: Return the list

return missing

Example Output

For a site missing some headers:

report.missing_headers
# ['Content-Security-Policy', 'Permissions-Policy']

For a well-configured site:

report.missing_headers
# []

Usage

missing = report.missing_headers

if missing:
    print(f"Missing {len(missing)} security headers:")
    for header in missing:
        print(f"  ✗ {header}")
else:
    print("✓ All security headers present")

Part 10: The present_headers Property

    @property
    def present_headers(self):
        """Get list of present security headers."""
        if not self.headers.get('success'):
            return []

        present = []
        for name, data in self.headers.get('headers', {}).items():
            if data['present']:
                present.append(name)
        return present

What It Does

Returns a list of security header names that the domain has.

The Logic

Mirror image of missing_headers:

if data['present']:  # Note: True instead of False
    present.append(name)

Everything else is identical — same safety checks, same iteration.

Why Have Both Properties?

Different use cases:

missing_headers — For recommendations

print("You should add these headers:")
for header in report.missing_headers:
    print(f"  - {header}")

present_headers — For positive feedback

print("Good job on these headers:")
for header in report.present_headers:
    print(f"  ✓ {header}")

Counting

total = 6  # We check 6 headers
present = len(report.present_headers)
missing = len(report.missing_headers)
print(f"Headers: {present}/{total}")

Could We Compute One from the Other?

Yes, but it would be slower and more complex:

# Computing missing from present (not recommended)
all_headers = ['Strict-Transport-Security', 'Content-Security-Policy', ...]
missing = [h for h in all_headers if h not in report.present_headers]

This requires:

  • Maintaining a list of all headers

  • Set difference logic

  • Extra computation each time

Having both properties is clearer and more efficient.


Part 11: The to_dict Method

    def to_dict(self):
        """Convert report to dictionary (useful for JSON/database)."""
        return {
            'domain': self.domain,
            'scan_time': self.scan_time,
            'dns': self.dns,
            'ssl': self.ssl,
            'headers': self.headers
        }

What It Does

Converts the ScanReport back to a plain dictionary.

Why We Need This

Properties are great for Python code, but some situations need plain dictionaries:

1. JSON serialization

import json

report = ScanReport(scanner.scan())

# This fails — can't serialize a custom class
json.dumps(report)  # TypeError!

# This works
json.dumps(report.to_dict())  # Valid JSON

2. Database storage

Most databases and ORMs expect dictionaries or primitive types, not custom objects.

3. API responses

Web frameworks typically return dictionaries that get converted to JSON.

4. Passing to other systems

External code might not have the ScanReport class definition. Plain dictionaries are universal.

Why Not Store the Original Dictionary?

We unpacked the dictionary into attributes in __init__. Now we're reconstructing it. Seems wasteful?

Actually, this is intentional:

1. Validation happened

When we unpacked, we verified all required keys exist. The reconstructed dict is guaranteed complete.

2. Normalization happened

Future versions might clean or transform data during __init__. The output dict reflects that.

3. Computed values could be added

We could extend to_dict() to include computed properties:

def to_dict(self):
    return {
        'domain': self.domain,
        'scan_time': self.scan_time,
        'dns': self.dns,
        'ssl': self.ssl,
        'headers': self.headers,
        # Computed fields
        'has_valid_ssl': self.has_valid_ssl,
        'missing_headers': self.missing_headers,
    }

Usage

# Save to JSON file
import json

report = ScanReport(scanner.scan())

with open('report.json', 'w') as f:
    json.dump(report.to_dict(), f, indent=2)
# Send as API response (Flask example)
from flask import jsonify

@app.route('/scan/<domain>')
def scan_domain(domain):
    scanner = Scanner(domain)
    report = ScanReport(scanner.scan())
    return jsonify(report.to_dict())

Part 12: The Package __init__.py

"""Data models package."""

from .report import ScanReport

__all__ = ['ScanReport']

What It Does

Makes ScanReport easy to import:

# With __init__.py exporting ScanReport
from models import ScanReport

# Without it
from models.report import ScanReport  # More verbose

The Pattern

Same pattern as our scanner package:

  1. Define classes/functions in specific files (report.py)

  2. Import them in __init__.py

  3. Export via __all__

This keeps files focused while providing a clean public interface.


Part 13: Putting It All Together

Let's see how ScanReport improves our code:

Before (Raw Dictionaries)

from scanner import Scanner

scanner = Scanner("github.com")
results = scanner.scan()

# Check SSL
if results['ssl'].get('success') and results['ssl'].get('valid', False):
    days = results['ssl'].get('days_until_expiry')
    if days is not None and days < 30:
        print(f"Warning: Certificate expires in {days} days")

# Check email security
if results['dns']['spf'].get('success', False):
    print("Has SPF")
if results['dns']['dmarc'].get('success', False):
    print("Has DMARC")

# Check headers
if results['headers'].get('success'):
    missing = []
    for name, data in results['headers'].get('headers', {}).items():
        if not data['present']:
            missing.append(name)
    if missing:
        print(f"Missing headers: {missing}")

After (ScanReport)

from scanner import Scanner
from models import ScanReport

scanner = Scanner("github.com")
report = ScanReport(scanner.scan())

# Check SSL
if report.has_valid_ssl:
    if report.ssl_days_remaining < 30:
        print(f"Warning: Certificate expires in {report.ssl_days_remaining} days")

# Check email security
if report.has_spf:
    print("Has SPF")
if report.has_dmarc:
    print("Has DMARC")

# Check headers
if report.missing_headers:
    print(f"Missing headers: {report.missing_headers}")

The after version is:

  • Shorter — Less boilerplate

  • Clearer — Intent is obvious (has_valid_ssl vs nested .get() calls)

  • Safer — Edge cases handled inside the properties

  • Maintainable — Change logic in one place, not everywhere


Part 14: How This Fits Into the Project

Here's our updated project structure:

src/
├── scanner/
│   ├── __init__.py         # Exports Scanner
│   ├── scanner.py          # Main Scanner class
│   ├── dns_scanner.py      # DNS checks
│   ├── ssl_scanner.py      # SSL checks
│   └── headers_scanner.py  # Header checks
├── models/
│   ├── __init__.py         # Exports ScanReport
│   └── report.py           # ScanReport class
└── analyzer.py             # Entry point (coming next)

The flow is:

Scanner.scan()
    ↓
Returns raw dict
    ↓
ScanReport(dict)
    ↓
Clean properties for display/scoring

Summary: What We've Learned

  1. Data model classes wrap raw data — They provide a clean interface over dictionaries

  2. @property makes methods look like attributes — Users access report.has_valid_ssl without parentheses

  3. Properties centralize logic — Edge cases and defaults are handled once, not everywhere

  4. Properties can use other propertiesssl_days_remaining uses has_valid_ssl

  5. to_dict() enables serialization — Convert back to dictionaries for JSON, databases, APIs

  6. Package __init__.py provides clean imports — from models import ScanReport works


What's Next

We now have:

  • Scanner package — Performs all security checks

  • Models package — Makes results easy to work with

In the next post, we'll build the CLI module (cli.py) — the display layer that takes a ScanReport and prints a beautiful, formatted report to the terminal. All those print statements that used to be scattered everywhere will live in one place, completely separate from the scanning and data logic.


This is Part 11 of the Domain Security Analyzer series.

Find the code on GitHub.