2026-04-27-exam-sprint-report-logging-design.md 5.6 KB

Exam Sprint Report Logging Design

Goal

Enrich business logs for the two exam-sprint report types, OUTLOOK and ACHIEVEMENT, so production issues can be located from report submission through PDF generation, storage, query, download, and cleanup.

Scope

The logging enhancement covers these report APIs and application flows:

  • Asynchronous creation: /api/exam-sprint/outlook-reports and /api/exam-sprint/achievement-reports
  • Synchronous creation: /api/exam-sprint/outlook-reports/sync and /api/exam-sprint/achievement-reports/sync
  • Report status query: /api/exam-sprint/reports/{reportId}
  • PDF download: /api/exam-sprint/reports/{reportId}/download
  • Background generation pipeline and expired report cleanup

The implementation should not log full request payloads, rendered HTML, PDF bytes, storage credentials, or account keys.

Recommended Approach

Use focused, structured business logs at application and pipeline boundaries. This keeps the log volume controlled while exposing the information needed to diagnose common failures.

The primary log fields should be:

  • reportId
  • reportType
  • generationStatus
  • stage
  • durationMs where a stage has meaningful elapsed time
  • storageObjectKey and fileName after upload or download lookup
  • concise failure reason and exception type for failures

Components

DefaultExamSprintReportApplicationService

Add logs around public application operations:

  • Asynchronous report submission:
    • INFO after the report is persisted and submitted to the dispatcher.
    • WARN if dispatch fails and the report is marked failed.
  • Synchronous report creation:
    • INFO when synchronous generation starts.
    • INFO when synchronous generation succeeds and a download URL can be returned.
    • WARN when the generated report is not downloadable.
  • Report query:
    • INFO with report status and whether a download URL is included.
    • INFO when an expired report is marked expired during query.
  • Report download:
    • INFO on download start and success.
    • WARN when the report is expired, not successful, has no storage key, or storage content is missing.
  • Expired report cleanup:
    • INFO at cleanup summary level with scanned, expired, storage-cleared, and failed counts.
    • WARN for individual cleanup failures while preserving retry behavior.

ExamSprintReportGenerationPipeline

Add stage-based logs around generation:

  • INFO when generation starts for a pending report.
  • INFO when a report is skipped because it is missing, not pending, or expired.
  • INFO when status changes to PROCESSING.
  • INFO after HTML rendering with elapsed time and HTML length.
  • INFO after PDF generation with elapsed time and byte size.
  • INFO after storage upload with elapsed time, storage object key, and file name.
  • INFO after final success with total elapsed time.
  • ERROR when any generation stage fails, including report id, report type, stage, failure reason, and exception.

Infrastructure Components

Keep infrastructure logs minimal. Application and pipeline logs should be enough for the normal diagnosis path. Add infrastructure-level logging only if a component has a meaningful failure boundary that is otherwise invisible.

For this change, avoid verbose logs in renderers, PDF generator, and Azure storage unless implementation reveals an unlogged failure path.

Data Flow

  1. The controller delegates to DefaultExamSprintReportApplicationService.
  2. The application service validates payload shape and creates a pending ExamSprintReport.
  3. For asynchronous generation, it persists the report and dispatches the reportId.
  4. For synchronous generation, it persists the report and calls ExamSprintReportGenerationPipeline.generate(reportId) directly.
  5. The pipeline moves the report to PROCESSING, renders HTML, generates PDF bytes, uploads the PDF, and marks the report SUCCESS.
  6. If an exception occurs, the pipeline marks the report FAILED and logs the failed stage.
  7. Query and download operations log current status and download availability without logging sensitive content.

Error Handling

Generation failures should continue to use the existing behavior: catch the exception, persist a failed report with failureReason, and return the failed report where applicable.

The logs should make failure causes observable without changing external API behavior. Expected business states such as expired reports or unavailable downloads should use WARN, while unexpected generation exceptions should use ERROR.

Testing Strategy

Add or update unit tests where feasible to verify that logging changes do not alter behavior:

  • Existing application service tests should continue to pass for asynchronous creation, synchronous creation, query, download, and cleanup flows.
  • Existing pipeline tests should continue to pass for success and failure flows.
  • If the project already uses log-capture testing utilities, add focused assertions for at least one successful generation log and one failed generation log. Otherwise, avoid brittle log text assertions and rely on behavior-preserving tests.

Acceptance Criteria

  • Both OUTLOOK and ACHIEVEMENT report generation paths emit useful logs with reportId and reportType.
  • Generation logs identify the stage where rendering, PDF generation, or upload fails.
  • Query and download logs distinguish expired, unavailable, missing-storage, and successful download cases.
  • Logs do not include full payloads, HTML content, PDF bytes, storage credentials, or account keys.
  • Existing tests for report creation, generation, query, download, and cleanup continue to pass.