Skip to content

log metrics around processing durations #217

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 4, 2025

Conversation

iuwqyir
Copy link
Collaborator

@iuwqyir iuwqyir commented Jun 3, 2025

TL;DR

Added performance metrics for database operations and Kafka publishing.

What changed?

  • Added Prometheus histogram metrics for key operations in the metrics package:
    • staging_insert_duration_seconds
    • main_storage_insert_duration_seconds
    • publish_duration_seconds
    • staging_delete_duration_seconds
    • get_block_numbers_to_commit_duration_seconds
    • get_staging_data_duration_seconds
  • Instrumented the Committer and Poller components with timing measurements
  • Added timing for getBlockNumbersToCommit() function
  • Added timing for staging data retrieval in getSequentialBlockDataToCommit()
  • Added timing for main storage insertion, Kafka publishing, and staging data deletion in the commit() function
  • Added timing for staging data insertion in the handleWorkerResults() function
  • All timing logs use a consistent format with the "metric" field for easier filtering

How to test?

  1. Run the application with debug logging enabled
  2. Monitor the logs for entries with "metric" field
  3. Verify that duration metrics are being reported for all operations
  4. Check Prometheus endpoint to confirm metrics are being exposed correctly

Why make this change?

This change adds detailed performance metrics for database operations and Kafka publishing, which will help identify bottlenecks in the data processing pipeline. By measuring the duration of key operations, we can better understand where performance issues might be occurring and optimize accordingly. The consistent logging format with the "metric" field makes it easier to filter and analyze these performance metrics in log aggregation tools.

Summary by CodeRabbit

  • New Features
    • Added detailed performance metrics and timing logs for key data handling and publishing operations, including staging and main storage inserts, publishing to Kafka, and data deletion.
    • Introduced Prometheus histogram metrics to monitor operation durations, enhancing observability of system performance.

Copy link

coderabbitai bot commented Jun 3, 2025

Walkthrough

The changes introduce Prometheus histogram metrics to measure the duration of key data handling and publishing operations. Instrumentation is added throughout the orchestrator components to record, log, and observe the elapsed time for operations such as inserting, publishing, deleting, and retrieving data. No changes to control flow or public interfaces are made.

Changes

Files/Paths Change Summary
internal/metrics/metrics.go Added new Prometheus histogram metrics for operation durations (insert, publish, delete, etc).
internal/orchestrator/committer.go Instrumented key methods to measure, log, and observe durations for storage and publish ops.
internal/orchestrator/poller.go Added timing, logging, and metric observation for staging data insertion in worker results.

Sequence Diagram(s)

sequenceDiagram
    participant Poller
    participant Metrics
    participant Storage

    Poller->>Storage: InsertStagingData()
    activate Poller
    Note right of Poller: Start timer
    Storage-->>Poller: Result
    Note right of Poller: Stop timer, log duration
    Poller->>Metrics: StagingInsertDuration.Observe(duration)
    deactivate Poller
Loading
sequenceDiagram
    participant Committer
    participant Metrics
    participant Storage
    participant Publisher

    Committer->>Storage: getBlockNumbersToCommit()
    activate Committer
    Note right of Committer: Start timer
    Storage-->>Committer: BlockNumbers
    Note right of Committer: Stop timer, log duration
    Committer->>Metrics: GetBlockNumbersToCommitDuration.Observe(duration)
    deactivate Committer

    Committer->>Storage: getSequentialBlockDataToCommit()
    activate Committer
    Note right of Committer: Start timer
    Storage-->>Committer: BlockData
    Note right of Committer: Stop timer, log duration
    Committer->>Metrics: GetStagingDataDuration.Observe(duration)
    deactivate Committer

    Committer->>Storage: InsertMainStorage()
    activate Committer
    Note right of Committer: Start timer
    Storage-->>Committer: InsertResult
    Note right of Committer: Stop timer, log duration
    Committer->>Metrics: MainStorageInsertDuration.Observe(duration)
    deactivate Committer

    Committer->>Publisher: PublishBlockDataAsync()
    activate Committer
    Note right of Committer: Start timer
    Publisher-->>Committer: PublishResult
    Note right of Committer: Stop timer, log duration
    Committer->>Metrics: PublishDuration.Observe(duration)
    deactivate Committer

    Committer->>Storage: DeleteStagingData()
    activate Committer
    Note right of Committer: Start timer
    Storage-->>Committer: DeleteResult
    Note right of Committer: Stop timer, log duration
    Committer->>Metrics: StagingDeleteDuration.Observe(duration)
    deactivate Committer
Loading

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 23884ad and 65f8943.

📒 Files selected for processing (3)
  • internal/metrics/metrics.go (1 hunks)
  • internal/orchestrator/committer.go (3 hunks)
  • internal/orchestrator/poller.go (2 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
internal/orchestrator/poller.go (1)
internal/metrics/metrics.go (1)
  • StagingInsertDuration (111-115)
🔇 Additional comments (7)
internal/orchestrator/poller.go (1)

238-251: Excellent timing instrumentation implementation.

The timing measurement correctly captures the complete duration of the InsertStagingData operation, including error handling. The consistent logging format with the "metric" field will facilitate easy filtering in log aggregation tools as intended.

internal/orchestrator/committer.go (5)

83-87: Excellent use of defer pattern for timing measurement.

Using defer with an anonymous function ensures the timing is captured regardless of how the function exits, including early returns. This is a best practice for instrumentation.


126-129: Proper timing implementation for staging data retrieval.

The timing correctly measures the duration of fetching staging data and follows the consistent logging pattern with the "metric" field.


181-187: Correct timing measurement for main storage insertion.

The instrumentation properly captures the duration of the critical main storage operation with appropriate logging and metrics collection.


189-195: Proper async operation timing implementation.

The timing correctly captures the publishStart time before launching the goroutine, then measures the actual duration of the async publishing operation inside the goroutine. This accurately measures the real publishing duration rather than just the goroutine launch overhead.


198-203: Appropriate timing for staging data cleanup.

The timing measurement correctly captures the duration of the staging data deletion operation, completing the comprehensive instrumentation of the commit process.

internal/metrics/metrics.go (1)

111-146: Well-designed Prometheus histogram metrics for operation durations.

The new metrics follow excellent practices:

  • Consistent naming convention with clear "_duration_seconds" suffix
  • Descriptive help text for each operation
  • Appropriate use of prometheus.DefBuckets for duration measurements
  • Comprehensive coverage of all major operations in the data processing pipeline

These metrics will provide valuable insights for performance monitoring and bottleneck identification.

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Collaborator Author

iuwqyir commented Jun 3, 2025

@iuwqyir iuwqyir force-pushed the 06-04-log_metrics_around_processing_durations branch from 2e4cbfa to e88726a Compare June 3, 2025 21:57
@iuwqyir iuwqyir marked this pull request as ready for review June 3, 2025 21:57
@iuwqyir iuwqyir force-pushed the 06-04-log_metrics_around_processing_durations branch 2 times, most recently from 273702b to c17f155 Compare June 4, 2025 12:21
@iuwqyir iuwqyir force-pushed the 06-04-track_latest_block_immediately_on_start branch from 18cd55e to ab704ec Compare June 4, 2025 12:21
@iuwqyir iuwqyir changed the base branch from 06-04-track_latest_block_immediately_on_start to graphite-base/217 June 4, 2025 13:33
@iuwqyir iuwqyir force-pushed the 06-04-log_metrics_around_processing_durations branch from c17f155 to a4ab66f Compare June 4, 2025 13:33
@graphite-app graphite-app bot changed the base branch from graphite-base/217 to main June 4, 2025 13:34
@iuwqyir iuwqyir force-pushed the 06-04-log_metrics_around_processing_durations branch from a4ab66f to 65f8943 Compare June 4, 2025 13:34
@iuwqyir iuwqyir merged commit 8e3d75c into main Jun 4, 2025
6 checks passed
@iuwqyir iuwqyir deleted the 06-04-log_metrics_around_processing_durations branch June 4, 2025 13:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants