Skip to content

[ENH] Configure the cache to have a hostPath and mountPath for log service. #4483

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 8, 2025

Conversation

rescrv
Copy link
Contributor

@rescrv rescrv commented May 7, 2025

Description of changes

This adds defaults to values.yaml that make k8s instantiate the log with a cache dir mounted.

Test plan

CI + eyes.

Documentation Changes

N/A

…rvice.

This adds defaults to values.yaml that make k8s instantiate the log with a cache dir mounted.
@rescrv rescrv requested a review from jasonvigil May 7, 2025 21:15
Copy link

github-actions bot commented May 7, 2025

Reviewer Checklist

Please leverage this checklist to ensure your code review is thorough before approving

Testing, Bugs, Errors, Logs, Documentation

  • Can you think of any use case in which the code does not behave as intended? Have they been tested?
  • Can you think of any inputs or external events that could break the code? Is user input validated and safe? Have they been tested?
  • If appropriate, are there adequate property based tests?
  • If appropriate, are there adequate unit tests?
  • Should any logging, debugging, tracing information be added or removed?
  • Are error messages user-friendly?
  • Have all documentation changes needed been made?
  • Have all non-obvious changes been commented?

System Compatibility

  • Are there any potential impacts on other parts of the system or backward compatibility?
  • Does this change intersect with any items on our roadmap, and if so, is there a plan for fitting them together?

Quality

  • Is this code of a unexpectedly high quality (Readability, Modularity, Intuitiveness)

Copy link
Contributor

propel-code-bot bot commented May 7, 2025

This PR adds cache configuration for the Rust log service in the Kubernetes deployment by introducing hostPath and mountPath parameters in values.yaml. It also bumps the Helm chart version accordingly from 0.1.38 to 0.1.39.

This summary was automatically generated by @propel-code-bot

Copy link

github-actions bot commented May 7, 2025

✅ The Helm chart's version was changed. Your changes to the chart will be published upon merge to main.

@@ -51,6 +51,10 @@ rustLogService:
image:
repository: 'rust-log-service'
tag: 'latest'
env:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[DataTypeCheck]

The env: key on line 54 implies an empty map ({}). However, other services in this file (e.g., sysdb, logService) define env as a list of name/value pairs. If Helm templates expect a list (e.g., for iteration with range), this type mismatch can cause errors or unexpected behavior. For an empty list of environment variables, it should be env: [] to maintain consistency and ensure correct template processing.

Suggested change
env:
env: []

Committable suggestion

Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this suggestion. Though, why do we need the env field here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yaml cargo culting on my part. I'll remove.

@@ -51,6 +51,10 @@ rustLogService:
image:
repository: 'rust-log-service'
tag: 'latest'
env:
cache:
hostPath: '/local/cache/chroma-log-service'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[ArchitectureDecision]

Using hostPath for the cache volume (line 56) ties the cache data to a specific node. In a multi-node Kubernetes cluster, if the pod is rescheduled to a different node, the cache data will be lost or become inconsistent. For production deployments or multi-node environments requiring persistent cache, consider using PersistentVolumeClaim (PVC) for more robust and portable cache storage. hostPath is generally more suitable for single-node clusters or local development setups where node affinity is guaranteed and simplicity is prioritized.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also agree with this suggestion, but understand that we have made this trade-off for performance reasons.

Copy link
Collaborator

@HammadB HammadB May 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you think we can have cache coherency across deploys another way we can make that change. it’s possible to not use hostpath but still mount the local disk. We should probably do that, i think it was just some clunkiness on our deployed env that made it hard

Alternatively if you have a system design that doesn’t need cache coherency would love to hear about it.

@rescrv rescrv merged commit 6339856 into main May 8, 2025
70 checks passed
@rescrv rescrv deleted the rescrv/dangling-mount branch May 8, 2025 16:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants