-
Notifications
You must be signed in to change notification settings - Fork 147
H2.0 runner memory optimization spike #5256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
So the harvester never made it past the compare stage. It ran for 5.1 hours, and the memory usage climbed slowly and consistently. You can see that usage here.
In order to get the largest sources working, we probably will need at least 4 G. However we probably don't need that for most jobs. I'd like to have a working session next week to consider implementation of the t-shirt size of harvest sources to size the jobs accordingly. The logic will be a bit more complex, but not that complex. |
Unfortunately this job has still not finished. More than that, it hasn't moved on. It hasn't changed it's memory significantly since 4pm yesterday. New Relic Logs show that the task never made it past the external records prep. However, as of this writing the task is still running and taking up memory. I've made a ticket to follow up on this spike, as I don't believe the current solution is tenable for a large source like IOOS. #5261. We will discuss further in an office hours if the proposed sketch is appropriate, and possible other optimizations. |
Purpose
We want to optimize memory usage on cloud.gov, but we're not sure what the current process will require from the system.
Given above question, conducting testing is needed to provide factual knowledge on future steps.
1 day of effort has been allocated and once compete, findings will be demonstrated and specific future actions will be decided.
Acceptance Criteria
[ACs should be clearly demo-able/verifiable whenever possible. Try specifying them using BDD.]
WHEN 1 day expires
THEN the amount of required memory to successfully harvest the source is known
AND a memory increase recommendation is made
AND any optimization/fixes are proposed (if necessary)
Background
See https://datagov-harvest-admin-dev.app.cloud.gov/harvest_source/554d15db-6080-4441-b4b2-d045451d6967, currently crashing regularly.
May want to investigate (if memory usage is high) what a "typical" source memory requirements are, and consider S/M/L approach.
May relate or be blocked by #5254
Sketch
Start at 2G, and increase as failures occur. Report success when import starts. Review code for possible optimizations.
The text was updated successfully, but these errors were encountered: