eager scaling strategy for ScaledJob does not work as documented (or intended?)

### Report

This form prompts me to be clear and concise, and I will try to be very clear but fear that will not be very concise (apologies)

I was trying to understand the difference between the `default` and `eager` scaling strategies of ScaledJob (see [https://keda.sh/docs/2.16/reference/scaledjob-spec/#scalingstrategy](https://keda.sh/docs/2.16/reference/scaledjob-spec/#scalingstrategy))

#### In short
- I do not think the documentation on the `eager` strategy is correct, either about the behaviour of `default` or the behaviour of `eager`
- I think the implementation of `eager` may be bugged, but it's hard to tell what the intention is since I believe it is already given by `default`

#### The documented behaviour
- the ScaledJob spec has this phrase “The number of the scale” – the number of jobs that will be created on a given poll
- a key point is that the scaling behaviour differs from a ScaledObject: where for example in a Deployment, if you have 5 items on the queue (in progress), then you need to set the number of replicas to 5 (setting it to less would shut down running pods). But jobs are not managed after creation, so if there are 5 jobs running and 5 jobs on the queue, then (normally) the correct number of jobs to create is zero (except for when jobs are consumed from the queue, in which case the `accurate` strategy is required)
- when scaling strategy is set to `default`, this is calculated as `maxScale - runningJobCount`, 
where  `maxScale = min(scaledJob.MaxReplicaCount(), divideWithCeil(queueLength, targetAverageValue))`
- the section about the `eager` scaling strategy does not exactly explain how it differs, only that it makes up for an issue you might find with `default`. there is an example listed, where the maximum replicas is 10, the target average value is 1, and there is the following sequence: submit 3 jobs, poll, submit another 3 jobs, poll, and gives this table
> With the `default` scaling strategy, we are supposed to see the metrics changes in the following table:
>
> |             | initial | incoming 3 messages | after poll | incoming 3 messages | after poll |
> |-------------|---------|---------------------|------------|---------------------|------------|
> | queueLength | 0       | 3                   | 3          | 6                   | 6          |
> | runningJobs | 0       | 0                   | 3          | 3                   | 3          |
- the final column, to my understanding, is incorrect. After the second poll, using the formulas above:
`maxScale = min(10, ceil(6 / 1)) = 6`
so `"the number of the scale" = 3`
so 3 new jobs will be created, meaning the total of running jobs is now 6
which is working as intended. 
- The second table in that section goes on to show that it is actually the `eager` strategy which has 6 running jobs after the poll – I'll come to what `eager` actually does in a later section but I believe this is incorrect also.

#### The intended behaviour
The documentation also suggests reading the initial suggestion here: [https://github.com/kedacore/keda/issues/5114](https://github.com/kedacore/keda/issues/5114)

I don't want to offend or misconstrue anyone here, so please don't take any of this as criticism, just trying to untangle the web – please correct me if I've misunderstood anything.

It seems to me that @junekhan may have confused the behaviour of "the number to scale", and thought that it would scale like a Deployment (where in the example above, a scale of 3 _would_ mean only 3 running jobs after poll, instead of 3 _new_ jobs). My evidence is this comment:
> [junekhan](https://github.com/junekhan) commented [on Oct 27, 2023](https://github.com/kedacore/keda/issues/5114#issuecomment-1782431472)
> Getting back to ScaledJob, let's imagine a case with 3 running pods and another 3 messages standing in line, and each of them takes 3 hours or even longer to run. Does it sound better if we empty the queue and run 6 pods in parallel within our affordable limit which is 10 replicas?

But this is the behaviour of `default`. @zroubalik replies and says this behaviour should be added. The pull request is later made by @junekhan and documentation added by @zroubalik. 

It's possible that some miscommunication happened here, so I also wanted to work out what the `eager` strategy _does_, in case I misunderstood the intention, and it is simply the documentation that needs updating.

#### The actual behaviour
Here I will try to narrate a sequence of logic through the code that explains how the two strategies work. I hope you can follow it – I have tried to just include the relevant detail with function names, parameter names, return value names, code/pseudocode behaviour, and some commentary (_in italics_). The function names link to the code with line numbers. I will also include the example values from earlier.

- in [`checkScalers`](https://github.com/kedacore/keda/blob/main/pkg/scaling/scale_handler.go#L262)
  `isActive, isError, scaleTo, maxScale := h.isScaledJobActive(ctx, obj)`
    - in [`isScaledJobActive`](https://github.com/kedacore/keda/blob/main/pkg/scaling/scale_handler.go#L899)
      `isActive, queueLength, maxValue, maxFloatValue := scaledjob.IsScaledJobActive(scalersMetrics, scaledJob.Spec.ScalingStrategy.MultipleScalersCalculation, scaledJob.MinReplicaCount(), scaledJob.MaxReplicaCount())`
        - in [`IsScaledJobActive`](https://github.com/kedacore/keda/blob/main/pkg/scaling/scaledjob/metrics.go#L54)
          ```
          sum/max/avg over each metric:
              queueLength = metric.QueueLength
              maxValue = metric.MaxValue
          ```
          (_but where do metrics get their values for MaxValue/QueueLength, an aside:_)
              - in [`CalculateQueueLengthAndMaxValue`](https://github.com/kedacore/keda/blob/main/pkg/scaling/scaledjob/metrics.go#L31)
          ```
          for each metric:
              queueLength += metricValue
          targetAverageValue = getTargetAverageValue(metricSpecs)
          averageLength := queueLength / targetAverageValue
          maxValue = min(averageLength, maxReplicaCount)
          ```
          (`getTargetAverageValue` gets the target value from the trigger, so for our example `targetAverageValue=1`, `queueLength=6`, `maxReplicaCount=10`, and so `maxValue=6`. _worth noting that `queueLength` does not divide by `targetAverageValue`, it is the raw length_)
        - (_back inside `IsScaledJobActive`_)
          `maxValue = min(maxValue, maxReplicaCount)`
          `return isActive, ceilToInt64(queueLength), ceilToInt64(maxValue), maxValue`
    - (_so `IsScaledJobActive` returns `queueLength=6` and `maxValue=6`_)
   - (_and `isScaledJobActive` returns them in this order: `isActive, isError, queueLength, maxValue`_)
 - (`checkScalers` _assigns these to `isActive, isError, scaleTo, maxScale`_, so `scaleTo=queueLength=6, maxScale=maxValue=6`)
  `h.scaleExecutor.RequestJobScale(ctx, obj, isActive, isError, scaleTo, maxScale)`
    - [`RequestJobScale`](https://github.com/kedacore/keda/blob/main/pkg/scaling/executor/scale_jobs.go#L41)
      `effectiveMaxScale, scaleTo := e.getScalingDecision(scaledJob, runningJobCount, scaleTo, maxScale, pendingJobCount, logger)`
        - [`getScalingDecision`](https://github.com/kedacore/keda/blob/main/pkg/scaling/executor/scale_jobs.go#L111)
          (_this is where it forks based on scaling strategy_)
          `effectiveMaxScale, scaleTo = NewScalingStrategy(logger, scaledJob).GetEffectiveMaxScale(maxScale, runningJobCount-minReplicaCount, pendingJobCount, scaledJob.MaxReplicaCount(), scaleTo)`
          and the definition of `GetEffectiveMaxScale`: `GetEffectiveMaxScale(maxScale, runningJobCount, pendingJobCount, maxReplicaCount, scaleTo int64) (int64, int64)`
            (_example: `maxScale=6`, `runningJobCount=3`, `minReplicaCount=0`, `pendingJobCount=0`, `scaledJob.MaxReplicaCount()=10`, `scaleTo=6`_)
            - `default`
              `return maxScale - runningJobCount, scaleTo`
              (_so this returns `(3, 6)`_)
            - `eager`
              `return min(maxReplicaCount-runningJobCount-pendingJobCount, maxScale), maxReplicaCount`
              (_so this returns `(min(7, 6), 10)=(6, 10)`_)
       - `return effectiveMaxScale, scaleTo`
   - (_finally `RequestJobScale` calls [`e.createJobs`](https://github.com/kedacore/keda/blob/main/pkg/scaling/executor/scale_jobs.go#L63)_)
    - `e.createJobs(ctx, logger, scaledJob, scaleTo, effectiveMaxScale)`
       with signature: `createJobs(..., scaleTo int64, maxScale int64)` (_so `effectiveMaxScale` is now `maxScale`_)
        - and this does:
       ```          
       if maxScale <= 0: return
       if scaleTo > maxScale: scaleTo = maxScale
       generate scaleTo jobs
       ```
       so for our example values,
            - default: `maxScale = 3`, `scaleTo = 6`, so this generates 3 jobs
            - eager: `maxScale = 6`, `scaleTo = 10`, so this generates 6 jobs

After the second poll in our example, the eager strategy will have 9 jobs. On the third poll, assuming no new jobs, it will create 1 more job and hit the maximum, since that is `maxReplicaCount-runningJobCount`.

I am not sure what `scaleTo` is doing in this calculation. It is set to the queue length, unmodified by the `targetAverageValue`, `maxReplicas`, or `runningJobs`. I can't immediately see any scenario where `scaleTo < maxScale`, meaning that it will always just use the value of `maxScale` for the number of jobs to create.

Regardless my conclusion for the behaviour of the `eager` strategy is that it does as @JorTurFer asked in the discussion, which is that it scales up until it hits the maximum whenever the queue is non zero. But the rate of scaling depends on the number of items in the queue. I'm still not sure if this is the intended behaviour – I think this could be achieved more efficiently with a scale strategy like
` if maxScale > 0 return maxReplicaCount else 0 `
and there wouldn't be a slow ramp up, but perhaps that is desirable.

### Expected Behavior

Expected `default` to have 3 running jobs, and `eager` to have 6 running jobs

### Actual Behavior

`default` has 6 running jobs, `eager` has 9 running jobs

### Steps to Reproduce the Problem

See above

### Logs from KEDA operator

_No response_

### KEDA Version

2.16.0

### Kubernetes Version

None

### Platform

None

### Scaler Details

_No response_

### Anything else?

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

eager scaling strategy for ScaledJob does not work as documented (or intended?) #6416

Report

In short

The documented behaviour

The intended behaviour

The actual behaviour

Expected Behavior

Actual Behavior

Steps to Reproduce the Problem

Logs from KEDA operator

KEDA Version

Kubernetes Version

Platform

Scaler Details

Anything else?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

eager scaling strategy for ScaledJob does not work as documented (or intended?) #6416

Description

Report

In short

The documented behaviour

The intended behaviour

The actual behaviour

Expected Behavior

Actual Behavior

Steps to Reproduce the Problem

Logs from KEDA operator

KEDA Version

Kubernetes Version

Platform

Scaler Details

Anything else?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions