Skip to content

Callculation::resizeMatricesExtend() overwrites all rows in the smaller matrix. #4451

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mklepaczewski opened this issue Apr 24, 2025 · 3 comments · May be fixed by #4474
Open

Callculation::resizeMatricesExtend() overwrites all rows in the smaller matrix. #4451

mklepaczewski opened this issue Apr 24, 2025 · 3 comments · May be fixed by #4474

Comments

@mklepaczewski
Copy link

The bug

Calling Callculation::resizeMatricesExtend() with matrices of different sizes overwrites all rows in the smaller matrix with the value found in the last row. I haven't found the expected behaviour of the method (comments of the method don't specify it), but I think the idea was to extend the smaller matrix with values found in the last row, not to overwrite all of it with the value found in the last row.

Example:

<?php
declare(strict_types=1);

require __DIR__ . '/vendor/autoload.php';

use PhpOffice\PhpSpreadsheet\Calculation\Calculation;

// Sample matrices to test with
$matrix1 = [[1], [3]];
$matrix2 = [[5], [8], [11]];

// Use reflection to make the protected method accessible
$calculation = new Calculation();
$reflectionMethod = new ReflectionMethod(Calculation::class, 'resizeMatricesExtend');
$reflectionMethod->setAccessible(true);

// Call the method using reflection
$reflectionMethod->invokeArgs($calculation, [&$matrix1, &$matrix2, count($matrix1), 1, count($matrix2), 1]);

echo "Matrix 1 after:\n";
print_r($matrix1);

(Presumably) Expected output:

$matrix1 = [ [1], [3], [3] ];

Actual output

$matrix1 = [ [3], [3], [3] ];

Relevant code:

if ($matrix1Rows < $matrix2Rows) {
$x = $matrix1[$matrix1Rows - 1];
for ($i = 0; $i < $matrix2Rows; ++$i) {
$matrix1[$i] = $x;
}
}

Note that the analogous code for columns does not overwrite the previous columns, it extends them with the last value:

if ($matrix1Columns < $matrix2Columns) {
for ($i = 0; $i < $matrix1Rows; ++$i) {
$x = $matrix1[$i][$matrix1Columns - 1];
for ($j = $matrix1Columns; $j < $matrix2Columns; ++$j) {
$matrix1[$i][$j] = $x;
}
}
}

Proposed change

            if ($matrix1Rows < $matrix2Rows) {
                $x = $matrix1[$matrix1Rows - 1];
                for ($i = matrix1Rows ; $i < $matrix2Rows; ++$i) {
                    $matrix1[$i] = $x;
                }
            }

Please note that analogous change should be done for case where $matrix2Rows < $matrix1Rows.

I can create a PR if the change is accepted.

Why it matters

We found that some formulas that work for us in Excel result in #CALC! in PhpSpreadsheet. As an example, the following type of formulas result in #CALC!: =FILTER(Products!M:M;(Products!K:K=D5)*(Products!L:L=E5)). Products!K:K=D5 and Products!L:L=E5 generate matrices of different sizes (for whatever reason). PhpSpreadsheet extends the smaller matrix and overwrites all of its values. In effect, FILTER() fails to match any rows and results in #CALC!

Expected result

I don't know if the above behaviour is expected, but I suspect it's a bug. I haven't found any description of this behaviour. If it's not a bug, documenting it might be a good idea.

@oleibman
Copy link
Collaborator

Yes, please go ahead and create a PR with your changes and new test cases, and I will evaluate it. It would be preferable for your test cases to run in the context of a spreadsheet (e.g. your #CALC! formulas) rather than, or at least in addition to, your Reflection example. Feel free to add as many comments as you think the method requires.

@oleibman
Copy link
Collaborator

oleibman commented May 13, 2025

I have looked into this some more, and will probably create a PR on my own. There's a lot going on here.

Extending the last row or column doesn't really make sense to me. I think this was probably intended to expand a 1*1 matrix to the size of is corresponding operand. I could easily be wrong, but let's go with that for now. In that case, what we want to do is repeat the last row (if there is one row) or column (if there is one column), or otherwise null-fill. This seems to not break anything, and get a reasonable result for your test case.

I admit that I'm not entirely sure what a reasonable result for your test case is. If you use the entire column, PhpSpreadsheet will wind up using 1,048,576-element arrays, which, given enough time and memory (which you probably don't have), will probably work. So I reformulated your formula as:

=FILTER(Products!M1:M9;(Products!K1:K8=D5)*(Products!L1:L9=E5))

Note that the K array has 1 fewer entry than M or L. This matches your symptom (CALC error) with the old logic, but the new logic handles it "correctly". At least it probably meets your expectations. But ... using that formula in Excel results in an error (I've seen VALUE and N/A and I think CALC but I can't remember how to duplicate that)! So perhaps the original result was correct, for the wrong reason. I need time to think about how this should really work.

@mklepaczewski
Copy link
Author

mklepaczewski commented May 13, 2025

I'm sorry for not creating the PR yet. Life happened ;-)

I admit that I'm not entirely sure what a reasonable result for your test case is.

I don't know if compatibility with Excel is a goal of this project, but as a user, that would be my expectation. If I remember correctly, Excel handles it by repeating the last row/column. What I'm sure of is that overwriting the smaller matrix is not the expected result.

Please don't worry about my case. We'll manage it on our end.

Thank you for the hard work you put into the project!

oleibman added a commit to oleibman/PhpSpreadsheet that referenced this issue May 17, 2025
Fix PHPOffice#4451. The reporter's analysis is correct - the method has a bug. The solution is relatively easy. Nevertheless, proving that the fix works as desired is tricky. The submitter's Reflection test is good, but it would be better to find one within Excel. My additional tests are contrived, for reasons explained in the test member. So, I will leave this as a work in progress while I search for something better. If that search doesn't succeed, I will nevertheless merge this in about a month.
@oleibman oleibman linked a pull request May 17, 2025 that will close this issue
11 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

2 participants