-
Notifications
You must be signed in to change notification settings - Fork 316
fix: fix rows returned when both start_index and page_size are provided #2181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
79e5fd9
810e8dd
c9883a5
ab099c2
90115a3
6c77ada
d75c190
a41be58
e33e02d
1689211
34a9243
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1682,6 +1682,78 @@ def test_result_with_start_index(self): | |
tabledata_list_request[1]["query_params"]["maxResults"], page_size | ||
) | ||
|
||
def test_result_with_start_index_multi_page(self): | ||
# When there are multiple pages of response and the user has set | ||
# start_index, we should supply start_index to the server in the first | ||
# request. However, in the subsequent requests, we will pass only | ||
# page_token but not start_index, because the server only allows one | ||
# of them. | ||
from google.cloud.bigquery.table import RowIterator | ||
|
||
query_resource = { | ||
"jobComplete": True, | ||
"jobReference": {"projectId": self.PROJECT, "jobId": self.JOB_ID}, | ||
"schema": {"fields": [{"name": "col1", "type": "STRING"}]}, | ||
"totalRows": "7", | ||
} | ||
|
||
# Although the result has 7 rows, the response only returns 6, because | ||
# start_index is 1. | ||
tabledata_resource_1 = { | ||
"totalRows": "7", | ||
"pageToken": "page_token_1", | ||
"rows": [ | ||
{"f": [{"v": "abc"}]}, | ||
{"f": [{"v": "def"}]}, | ||
{"f": [{"v": "ghi"}]}, | ||
], | ||
} | ||
tabledata_resource_2 = { | ||
"totalRows": "7", | ||
"pageToken": None, | ||
"rows": [ | ||
{"f": [{"v": "jkl"}]}, | ||
{"f": [{"v": "mno"}]}, | ||
{"f": [{"v": "pqe"}]}, | ||
], | ||
} | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please go through the rest of this test and provide some additional comments. Clarify why we are using
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Done.
There are comments above the code block explaining that these are the first and the second requests that the client sends to the server. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am gonna drop this comment. |
||
connection = make_connection( | ||
query_resource, tabledata_resource_1, tabledata_resource_2 | ||
) | ||
client = _make_client(self.PROJECT, connection=connection) | ||
resource = self._make_resource(ended=True) | ||
job = self._get_target_class().from_api_repr(resource, client) | ||
|
||
start_index = 1 | ||
page_size = 3 | ||
|
||
result = job.result(page_size=page_size, start_index=start_index) | ||
|
||
self.assertIsInstance(result, RowIterator) | ||
self.assertEqual(result.total_rows, 7) | ||
|
||
rows = list(result) | ||
|
||
self.assertEqual(len(rows), 6) | ||
self.assertEqual(len(connection.api_request.call_args_list), 3) | ||
|
||
# First call has both startIndex and maxResults. | ||
tabledata_list_request_1 = connection.api_request.call_args_list[1] | ||
self.assertEqual( | ||
tabledata_list_request_1[1]["query_params"]["startIndex"], start_index | ||
) | ||
self.assertEqual( | ||
tabledata_list_request_1[1]["query_params"]["maxResults"], page_size | ||
) | ||
|
||
# Second call only has maxResults. | ||
tabledata_list_request_2 = connection.api_request.call_args_list[2] | ||
self.assertFalse("startIndex" in tabledata_list_request_2[1]["query_params"]) | ||
self.assertEqual( | ||
tabledata_list_request_2[1]["query_params"]["maxResults"], page_size | ||
) | ||
|
||
def test_result_error(self): | ||
from google.cloud import exceptions | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we mention sending
page_token
, but nowhere in the test do we confirm thatpage_token
is set or that it has been sent, etc.In fact, while it is correct, with this language in place, it seems weird that in
tabledata_resource_2
,pagetoken
is set toNone
. The above language implies that we might send something besidesNone
.Should we revise the language slightly?
Do we mean some other argument/attribute besides
page_token
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the last message, the page token will be empty because there is no following page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also,
tabledata_resource_2
is what the server returns, not what we pass to the server. The process of sending the page token happens withinresult = job.result(page_size=page_size, start_index=start_index)
and I don't think it's realistic to expose that in unit test.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am gonna drop this comment.