Skip to content

Refactor get_github_auth_token #71

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 16, 2021

Conversation

coni2k
Copy link
Contributor

@coni2k coni2k commented Jan 12, 2021

I realized that it's possible to use multiple auth tokens for GitHub, which is a life-saver. Thanks for that ✌
Now generate script takes much less time to run.

While I was using multiple tokens, I noticed that get_github_auth_token function doesn't take the shortest wait time but just uses the wait time of the last token. Although not sure how much this would improve the process but I wanted to try refactoring it.

Here is the summary of this update:

  • Introduced _CACHED_GITHUB_TOKENS global variable that stores token, token_obj and reset data as a list
  • get_github_auth_token returns an object from this list
  • While searching for the next active token, it ignores the already expired tokens
  • If all tokens are expired, it finds the minimum wait time and goes to sleep
  • That block is in a while loop and so it just goes back to start to find the next active token after the sleep

One more small update that I could do is to eliminate the use of _CACHED_GITHUB_TOKEN global variable, which is used in get_first_commit_time function. But that would mean one more "rate limit" check. If you think that's okay, I could update that part as well?

I keep sending these PRs to able to sync with my fork but since we don't have prior discussions about these updates, if it doesn't align with your work, feel free to ignore them as usual 🙏

@coni2k coni2k force-pushed the refactor-get_github_auth_token branch from 4167302 to 0e88035 Compare January 12, 2021 18:37
github_auth_token = os.getenv('GITHUB_AUTH_TOKEN')
assert github_auth_token, 'GITHUB_AUTH_TOKEN needs to be set.' # TODO Validate earlier?
for token in github_auth_token.split(','):
token_obj = github.Github(token)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are recreating all token objects after every repo, this is not needed, also feels unnecessary to add reset value in there, since it needs to be rechecked. Also get_github_auth_token_info is weird now, it is just set the reset value, not getting any info.

If you want to just do the minimum wait time, you can do in the loop by capturing least value of wait_time while iterating in the loop. that is it, no other change needed.

for token in tokens:	            
    token_obj = github.Github(token)
    near_expiry, wait_time = get_github_token_info(token_obj)
    if not near_expiry:	
        _CACHED_GITHUB_TOKEN = token	        
        _CACHED_GITHUB_TOKEN_OBJ = token_obj
        return token_obj

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, only kept "min_wait_time" part and removed the other changes

@inferno-chromium inferno-chromium merged commit 00e963d into ossf:main Jan 16, 2021
@coni2k coni2k deleted the refactor-get_github_auth_token branch January 18, 2021 18:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants