Skip to content

Move at::{Refcounted,}MapAllocator to c10 #109881

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 16 commits into from

Conversation

peterbell10
Copy link
Collaborator

@peterbell10 peterbell10 commented Sep 22, 2023

Stack from ghstack (oldest at bottom):

libshm.so depends on the torch library exclusively for at::RefcountedMapAllocator,
so it makes sense to move it to c10 along with the other memory allocators.

This means libshm.so only depends on c10 and we don't need to relink
libshm.so for every ATen change.

libshm.so depends on the torch library exclusively for
`at::RefcountedMapAllocator`, so it makes sense to move it to c10 along with the
other other memory allocators.

This means `libshm.so` only depends on `c10` and we don't need to relink
`libshm.so` for every ATen change.

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Sep 22, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/109881

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

✅ No Failures

As of commit 2e1530c with merge base 625a3b1 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

`libshm.so` depends on the torch library exclusively for `at::RefcountedMapAllocator`,
 so it makes sense to move it to c10 along with the other memory allocators.

This means `libshm.so` only depends on `c10` and we don't need to relink
`libshm.so` for every ATen change.

[ghstack-poisoned]
`libshm.so` depends on the torch library exclusively for `at::RefcountedMapAllocator`,
 so it makes sense to move it to c10 along with the other memory allocators.

This means `libshm.so` only depends on `c10` and we don't need to relink
`libshm.so` for every ATen change.

[ghstack-poisoned]
`libshm.so` depends on the torch library exclusively for `at::RefcountedMapAllocator`,
 so it makes sense to move it to c10 along with the other memory allocators.

This means `libshm.so` only depends on `c10` and we don't need to relink
`libshm.so` for every ATen change.

[ghstack-poisoned]
`libshm.so` depends on the torch library exclusively for `at::RefcountedMapAllocator`,
 so it makes sense to move it to c10 along with the other memory allocators.

This means `libshm.so` only depends on `c10` and we don't need to relink
`libshm.so` for every ATen change.

[ghstack-poisoned]
`libshm.so` depends on the torch library exclusively for `at::RefcountedMapAllocator`,
 so it makes sense to move it to c10 along with the other memory allocators.

This means `libshm.so` only depends on `c10` and we don't need to relink
`libshm.so` for every ATen change.

[ghstack-poisoned]
`libshm.so` depends on the torch library exclusively for `at::RefcountedMapAllocator`,
 so it makes sense to move it to c10 along with the other memory allocators.

This means `libshm.so` only depends on `c10` and we don't need to relink
`libshm.so` for every ATen change.

[ghstack-poisoned]
`libshm.so` depends on the torch library exclusively for `at::RefcountedMapAllocator`,
 so it makes sense to move it to c10 along with the other memory allocators.

This means `libshm.so` only depends on `c10` and we don't need to relink
`libshm.so` for every ATen change.

[ghstack-poisoned]
@peterbell10 peterbell10 added the ciflow/periodic Trigger jobs ran periodically on master (periodic.yml) on the PR label Oct 4, 2023
@peterbell10 peterbell10 marked this pull request as ready for review October 4, 2023 13:39
@peterbell10 peterbell10 requested a review from albanD October 4, 2023 13:39

C10_API std::string NewProcessWideShmHandle();

class C10_API MapAllocator {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These diffs are a mess, but it's mostly just a copy paste of the aten files with aten changed to c10. However, with the linter changes, it's too big of a diff for git to recognize as a simple rename. This is clearer in the commit-level diffs.

Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ho very nice cleanup!

I'll verify that all build are good and then merge this.

@albanD
Copy link
Collaborator

albanD commented Oct 4, 2023

@pytorchbot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

`libshm.so` depends on the torch library exclusively for `at::RefcountedMapAllocator`,
 so it makes sense to move it to c10 along with the other memory allocators.

This means `libshm.so` only depends on `c10` and we don't need to relink
`libshm.so` for every ATen change.

[ghstack-poisoned]
@pytorchmergebot
Copy link
Collaborator

Successfully rebased gh/peterbell10/619/orig onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via ghstack checkout https://github.com/pytorch/pytorch/pull/109881)

pytorchmergebot pushed a commit that referenced this pull request Oct 4, 2023
libshm.so depends on the torch library exclusively for
`at::RefcountedMapAllocator`, so it makes sense to move it to c10 along with the
other other memory allocators.

This means `libshm.so` only depends on `c10` and we don't need to relink
`libshm.so` for every ATen change.

ghstack-source-id: 198c9ee
Pull Request resolved: #109881
@peterbell10 peterbell10 added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 5, 2023
@albanD
Copy link
Collaborator

albanD commented Oct 9, 2023

I can't import this so yolo

@pytorchbot merge -r

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

`libshm.so` depends on the torch library exclusively for `at::RefcountedMapAllocator`,
 so it makes sense to move it to c10 along with the other memory allocators.

This means `libshm.so` only depends on `c10` and we don't need to relink
`libshm.so` for every ATen change.

[ghstack-poisoned]
@pytorchmergebot
Copy link
Collaborator

Successfully rebased gh/peterbell10/619/orig onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via ghstack checkout https://github.com/pytorch/pytorch/pull/109881)

pytorchmergebot pushed a commit that referenced this pull request Oct 9, 2023
libshm.so depends on the torch library exclusively for
`at::RefcountedMapAllocator`, so it makes sense to move it to c10 along with the
other other memory allocators.

This means `libshm.so` only depends on `c10` and we don't need to relink
`libshm.so` for every ATen change.

ghstack-source-id: 7b78056
Pull Request resolved: #109881
@peterbell10
Copy link
Collaborator Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@kit1980
Copy link
Contributor

kit1980 commented Oct 13, 2023

@pytorchbot revert -m "breaking internal builds, undefined symbol: _ZN3c1022RefcountedMapAllocator6decrefEv" -c ghfirst

@kit1980
Copy link
Contributor

kit1980 commented Oct 13, 2023

@peterbell10 You probably need someone from Meta to help re-landing this.

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

@pytorchmergebot
Copy link
Collaborator

@peterbell10 your PR has been successfully reverted.

pytorchmergebot added a commit that referenced this pull request Oct 13, 2023
This reverts commit 68a1219.

Reverted #109881 on behalf of https://github.com/kit1980 due to breaking internal builds, undefined symbol: _ZN3c1022RefcountedMapAllocator6decrefEv ([comment](#109881 (comment)))
@peterbell10 peterbell10 reopened this Oct 13, 2023
preprocessor_flags = ['-DC10_BUILD_MAIN_LIB'],
preprocessor_flags = [
'-DC10_BUILD_MAIN_LIB',
'-DHAVE_MMAP=1',
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

undefined symbol: _ZN3c1022RefcountedMapAllocator6decrefEv"

I had the same error in the bazel build when HAVE_MMAP wasn't being defined. Is it possible there is another build definition that needs the flag added?

If so, does that mean we have 4 completely unrelated build definitions: CMake, bazel, buck (oss), buck (internal)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peterbell10 unfortunately it's even more than 4: more than 1 internal buck.

But they are not completely unrelated: all of them are supposed to use build.bzl files (like https://github.com/pytorch/pytorch/blob/main/c10/build.bzl), but it's only implemented for some things so far.

For this particular case, likely https://github.com/pytorch/pytorch/blob/main/c10/ovrsource_defs.bzl needs to be updated. But someone from Meta should work with you on this.

`libshm.so` depends on the torch library exclusively for `at::RefcountedMapAllocator`,
 so it makes sense to move it to c10 along with the other memory allocators.

This means `libshm.so` only depends on `c10` and we don't need to relink
`libshm.so` for every ATen change.

[ghstack-poisoned]
peterbell10 added a commit that referenced this pull request Oct 13, 2023
libshm.so depends on the torch library exclusively for
`at::RefcountedMapAllocator`, so it makes sense to move it to c10 along with the
other other memory allocators.

This means `libshm.so` only depends on `c10` and we don't need to relink
`libshm.so` for every ATen change.

ghstack-source-id: 96dea8e
Pull Request resolved: #109881
`libshm.so` depends on the torch library exclusively for `at::RefcountedMapAllocator`,
 so it makes sense to move it to c10 along with the other memory allocators.

This means `libshm.so` only depends on `c10` and we don't need to relink
`libshm.so` for every ATen change.

[ghstack-poisoned]
deps = select({
deps = [
"//third-party/rt",
] + select({
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kit1980 would you be able to help on the meta-internal side? I think this might fix it but not idea if it works.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can try but not today.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kit1980 any updates on this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping @kit1980

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll look today.

Copy link
Contributor

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

@kit1980
Copy link
Contributor

kit1980 commented Dec 20, 2023

@pytorchbot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Rebase failed due to Command git -C /home/runner/work/pytorch/pytorch rebase refs/remotes/origin/viable/strict gh/peterbell10/619/orig returned non-zero exit code 1

Rebasing (1/1)
Auto-merging BUILD.bazel
Auto-merging aten/src/ATen/CMakeLists.txt
CONFLICT (modify/delete): aten/src/ATen/MapAllocator.cpp deleted in 11e28cb2f63 (Move at::{Refcounted,}MapAllocator to c10) and modified in HEAD.  Version HEAD of aten/src/ATen/MapAllocator.cpp left in tree.
Auto-merging build_variables.bzl
Auto-merging c10/CMakeLists.txt
Auto-merging c10/ovrsource_defs.bzl
Auto-merging caffe2/CMakeLists.txt
Auto-merging cmake/Codegen.cmake
error: could not apply 11e28cb2f63... Move at::{Refcounted,}MapAllocator to c10
hint: Resolve all conflicts manually, mark them as resolved with
hint: "git add/rm <conflicted_files>", then run "git rebase --continue".
hint: You can instead skip this commit: run "git rebase --skip".
hint: To abort and get back to the state before "git rebase", run "git rebase --abort".
Could not apply 11e28cb2f63... Move at::{Refcounted,}MapAllocator to c10

Raised by https://github.com/pytorch/pytorch/actions/runs/7279909983

@kit1980
Copy link
Contributor

kit1980 commented Dec 20, 2023

@peterbell10 May I ask you to fix the conflicts?
Or even better create a new non-ghstack PR - it will simplify things for me (it's a single PR stack anyway).

@peterbell10
Copy link
Collaborator Author

Okay, rebased and resubmitted as #116204

@facebook-github-bot facebook-github-bot deleted the gh/peterbell10/619/head branch December 24, 2023 15:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/periodic Trigger jobs ran periodically on master (periodic.yml) on the PR ciflow/trunk Trigger trunk jobs on your pull request Merged open source Reverted topic: not user facing topic category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants