Skip to content

gh-133009: fix UAF in xml.etree.ElementTree.Element.__deepcopy__ #133010

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
May 10, 2025

Conversation

picnixz
Copy link
Member

@picnixz picnixz commented Apr 26, 2025

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not enough. "tag", "text" and "tail" can be set to different value during deepcopying. If it was the last reference, the deepcopy() argument can be destroyed, and the reference can became handling.

The safe way is to increase the reference count of the deepcopy() argument before calling any code that can release the GIL.

@picnixz
Copy link
Member Author

picnixz commented Apr 26, 2025

I actually tried to do something with tag etc, but I wasn't able to crash the interpreter.

@serhiy-storchaka
Copy link
Member

String literals are interned and saved in the constants list. You need to use something having a single reference. Try 'tag'.upper().

@picnixz
Copy link
Member Author

picnixz commented Apr 26, 2025

The safe way is to increase the reference count of the deepcopy() argument before calling any code that can release the GIL.

I've actually removed this code because I thought it wasn't needed, but I'll check with an evil non-interned string tomorrow.

@picnixz
Copy link
Member Author

picnixz commented Apr 27, 2025

I'll try to add more tests later as well.

@picnixz picnixz requested a review from serhiy-storchaka May 7, 2025 12:36
@@ -899,6 +905,8 @@ deepcopy(elementtreestate *st, PyObject *object, PyObject *memo)

if (Py_REFCNT(object) == 1) {
if (PyDict_CheckExact(object)) {
// Exact dictionaries do not execute arbitrary code as it's
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@serhiy-storchaka is this assumption correct? namely, here I don't need to incref object temporarily right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyDict_Next() does not use __iter__, so this comment is redundant. PyDict_Next() does not call any user code.

@picnixz picnixz added the needs backport to 3.14 bugs and security fixes label May 7, 2025
@@ -899,6 +905,8 @@ deepcopy(elementtreestate *st, PyObject *object, PyObject *memo)

if (Py_REFCNT(object) == 1) {
if (PyDict_CheckExact(object)) {
// Exact dictionaries do not execute arbitrary code as it's
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyDict_Next() does not use __iter__, so this comment is redundant. PyDict_Next() does not call any user code.

@picnixz
Copy link
Member Author

picnixz commented May 9, 2025

Actually, it's much more tricky than what I thought. Even with the INCREF/DECREF and the additional checks, the following still crashes, but not during deepcopy() but during de-allocation:

class Evil(ET.Element):
    def __deepcopy__(self, memo):
        root.append(ET.Element('y'))
        root.append(ET.Element('z'))
        return self

Y = Evil('y')
root = ET.Element('a')
root.extend([Evil('x'), ET.Element('t'), Y])
c = deepcopy(root)
print(list(c))
print("ok")
assert 0

So I'll need a bit more work.

@serhiy-storchaka
Copy link
Member

#133010 (comment) should help. It is good that you already have a reproducer.

@serhiy-storchaka
Copy link
Member

Ah, you have a different issue -- growing children, not attributes. The solution should be the same -- either ignore new items or resize the array in process.

@picnixz
Copy link
Member Author

picnixz commented May 9, 2025

The solution should be the same -- either ignore new items or resize the array in process.

Yes, actually I found a way to fix it in the meantime (but the same can be said)

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. 👍

root = ET.Element('a')
evil = X('x')
root.extend([evil, ET.Element('y')])
if is_python_implementation():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also make the C implementation raising RuntimeError. It is fine either way.

Copy link
Member Author

@picnixz picnixz May 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll postpone this for a future PR as I want to backport this one to 3.13 and 3.14 first.

@picnixz picnixz merged commit 116a9f9 into python:main May 10, 2025
39 checks passed
@miss-islington-app
Copy link

Thanks @picnixz for the PR 🌮🎉.. I'm working now to backport this PR to: 3.13, 3.14.
🐍🍒⛏🤖

@picnixz picnixz deleted the fix/xml/uaf-deepcopy-133009 branch May 10, 2025 07:32
miss-islington pushed a commit to miss-islington/cpython that referenced this pull request May 10, 2025
miss-islington pushed a commit to miss-islington/cpython that referenced this pull request May 10, 2025
@bedevere-app
Copy link

bedevere-app bot commented May 10, 2025

GH-133805 is a backport of this pull request to the 3.14 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.14 bugs and security fixes label May 10, 2025
@bedevere-app
Copy link

bedevere-app bot commented May 10, 2025

GH-133806 is a backport of this pull request to the 3.13 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.13 bugs and security fixes label May 10, 2025
picnixz added a commit that referenced this pull request May 10, 2025
…y__` (GH-133010) (#133806)

gh-133009: fix UAF in `xml.etree.ElementTree.Element.__deepcopy__` (GH-133010)
(cherry picked from commit 116a9f9)

Co-authored-by: Bénédikt Tran <[email protected]>
picnixz added a commit that referenced this pull request May 10, 2025
…y__` (GH-133010) (#133805)

gh-133009: fix UAF in `xml.etree.ElementTree.Element.__deepcopy__` (GH-133010)
(cherry picked from commit 116a9f9)

Co-authored-by: Bénédikt Tran <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants