Skip to content

Stop leaving sockets in CLOSE_WAIT on failed TLS connections #3634

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 25, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions modules/proto_tls/proto_tls.c
Original file line number Diff line number Diff line change
Expand Up @@ -622,6 +622,9 @@ static int proto_tls_send(const struct socket_info* send_sock,
return rlen;
con_release:
sh_log(c->hist, TCP_SEND2MAIN, "send 1, (%d)", c->refcnt);
/* close the fd if this process is not meant to own it */
if (c->proc_id != process_no)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jes , are you sure about this fix? the con_release label is jumped (from above) only if a connection was not already found, so the connection was locally (to the process) started. If the conn (and fd) are local to the process, I assume c->proc_id == process_no, right? so the code you added will be all the time false, not executed. Or maybe I'm missing something here :D

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worked on this a long time ago but in my notes it says that c->proc_id was 0 in the offending cases. So maybe the problem is just that c->proc_id wasn't getting set correctly?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jes , I had to time to dig a bit more into this. And the only valid case (according to your description and also to the code), is this one here https://github.com/OpenSIPS/opensips/blob/master/modules/proto_tls/proto_tls.c#L556, when the TCP conn is created (and we have a fd), but the TSL init fails. Here we should do the close on the fd, before jumping to con_release. IMHO this is the proper fix, meaning the fix in the right spot.

close(fd);
tcp_conn_release(c, (rlen < 0)?0:1);
return rlen;
}
Expand Down