Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-36094: Update smtplib.py #23635

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

bpo-36094: Update smtplib.py #23635

wants to merge 1 commit into from

Conversation

prrvchr
Copy link

@prrvchr prrvchr commented Dec 4, 2020

@prrvchr prrvchr requested a review from a team as a code owner December 4, 2020 10:04
@the-knights-who-say-ni
Copy link

Hello, and thanks for your contribution!

I'm a bot set up to make sure that the project can legally accept this contribution by verifying everyone involved has signed the PSF contributor agreement (CLA).

Recognized GitHub username

We couldn't find a bugs.python.org (b.p.o) account corresponding to the following GitHub usernames:

@prrvchr

This might be simply due to a missing "GitHub Name" entry in one's b.p.o account settings. This is necessary for legal reasons before we can look at this contribution. Please follow the steps outlined in the CPython devguide to rectify this issue.

You can check yourself to see if the CLA has been received.

Thanks again for the contribution, we look forward to reviewing it!

@@ -1039,7 +1041,7 @@ def _get_socket(self, host, port, timeout):
self._print_debug('connect:', (host, port))
new_socket = super()._get_socket(host, port, timeout)
new_socket = self.context.wrap_socket(new_socket,
server_hostname=self._host)
server_hostname=host)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be preferable if server_hostname was maintained as a separate parameter from host for the reasons outlined in msg376344.

The work-around that I am currently employing for this issue involves:

  • Calling smtp = smtplib.SMTP(…) without a host parameter in order to avoid initially establishing a connection
  • Setting smtp._host = server_hostname in order to set the parameter to the ssl.SSLContext.wrap_socket() call
  • Calling smtp.connect(address, …) where address is an IP address previously obtained from a call to socket.getaddrinfo(server_hostname, …)

This approach is necessary because TLS server certificates rarely contain IP address information that would be required for certificate validation, the expectation being that validation be performed against the server's hostname.

The solution that you propose here will break this work-around, as self._host is no longer passed as server_hostname; the ability to provide separate parameters for establishing a socket connection (i.e. address) and the corresponding server_hostname validation will have been removed.

Copy link

@amcgregor amcgregor Dec 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an additional concern that users of my own library fell prey to (frequently, you may notice) in attempting to patch the problem from the side of my library: passing in a hostname establishes a connection, calling self.connect(host, port) quite early on, preventing invocation of set_debuglevel prior to the connection attempt.

Other than "it's always done this, historically" or hypothetical convenience, is there an actual reason for an object initializer / constructor to establish TCP connections outbound? I.e. actually do something, procedurally, and not just prepare the object for subsequent method invocation?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be preferable if server_hostname was maintained as a separate parameter from host for the reasons outlined in msg376344.

It would be urgent to finally do what the documentation says:

class smtplib.SMTP(host='', port=0, local_hostname=None, [timeout, ]source_address=None)........If the optional host and port parameters are given, the SMTP connect() method is called with those parameters during initialization

There is never any question of connecting with an ip address

Then either you have to change the documentation, or it's the API?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be urgent to finally do what the documentation says…

I cannot identify the discrepancy between the implementation and the documentation to which you appear to be alluding; can you elaborate?

There is never any question of connecting with an ip address

Providing either a String containing a hostname (e.g. example.com) or a String containing an address (e.g. 1.2.3.4) is a fairly well established behaviour when dealing with network socket connections. My desire to provide the host in IP address form, which necessitates the work-around that I initially described, is specifically highlighted within the socket - Socket families documentation:

If you use a hostname in the host portion of IPv4/v6 socket address, the program may show a nondeterministic behavior, as Python uses the first address returned from the DNS resolution. The socket address will be resolved differently into an actual IPv4/v6 address, depending on the results from DNS resolution and/or the host configuration. For deterministic behavior use a numeric address in host portion.

Basically, I'm proposing a solution along the lines of:

…
def _get_socket(self, host, port, timeout, server_hostname=None):
    if server_hostname is None:
        server_hostname = host
    …
    new_socket = self.context.wrap_socket(new_socket, server_hostname=server_hostname)
…

with support for specifying server_hostname being plumbed through the various function call layers such that it is exposed by the SMTP library's public API.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot identify the discrepancy between the implementation and the documentation to which you appear to be alluding; can you elaborate?

I want to integrate smtplib in LibreOffice / OpenOffice.
For that, I have a wizard allowing to configure the connection (host, port, connection mode: not secure, SSL or TLS, authentication method: without, login, OAuth2 ...) using Mozilla ISPDB technology ...
This wizard needs to load the smtplib.SMTP or smtplib.SMTP_SSL class without connection for 2 reasons:

Allow to adjust the debug level before connecting.

Get the code and the message returned by the connect () method to allow better debugging if necessary...

For the moment it is not possible to instantiate the smtplib.SMTP or smtplib.SMTP_SSL class without providing the host parameter without having an error (see bug , contrary to what the documentation says: host is optional ...)

There is never any question of connecting with an ip address

I understand the need to have a deterministic connection mode on the ip address, but in this case it is necessary to modify the API, because it does not offer anything like method or property to allow it... , and assign a private attribute (self._host) is not very pythonic...

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see; basically you're saying that this bug should be addressed. I totally agree; sorry, I presumed that was a given.

Indeed, I believe that it is necessary to expand the API (e.g. to introduce a server_hostname argument to the various connection methods) in order to correctly handle support for TLS validation, initially introduced in a5768f72 and which broke the ability to establish a connection outside of the scope of the initialisation function.

I am by no means advocating that my work-around, of manually overriding the value of self._host, is a good one, but it does at least allow me to provide a viable solution. My initial comment on your proposed changes was to highlight the comment I made on the original bug report and to pre-empt the next bug report I would otherwise be forced to create in order to highlight that it is not valid to assume that server_hostname should be the same as host in every scenario.

Copy link
Author

@prrvchr prrvchr Dec 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I admit that my proposed modifications cannot be used if we want to have a deterministic connection mode.

But it would be good to definitely solve both problems: initialization without host parameter and deterministic connection.

If the API change is needed then do it, as we can't be happy with how it works right now ...

Edit:
I am ready to make a new pull request incorporating the changes you suggest.
It remains to be decided where self._host should be set to its value in the connect() procedure when it is not supplied at initialization.
We must also plan to update the documentation to integrate the new optional parameter server_hostname

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be most welcome if you could fix these issues. I do not mind if my current work-around is broken by any proposed changes (i.e. I should not be modifying the internal implementation detail represented by self._host). All I ask is for it to be possible to perform TLS validation in a scenario where server_hostname may not be assumed to be equal to host. I am perfectly happy to fix-up my current implementation (e.g. to adopt the new API parameter).

@github-actions
Copy link

github-actions bot commented Jan 7, 2021

This PR is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale Stale PR or inactive for long period of time. label Jan 7, 2021
@github-actions github-actions bot removed the stale Stale PR or inactive for long period of time. label Aug 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants