@@ -306,3 +306,326 @@ added matters. To illustrate::
306
306
``7bit ``, non-ascii binary data is CTE encoded using the ``unknown-8bit ``
307
307
charset. Otherwise the original source header is used, with its existing
308
308
line breaks and and any (RFC invalid) binary data it may contain.
309
+
310
+
311
+ .. note ::
312
+
313
+ The remainder of the classes documented below are included in the standard
314
+ library on a :term: `provisional basis <provisional package> `. Backwards
315
+ incompatible changes (up to and including removal of the feature) may occur
316
+ if deemed necessary by the core developers.
317
+
318
+
319
+ .. class :: EmailPolicy(**kw)
320
+
321
+ This concrete :class: `Policy ` provides behavior that is intended to be fully
322
+ compliant with the current email RFCs. These include (but are not limited
323
+ to) :rfc: `5322 `, :rfc: `2047 `, and the current MIME RFCs.
324
+
325
+ This policy adds new header parsing and folding algorithms. Instead of
326
+ simple strings, headers are custom objects with custom attributes depending
327
+ on the type of the field. The parsing and folding algorithm fully implement
328
+ :rfc: `2047 ` and :rfc: `5322 `.
329
+
330
+ In addition to the settable attributes listed above that apply to all
331
+ policies, this policy adds the following additional attributes:
332
+
333
+ .. attribute :: refold_source
334
+
335
+ If the value for a header in the ``Message `` object originated from a
336
+ :mod: `~email.parser ` (as opposed to being set by a program), this
337
+ attribute indicates whether or not a generator should refold that value
338
+ when transforming the message back into stream form. The possible values
339
+ are:
340
+
341
+ ======== ===============================================================
342
+ ``none `` all source values use original folding
343
+
344
+ ``long `` source values that have any line that is longer than
345
+ ``max_line_length `` will be refolded
346
+
347
+ ``all `` all values are refolded.
348
+ ======== ===============================================================
349
+
350
+ The default is ``long ``.
351
+
352
+ .. attribute :: header_factory
353
+
354
+ A callable that takes two arguments, ``name `` and ``value ``, where
355
+ ``name `` is a header field name and ``value `` is an unfolded header field
356
+ value, and returns a string-like object that represents that header. A
357
+ default ``header_factory `` is provided that understands some of the
358
+ :RFC: `5322 ` header field types. (Currently address fields and date
359
+ fields have special treatment, while all other fields are treated as
360
+ unstructured. This list will be completed before the extension is marked
361
+ stable.)
362
+
363
+ The class provides the following concrete implementations of the abstract
364
+ methods of :class: `Policy `:
365
+
366
+ .. method :: header_source_parse(sourcelines)
367
+
368
+ The implementation of this method is the same as that for the
369
+ :class: `Compat32 ` policy.
370
+
371
+ .. method :: header_store_parse(name, value)
372
+
373
+ The name is returned unchanged. If the input value has a ``name ``
374
+ attribute and it matches *name * ignoring case, the value is returned
375
+ unchanged. Otherwise the *name * and *value * are passed to
376
+ ``header_factory ``, and the resulting custom header object is returned as
377
+ the value. In this case a ``ValueError `` is raised if the input value
378
+ contains CR or LF characters.
379
+
380
+ .. method :: header_fetch_parse(name, value)
381
+
382
+ If the value has a ``name `` attribute, it is returned to unmodified.
383
+ Otherwise the *name *, and the *value * with any CR or LF characters
384
+ removed, are passed to the ``header_factory ``, and the resulting custom
385
+ header object is returned. Any surrogateescaped bytes get turned into
386
+ the unicode unknown-character glyph.
387
+
388
+ .. method :: fold(name, value)
389
+
390
+ Header folding is controlled by the :attr: `refold_source ` policy setting.
391
+ A value is considered to be a 'source value' if and only if it does not
392
+ have a ``name `` attribute (having a ``name `` attribute means it is a
393
+ header object of some sort). If a source value needs to be refolded
394
+ according to the policy, it is converted into a custom header object by
395
+ passing the *name * and the *value * with any CR and LF characters removed
396
+ to the ``header_factory ``. Folding of a custom header object is done by
397
+ calling its ``fold `` method with the current policy.
398
+
399
+ Source values are split into lines using :meth: `~str.splitlines `. If
400
+ the value is not to be refolded, the lines are rejoined using the
401
+ ``linesep `` from the policy and returned. The exception is lines
402
+ containing non-ascii binary data. In that case the value is refolded
403
+ regardless of the ``refold_source `` setting, which causes the binary data
404
+ to be CTE encoded using the ``unknown-8bit `` charset.
405
+
406
+ .. method :: fold_binary(name, value)
407
+
408
+ The same as :meth: `fold ` if :attr: `cte_type ` is ``7bit ``, except that
409
+ the returned value is bytes.
410
+
411
+ If :attr: `cte_type ` is ``8bit ``, non-ASCII binary data is converted back
412
+ into bytes. Headers with binary data are not refolded, regardless of the
413
+ ``refold_header `` setting, since there is no way to know whether the
414
+ binary data consists of single byte characters or multibyte characters.
415
+
416
+ The following instances of :class: `EmailPolicy ` provide defaults suitable for
417
+ specific application domains. Note that in the future the behavior of these
418
+ instances (in particular the ``HTTP` instance) may be adjusted to conform even
419
+ more closely to the RFCs relevant to their domains.
420
+
421
+ .. data:: default
422
+
423
+ An instance of ``EmailPolicy `` with all defaults unchanged. This policy
424
+ uses the standard Python ``\n `` line endings rather than the RFC-correct
425
+ ``\r\n ``.
426
+
427
+ .. data :: SMTP
428
+
429
+ Suitable for serializing messages in conformance with the email RFCs.
430
+ Like ``default ``, but with ``linesep `` set to ``\r\n ``, which is RFC
431
+ compliant.
432
+
433
+ .. data :: HTTP
434
+
435
+ Suitable for serializing headers with for use in HTTP traffic. Like
436
+ ``SMTP `` except that ``max_line_length `` is set to ``None `` (unlimited).
437
+
438
+ .. data :: strict
439
+
440
+ Convenience instance. The same as ``default `` except that
441
+ ``raise_on_defect `` is set to ``True ``. This allows any policy to be made
442
+ strict by writing::
443
+
444
+ somepolicy + policy.strict
445
+
446
+ With all of these :class: `EmailPolicies <.EmailPolicy> `, the effective API of
447
+ the email package is changed from the Python 3.2 API in the following ways:
448
+
449
+ * Setting a header on a :class: `~email.message.Message ` results in that
450
+ header being parsed and a custom header object created.
451
+
452
+ * Fetching a header value from a :class: `~email.message.Message ` results
453
+ in that header being parsed and a custom header object created and
454
+ returned.
455
+
456
+ * Any custom header object, or any header that is refolded due to the
457
+ policy settings, is folded using an algorithm that fully implements the
458
+ RFC folding algorithms, including knowing where encoded words are required
459
+ and allowed.
460
+
461
+ From the application view, this means that any header obtained through the
462
+ :class: `~email.message.Message ` is a custom header object with custom
463
+ attributes, whose string value is the fully decoded unicode value of the
464
+ header. Likewise, a header may be assigned a new value, or a new header
465
+ created, using a unicode string, and the policy will take care of converting
466
+ the unicode string into the correct RFC encoded form.
467
+
468
+ The custom header objects and their attributes are described below. All custom
469
+ header objects are string subclasses, and their string value is the fully
470
+ decoded value of the header field (the part of the field after the ``: ``)
471
+
472
+
473
+ .. class :: BaseHeader
474
+
475
+ This is the base class for all custom header objects. It provides the
476
+ following attributes:
477
+
478
+ .. attribute :: name
479
+
480
+ The header field name (the portion of the field before the ':').
481
+
482
+ .. attribute :: defects
483
+
484
+ A possibly empty list of :class: `~email.errors.MessageDefect ` objects
485
+ that record any RFC violations found while parsing the header field.
486
+
487
+ .. method :: fold(*, policy)
488
+
489
+ Return a string containing :attr: `~email.policy.Policy.linesep `
490
+ characters as required to correctly fold the header according
491
+ to *policy *. A :attr: `~email.policy.Policy.cte_type ` of
492
+ ``8bit `` will be treated as if it were ``7bit ``, since strings
493
+ may not contain binary data.
494
+
495
+
496
+ .. class :: UnstructuredHeader
497
+
498
+ The class used for any header that does not have a more specific
499
+ type. (The :mailheader: `Subject ` header is an example of an
500
+ unstructured header.) It does not have any additional attributes.
501
+
502
+
503
+ .. class :: DateHeader
504
+
505
+ The value of this type of header is a single date and time value. The
506
+ primary example of this type of header is the :mailheader: `Date ` header.
507
+
508
+ .. attribute :: datetime
509
+
510
+ A :class: `~datetime.datetime ` encoding the date and time from the
511
+ header value.
512
+
513
+ The ``datetime `` will be a naive ``datetime `` if the value either does
514
+ not have a specified timezone (which would be a violation of the RFC) or
515
+ if the timezone is specified as ``-0000 ``. This timezone value indicates
516
+ that the date and time is to be considered to be in UTC, but with no
517
+ indication of the local timezone in which it was generated. (This
518
+ contrasts to ``+0000 ``, which indicates a date and time that really is in
519
+ the UTC ``0000 `` timezone.)
520
+
521
+ If the header value contains a valid timezone that is not ``-0000 ``, the
522
+ ``datetime `` will be an aware ``datetime `` having a
523
+ :class: `~datetime.tzinfo ` set to the :class: `~datetime.timezone `
524
+ indicated by the header value.
525
+
526
+ A ``datetime `` may also be assigned to a :mailheader: `Date ` type header.
527
+ The resulting string value will use a timezone of ``-0000 `` if the
528
+ ``datetime `` is naive, and the appropriate UTC offset if the ``datetime `` is
529
+ aware.
530
+
531
+
532
+ .. class :: AddressHeader
533
+
534
+ This class is used for all headers that can contain addresses, whether they
535
+ are supposed to be singleton addresses or a list.
536
+
537
+ .. attribute :: addresses
538
+
539
+ A list of :class: `.Address ` objects listing all of the addresses that
540
+ could be parsed out of the field value.
541
+
542
+ .. attribute :: groups
543
+
544
+ A list of :class: `.Group ` objects. Every address in :attr: `.addresses `
545
+ appears in one of the group objects in the tuple. Addresses that are not
546
+ syntactically part of a group are represented by ``Group `` objects whose
547
+ ``name `` is ``None ``.
548
+
549
+ In addition to addresses in string form, any combination of
550
+ :class: `.Address ` and :class: `.Group ` objects, singly or in a list, may be
551
+ assigned to an address header.
552
+
553
+
554
+ .. class :: Address(display_name='', username='', domain='', addr_spec=None):
555
+
556
+ The class used to represent an email address. The general form of an
557
+ address is::
558
+
559
+ [display_name] <username@domain>
560
+
561
+ or::
562
+
563
+ username@domain
564
+
565
+ where each part must conform to specific syntax rules spelled out in
566
+ :rfc: `5322 `.
567
+
568
+ As a convenience *addr_spec * can be specified instead of *username * and
569
+ *domain *, in which case *username * and *domain * will be parsed from the
570
+ *addr_spec *. An *addr_spec * must be a properly RFC quoted string; if it is
571
+ not ``Address `` will raise an error. Unicode characters are allowed and
572
+ will be property encoded when serialized. However, per the RFCs, unicode is
573
+ *not * allowed in the username portion of the address.
574
+
575
+ .. attribute :: display_name
576
+
577
+ The display name portion of the address, if any, with all quoting
578
+ removed. If the address does not have a display name, this attribute
579
+ will be an empty string.
580
+
581
+ .. attribute :: username
582
+
583
+ The ``username `` portion of the address, with all quoting removed.
584
+
585
+ .. attribute :: domain
586
+
587
+ The ``domain `` portion of the address.
588
+
589
+ .. attribute :: addr_spec
590
+
591
+ The ``username@domain `` portion of the address, correctly quoted
592
+ for use as a bare address (the second form shown above). This
593
+ attribute is not mutable.
594
+
595
+ .. method :: __str__()
596
+
597
+ The ``str `` value of the object is the address quoted according to
598
+ :rfc: `5322 ` rules, but with no Content Transfer Encoding of any non-ASCII
599
+ characters.
600
+
601
+
602
+ .. class :: Group(display_name=None, addresses=None)
603
+
604
+ The class used to represent an address group. The general form of an
605
+ address group is::
606
+
607
+ display_name: [address-list];
608
+
609
+ As a convenience for processing lists of addresses that consist of a mixture
610
+ of groups and single addresses, a ``Group `` may also be used to represent
611
+ single addresses that are not part of a group by setting *display_name * to
612
+ ``None `` and providing a list of the single address as *addresses *.
613
+
614
+ .. attribute :: display_name
615
+
616
+ The ``display_name `` of the group. If it is ``None `` and there is
617
+ exactly one ``Address `` in ``addresses ``, then the ``Group `` represents a
618
+ single address that is not in a group.
619
+
620
+ .. attribute :: addresses
621
+
622
+ A possibly empty tuple of :class: `.Address ` objects representing the
623
+ addresses in the group.
624
+
625
+ .. method :: __str__()
626
+
627
+ The ``str `` value of a ``Group `` is formatted according to :rfc: `5322 `,
628
+ but with no Content Transfer Encoding of any non-ASCII characters. If
629
+ ``display_name `` is none and there is a single ``Address `` in the
630
+ ``addresses` list, the ``str `` value will be the same as the ``str `` of
631
+ that single ``Address ``.
0 commit comments