PEP: 643
Title: Metadata for Package Source Distributions
Author: Paul Moore <p.f.moore@gmail.com>
BDFL-Delegate: Paul Ganssle <paul@ganssle.io>
Discussions-To: https://discuss.python.org/t/pep-643-metadata-for-package-source-distributions/5577
Status: Final
Type: Standards Track
Topic: Packaging
Content-Type: text/x-rst
Created: 24-Oct-2020
Post-History: 24-Oct-2020, 01-Nov-2020, 02-Nov-2020, 14-Nov-2020
Resolution: https://discuss.python.org/t/pep-643-metadata-for-package-source-distributions/5577/53


Abstract
========

Python package metadata is stored in the distribution file in a standard
format, defined in the `Core Metadata Specification`_. However, for
source distributions, while the format of the data is defined, there has
traditionally been a lot of inconsistency in what data is recorded in
the source distribution. See `here
<https://discuss.python.org/t/why-isnt-source-distribution-metadata-trustworthy-can-we-make-it-so/2620>`_
for a discussion of this issue.

As a result, metadata consumers are unable to rely on the data available
from source distributions, and need to use the (costly) :pep:`517` build
mechanisms to extract medatata.

This PEP defines a standard that allows build backends to reliably store
package metadata in the source distribution, while still retaining the
necessary flexibility to handle metadata fields that have to be calculated
at build time.


Motivation
==========

There are a number of issues with the way that metadata is currently
stored in source distributions:

* The details of how to store metadata, while standardised, are not
  easy to find.
* The specification requires an old metadata version, and has not been
  updated in line with changes to the core metadata spec.
* There is no way in the spec to distinguish between "this field has been
  omitted because its value will not be known until build time" and "this
  field does not have a value".
* The core metadata specification allows most fields to be optional,
  meaning that the previous issue affects nearly every metadata field.

This PEP proposes an update to the metadata specification to allow
recording of fields which are expected to be "filled in later", and
updates the source distribution specification to clarify that backends
should record sdist metadata using that version of the spec (or later).


Rationale
=========

This PEP allows projects to define source distribution metadata values
as being "dynamic". In this context, saying that a field is "dynamic"
means that the value has not been fixed at the time that the source
distribution was generated. Dynamic values will be supplied by the build
backend at the time when the wheel is generated, and could depend on
details of the build environment.

:pep:`621` has a similar concept, of "dynamic" values that will be
"filled in later", and so we choose to use the same term here by
analogy.


Specification
=============

This PEP defines the relationship between metadata values specified in a
source distribution, and the corresponding values in wheels built from
it. It requires build backends to clearly mark any fields which will
*not* simply be copied unchanged from the sdist to the wheel.

In addition, this PEP makes the `PyPA Specifications`_ document the
canonical location for the specification of the source distribution
format (collecting the information in :pep:`517` and in this PEP).

A new field, ``Dynamic``, will be added to the `Core Metadata Specification`_.
This field will be multiple use, and will be allowed to contain the name
of another core metadata field.

When found in the metadata of a source distribution, the following
rules apply:

1. If a field is *not* marked as ``Dynamic``, then the value of the field
   in any wheel built from the sdist MUST match the value in the sdist.
   If the field is not in the sdist, and not marked as ``Dynamic``, then
   it MUST NOT be present in the wheel.
2. If a field is marked as ``Dynamic``, it may contain any valid value in
   a wheel built from the sdist (including not being present at all).
3. Backends MUST NOT mark a field as ``Dynamic`` if they can determine that
   it was generated from data that will not change at build time. 

Backends MAY record the value they calculated for a field they mark as
``Dynamic`` in a source distribution. Consumers, however, MUST NOT treat
this value as canonical, but MAY use it as an hint about what the final
value in a wheel could be.

In any context other than a source distribution, if a field is marked as
``Dynamic``, that indicates that the value was generated at wheel build
time and may not match the value in the sdist (or in other builds of the
project). Backends are not required to record this information, though,
and consumers MUST NOT assume that the lack of a ``Dynamic`` marking has
any significance, except in a source distribution.

The fields ``Name`` and ``Version`` MUST NOT be marked as ``Dynamic``.

As it adds a new metadata field, this PEP updates the core metadata
format to version 2.2.

Source distributions SHOULD use the latest version of the core metadata
specification that was available when they were created.

Build backends are strongly encouraged to only mark fields as
``Dynamic`` when absolutely necessary, and to encourage projects to
avoid backend features that require the use of ``Dynamic``. Projects
should prefer to use environment markers on static values to adapt to
details of the install location.


Backwards Compatibility
=======================

As this proposal increments the core metadata version, it is compatible
with existing source distributions, which will use an older metadata
version. Tools can determine whether a source distribution conforms to
this PEP by checking the metadata version.


Security Implications
=====================

As this specification is purely for the storage of data that is intended
to be publicly available, there are no security implications.


How to Teach This
=================

This is a data storage format for project metadata, and so will not
typically be visible to end users. There is therefore no need to teach
users how to use the format. Developers wanting to reference the
metadata will be able to find the details in the `PyPA Specifications`_.


Rejected Ideas
==============

1. Rather than marking fields as ``Dynamic``, fields should be assumed
   to be dynamic unless explicitly marked as ``Static``.

   This is logically equivalent to the current proposal, but it implies
   that fields being dynamic is the norm. Packaging tools can be much
   more efficient in the presence of metadata that is known to be static,
   so the PEP chooses to make dynamic fields the exception, and require
   backends to "opt in" to making a field dynamic.

   In addition, if dynamic is the default, then in future, as more
   and more metadata becomes static, metadata files will include an
   increasing number of ``Static`` declarations.

2. Rather than having a ``Dynamic`` field, add a special value that
   indicates that a field is "not yet defined".

   Again, this is logically equivalent to the current proposal. It makes
   "being dynamic" an explicit choice, but requires a special value.  As
   some fields can contain arbitrary text, choosing a such a value is
   somewhat awkward (although likely not a problem in practice). There
   does not seem to be enough benefit to this approach to make it worth
   using instead of the proposed mechanism.

3. Special handling of ``Requires-Python``.

   Early drafts of the PEP needed special discussion of ``Requires-Python``,
   because the lack of environment markers for this field meant that it might
   be difficult to require it to be static. The final form of the PEP no longer
   needs this, as the idea of a whitelist of fields allowed to be dynamic was
   dropped.

4. Restrict the use of ``Dynamic`` to a minimal "white list" of
   permitted fields.

   This approach was likely to prove extremely difficult for setuptools
   to implement in a backward compatible way, due to the dynamic nature
   of the setuptools interface. Instead, the proposal now allows most
   fields to be dynamic, but encourages backends to avoid dynamic values
   unless essential.


Open Issues
===========

None

References
==========

.. _Core Metadata Specification: https://packaging.python.org/specifications/core-metadata/
.. _PyPA Specifications: https://packaging.python.org/specifications/

Copyright
=========

This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.