FAQ
Here is my proposed PEP to drop .pyo files from Python. Thanks to Barry's
work in PEP 3147 this really shouldn't have much impact on user's code
(then again, bytecode files are basically an implementation detail so it
shouldn't impact hardly anyone directly).


One thing I would appreciate is if people have more motivation for this.
While the maintainer of importlib in me wants to see this happen, the core
developer in me thinks the arguments are a little weak. So if people can
provide more reasons why this is a good thing that would be appreciated.




PEP: 487
Title: Elimination of PYO files
Version: $Revision$
Last-Modified: $Date$
Author: Brett Cannon <brett@python.org>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 20-Feb-2015
Post-History:


Abstract
========


This PEP proposes eliminating the concept of PYO files from Python.
To continue the support of the separation of bytecode files based on
their optimization level, this PEP proposes extending the PYC file
name to include the optimization level in bytecode repository
directory (i.e., the ``__pycache__`` directory).




Rationale
=========


As of today, bytecode files come in two flavours: PYC and PYO. A PYC
file is the bytecode file generated and read from when no
optimization level is specified at interpreter startup (i.e., ``-O``
is not specified). A PYO file represents the bytecode file that is
read/written when **any** optimization level is specified (i.e., when
``-O`` is specified, including ``-OO``). This means that while PYC
files clearly delineate the optimization level used when they were
generated -- namely no optimizations beyond the peepholer -- the same
is not true for PYO files. Put in terms of optimization levels and
the file extension:


   - 0: ``.pyc``
   - 1 (``-O``): ``.pyo``
   - 2 (``-OO``): ``.pyo``


The reuse of the ``.pyo`` file extension for both level 1 and 2
optimizations means that there is no clear way to tell what
optimization level was used to generate the bytecode file. In terms
of reading PYO files, this can lead to an interpreter using a mixture
of optimization levels with its code if the user was not careful to
make sure all PYO files were generated using the same optimization
level (typically done by blindly deleting all PYO files and then
using the `compileall` module to compile all-new PYO files [1]_).
This issue is only compounded when people optimize Python code beyond
what the interpreter natively supports, e.g., using the astoptimizer
project [2]_.


In terms of writing PYO files, the need to delete all PYO files
every time one either changes the optimization level they want to use
or are unsure of what optimization was used the last time PYO files
were generated leads to unnecessary file churn.


As for distributing bytecode-only modules, having to distribute both
``.pyc`` and ``.pyo`` files is unnecessary for the common use-case
of code obfuscation and smaller file deployments.




Proposal
========


To eliminate the ambiguity that PYO files present, this PEP proposes
eliminating the concept of PYO files and their accompanying ``.pyo``
file extension. To allow for the optimization level to be unambiguous
as well as to avoid having to regenerate optimized bytecode files
needlessly in the `__pycache__` directory, the optimization level
used to generate a PYC file will be incorporated into the bytecode
file name. Currently bytecode file names are created by
``importlib.util.cache_from_source()``, approximately using the
following expression defined by PEP 3147 [3]_, [4]_, [5]_::


     '{name}.{cache_tag}.pyc'.format(name=module_name,
                                     cache_tag=sys.implementation.cache_tag)


This PEP proposes to change the expression to::


     '{name}.{cache_tag}.opt-{optimization}.pyc'.format(
             name=module_name,
             cache_tag=sys.implementation.cache_tag,
             optimization=str(sys.flags.optimize))


The "opt-" prefix was chosen so as to provide a visual separator
from the cache tag. The placement of the optimization level after
the cache tag was chosen to preserve lexicographic sort order of
bytecode file names based on module name and cache tag which will
not vary for a single interpreter. The "opt-" prefix was chosen over
"o" so as to be somewhat self-documenting. The "opt-" prefix was
chosen over "O" so as to not have any confusion with "0" while being
so close to the interpreter version number.


A period was chosen over a hyphen as a separator so as to distinguish
clearly that the optimization level is not part of the interpreter
version as specified by the cache tag. It also lends to the use of
the period in the file name to delineate semantically different
concepts.


For example, the bytecode file name of ``importlib.cpython-35.pyc``
would become ``importlib.cpython-35.opt-0.pyc``. If ``-OO`` had been
passed to the interpreter then instead of
``importlib.cpython-35.pyo`` the file name would be
``importlib.cpython-35.opt-2.pyc``.




Implementation
==============


importlib
---------


As ``importlib.util.cache_from_source()`` is the API that exposes
bytecode file paths as while as being directly used by importlib, it
requires the most critical change. As of Python 3.4, the function's
signature is::


   importlib.util.cache_from_source(path, debug_override=None)


This PEP proposes changing the signature in Python 3.5 to::


   importlib.util.cache_from_source(path, debug_override=None, *,
optimization=None)


The introduced ``optimization`` keyword-only parameter will control
what optimization level is specified in the file name. If the
argument is ``None`` then the current optimization level of the
interpreter will be assumed. Any argument given for ``optimization``
will be passed to ``str()`` and must have ``str.isalnum()`` be true,
else ``ValueError`` will be raised (this prevents invalid characters
being used in the file name). It is expected that beyond Python's own
0-2 optimization levels, third-party code will use a hash of
optimization names to specify the optimization level, e.g.
``hashlib.sha256(','.join(['dead code elimination', 'constant
folding'])).hexdigest()``.


The ``debug_override`` parameter will be deprecated. As the parameter
expects a boolean, the integer value of the boolean will be used as
if it had been provided as the argument to ``optimization`` (a
``None`` argument will mean the same as for ``optimization``). A
deprecation warning will be raised when ``debug_override`` is given a
value other than ``None``, but there are no plans for the complete
removal of the parameter as this time (but removal will be no later
than Python 4).


The various module attributes for importlib.machinery which relate to
bytecode file suffixes will be updated [7]_. The
``DEBUG_BYTECODE_SUFFIXES`` and ``OPTIMIZED_BYTECODE_SUFFIXES`` will
both be documented as deprecated and set to the same value as
``BYTECODE_SUFFIXES`` (removal of ``DEBUG_BYTECODE_SUFFIXES`` and
``OPTIMIZED_BYTECODE_SUFFIXES`` is not currently planned, but will be
not later than Python 4).


All various finders and loaders will also be updated as necessary,
but updating the previous mentioned parts of importlib should be all
that is required.




Rest of the standard library
----------------------------


The various functions exposed by the ``py_compile`` and
``compileall`` functions will be updated as necessary to make sure
they follow the new bytecode file name semantics [6]_, [1]_.




Compatibility Considerations
============================


Any code directly manipulating bytecode files from Python 3.2 on
will need to consider the impact of this change on their code (prior
to Python 3.2 -- including all of Python 2 -- there was no
__pycache__ which already necessitates bifurcating bytecode file
handling support). If code was setting the ``debug_override``
argument to ``importlib.util.cache_from_source()`` then care will be
needed if they want the path to a bytecode file with an optimization
level of 2. Otherwise only code **not** using
``importlib.util.cache_from_source()`` will need updating.


As for people who distribute bytecode-only modules, they will have
to choose which optimization level they want their bytecode files to
be since distributing a ``.pyo`` file with a ``.pyc`` file will no
longer be of any use. Since people typically only distribute bytecode
files for code obfuscation purposes or smaller distribution size
then only having to distribute a single ``.pyc`` should actually be
beneficial to these use-cases.




Rejected Ideas
==============


N/A




Open Issues
===========


Formatting of the optimization level in the file name
-----------------------------------------------------


Using the "opt-" prefix and placing the optimization level between
the cache tag and file extension is not critical. Other options which
were considered are:


* ``importlib.cpython-35.o0.pyc``
* ``importlib.cpython-35.O0.pyc``
* ``importlib.cpython-35.0.pyc``
* ``importlib.cpython-35-O0.pyc``
* ``importlib.O0.cpython-35.pyc``
* ``importlib.o0.cpython-35.pyc``
* ``importlib.0.cpython-35.pyc``


These were initially rejected either because they would change the
sort order of bytecode files, possible ambiguity with the cache tag,
or were not self-documenting enough.




References
==========


.. [1] The compileall module
    (https://docs.python.org/3/library/compileall.html#module-compileall)


.. [2] The astoptimizer project
    (https://pypi.python.org/pypi/astoptimizer)


.. [3] ``importlib.util.cache_from_source()``
    (
https://docs.python.org/3.5/library/importlib.html#importlib.util.cache_from_source
)


.. [4] Implementation of ``importlib.util.cache_from_source()`` from
CPython 3.4.3rc1
    (
https://hg.python.org/cpython/file/038297948389/Lib/importlib/_bootstrap.py#l437
)


.. [5] PEP 3147, PYC Repository Directories, Warsaw
    (http://www.python.org/dev/peps/pep-3147)


.. [6] The py_compile module
    (https://docs.python.org/3/library/compileall.html#module-compileall)


.. [7] The importlib.machinery module
    (
https://docs.python.org/3/library/importlib.html#module-importlib.machinery)




Copyright
=========


This document has been placed in the public domain.




..
    Local Variables:
    mode: indented-text
    indent-tabs-mode: nil
    sentence-end-double-space: t
    fill-column: 70
    coding: utf-8
    End:
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/import-sig/attachments/20150227/f366769d/attachment-0001.html>

Search Discussions

  • Guido van Rossum at Feb 27, 2015 at 6:01 pm
    I'm in a good mood today and I think this is a great idea! That's not to
    say that I'm accepting it as-is (I haven't read it fully) but I expect that
    there are very few downsides and it won't break much. (There's of course
    always going to be someone who always uses -O and somehow depends on the
    existence of .pyo files, but they should have seen it coming with
    __pycache__ and the new version-specific extensions. :-)


    On Fri, Feb 27, 2015 at 9:06 AM, Brett Cannon wrote:

    Here is my proposed PEP to drop .pyo files from Python. Thanks to Barry's
    work in PEP 3147 this really shouldn't have much impact on user's code
    (then again, bytecode files are basically an implementation detail so it
    shouldn't impact hardly anyone directly).

    One thing I would appreciate is if people have more motivation for this.
    While the maintainer of importlib in me wants to see this happen, the core
    developer in me thinks the arguments are a little weak. So if people can
    provide more reasons why this is a good thing that would be appreciated.


    PEP: 487
    Title: Elimination of PYO files
    Version: $Revision$
    Last-Modified: $Date$
    Author: Brett Cannon <brett@python.org>
    Status: Draft
    Type: Standards Track
    Content-Type: text/x-rst
    Created: 20-Feb-2015
    Post-History:

    Abstract
    ========

    This PEP proposes eliminating the concept of PYO files from Python.
    To continue the support of the separation of bytecode files based on
    their optimization level, this PEP proposes extending the PYC file
    name to include the optimization level in bytecode repository
    directory (i.e., the ``__pycache__`` directory).


    Rationale
    =========

    As of today, bytecode files come in two flavours: PYC and PYO. A PYC
    file is the bytecode file generated and read from when no
    optimization level is specified at interpreter startup (i.e., ``-O``
    is not specified). A PYO file represents the bytecode file that is
    read/written when **any** optimization level is specified (i.e., when
    ``-O`` is specified, including ``-OO``). This means that while PYC
    files clearly delineate the optimization level used when they were
    generated -- namely no optimizations beyond the peepholer -- the same
    is not true for PYO files. Put in terms of optimization levels and
    the file extension:

    - 0: ``.pyc``
    - 1 (``-O``): ``.pyo``
    - 2 (``-OO``): ``.pyo``

    The reuse of the ``.pyo`` file extension for both level 1 and 2
    optimizations means that there is no clear way to tell what
    optimization level was used to generate the bytecode file. In terms
    of reading PYO files, this can lead to an interpreter using a mixture
    of optimization levels with its code if the user was not careful to
    make sure all PYO files were generated using the same optimization
    level (typically done by blindly deleting all PYO files and then
    using the `compileall` module to compile all-new PYO files [1]_).
    This issue is only compounded when people optimize Python code beyond
    what the interpreter natively supports, e.g., using the astoptimizer
    project [2]_.

    In terms of writing PYO files, the need to delete all PYO files
    every time one either changes the optimization level they want to use
    or are unsure of what optimization was used the last time PYO files
    were generated leads to unnecessary file churn.

    As for distributing bytecode-only modules, having to distribute both
    ``.pyc`` and ``.pyo`` files is unnecessary for the common use-case
    of code obfuscation and smaller file deployments.


    Proposal
    ========

    To eliminate the ambiguity that PYO files present, this PEP proposes
    eliminating the concept of PYO files and their accompanying ``.pyo``
    file extension. To allow for the optimization level to be unambiguous
    as well as to avoid having to regenerate optimized bytecode files
    needlessly in the `__pycache__` directory, the optimization level
    used to generate a PYC file will be incorporated into the bytecode
    file name. Currently bytecode file names are created by
    ``importlib.util.cache_from_source()``, approximately using the
    following expression defined by PEP 3147 [3]_, [4]_, [5]_::

    '{name}.{cache_tag}.pyc'.format(name=module_name,
    cache_tag=sys.implementation.cache_tag)

    This PEP proposes to change the expression to::

    '{name}.{cache_tag}.opt-{optimization}.pyc'.format(
    name=module_name,
    cache_tag=sys.implementation.cache_tag,
    optimization=str(sys.flags.optimize))

    The "opt-" prefix was chosen so as to provide a visual separator
    from the cache tag. The placement of the optimization level after
    the cache tag was chosen to preserve lexicographic sort order of
    bytecode file names based on module name and cache tag which will
    not vary for a single interpreter. The "opt-" prefix was chosen over
    "o" so as to be somewhat self-documenting. The "opt-" prefix was
    chosen over "O" so as to not have any confusion with "0" while being
    so close to the interpreter version number.

    A period was chosen over a hyphen as a separator so as to distinguish
    clearly that the optimization level is not part of the interpreter
    version as specified by the cache tag. It also lends to the use of
    the period in the file name to delineate semantically different
    concepts.

    For example, the bytecode file name of ``importlib.cpython-35.pyc``
    would become ``importlib.cpython-35.opt-0.pyc``. If ``-OO`` had been
    passed to the interpreter then instead of
    ``importlib.cpython-35.pyo`` the file name would be
    ``importlib.cpython-35.opt-2.pyc``.


    Implementation
    ==============

    importlib
    ---------

    As ``importlib.util.cache_from_source()`` is the API that exposes
    bytecode file paths as while as being directly used by importlib, it
    requires the most critical change. As of Python 3.4, the function's
    signature is::

    importlib.util.cache_from_source(path, debug_override=None)

    This PEP proposes changing the signature in Python 3.5 to::

    importlib.util.cache_from_source(path, debug_override=None, *,
    optimization=None)

    The introduced ``optimization`` keyword-only parameter will control
    what optimization level is specified in the file name. If the
    argument is ``None`` then the current optimization level of the
    interpreter will be assumed. Any argument given for ``optimization``
    will be passed to ``str()`` and must have ``str.isalnum()`` be true,
    else ``ValueError`` will be raised (this prevents invalid characters
    being used in the file name). It is expected that beyond Python's own
    0-2 optimization levels, third-party code will use a hash of
    optimization names to specify the optimization level, e.g.
    ``hashlib.sha256(','.join(['dead code elimination', 'constant
    folding'])).hexdigest()``.

    The ``debug_override`` parameter will be deprecated. As the parameter
    expects a boolean, the integer value of the boolean will be used as
    if it had been provided as the argument to ``optimization`` (a
    ``None`` argument will mean the same as for ``optimization``). A
    deprecation warning will be raised when ``debug_override`` is given a
    value other than ``None``, but there are no plans for the complete
    removal of the parameter as this time (but removal will be no later
    than Python 4).

    The various module attributes for importlib.machinery which relate to
    bytecode file suffixes will be updated [7]_. The
    ``DEBUG_BYTECODE_SUFFIXES`` and ``OPTIMIZED_BYTECODE_SUFFIXES`` will
    both be documented as deprecated and set to the same value as
    ``BYTECODE_SUFFIXES`` (removal of ``DEBUG_BYTECODE_SUFFIXES`` and
    ``OPTIMIZED_BYTECODE_SUFFIXES`` is not currently planned, but will be
    not later than Python 4).

    All various finders and loaders will also be updated as necessary,
    but updating the previous mentioned parts of importlib should be all
    that is required.


    Rest of the standard library
    ----------------------------

    The various functions exposed by the ``py_compile`` and
    ``compileall`` functions will be updated as necessary to make sure
    they follow the new bytecode file name semantics [6]_, [1]_.


    Compatibility Considerations
    ============================

    Any code directly manipulating bytecode files from Python 3.2 on
    will need to consider the impact of this change on their code (prior
    to Python 3.2 -- including all of Python 2 -- there was no
    __pycache__ which already necessitates bifurcating bytecode file
    handling support). If code was setting the ``debug_override``
    argument to ``importlib.util.cache_from_source()`` then care will be
    needed if they want the path to a bytecode file with an optimization
    level of 2. Otherwise only code **not** using
    ``importlib.util.cache_from_source()`` will need updating.

    As for people who distribute bytecode-only modules, they will have
    to choose which optimization level they want their bytecode files to
    be since distributing a ``.pyo`` file with a ``.pyc`` file will no
    longer be of any use. Since people typically only distribute bytecode
    files for code obfuscation purposes or smaller distribution size
    then only having to distribute a single ``.pyc`` should actually be
    beneficial to these use-cases.


    Rejected Ideas
    ==============

    N/A


    Open Issues
    ===========

    Formatting of the optimization level in the file name
    -----------------------------------------------------

    Using the "opt-" prefix and placing the optimization level between
    the cache tag and file extension is not critical. Other options which
    were considered are:

    * ``importlib.cpython-35.o0.pyc``
    * ``importlib.cpython-35.O0.pyc``
    * ``importlib.cpython-35.0.pyc``
    * ``importlib.cpython-35-O0.pyc``
    * ``importlib.O0.cpython-35.pyc``
    * ``importlib.o0.cpython-35.pyc``
    * ``importlib.0.cpython-35.pyc``

    These were initially rejected either because they would change the
    sort order of bytecode files, possible ambiguity with the cache tag,
    or were not self-documenting enough.


    References
    ==========

    .. [1] The compileall module
    (https://docs.python.org/3/library/compileall.html#module-compileall)

    .. [2] The astoptimizer project
    (https://pypi.python.org/pypi/astoptimizer)

    .. [3] ``importlib.util.cache_from_source()``
    (
    https://docs.python.org/3.5/library/importlib.html#importlib.util.cache_from_source
    )

    .. [4] Implementation of ``importlib.util.cache_from_source()`` from
    CPython 3.4.3rc1
    (
    https://hg.python.org/cpython/file/038297948389/Lib/importlib/_bootstrap.py#l437
    )

    .. [5] PEP 3147, PYC Repository Directories, Warsaw
    (http://www.python.org/dev/peps/pep-3147)

    .. [6] The py_compile module
    (https://docs.python.org/3/library/compileall.html#module-compileall)

    .. [7] The importlib.machinery module
    (
    https://docs.python.org/3/library/importlib.html#module-importlib.machinery
    )


    Copyright
    =========

    This document has been placed in the public domain.


    ..
    Local Variables:
    mode: indented-text
    indent-tabs-mode: nil
    sentence-end-double-space: t
    fill-column: 70
    coding: utf-8
    End:


    _______________________________________________
    Import-SIG mailing list
    Import-SIG at python.org
    https://mail.python.org/mailman/listinfo/import-sig



    --
    --Guido van Rossum (python.org/~guido)
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: <http://mail.python.org/pipermail/import-sig/attachments/20150227/51a1d931/attachment-0001.html>
  • Brett Cannon at Feb 27, 2015 at 6:06 pm

    On Fri, Feb 27, 2015 at 1:02 PM Guido van Rossum wrote:


    I'm in a good mood today and I think this is a great idea!

    Just that mean if you were in a bad mood this would be a bad idea? ;)



    That's not to say that I'm accepting it as-is (I haven't read it fully)
    but I expect that there are very few downsides and it won't break much.

    There is a section in the PEP discussing backwards-compatibility. Basically
    the potential breakage seems fairly minimal to me.



    (There's of course always going to be someone who always uses -O and
    somehow depends on the existence of .pyo files, but they should have seen
    it coming with __pycache__ and the new version-specific extensions. :-)

    Yep! PEP 3147 makes this much easier to do without breaking the world.


    -Brett



    On Fri, Feb 27, 2015 at 9:06 AM, Brett Cannon wrote:

    Here is my proposed PEP to drop .pyo files from Python. Thanks to Barry's
    work in PEP 3147 this really shouldn't have much impact on user's code
    (then again, bytecode files are basically an implementation detail so it
    shouldn't impact hardly anyone directly).

    One thing I would appreciate is if people have more motivation for this.
    While the maintainer of importlib in me wants to see this happen, the core
    developer in me thinks the arguments are a little weak. So if people can
    provide more reasons why this is a good thing that would be appreciated.


    PEP: 487
    Title: Elimination of PYO files
    Version: $Revision$
    Last-Modified: $Date$
    Author: Brett Cannon <brett@python.org>
    Status: Draft
    Type: Standards Track
    Content-Type: text/x-rst
    Created: 20-Feb-2015
    Post-History:

    Abstract
    ========

    This PEP proposes eliminating the concept of PYO files from Python.
    To continue the support of the separation of bytecode files based on
    their optimization level, this PEP proposes extending the PYC file
    name to include the optimization level in bytecode repository
    directory (i.e., the ``__pycache__`` directory).


    Rationale
    =========

    As of today, bytecode files come in two flavours: PYC and PYO. A PYC
    file is the bytecode file generated and read from when no
    optimization level is specified at interpreter startup (i.e., ``-O``
    is not specified). A PYO file represents the bytecode file that is
    read/written when **any** optimization level is specified (i.e., when
    ``-O`` is specified, including ``-OO``). This means that while PYC
    files clearly delineate the optimization level used when they were
    generated -- namely no optimizations beyond the peepholer -- the same
    is not true for PYO files. Put in terms of optimization levels and
    the file extension:

    - 0: ``.pyc``
    - 1 (``-O``): ``.pyo``
    - 2 (``-OO``): ``.pyo``

    The reuse of the ``.pyo`` file extension for both level 1 and 2
    optimizations means that there is no clear way to tell what
    optimization level was used to generate the bytecode file. In terms
    of reading PYO files, this can lead to an interpreter using a mixture
    of optimization levels with its code if the user was not careful to
    make sure all PYO files were generated using the same optimization
    level (typically done by blindly deleting all PYO files and then
    using the `compileall` module to compile all-new PYO files [1]_).
    This issue is only compounded when people optimize Python code beyond
    what the interpreter natively supports, e.g., using the astoptimizer
    project [2]_.

    In terms of writing PYO files, the need to delete all PYO files
    every time one either changes the optimization level they want to use
    or are unsure of what optimization was used the last time PYO files
    were generated leads to unnecessary file churn.

    As for distributing bytecode-only modules, having to distribute both
    ``.pyc`` and ``.pyo`` files is unnecessary for the common use-case
    of code obfuscation and smaller file deployments.


    Proposal
    ========

    To eliminate the ambiguity that PYO files present, this PEP proposes
    eliminating the concept of PYO files and their accompanying ``.pyo``
    file extension. To allow for the optimization level to be unambiguous
    as well as to avoid having to regenerate optimized bytecode files
    needlessly in the `__pycache__` directory, the optimization level
    used to generate a PYC file will be incorporated into the bytecode
    file name. Currently bytecode file names are created by
    ``importlib.util.cache_from_source()``, approximately using the
    following expression defined by PEP 3147 [3]_, [4]_, [5]_::

    '{name}.{cache_tag}.pyc'.format(name=module_name,

    cache_tag=sys.implementation.cache_tag)

    This PEP proposes to change the expression to::

    '{name}.{cache_tag}.opt-{optimization}.pyc'.format(
    name=module_name,
    cache_tag=sys.implementation.cache_tag,
    optimization=str(sys.flags.optimize))

    The "opt-" prefix was chosen so as to provide a visual separator
    from the cache tag. The placement of the optimization level after
    the cache tag was chosen to preserve lexicographic sort order of
    bytecode file names based on module name and cache tag which will
    not vary for a single interpreter. The "opt-" prefix was chosen over
    "o" so as to be somewhat self-documenting. The "opt-" prefix was
    chosen over "O" so as to not have any confusion with "0" while being
    so close to the interpreter version number.

    A period was chosen over a hyphen as a separator so as to distinguish
    clearly that the optimization level is not part of the interpreter
    version as specified by the cache tag. It also lends to the use of
    the period in the file name to delineate semantically different
    concepts.

    For example, the bytecode file name of ``importlib.cpython-35.pyc``
    would become ``importlib.cpython-35.opt-0.pyc``. If ``-OO`` had been
    passed to the interpreter then instead of
    ``importlib.cpython-35.pyo`` the file name would be
    ``importlib.cpython-35.opt-2.pyc``.


    Implementation
    ==============

    importlib
    ---------

    As ``importlib.util.cache_from_source()`` is the API that exposes
    bytecode file paths as while as being directly used by importlib, it
    requires the most critical change. As of Python 3.4, the function's
    signature is::

    importlib.util.cache_from_source(path, debug_override=None)

    This PEP proposes changing the signature in Python 3.5 to::

    importlib.util.cache_from_source(path, debug_override=None, *,
    optimization=None)

    The introduced ``optimization`` keyword-only parameter will control
    what optimization level is specified in the file name. If the
    argument is ``None`` then the current optimization level of the
    interpreter will be assumed. Any argument given for ``optimization``
    will be passed to ``str()`` and must have ``str.isalnum()`` be true,
    else ``ValueError`` will be raised (this prevents invalid characters
    being used in the file name). It is expected that beyond Python's own
    0-2 optimization levels, third-party code will use a hash of
    optimization names to specify the optimization level, e.g.
    ``hashlib.sha256(','.join(['dead code elimination', 'constant
    folding'])).hexdigest()``.

    The ``debug_override`` parameter will be deprecated. As the parameter
    expects a boolean, the integer value of the boolean will be used as
    if it had been provided as the argument to ``optimization`` (a
    ``None`` argument will mean the same as for ``optimization``). A
    deprecation warning will be raised when ``debug_override`` is given a
    value other than ``None``, but there are no plans for the complete
    removal of the parameter as this time (but removal will be no later
    than Python 4).

    The various module attributes for importlib.machinery which relate to
    bytecode file suffixes will be updated [7]_. The
    ``DEBUG_BYTECODE_SUFFIXES`` and ``OPTIMIZED_BYTECODE_SUFFIXES`` will
    both be documented as deprecated and set to the same value as
    ``BYTECODE_SUFFIXES`` (removal of ``DEBUG_BYTECODE_SUFFIXES`` and
    ``OPTIMIZED_BYTECODE_SUFFIXES`` is not currently planned, but will be
    not later than Python 4).

    All various finders and loaders will also be updated as necessary,
    but updating the previous mentioned parts of importlib should be all
    that is required.


    Rest of the standard library
    ----------------------------

    The various functions exposed by the ``py_compile`` and
    ``compileall`` functions will be updated as necessary to make sure
    they follow the new bytecode file name semantics [6]_, [1]_.


    Compatibility Considerations
    ============================

    Any code directly manipulating bytecode files from Python 3.2 on
    will need to consider the impact of this change on their code (prior
    to Python 3.2 -- including all of Python 2 -- there was no
    __pycache__ which already necessitates bifurcating bytecode file
    handling support). If code was setting the ``debug_override``
    argument to ``importlib.util.cache_from_source()`` then care will be
    needed if they want the path to a bytecode file with an optimization
    level of 2. Otherwise only code **not** using
    ``importlib.util.cache_from_source()`` will need updating.

    As for people who distribute bytecode-only modules, they will have
    to choose which optimization level they want their bytecode files to
    be since distributing a ``.pyo`` file with a ``.pyc`` file will no
    longer be of any use. Since people typically only distribute bytecode
    files for code obfuscation purposes or smaller distribution size
    then only having to distribute a single ``.pyc`` should actually be
    beneficial to these use-cases.


    Rejected Ideas
    ==============

    N/A


    Open Issues
    ===========

    Formatting of the optimization level in the file name
    -----------------------------------------------------

    Using the "opt-" prefix and placing the optimization level between
    the cache tag and file extension is not critical. Other options which
    were considered are:

    * ``importlib.cpython-35.o0.pyc``
    * ``importlib.cpython-35.O0.pyc``
    * ``importlib.cpython-35.0.pyc``
    * ``importlib.cpython-35-O0.pyc``
    * ``importlib.O0.cpython-35.pyc``
    * ``importlib.o0.cpython-35.pyc``
    * ``importlib.0.cpython-35.pyc``

    These were initially rejected either because they would change the
    sort order of bytecode files, possible ambiguity with the cache tag,
    or were not self-documenting enough.


    References
    ==========

    .. [1] The compileall module
    (https://docs.python.org/3/library/compileall.html#module-compileall)

    .. [2] The astoptimizer project
    (https://pypi.python.org/pypi/astoptimizer)

    .. [3] ``importlib.util.cache_from_source()``
    (
    https://docs.python.org/3.5/library/importlib.html#importlib.util.cache_from_source
    )

    .. [4] Implementation of ``importlib.util.cache_from_source()`` from
    CPython 3.4.3rc1
    (
    https://hg.python.org/cpython/file/038297948389/Lib/importlib/_bootstrap.py#l437
    )

    .. [5] PEP 3147, PYC Repository Directories, Warsaw
    (http://www.python.org/dev/peps/pep-3147)

    .. [6] The py_compile module
    (https://docs.python.org/3/library/compileall.html#module-compileall)

    .. [7] The importlib.machinery module
    (
    https://docs.python.org/3/library/importlib.html#module-importlib.machinery
    )


    Copyright
    =========

    This document has been placed in the public domain.


    ..
    Local Variables:
    mode: indented-text
    indent-tabs-mode: nil
    sentence-end-double-space: t
    fill-column: 70
    coding: utf-8
    End:


    _______________________________________________
    Import-SIG mailing list
    Import-SIG at python.org
    https://mail.python.org/mailman/listinfo/import-sig

    --
    --Guido van Rossum (python.org/~guido)
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: <http://mail.python.org/pipermail/import-sig/attachments/20150227/15424e1b/attachment-0001.html>
  • Ethan Furman at Feb 27, 2015 at 6:12 pm

    On 02/27/2015 09:06 AM, Brett Cannon wrote:


    PEP: 487

    +1. Great idea.


    --
    ~Ethan~


    -------------- next part --------------
    A non-text attachment was scrubbed...
    Name: signature.asc
    Type: application/pgp-signature
    Size: 836 bytes
    Desc: OpenPGP digital signature
    URL: <http://mail.python.org/pipermail/import-sig/attachments/20150227/68cdd329/attachment.sig>
  • Barry Warsaw at Feb 27, 2015 at 6:28 pm
    This looks great Brett, thanks for pushing it forward. I think it's a
    perfectly natural and consistent extension to PEP 3147.


    Some comments inlined.


    On Feb 27, 2015, at 05:06 PM, Brett Cannon wrote:

    Rationale
    =========

    - 0: ``.pyc``
    - 1 (``-O``): ``.pyo``
    - 2 (``-OO``): ``.pyo``

    This is all the rationale I need. :)

    The "opt-" prefix was chosen so as to provide a visual separator
    from the cache tag. The placement of the optimization level after
    the cache tag was chosen to preserve lexicographic sort order of
    bytecode file names based on module name and cache tag which will
    not vary for a single interpreter. The "opt-" prefix was chosen over
    "o" so as to be somewhat self-documenting. The "opt-" prefix was
    chosen over "O" so as to not have any confusion with "0" while being
    so close to the interpreter version number.

    I get it, and the examples you include in the open questions is helpful, but I
    still don't like "opt-". We'll no doubt bikeshed on this until Guido
    decides but looking at the examples below I'd be okay with 'O<level>'. Did
    you consider 'opt<level>', e.g. imporlib.cpython-35.opt0.pyc ?

    Compatibility Considerations
    ============================

    Just as PEP 3147 had to make backward compatibility concessions to .pyc files
    living outside __pycache__ (which I think is still supported, right?) I think
    you'll have to do the same for traditional .pyo files, at least for Python
    3.5. You won't have to *write* such files, but if they exist and the
    corresponding optimization level pyc file isn't present in __pycache__, you'll
    have to load them.


    It might in fact make sense to add some language to this PEP saying that in
    Python 3.6, support for old-style .pyc and .pyo files will be removed.


    Cheers,
    -Barry
  • Ethan Furman at Feb 27, 2015 at 6:36 pm

    On 02/27/2015 10:28 AM, Barry Warsaw wrote:
    from the PEP:
    The "opt-" prefix was chosen so as to provide a visual separator
    from the cache tag. The placement of the optimization level after
    the cache tag was chosen to preserve lexicographic sort order of
    bytecode file names based on module name and cache tag which will
    not vary for a single interpreter. The "opt-" prefix was chosen over
    "o" so as to be somewhat self-documenting. The "opt-" prefix was
    chosen over "O" so as to not have any confusion with "0" while being
    so close to the interpreter version number.
    I get it, and the examples you include in the open questions is helpful, but I
    still don't like "opt-". We'll no doubt bikeshed on this until Guido
    decides but looking at the examples below I'd be okay with 'O<level>'. Did
    you consider 'opt<level>', e.g. imporlib.cpython-35.opt0.pyc ?

    I can live with either '.opt-N.' or just '.optN.' but all the others I thought were horrid.



    Compatibility Considerations
    ============================
    Just as PEP 3147 had to make backward compatibility concessions to .pyc files
    living outside __pycache__ (which I think is still supported, right?) I think
    you'll have to do the same for traditional .pyo files, at least for Python
    3.5. You won't have to *write* such files, but if they exist and the
    corresponding optimization level pyc file isn't present in __pycache__, you'll
    have to load them.

    It might in fact make sense to add some language to this PEP saying that in
    Python 3.6, support for old-style .pyc and .pyo files will be removed.

    +1


    --
    ~Ethan~


    -------------- next part --------------
    A non-text attachment was scrubbed...
    Name: signature.asc
    Type: application/pgp-signature
    Size: 836 bytes
    Desc: OpenPGP digital signature
    URL: <http://mail.python.org/pipermail/import-sig/attachments/20150227/3dc779c0/attachment.sig>
  • Brett Cannon at Feb 27, 2015 at 7:26 pm

    On Fri, Feb 27, 2015 at 1:28 PM Barry Warsaw wrote:


    This looks great Brett, thanks for pushing it forward. I think it's a
    perfectly natural and consistent extension to PEP 3147.

    Some comments inlined.
    On Feb 27, 2015, at 05:06 PM, Brett Cannon wrote:

    Rationale
    =========

    - 0: ``.pyc``
    - 1 (``-O``): ``.pyo``
    - 2 (``-OO``): ``.pyo``
    This is all the rationale I need. :)
    The "opt-" prefix was chosen so as to provide a visual separator
    from the cache tag. The placement of the optimization level after
    the cache tag was chosen to preserve lexicographic sort order of
    bytecode file names based on module name and cache tag which will
    not vary for a single interpreter. The "opt-" prefix was chosen over
    "o" so as to be somewhat self-documenting. The "opt-" prefix was
    chosen over "O" so as to not have any confusion with "0" while being
    so close to the interpreter version number.
    I get it, and the examples you include in the open questions is helpful,
    but I
    still don't like "opt-". We'll no doubt bikeshed on this until Guido
    decides but looking at the examples below I'd be okay with 'O<level>'. Did
    you consider 'opt<level>', e.g. imporlib.cpython-35.opt0.pyc ?

    Nope, and I'll think about it and at least add it as a possibility.



    Compatibility Considerations
    ============================
    Just as PEP 3147 had to make backward compatibility concessions to .pyc
    files
    living outside __pycache__ (which I think is still supported, right?)



    Unfortunately yes.



    I think
    you'll have to do the same for traditional .pyo files, at least for Python
    3.5. You won't have to *write* such files, but if they exist and the
    corresponding optimization level pyc file isn't present in __pycache__,
    you'll
    have to load them.

    It might in fact make sense to add some language to this PEP saying that in
    Python 3.6, support for old-style .pyc and .pyo files will be removed.

    Ah, but you see the magic number changed in Python 3.5 for matrix
    multiplication, so pre-existing .pyo files won't even load, so they will
    have to be regenerated regardless. I will mention that in the PEP.
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: <http://mail.python.org/pipermail/import-sig/attachments/20150227/0cfee2a3/attachment.html>
  • Donald Stufft at Feb 27, 2015 at 7:28 pm

    On Feb 27, 2015, at 2:26 PM, Brett Cannon wrote:



    On Fri, Feb 27, 2015 at 1:28 PM Barry Warsaw <barry at python.org wrote:
    This looks great Brett, thanks for pushing it forward. I think it's a
    perfectly natural and consistent extension to PEP 3147.

    Some comments inlined.
    On Feb 27, 2015, at 05:06 PM, Brett Cannon wrote:

    Rationale
    =========

    - 0: ``.pyc``
    - 1 (``-O``): ``.pyo``
    - 2 (``-OO``): ``.pyo``
    This is all the rationale I need. :)
    The "opt-" prefix was chosen so as to provide a visual separator
    from the cache tag. The placement of the optimization level after
    the cache tag was chosen to preserve lexicographic sort order of
    bytecode file names based on module name and cache tag which will
    not vary for a single interpreter. The "opt-" prefix was chosen over
    "o" so as to be somewhat self-documenting. The "opt-" prefix was
    chosen over "O" so as to not have any confusion with "0" while being
    so close to the interpreter version number.
    I get it, and the examples you include in the open questions is helpful, but I
    still don't like "opt-". We'll no doubt bikeshed on this until Guido
    decides but looking at the examples below I'd be okay with 'O<level>'. Did
    you consider 'opt<level>', e.g. imporlib.cpython-35.opt0.pyc ?

    Nope, and I'll think about it and at least add it as a possibility.

    Compatibility Considerations
    ============================
    Just as PEP 3147 had to make backward compatibility concessions to .pyc files
    living outside __pycache__ (which I think is still supported, right?)

    Unfortunately yes.

    I think
    you'll have to do the same for traditional .pyo files, at least for Python
    3.5. You won't have to *write* such files, but if they exist and the
    corresponding optimization level pyc file isn't present in __pycache__, you'll
    have to load them.

    It might in fact make sense to add some language to this PEP saying that in
    Python 3.6, support for old-style .pyc and .pyo files will be removed.

    Ah, but you see the magic number changed in Python 3.5 for matrix multiplication, so pre-existing .pyo files won't even load, so they will have to be regenerated regardless. I will mention that in the PEP.
    _______________________________________________
    Import-SIG mailing list
    Import-SIG at python.org <mailto:import-sig@python.org>
    https://mail.python.org/mailman/listinfo/import-sig <https://mail.python.org/mailman/listinfo/import-sig>

    Some people ship .pyc only code, do people also ship .pyo only code?


    ---
    Donald Stufft
    PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA


    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: <http://mail.python.org/pipermail/import-sig/attachments/20150227/9b416093/attachment-0001.html>
    -------------- next part --------------
    A non-text attachment was scrubbed...
    Name: signature.asc
    Type: application/pgp-signature
    Size: 801 bytes
    Desc: Message signed with OpenPGP using GPGMail
    URL: <http://mail.python.org/pipermail/import-sig/attachments/20150227/9b416093/attachment-0001.sig>
  • Brett Cannon at Feb 27, 2015 at 7:40 pm

    On Fri, Feb 27, 2015 at 2:28 PM Donald Stufft wrote:


    On Feb 27, 2015, at 2:26 PM, Brett Cannon wrote:


    On Fri, Feb 27, 2015 at 1:28 PM Barry Warsaw wrote:

    This looks great Brett, thanks for pushing it forward. I think it's a
    perfectly natural and consistent extension to PEP 3147.

    Some comments inlined.
    On Feb 27, 2015, at 05:06 PM, Brett Cannon wrote:

    Rationale
    =========

    - 0: ``.pyc``
    - 1 (``-O``): ``.pyo``
    - 2 (``-OO``): ``.pyo``
    This is all the rationale I need. :)
    The "opt-" prefix was chosen so as to provide a visual separator
    from the cache tag. The placement of the optimization level after
    the cache tag was chosen to preserve lexicographic sort order of
    bytecode file names based on module name and cache tag which will
    not vary for a single interpreter. The "opt-" prefix was chosen over
    "o" so as to be somewhat self-documenting. The "opt-" prefix was
    chosen over "O" so as to not have any confusion with "0" while being
    so close to the interpreter version number.
    I get it, and the examples you include in the open questions is helpful,
    but I
    still don't like "opt-". We'll no doubt bikeshed on this until Guido
    decides but looking at the examples below I'd be okay with 'O<level>'.
    Did
    you consider 'opt<level>', e.g. imporlib.cpython-35.opt0.pyc ?
    Nope, and I'll think about it and at least add it as a possibility.

    Compatibility Considerations
    ============================
    Just as PEP 3147 had to make backward compatibility concessions to .pyc
    files
    living outside __pycache__ (which I think is still supported, right?)

    Unfortunately yes.

    I think
    you'll have to do the same for traditional .pyo files, at least for Python
    3.5. You won't have to *write* such files, but if they exist and the
    corresponding optimization level pyc file isn't present in __pycache__,
    you'll
    have to load them.

    It might in fact make sense to add some language to this PEP saying that
    in
    Python 3.6, support for old-style .pyc and .pyo files will be removed.
    Ah, but you see the magic number changed in Python 3.5 for matrix
    multiplication, so pre-existing .pyo files won't even load, so they will
    have to be regenerated regardless. I will mention that in the PEP.

    _______________________________________________
    Import-SIG mailing list
    Import-SIG at python.org
    https://mail.python.org/mailman/listinfo/import-sig


    Some people ship .pyc only code, do people also ship .pyo only code?

    Definitely possible as is shipping both .pyc and .pyo files.
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: <http://mail.python.org/pipermail/import-sig/attachments/20150227/9b417b02/attachment.html>
  • Eric Snow at Feb 28, 2015 at 12:30 am

    On Fri, Feb 27, 2015 at 12:26 PM, Brett Cannon wrote:
    On Fri, Feb 27, 2015 at 1:28 PM Barry Warsaw wrote:
    I get it, and the examples you include in the open questions is helpful,
    but I
    still don't like "opt-". We'll no doubt bikeshed on this until Guido
    decides but looking at the examples below I'd be okay with 'O<level>'.
    Did
    you consider 'opt<level>', e.g. imporlib.cpython-35.opt0.pyc ?

    Nope, and I'll think about it and at least add it as a possibility.

    Keep in mind that the optimization "level" isn't constrained to just digits:


    imporlib.cpython-35.opt-b01603b27537a88c593d429923081a813f66eaef7360a5040507b90e85d285b0.pyc


    vs.


    imporlib.cpython-35.optb01603b27537a88c593d429923081a813f66eaef7360a5040507b90e85d285b0.pyc


    I think the hyphen helps in that case.


    -eric
  • Eric Snow at Feb 28, 2015 at 12:30 am

    On Fri, Feb 27, 2015 at 5:30 PM, Eric Snow wrote:
    imporlib.cpython-35.opt-b01603b27537a88c593d429923081a813f66eaef7360a5040507b90e85d285b0.pyc

    vs.

    imporlib.cpython-35.optb01603b27537a88c593d429923081a813f66eaef7360a5040507b90e85d285b0.pyc

    BTW, that hash comes from the hashlib example in the PEP. :)


    -eric
  • Nick Coghlan at Feb 28, 2015 at 4:50 pm

    On 28 February 2015 at 03:06, Brett Cannon wrote:
    Here is my proposed PEP to drop .pyo files from Python. Thanks to Barry's
    work in PEP 3147 this really shouldn't have much impact on user's code (then
    again, bytecode files are basically an implementation detail so it shouldn't
    impact hardly anyone directly).

    Some specific technical questions/suggestions:


    * Can we make "opt-0" implied so normal pyc file names don't change at all?


    * I'd like to see a description of the impact on compileall (which may
    be "no impact", but I'd like the PEP to explicitly say that if so)

    One thing I would appreciate is if people have more motivation for this.
    While the maintainer of importlib in me wants to see this happen, the core
    developer in me thinks the arguments are a little weak. So if people can
    provide more reasons why this is a good thing that would be appreciated.

    For that aspect, I'd suggest pitching the PEP as aiming primarily at
    separating the two optimisation levels (so stripped PYO files don't
    overwrite normal ones) and then simply eliminating the pyo extension
    entirely as being redundant since the new mechanism will also make it
    possible to distinguish optimised files from unoptimised ones.


    The first is the user facing benefit of the change (e.g. it lets us
    precompile all three levels in distro packages), while the latter is
    just a nice import maintainer facing side-effect.


    This perspective would likely be further strengthened if the "opt-0"
    case were taken as the implied default rather than being explicit in
    the filename.


    Regards,
    Nick.


    --
    Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
  • Antoine Pitrou at Feb 28, 2015 at 4:57 pm

    On Fri, 27 Feb 2015 17:06:59 +0000 Brett Cannon wrote:

    A period was chosen over a hyphen as a separator so as to distinguish
    clearly that the optimization level is not part of the interpreter
    version as specified by the cache tag. It also lends to the use of
    the period in the file name to delineate semantically different
    concepts.

    Indeed but why would other implementations have to mimick CPython here?
    Perhaps the whole idea of differing "optimization" levels doesn't make
    sense for them.


    Regards


    Antoine.
  • Nick Coghlan at Feb 28, 2015 at 5:13 pm

    On 1 March 2015 at 02:57, Antoine Pitrou wrote:
    On Fri, 27 Feb 2015 17:06:59 +0000
    Brett Cannon wrote:
    A period was chosen over a hyphen as a separator so as to distinguish
    clearly that the optimization level is not part of the interpreter
    version as specified by the cache tag. It also lends to the use of
    the period in the file name to delineate semantically different
    concepts.
    Indeed but why would other implementations have to mimick CPython here?
    Perhaps the whole idea of differing "optimization" levels doesn't make
    sense for them.

    Could Numba potentially use it for JIT priming?


    (I'd ask for PyPy as well, but I don't know if we have any PyPy devs
    on the import-sig list)


    Cheers,
    Nick.


    --
    Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
  • Antoine Pitrou at Feb 28, 2015 at 5:16 pm

    On Sun, 1 Mar 2015 03:13:20 +1000 Nick Coghlan wrote:
    On 1 March 2015 at 02:57, Antoine Pitrou wrote:
    On Fri, 27 Feb 2015 17:06:59 +0000
    Brett Cannon wrote:
    A period was chosen over a hyphen as a separator so as to distinguish
    clearly that the optimization level is not part of the interpreter
    version as specified by the cache tag. It also lends to the use of
    the period in the file name to delineate semantically different
    concepts.
    Indeed but why would other implementations have to mimick CPython here?
    Perhaps the whole idea of differing "optimization" levels doesn't make
    sense for them.
    Could Numba potentially use it for JIT priming?

    We'll probably want something like that one day, but we wouldn't
    necessarily use the same file structure - Numba currently works at the
    function level, not at the module level.


    In other words, the PEP is entirely neutral for us.


    Regards


    Antoine.
  • Brett Cannon at Feb 28, 2015 at 9:16 pm

    On Sat, Feb 28, 2015 at 11:57 AM Antoine Pitrou wrote:


    On Fri, 27 Feb 2015 17:06:59 +0000
    Brett Cannon wrote:
    A period was chosen over a hyphen as a separator so as to distinguish
    clearly that the optimization level is not part of the interpreter
    version as specified by the cache tag. It also lends to the use of
    the period in the file name to delineate semantically different
    concepts.
    Indeed but why would other implementations have to mimick CPython here?
    Perhaps the whole idea of differing "optimization" levels doesn't make
    sense for them.

    Directly it might not, but if they support the AST module along with
    passing AST nodes to compile() then they would implicitly support
    optimizations for bytecode through custom loaders.


    I also checked PyPy and IronPython 3 and they both support -O.


    But an implementation that chose to skip the ast module and not support -O
    is the best argument to support Nick's ask to not specify the optimization
    if it is 0 (although I'm not saying that's enough to sway me to change the
    PEP).
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: <http://mail.python.org/pipermail/import-sig/attachments/20150228/6c791ce5/attachment-0001.html>
  • Nick Coghlan at Mar 1, 2015 at 12:48 am

    On 1 Mar 2015 07:16, "Brett Cannon" wrote:

    On Sat, Feb 28, 2015 at 11:57 AM Antoine Pitrou wrote:

    On Fri, 27 Feb 2015 17:06:59 +0000
    Brett Cannon wrote:
    A period was chosen over a hyphen as a separator so as to distinguish
    clearly that the optimization level is not part of the interpreter
    version as specified by the cache tag. It also lends to the use of
    the period in the file name to delineate semantically different
    concepts.
    Indeed but why would other implementations have to mimick CPython here?
    Perhaps the whole idea of differing "optimization" levels doesn't make
    sense for them.

    Directly it might not, but if they support the AST module along with
    passing AST nodes to compile() then they would implicitly support
    optimizations for bytecode through custom loaders.
    I also checked PyPy and IronPython 3 and they both support -O.

    But an implementation that chose to skip the ast module and not support
    -O is the best argument to support Nick's ask to not specify the
    optimization if it is 0 (although I'm not saying that's enough to sway me
    to change the PEP).


    I was only +0 on that particular idea myself, so I agree it's better to
    keep things consistent. However, the PEP should explicitly define what
    happens if the empty string (rather than None) is passed in. Since we need
    to define a standard way of handling that anyway, it could be a reasonable
    API for suppressing the new name segment entirely (even if CPython doesn't
    make use of it outside the test suite).


    Cheers,
    Nick.

    _______________________________________________
    Import-SIG mailing list
    Import-SIG at python.org
    https://mail.python.org/mailman/listinfo/import-sig
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: <http://mail.python.org/pipermail/import-sig/attachments/20150301/7397a2be/attachment.html>
  • Brett Cannon at Mar 1, 2015 at 4:02 pm
    On Sat, Feb 28, 2015 at 7:48 PM Nick Coghlan wrote:

    On 1 Mar 2015 07:16, "Brett Cannon" wrote:


    On Sat, Feb 28, 2015 at 11:57 AM Antoine Pitrou wrote:

    On Fri, 27 Feb 2015 17:06:59 +0000
    Brett Cannon wrote:
    A period was chosen over a hyphen as a separator so as to distinguish
    clearly that the optimization level is not part of the interpreter
    version as specified by the cache tag. It also lends to the use of
    the period in the file name to delineate semantically different
    concepts.
    Indeed but why would other implementations have to mimick CPython here?
    Perhaps the whole idea of differing "optimization" levels doesn't make
    sense for them.

    Directly it might not, but if they support the AST module along with
    passing AST nodes to compile() then they would implicitly support
    optimizations for bytecode through custom loaders.
    I also checked PyPy and IronPython 3 and they both support -O.

    But an implementation that chose to skip the ast module and not support
    -O is the best argument to support Nick's ask to not specify the
    optimization if it is 0 (although I'm not saying that's enough to sway me
    to change the PEP).

    I was only +0 on that particular idea myself, so I agree it's better to
    keep things consistent. However, the PEP should explicitly define what
    happens if the empty string (rather than None) is passed in. Since we need
    to define a standard way of handling that anyway, it could be a reasonable
    API for suppressing the new name segment entirely (even if CPython doesn't
    make use of it outside the test suite).

    Fair enough. It also provides a way to get to the old file name if it's
    desirable for some reason.


    I still have the option in the Open Issues section to see what it brings up
    in further discussions.
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: <http://mail.python.org/pipermail/import-sig/attachments/20150301/1302717e/attachment.html>
  • Brett Cannon at Feb 28, 2015 at 9:08 pm

    On Sat, Feb 28, 2015 at 11:50 AM Nick Coghlan wrote:

    On 28 February 2015 at 03:06, Brett Cannon wrote:
    Here is my proposed PEP to drop .pyo files from Python. Thanks to Barry's
    work in PEP 3147 this really shouldn't have much impact on user's code (then
    again, bytecode files are basically an implementation detail so it shouldn't
    impact hardly anyone directly).
    Some specific technical questions/suggestions:

    * Can we make "opt-0" implied so normal pyc file names don't change at all?

    Sure, but why specifically? EIBTI makes me not want to have some optional
    bit in the file name just make someone's life who didn't use
    cache_from_source() a little easier.



    * I'd like to see a description of the impact on compileall (which may
    be "no impact", but I'd like the PEP to explicitly say that if so)

    Are you talking about the command-line interface? If so then no, it makes
    no special difference beyond the fact that .pyo files won't be put in the
    legacy locations if you run the interpreter with -O and -OO.



    One thing I would appreciate is if people have more motivation for this.
    While the maintainer of importlib in me wants to see this happen, the core
    developer in me thinks the arguments are a little weak. So if people can
    provide more reasons why this is a good thing that would be appreciated.
    For that aspect, I'd suggest pitching the PEP as aiming primarily at
    separating the two optimisation levels (so stripped PYO files don't
    overwrite normal ones) and then simply eliminating the pyo extension
    entirely as being redundant since the new mechanism will also make it
    possible to distinguish optimised files from unoptimised ones.

    The first is the user facing benefit of the change (e.g. it lets us
    precompile all three levels in distro packages), while the latter is
    just a nice import maintainer facing side-effect.

    I'll add a sentence mentioning it allows all optimization levels to be
    compiled and available at once.



    This perspective would likely be further strengthened if the "opt-0"
    case were taken as the implied default rather than being explicit in
    the filename.

    Is that really so important? When was the last time you looked in a
    __pycache__ directory?
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: <http://mail.python.org/pipermail/import-sig/attachments/20150228/b42244a1/attachment.html>
  • Barry Warsaw at Mar 2, 2015 at 10:38 pm
    On Feb 28, 2015, at 09:08 PM, Brett Cannon wrote:

    On Sat, Feb 28, 2015 at 11:50 AM Nick Coghlan wrote:
    * Can we make "opt-0" implied so normal pyc file names don't change at all?
    Sure, but why specifically? EIBTI makes me not want to have some optional
    bit in the file name just make someone's life who didn't use
    cache_from_source() a little easier.

    I'd rather like opt-0 to be implied too, just because I think it will be the
    common case and it's less clutter, but I could be convinced that for
    consistency, opt-0 should be explicit.


    Just like with old .pyo files, you'll still have to support *loading* implicit
    opt-0 __pycache__ .pyc files. Even if the bytecode has to be regenerated for
    Python 3.5, you can't guarantee what tool will be generating it. So for
    backward compatibility with third party tools, I think you still have to
    support loading the old file names for 3.5, but only if the new name doesn't
    exist.


    Cheers,
    -Barry


    -------------- next part --------------
    A non-text attachment was scrubbed...
    Name: not available
    Type: application/pgp-signature
    Size: 819 bytes
    Desc: OpenPGP digital signature
    URL: <http://mail.python.org/pipermail/import-sig/attachments/20150302/f72d584c/attachment.sig>
  • Brett Cannon at Mar 1, 2015 at 4:05 pm
    Here is the latest draft. I think the biggest bit is the expanded section
    of the Open Issues with a few more formatting proposals and Nick's
    suggestion to let the common case of no optimizations lead to no level
    being specified in the file name (I also changed the potential PEP # as 487
    got snagged). Otherwise a sentence about getting to generate all
    optimization levels upfront and the empty string suppressing the inclusion
    of the optimization level are the other substantive changes.


    PEP: 488
    Title: Elimination of PYO files
    Version: $Revision$
    Last-Modified: $Date$
    Author: Brett Cannon <brett@python.org>
    Status: Draft
    Type: Standards Track
    Content-Type: text/x-rst
    Created: 20-Feb-2015
    Post-History:


    Abstract
    ========


    This PEP proposes eliminating the concept of PYO files from Python.
    To continue the support of the separation of bytecode files based on
    their optimization level, this PEP proposes extending the PYC file
    name to include the optimization level in bytecode repository
    directory (i.e., the ``__pycache__`` directory).




    Rationale
    =========


    As of today, bytecode files come in two flavours: PYC and PYO. A PYC
    file is the bytecode file generated and read from when no
    optimization level is specified at interpreter startup (i.e., ``-O``
    is not specified). A PYO file represents the bytecode file that is
    read/written when **any** optimization level is specified (i.e., when
    ``-O`` is specified, including ``-OO``). This means that while PYC
    files clearly delineate the optimization level used when they were
    generated -- namely no optimizations beyond the peepholer -- the same
    is not true for PYO files. Put in terms of optimization levels and
    the file extension:


       - 0: ``.pyc``
       - 1 (``-O``): ``.pyo``
       - 2 (``-OO``): ``.pyo``


    The reuse of the ``.pyo`` file extension for both level 1 and 2
    optimizations means that there is no clear way to tell what
    optimization level was used to generate the bytecode file. In terms
    of reading PYO files, this can lead to an interpreter using a mixture
    of optimization levels with its code if the user was not careful to
    make sure all PYO files were generated using the same optimization
    level (typically done by blindly deleting all PYO files and then
    using the `compileall` module to compile all-new PYO files [1]_).
    This issue is only compounded when people optimize Python code beyond
    what the interpreter natively supports, e.g., using the astoptimizer
    project [2]_.


    In terms of writing PYO files, the need to delete all PYO files
    every time one either changes the optimization level they want to use
    or are unsure of what optimization was used the last time PYO files
    were generated leads to unnecessary file churn. The change proposed
    by this PEP also allows for **all** optimization levels to be
    pre-compiled for bytecode files ahead of time, something that is
    currently impossible thanks to the reuse of the ``.pyo`` file
    extension for multiple optimization levels.


    As for distributing bytecode-only modules, having to distribute both
    ``.pyc`` and ``.pyo`` files is unnecessary for the common use-case
    of code obfuscation and smaller file deployments.




    Proposal
    ========


    To eliminate the ambiguity that PYO files present, this PEP proposes
    eliminating the concept of PYO files and their accompanying ``.pyo``
    file extension. To allow for the optimization level to be unambiguous
    as well as to avoid having to regenerate optimized bytecode files
    needlessly in the `__pycache__` directory, the optimization level
    used to generate a PYC file will be incorporated into the bytecode
    file name. Currently bytecode file names are created by
    ``importlib.util.cache_from_source()``, approximately using the
    following expression defined by PEP 3147 [3]_, [4]_, [5]_::


         '{name}.{cache_tag}.pyc'.format(name=module_name,
                                         cache_tag=sys.implementation.cache_tag)


    This PEP proposes to change the expression to::


         '{name}.{cache_tag}.opt-{optimization}.pyc'.format(
                 name=module_name,
                 cache_tag=sys.implementation.cache_tag,
                 optimization=str(sys.flags.optimize))


    The "opt-" prefix was chosen so as to provide a visual separator
    from the cache tag. The placement of the optimization level after
    the cache tag was chosen to preserve lexicographic sort order of
    bytecode file names based on module name and cache tag which will
    not vary for a single interpreter. The "opt-" prefix was chosen over
    "o" so as to be somewhat self-documenting. The "opt-" prefix was
    chosen over "O" so as to not have any confusion with "0" while being
    so close to the interpreter version number.


    A period was chosen over a hyphen as a separator so as to distinguish
    clearly that the optimization level is not part of the interpreter
    version as specified by the cache tag. It also lends to the use of
    the period in the file name to delineate semantically different
    concepts.


    For example, the bytecode file name of ``importlib.cpython-35.pyc``
    would become ``importlib.cpython-35.opt-0.pyc``. If ``-OO`` had been
    passed to the interpreter then instead of
    ``importlib.cpython-35.pyo`` the file name would be
    ``importlib.cpython-35.opt-2.pyc``.




    Implementation
    ==============


    importlib
    ---------


    As ``importlib.util.cache_from_source()`` is the API that exposes
    bytecode file paths as while as being directly used by importlib, it
    requires the most critical change. As of Python 3.4, the function's
    signature is::


       importlib.util.cache_from_source(path, debug_override=None)


    This PEP proposes changing the signature in Python 3.5 to::


       importlib.util.cache_from_source(path, debug_override=None, *,
    optimization=None)


    The introduced ``optimization`` keyword-only parameter will control
    what optimization level is specified in the file name. If the
    argument is ``None`` then the current optimization level of the
    interpreter will be assumed. Any argument given for ``optimization``
    will be passed to ``str()`` and must have ``str.isalnum()`` be true,
    else ``ValueError`` will be raised (this prevents invalid characters
    being used in the file name). If the empty string is passed in for
    ``optimization`` then the addition of the optimization will be
    suppressed, reverting to the file name format which predates this
    PEP.


    It is expected that beyond Python's own
    0-2 optimization levels, third-party code will use a hash of
    optimization names to specify the optimization level, e.g.
    ``hashlib.sha256(','.join(['dead code elimination', 'constant
    folding'])).hexdigest()``.
    While this might lead to long file names, it is assumed that most
    users never look at the contents of the __pycache__ directory and so
    this won't be an issue.


    The ``debug_override`` parameter will be deprecated. As the parameter
    expects a boolean, the integer value of the boolean will be used as
    if it had been provided as the argument to ``optimization`` (a
    ``None`` argument will mean the same as for ``optimization``). A
    deprecation warning will be raised when ``debug_override`` is given a
    value other than ``None``, but there are no plans for the complete
    removal of the parameter as this time (but removal will be no later
    than Python 4).


    The various module attributes for importlib.machinery which relate to
    bytecode file suffixes will be updated [7]_. The
    ``DEBUG_BYTECODE_SUFFIXES`` and ``OPTIMIZED_BYTECODE_SUFFIXES`` will
    both be documented as deprecated and set to the same value as
    ``BYTECODE_SUFFIXES`` (removal of ``DEBUG_BYTECODE_SUFFIXES`` and
    ``OPTIMIZED_BYTECODE_SUFFIXES`` is not currently planned, but will be
    not later than Python 4).


    All various finders and loaders will also be updated as necessary,
    but updating the previous mentioned parts of importlib should be all
    that is required.




    Rest of the standard library
    ----------------------------


    The various functions exposed by the ``py_compile`` and
    ``compileall`` functions will be updated as necessary to make sure
    they follow the new bytecode file name semantics [6]_, [1]_. The CLI
    for the ``compileall`` module will not be directly affected (the
    ``-b`` flag will be implicitly as it will no longer generate ``.pyo``
    files when ``-O`` is specified).




    Compatibility Considerations
    ============================


    Any code directly manipulating bytecode files from Python 3.2 on
    will need to consider the impact of this change on their code (prior
    to Python 3.2 -- including all of Python 2 -- there was no
    __pycache__ which already necessitates bifurcating bytecode file
    handling support). If code was setting the ``debug_override``
    argument to ``importlib.util.cache_from_source()`` then care will be
    needed if they want the path to a bytecode file with an optimization
    level of 2. Otherwise only code **not** using
    ``importlib.util.cache_from_source()`` will need updating.


    As for people who distribute bytecode-only modules (i.e., use a
    bytecode file instead of a source file), they will have to choose
    which optimization level they want their bytecode files to be since
    distributing a ``.pyo`` file with a ``.pyc`` file will no longer be
    of any use. Since people typically only distribute bytecode files for
    code obfuscation purposes or smaller distribution size then only
    having to distribute a single ``.pyc`` should actually be beneficial
    to these use-cases. And since the magic number for bytecode files
    changed in Python 3.5 to support PEP 465 there is no need to support
    pre-existing ``.pyo`` files [8]_.




    Rejected Ideas
    ==============


    N/A




    Open Issues
    ===========


    Formatting of the optimization level in the file name
    -----------------------------------------------------


    Using the "opt-" prefix and placing the optimization level between
    the cache tag and file extension is not critical. All options which
    have been considered are:


    * ``importlib.cpython-35.opt-0.pyc``
    * ``importlib.cpython-35.opt0.pyc``
    * ``importlib.cpython-35.o0.pyc``
    * ``importlib.cpython-35.O0.pyc``
    * ``importlib.cpython-35.0.pyc``
    * ``importlib.cpython-35-O0.pyc``
    * ``importlib.O0.cpython-35.pyc``
    * ``importlib.o0.cpython-35.pyc``
    * ``importlib.0.cpython-35.pyc``


    These were initially rejected either because they would change the
    sort order of bytecode files, possible ambiguity with the cache tag,
    or were not self-documenting enough.




    Not specifying the optimization level when it is at 0
    -----------------------------------------------------


    It has been suggested that for the common case of when the
    optimizations are at level 0 that the entire part of the file name
    relating to the optimization level be left out. This would allow for
    file names of ``.pyc`` files to go unchanged, potentially leading to
    less backwards-compatibility issues.


    It would also allow a potentially redundant bit of information to be
    left out of the file name if an implementation of Python did not
    allow for optimizing bytecode. This would only occur, though, if the
    interpreter didn't support ``-O`` **and** didn't implement the ast
    module, else user's could implement their own optimizations.


    Arguments against allow for this is "explicit is better than
    implicit" and "special cases aren't special enough to break the
    rules". There are also currently no Python 3 interpreters that don't
    support ``-O``, so a potential Python 3 implementation which doesn't
    allow bytecode optimization is entirely theoretical at the moment.




    References
    ==========


    .. [1] The compileall module
        (https://docs.python.org/3/library/compileall.html#module-compileall)


    .. [2] The astoptimizer project
        (https://pypi.python.org/pypi/astoptimizer)


    .. [3] ``importlib.util.cache_from_source()``
        (
    https://docs.python.org/3.5/library/importlib.html#importlib.util.cache_from_source
    )


    .. [4] Implementation of ``importlib.util.cache_from_source()`` from
    CPython 3.4.3rc1
        (
    https://hg.python.org/cpython/file/038297948389/Lib/importlib/_bootstrap.py#l437
    )


    .. [5] PEP 3147, PYC Repository Directories, Warsaw
        (http://www.python.org/dev/peps/pep-3147)


    .. [6] The py_compile module
        (https://docs.python.org/3/library/compileall.html#module-compileall)


    .. [7] The importlib.machinery module
        (
    https://docs.python.org/3/library/importlib.html#module-importlib.machinery)


    .. [8] ``importlib.util.MAGIC_NUMBER``
        (
    https://docs.python.org/3/library/importlib.html#importlib.util.MAGIC_NUMBER
    )




    Copyright
    =========


    This document has been placed in the public domain.




    ..
        Local Variables:
        mode: indented-text
        indent-tabs-mode: nil
        sentence-end-double-space: t
        fill-column: 70
        coding: utf-8
        End:
    -------------- next part --------------
    An HTML attachment was scrubbed...
    URL: <http://mail.python.org/pipermail/import-sig/attachments/20150301/08751533/attachment-0001.html>

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
groupimport-sig @
categoriespython
postedFeb 27, '15 at 5:06p
activeMar 2, '15 at 10:38p
posts21
users9
websitepython.org

People

Translate

site design / logo © 2018 Grokbase