Skip to content

separate source code and deployment configurations #1082

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
zzzeek opened this issue Aug 24, 2022 · 19 comments
Closed

separate source code and deployment configurations #1082

zzzeek opened this issue Aug 24, 2022 · 19 comments
Labels
bug Something isn't working configuration

Comments

@zzzeek
Copy link
Member

zzzeek commented Aug 24, 2022

this is the modern version of #708, where people called for alembic configuration to be in pyproject.toml. it's not appropriate for the whole alembic.ini to be moved to this file, and the actual problem to be solved is that Alembic's configuration model is somewhat wrong, hence we were unable to do #708 as it stands without first fixing configuration.

First let's lay out some of what exists, that is impacted by this:

  1. the alembic.ini file has source code pathing information in it, as well as database URLs to connect to databases, and python logging configurations
  2. alembic.ini is generated by the "alembic init" command that is intended to lay out a working config within a project
  3. the parts of alembic.ini that deal with file paths are consumed by the Config /ScriptDirectory components, which then load up the env.py file.
  4. Then the parts of alembic.ini that connect to databases and do logging are consumed within the env.py file, and users can in fact customize env.py so that logging / database config comes from somewhere else entirely.

So from that, we can summarize Alembic's configuration is in practice at least two separate categories:

  • the first is what I will call source code configuration, which is all the path related stuff: script_location, prepend_sys_path, version_locations. These attributes are consumed by Alembic when any commands are run, and in particular it needs to consume these variables in order for it to start working with the user's environment, which includes the env.py. these variables should ALWAYS be in pyproject.toml.
  • the second is what I will call deployment configuration, which is the live runtime stuff: database URLs , logging configuration. These are things that are configured on a production server in a way that is specific to a particular deployment. these variables should NEVER be in pyproject.toml
  • The "deployment configuration" in alembic.ini has always been implicitly optional. That is, from day one, we said, although not very clearly and sort of by "read the source, Luke" methods, please customize env.py to use your own deployment configuration scheme. That is we even included a "pylons" template that illustrated, "hey the live database URL doesn't have to come from alembic.ini". The use of logging.fileConfig() inside of env.py illustrated, "hey, the logging config doesn't have to come from alembic.ini".

See this is the bug. alembic.ini has two kinds of configuration, and then through vague handwavy means we sort of gave people a way to "move" the deployment configuration out of it, into their own system, which is what people do (at least, developers working in professional deployment situations. beginners, I have no idea what they do).

So given all that, here are more wrinkles:

  • we have an alembic init command that has to generate a working app config. So we can't just say, "hey sure, use pyproject.toml if you want". our init program has to support this, and to that end, supporting just one way would be best, with regards to the automatic generation of a project. Supporting existing deployments with alembic.ini is of course something we'd never change.
  • our Config object refers explicitly to a Python ConfigParser and its API is based on ConfigParser, which is not necessarily compatible with pyproject.toml - there are of course a lot of ways to work around this, but not sure of the details. We would need to at the very least port the notions of get_section_option get_main_option as well as setters to a configurational model that accommodates the toml format. such as, "main options" are now officially "source code / toml" features and "section options" are "deployment / alembic.ini" features (edit: not exactly. DB URL is in main, that goes "post_write_hooks", that's pyproject.toml).

So given all that, here are proposals:

  1. this is all major release stuff - alembic 1.9 or greater

  2. existing alembic projects should not be impacted at all - the existing alembic.ini format should always work, no plans to remove it

  3. Config API should be changed to have separate notions of source config and deployment config - I would deprecate all "main_option" / "section_option" language entirely and replace it with "source_config_option" / "deployment_option" (or whatever term). These dont match up exactly.

    • everything in [alembic] except for database url can be pyproject.toml, that is, source_config_option
    • [post_write_hooks] should be pyproject.toml, that is, source_config_section or some concept like that
    • [loggers] and all logging config stays as configparser
    • database.url is in alembic.ini "main", but we classify this as a deployment option
    • --name option, that is, https://alembic.sqlalchemy.org/en/latest/cookbook.html#multiple-environments. I dont know what to do here. --name should probably look in pyproject.toml and then alembic.ini for the named sections using resolution similar to the default resolution.
  4. if Config.file_config remains, it returns the alembic.ini config, but this would not have any data from pyproject.toml inside of it

  5. alembic init will be switched to render into pyproject.toml directly for source code options - that is we remove all the "main" stuff from alembic.ini. I don't want to have to document two styles. note that this includes:

    • generate pyproject.toml if it doesnt exist
    • render our pyproject section into an existing pyproject.toml file
    • detect if our sections are already present in pyproject.toml and don't add if so, emit a message
    • use a mako template for the pyproject section itself, but not the whole .toml file, since we need to render inside an existing file
  6. even in the new way, there is still an alembic.ini file - with default deployment configuration used by env.py. it can be removed if one's env.py does not need it

  7. alembic will consume source code options from pyproject.toml then fall back to alembic.ini

so that's what I have. I personally don't have time to own the effort on this, however whoever is doing it I will have a lot of very detailed review comments etc. because getting this wrong will just create expoentially more work later.

@zzzeek
Copy link
Member Author

zzzeek commented Aug 24, 2022

looking at how --name works, implementation wise, actually porting the toml reader to read out keys/sections into the ConfigParser() is likely the easiest way to do this. but not totally sure there isn't some format consideration that makes this infeasible

@zzzeek
Copy link
Member Author

zzzeek commented Aug 24, 2022

hmm nope, I think we have to rip out ConfigParser and rework Config to have its own internal data structure. then we have consumers that can consume both toml and configparser into the internal structure.

 def set_main_option(self, name: str, value: str) -> None:

for things like "list of directories", toml can represent the list directly, so that "str value" is not appropriate.

So we need to port ConfigParser to our own internal solution, for things like named sections. We then need consumers for pyproject and configparser that are separate and based on a schema. Things like the directory splitting we are doing at https://github.com/sqlalchemy/alembic/blob/main/alembic/script/base.py#L166 becomes local to the configparser consumer, because pyproject.toml gives us that directly.

this is a big job

@CaselIT
Copy link
Member

CaselIT commented Aug 24, 2022

  • use a mako template for the pyproject section itself, but not the whole .toml file, since we need to render inside an existing file

Not sure I would go down this path. I think toml files are witten like json files, ie the file is read, modification are done to python objects then the object is serialized, rewriting the toml file (iirc comments are kept)

@zzzeek
Copy link
Member Author

zzzeek commented Aug 24, 2022

how do different templates produce different configuations then ?

@CaselIT
Copy link
Member

CaselIT commented Aug 24, 2022

I guess we would have to manually construct the python dict that represents the tool.alembic section of the toml in some way

@CaselIT
Copy link
Member

CaselIT commented Aug 24, 2022

So from that, we can summarize Alembic's configuration is in practice at least two separate categories:

I never thought about thins, but I actually agree. In fact at work the have a different mean for configuring the engine and the logging, while all the other "source_config_option" are taken from alembic.ini

@gbdlin
Copy link

gbdlin commented Mar 23, 2023

Maybe it is worth changing the configuration templates now, so they no longer create an alembic.ini file that mixes deployment configuration, instead leaving it fully in the env.py?

@Kostiantyn-Salnykov
Copy link

Ideally solution for me as a SE:

  1. Move logging configuration from alembic.ini to 🐍 runtime (usually loggers.py where dictConfig initialized, this is really dynamic solution, because you can change any logging configuration depending on ENVs). Then setup alembic logger with all handlers/formatters/filters. (For example gunicorn provides such possibility with gunicorn logger name).
  2. Move out any connections / security stuff from alembic.ini / env.py. The DB URL shouldmust be constructed in 🐍 runtime and depends on (os.environ, pydantic.BaseSettings, SSM, SecretsManager, etc...) :rage3: .
  3. Move other configurations (path options. names, hooks) from alembic.ini to pyproject.toml under a [tool.alembic] and [tool.alembic.hooks]. Just because such options should be in pyproject.toml
  4. Remove alembic.ini from repository 🙏.

@Danipulok
Copy link
Contributor

Any update on this issue?

@CaselIT
Copy link
Member

CaselIT commented Feb 9, 2024

Nothing outside what's in this issue

@edwardgalligan
Copy link

To follow up on @Danipulok 's question

Context
@zzzeek seems to be the main @sqlalchemy team member that's contributed (& opened) to this ticket (& generally the main active contributor to this project anyway). Their last update was Aug 24 2022 at which point Alembic was on 1.9.0 using SQLAlchemy 1.3.0. SQLAlchemy 2.0 is now released (not sure how much impact that project has on this one) though Alembic is still on 1.x; I presume this change would warrant a bump to 2.x.

In that context, in those intervening years, is there:

  • any branches / forks spiking this work
  • any changes to the codebase or to the modern Python tooling landscape that reduces the scale of the original assessment of "this is a big job"
  • any concrete roadmap/plans to implement?

I see this commit went in very recently which may or may not relate to this (I haven't quite parsed that addition to see if it's purely related to this repo's package management or intended to affect the end user's repo package pyproject.toml usage).

Or in other words: if someone wanted to contribute some PRs toward this initiative, would they be duplicating any existing efforts?

@CaselIT
Copy link
Member

CaselIT commented Feb 27, 2025

if someone wanted to contribute some PRs toward this initiative, would they be duplicating any existing efforts?

no as far as we are aware.

Before opening a PR it make make sense to sketch the rough idea here to verify that the implementation is in the right track

@kassett
Copy link

kassett commented Apr 1, 2025

How do we feel about adoption for people who don't customize alembic too much?

@HideyoshiNakazone
Copy link

Don't want the the guy that necrobumps, but I believe this is an important issue, most Python tools are moving toward formalizing their configurations using pyproject.toml, and that seems like the right direction—especially considering the Python Foundation intention as outlined in PEP 621, particularly the section on Allowing tools to add/extend data.

I find the solution proposed by @Kostiantyn-Salnykov to be very reasonable, and I kindly ask @zzzeek to take another look at this issue.

@zzzeek
Copy link
Member Author

zzzeek commented Apr 25, 2025

Sorry what am I looking at exactly? The steps to implement this are fully written out in my original comment.

@HideyoshiNakazone
Copy link

Sorry, I think I should have asked if there is a implementation planned in the project backlog of one of the proposed solutions instead of just asking you to "look into it", my apologies.

If there is nothing planned in the project backlog do you think contributions is interesting for the project?

@zzzeek
Copy link
Member Author

zzzeek commented Apr 26, 2025

I think this is backlog right now. there's no technical challenge to this as much as presenting it correctly since I dont think it's feasible to add pyproject.toml support without updating all the tutorials / templates to use this style, and at the same time making sure nothing goes wrong with the millions of existing projects out there. if you just want to bolt on pyproject support, you can even do that on your end building a short commandline wrapper for the one we have that injects pyproject values into the config (though we also have someone who wants to merge a refactor of commandline too

@zzzeek
Copy link
Member Author

zzzeek commented May 13, 2025

particularly the section on Allowing tools to add/extend data.

That note seems to be part of the rejected portion of the pep and seems to involve tools being allowed to extend configuration that's not within a [tool] section

@sqla-tester
Copy link
Collaborator

Mike Bayer has proposed a fix for this issue in the main branch:

WIP: allow pep 621 configuration (only docs so far) https://gerrit.sqlalchemy.org/c/sqlalchemy/alembic/+/5860

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working configuration
Projects
None yet
Development

No branches or pull requests

9 participants