Skip to content

Commit fdca351

Browse files
authored
Pyproject.toml support for DockerSettings (#3292)
* WIP pyproject toml support * Docstrings * Some docs * Tests * More tests * Add link * Fix install order mistake in docs * Add support for installing local projects * Add option to disable automatic requirements detection * Fix some more conditions * Local project installation docs * Small docs and exception updates * Remove wrong link * Add enum to check * Disable automatic detection by default * Improve docs and log * Adjust test to new default
1 parent 166e993 commit fdca351

File tree

6 files changed

+373
-53
lines changed

6 files changed

+373
-53
lines changed

docs/book/how-to/containerization/containerization.md

Lines changed: 65 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -245,65 +245,102 @@ ZenML offers several ways to specify dependencies for your Docker containers:
245245

246246
### Python Dependencies
247247

248+
By default, ZenML automatically installs all packages required by your active ZenML stack.
249+
250+
{% hint style="warning" %}
251+
In future versions, if none of the `replicate_local_python_environment`, `pyproject_path` or `requirements` attributes on `DockerSettings` are specified, ZenML will try to automatically find a `requirements.txt` and `pyproject.toml` files inside your current source root and install packages from the first one it finds. You can disable this behavior by setting `disable_automatic_requirements_detection=True`. If
252+
you already want this automatic detection in current versions of ZenML, set `disable_automatic_requirements_detection=False`.
253+
{% endhint %}
254+
248255
1. **Replicate Local Environment**:
256+
```python
257+
docker_settings = DockerSettings(replicate_local_python_environment=True)
258+
259+
260+
@pipeline(settings={"docker": docker_settings})
261+
def my_pipeline(...):
262+
...
263+
```
264+
265+
This will run `pip freeze` to get a list of the installed packages in your local Python environment and will install them in the Docker image. This ensures that the same
266+
exact dependencies will be installed.
267+
{% hint style="warning" %}
268+
This does not work when you have a local project installed. To install local projects, check out the `Install Local Projects` section below.
269+
{% endhint %}
270+
2. **Specify a `pyproject.toml` file**:
249271

250272
```python
251-
# Use pip freeze (outputs a requirements file with exact package versions)
252-
from zenml.config import DockerSettings, PythonEnvironmentExportMethod
253-
docker_settings = DockerSettings(
254-
replicate_local_python_environment=PythonEnvironmentExportMethod.PIP_FREEZE
255-
)
256-
# Or as a string
257-
docker_settings = DockerSettings(replicate_local_python_environment="pip_freeze")
258-
259-
# Or use poetry (requires Poetry to be installed)
260-
docker_settings = DockerSettings(
261-
replicate_local_python_environment=PythonEnvironmentExportMethod.POETRY_EXPORT
262-
)
263-
# Or as a string
264-
docker_settings = DockerSettings(replicate_local_python_environment="poetry_export")
265-
266-
# Use custom command (provide a list of command arguments)
267-
docker_settings = DockerSettings(replicate_local_python_environment=[
268-
"poetry", "export", "--extras=train", "--format=requirements.txt"
269-
])
273+
docker_settings = DockerSettings(pyproject_path="/path/to/pyproject.toml")
274+
275+
@pipeline(settings={"docker": docker_settings})
276+
def my_pipeline(...):
277+
...
270278
```
271279

272-
This feature allows you to easily replicate your local Python environment in the Docker container, ensuring that your pipeline runs with the same dependencies.
273-
2. **Specify Requirements Directly**:
280+
By default, ZenML will try to export the dependencies specified in the `pyproject.toml` by trying to run `uv export` and `poetry export`.
281+
If both of these commands do not work for your `pyproject.toml` file or you want to customize the command (for example to install certain
282+
extras), you can specify a custom command using the `pyproject_export_command` attribute. This command must output a list of requirements following the format of the [requirements file](https://pip.pypa.io/en/stable/reference/requirements-file-format/). The command can contain a `{directory}` placeholder which will be replaced with the directory in which the `pyproject.toml` file is stored.
283+
284+
```python
285+
from zenml.config import DockerSettings
286+
287+
docker_settings = DockerSettings(pyproject_export_command=[
288+
"uv",
289+
"export",
290+
"--extra=train",
291+
"--format=requirements-txt"
292+
"--directory={directory}
293+
])
294+
295+
296+
@pipeline(settings={"docker": docker_settings})
297+
def my_pipeline(...):
298+
...
299+
```
300+
3. **Specify Requirements Directly**:
274301

275302
```python
276303
docker_settings = DockerSettings(requirements=["torch==1.12.0", "torchvision"])
277304
```
278-
3. **Use Requirements File**:
305+
4. **Use Requirements File**:
279306

280307
```python
281308
docker_settings = DockerSettings(requirements="/path/to/requirements.txt")
282309
```
283-
4. **Specify ZenML Integrations**:
310+
5. **Specify ZenML Integrations**:
284311

285312
```python
286313
from zenml.integrations.constants import PYTORCH, EVIDENTLY
287314
288315
docker_settings = DockerSettings(required_integrations=[PYTORCH, EVIDENTLY])
289316
```
290-
5. **Control Stack Requirements**:\
317+
6. **Control Stack Requirements**:
291318
By default, ZenML installs the requirements needed by your active stack. You can disable this behavior if needed:
292319

293320
```python
294321
docker_settings = DockerSettings(install_stack_requirements=False)
295322
```
296323

297-
{% hint style="info" %}
298-
You can combine these methods but do make sure that your list of requirements does not overlap with ones specified explicitly in the Docker settings to avoid version conflicts.
299-
{% endhint %}
324+
7. **Install Local Projects**:
325+
If your code requires the installation of some local code files as a python package, you can specify a command
326+
that installs it as follows:
327+
```python
328+
docker_settings = DockerSettings(local_project_install_command="pip install . --no-deps")
329+
```
330+
331+
{% hint style="warning" %}
332+
Installing a local python package only works if your code files are included in the Docker image, so make sure you have
333+
`allow_including_files_in_images=True` in your Docker settings. If you want to instead use the [code download functionality](#source-code-management)
334+
to avoid building new Docker images for each pipeline run, you can follow [this example](https://github.com/zenml-io/zenml-patterns/tree/main/docker-local-pkg).
335+
{% endhint %}
300336

301337
Depending on the options specified in your Docker settings, ZenML installs the requirements in the following order (each step optional):
302338

303339
1. The packages installed in your local Python environment
304340
2. The packages required by the stack (unless disabled by setting `install_stack_requirements=False`)
305341
3. The packages specified via the `required_integrations`
306-
4. The packages specified via the `requirements` attribute
342+
4. The packages defined in the pyproject.toml file specified by the `pyproject_path` attribute
343+
5. The packages specified via the `requirements` attribute
307344

308345
### System Packages
309346

src/zenml/config/build_configuration.py

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -135,6 +135,9 @@ def should_include_files(
135135
Returns:
136136
Whether files should be included in the image.
137137
"""
138+
if self.settings.local_project_install_command:
139+
return True
140+
138141
if self.should_download_files(code_repository=code_repository):
139142
return False
140143

@@ -153,6 +156,9 @@ def should_download_files(
153156
Returns:
154157
Whether files should be downloaded in the image.
155158
"""
159+
if self.settings.local_project_install_command:
160+
return False
161+
156162
if self.should_download_files_from_code_repository(
157163
code_repository=code_repository
158164
):
@@ -176,6 +182,9 @@ def should_download_files_from_code_repository(
176182
Returns:
177183
Whether files should be downloaded from the code repository.
178184
"""
185+
if self.settings.local_project_install_command:
186+
return False
187+
179188
if (
180189
code_repository
181190
and self.settings.allow_download_from_code_repository

src/zenml/config/docker_settings.py

Lines changed: 97 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -91,12 +91,21 @@ class DockerSettings(BaseSettings):
9191
--------------------------------
9292
Depending on the configuration of this object, requirements will be
9393
installed in the following order (each step optional):
94-
- The packages installed in your local python environment
94+
- The packages installed in your local python environment (extracted using
95+
`pip freeze`)
9596
- The packages required by the stack unless this is disabled by setting
96-
`install_stack_requirements=False`.
97+
`install_stack_requirements=False`
9798
- The packages specified via the `required_integrations`
99+
- The packages defined inside a pyproject.toml file given by the
100+
`pyproject_path` attribute.
98101
- The packages specified via the `requirements` attribute
99102
103+
If neither `replicate_local_python_environment`, `pyproject_path` or
104+
`requirements` are specified, ZenML will try to automatically find a
105+
requirements.txt or pyproject.toml file in your current source root
106+
and installs packages from the first one it finds. You can disable this
107+
behavior by setting `disable_automatic_requirements_detection=True`.
108+
100109
Attributes:
101110
parent_image: Full name of the Docker image that should be
102111
used as the parent for the image that will be built. Defaults to
@@ -137,10 +146,29 @@ class DockerSettings(BaseSettings):
137146
packages.
138147
python_package_installer_args: Arguments to pass to the python package
139148
installer.
140-
replicate_local_python_environment: If not `None`, ZenML will use the
141-
specified method to generate a requirements file that replicates
142-
the packages installed in the currently running python environment.
143-
This requirements file will then be installed in the Docker image.
149+
disable_automatic_requirements_detection: If set to True, ZenML will
150+
not automatically detect requirements.txt files or pyproject.toml
151+
files in your source root.
152+
replicate_local_python_environment: If set to True, ZenML will run
153+
`pip freeze` to gather the requirements of the local Python
154+
environment and then install them in the Docker image.
155+
pyproject_path: Path to a pyproject.toml file. If given, the
156+
dependencies will be exported to a requirements.txt
157+
formatted file using the `pyproject_export_command` and then
158+
installed inside the Docker image.
159+
pyproject_export_command: Command to export the dependencies inside a
160+
pyproject.toml file to a requirements.txt formatted file. If not
161+
given and ZenML needs to export the requirements anyway, `uv export`
162+
and `poetry export` will be tried to see if one of them works. This
163+
command can contain a `{directory}` placeholder which will be
164+
replaced with the directory in which the pyproject.toml file is
165+
stored.
166+
**Note**: This command will be run before any code files are copied
167+
into the image. It is therefore not possible to install a local
168+
project using this command. This command should exclude any local
169+
projects, and you can specify a `local_project_install_command`
170+
instead which will be run after the code files are copied into the
171+
image.
144172
requirements: Path to a requirements file or a list of required pip
145173
packages. During the image build, these requirements will be
146174
installed using pip. If you need to use a different tool to
@@ -149,13 +177,16 @@ class DockerSettings(BaseSettings):
149177
required_integrations: List of ZenML integrations that should be
150178
installed. All requirements for the specified integrations will
151179
be installed inside the Docker image.
152-
required_hub_plugins: DEPRECATED/UNUSED.
153180
install_stack_requirements: If `True`, ZenML will automatically detect
154181
if components of your active stack are part of a ZenML integration
155182
and install the corresponding requirements and apt packages.
156183
If you set this to `False` or use custom components in your stack,
157184
you need to make sure these get installed by specifying them in
158185
the `requirements` and `apt_packages` attributes.
186+
local_project_install_command: Command to install a local project in
187+
the Docker image. This is run after the code files are copied into
188+
the image, and it is therefore only possible when code is included
189+
in the image, not downloaded at runtime.
159190
apt_packages: APT packages to install inside the Docker image.
160191
environment: Dictionary of environment variables to set inside the
161192
Docker image.
@@ -170,14 +201,6 @@ class DockerSettings(BaseSettings):
170201
from a code repository if possible.
171202
allow_download_from_artifact_store: If `True`, code can be downloaded
172203
from the artifact store.
173-
build_options: DEPRECATED, use parent_image_build_config.build_options
174-
instead.
175-
dockerignore: DEPRECATED, use build_config.dockerignore instead.
176-
copy_files: DEPRECATED/UNUSED.
177-
copy_global_config: DEPRECATED/UNUSED.
178-
source_files: DEPRECATED. Use allow_including_files_in_images,
179-
allow_download_from_code_repository and
180-
allow_download_from_artifact_store instead.
181204
"""
182205

183206
parent_image: Optional[str] = None
@@ -191,14 +214,18 @@ class DockerSettings(BaseSettings):
191214
PythonPackageInstaller.PIP
192215
)
193216
python_package_installer_args: Dict[str, Any] = {}
217+
disable_automatic_requirements_detection: bool = True
194218
replicate_local_python_environment: Optional[
195-
Union[List[str], PythonEnvironmentExportMethod]
219+
Union[List[str], PythonEnvironmentExportMethod, bool]
196220
] = Field(default=None, union_mode="left_to_right")
221+
pyproject_path: Optional[str] = None
222+
pyproject_export_command: Optional[List[str]] = None
197223
requirements: Union[None, str, List[str]] = Field(
198224
default=None, union_mode="left_to_right"
199225
)
200226
required_integrations: List[str] = []
201227
install_stack_requirements: bool = True
228+
local_project_install_command: Optional[str] = None
202229
apt_packages: List[str] = []
203230
environment: Dict[str, Any] = {}
204231
user: Optional[str] = None
@@ -221,6 +248,8 @@ class DockerSettings(BaseSettings):
221248
"copy_global_config",
222249
"source_files",
223250
"required_hub_plugins",
251+
"build_options",
252+
"dockerignore",
224253
)
225254

226255
@model_validator(mode="before")
@@ -305,6 +334,58 @@ def _validate_skip_build(self) -> "DockerSettings":
305334

306335
return self
307336

337+
@model_validator(mode="after")
338+
def _validate_code_files_included_if_installing_local_project(
339+
self,
340+
) -> "DockerSettings":
341+
"""Ensures that files are included when installing a local package.
342+
343+
Raises:
344+
ValueError: If files are not included in the Docker image
345+
when trying to install a local package.
346+
347+
Returns:
348+
The validated settings values.
349+
"""
350+
if (
351+
self.local_project_install_command
352+
and not self.allow_including_files_in_images
353+
):
354+
raise ValueError(
355+
"Files must be included in the Docker image when trying to "
356+
"install a local python package. You can do so by setting "
357+
"the `allow_including_files_in_images` attribute of your "
358+
"DockerSettings to `True`."
359+
)
360+
361+
return self
362+
363+
@model_validator(mode="after")
364+
def _deprecate_replicate_local_environment_commands(
365+
self,
366+
) -> "DockerSettings":
367+
"""Deprecates some values for `replicate_local_python_environment`.
368+
369+
Returns:
370+
The validated settings values.
371+
"""
372+
if isinstance(
373+
self.replicate_local_python_environment,
374+
(str, list, PythonEnvironmentExportMethod),
375+
):
376+
logger.warning(
377+
"Specifying a command (`%s`) for "
378+
"`DockerSettings.replicate_local_python_environment` is "
379+
"deprecated. If you want to replicate your exact local "
380+
"environment using `pip freeze`, set "
381+
"`DockerSettings.replicate_local_python_environment=True`. "
382+
"If you want to export requirements from a pyproject.toml "
383+
"file, use `DockerSettings.pyproject_path` and "
384+
"`DockerSettings.pyproject_export_command` instead."
385+
)
386+
387+
return self
388+
308389
model_config = ConfigDict(
309390
# public attributes are immutable
310391
frozen=True,

src/zenml/pipelines/build_utils.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,11 @@ def requires_included_code(
8080
for step in deployment.step_configurations.values():
8181
docker_settings = step.config.docker_settings
8282

83+
if docker_settings.local_project_install_command:
84+
# When installing a local package, we need to include the code
85+
# files in the container image.
86+
return True
87+
8388
if docker_settings.allow_download_from_artifact_store:
8489
return False
8590

@@ -136,6 +141,9 @@ def code_download_possible(
136141
Whether code download is possible for the deployment.
137142
"""
138143
for step in deployment.step_configurations.values():
144+
if step.config.docker_settings.local_project_install_command:
145+
return False
146+
139147
if step.config.docker_settings.allow_download_from_artifact_store:
140148
continue
141149

0 commit comments

Comments
 (0)