|
| 1 | +--- |
| 2 | +title: assisted-installer-bare-metal-validations |
| 3 | +authors: |
| 4 | + - "@dhellmann" |
| 5 | + - "@stbenjam" |
| 6 | +reviewers: |
| 7 | + - "@romfreiman" |
| 8 | + - "@yevgeny-shnaidman" |
| 9 | +approvers: |
| 10 | + - "@hardys" |
| 11 | + - "@crawford" |
| 12 | + - "@markmc" |
| 13 | +creation-date: 2020-06-24 |
| 14 | +last-updated: 2020-06-26 |
| 15 | +status: implementable |
| 16 | +--- |
| 17 | + |
| 18 | +# Assisted Installer Bare Metal Validations |
| 19 | + |
| 20 | +## Release Signoff Checklist |
| 21 | + |
| 22 | +- [x] Enhancement is `implementable` |
| 23 | +- [ ] Design details are appropriately documented from clear requirements |
| 24 | +- [ ] Test plan is defined |
| 25 | +- [ ] Graduation criteria for dev preview, tech preview, GA |
| 26 | +- [ ] User-facing documentation is created in [openshift-docs](https://github.com/openshift/openshift-docs/) |
| 27 | + |
| 28 | +## Summary |
| 29 | + |
| 30 | +The assisted installer supports a user-provisioned infrastructure |
| 31 | +installation workflow for bare-metal clusters, but configures the |
| 32 | +installed cluster with certain capabilities that are currently only |
| 33 | +available using the `baremetal` platform in an installer-provisioned |
| 34 | +infrastructure workflow. |
| 35 | + |
| 36 | +This enhancement describes how we can modify the installer to allow |
| 37 | +bare-metal user-provisioned infrastructure clusters, including those |
| 38 | +produced with the assisted installer, to include those |
| 39 | +installer-provisioned infrastructure capabilities. |
| 40 | + |
| 41 | +## Motivation |
| 42 | + |
| 43 | +The [connected-assisted-installer |
| 44 | +enhancement](./connected-assisted-installer.md) describes how an agent |
| 45 | +running on several bare-metal hosts receive a command from the |
| 46 | +assisted installer application to begin the installation of that |
| 47 | +host. One of the first tasks that each host performs upon receiving |
| 48 | +this instruction is to download, from the assisted installer, the |
| 49 | +appropriate Ignition file for that host's role. These ignition files |
| 50 | +are generated by the assisted installer earlier in the process using |
| 51 | +the `openshift-install create ignition-configs` command. |
| 52 | + |
| 53 | +This installer workflow mirrors the [bare-metal user-provisioned |
| 54 | +infrastructure |
| 55 | +workflow](https://github.com/openshift/installer/blob/master/docs/user/metal/install_upi.md) |
| 56 | +in that the the OpenShift installer is used to generate assets for the |
| 57 | +cluster but the `openshift-install create cluster` command is not used |
| 58 | +to manage the deployment. However, the assisted installer does need to |
| 59 | +enable some `baremetal` installer-provisioned infrastructure platform |
| 60 | +features on the installed cluster. Therefore, we need to consider how |
| 61 | +to make those features available while using a user-provisioned |
| 62 | +infrastructure workflow. Currently, the assisted installer is using |
| 63 | +the [workaround of providing dummy |
| 64 | +data](https://github.com/filanov/bm-inventory/blob/54d56712b59d306125ee68d2014c98cf1e995e75/internal/installcfg/installcfg.go#L143-L190) |
| 65 | +for fields that are currently required by the `baremetal` platform but |
| 66 | +that are unused in this use-case. |
| 67 | + |
| 68 | +### Goals |
| 69 | + |
| 70 | +- Allow the use of certain installer-provisioned infrastructure |
| 71 | + features on bare-metal user-provisioned infrastructure clusters |
| 72 | +- Avoid making any incompatible changes to the bare-metal |
| 73 | + installer-provisioned infrastructure workflow |
| 74 | + |
| 75 | +### Non-Goals |
| 76 | + |
| 77 | +- Removing or relaxing any validation rules for the |
| 78 | + installer-provisioned infrastructure installer. |
| 79 | +- Recapitulate the test plans for the assisted installer. |
| 80 | +- Recapitulate the [approach for making parts of the bare metal |
| 81 | + infrastructure automation optional](./baremetal-provisioning-optional.md). |
| 82 | + |
| 83 | +## Proposal |
| 84 | + |
| 85 | +The [`baremetal.Platform` fields in |
| 86 | +`install-config.yaml`](https://github.com/openshift/installer/blob/master/docs/user/metal/install_ipi.md#install-config) |
| 87 | +can be considered to be configuring three distinct capabilities: |
| 88 | + |
| 89 | +1. Provision a libvirt VM with the bootstrap configuration, and use |
| 90 | + IPMI/PXE to provision a set of hosts as control plane nodes. |
| 91 | +2. Configure the `baremetal-operator` to run on the installed cluster |
| 92 | + for day 2 machine-management operations such as power control and |
| 93 | + automated provisioning of additional workers. |
| 94 | +3. Configure [cluster-hosted control-plane |
| 95 | + load-balancing](https://github.com/openshift/installer/blob/master/docs/design/baremetal/networking-infrastructure.md) |
| 96 | + using `keepalived` and `haproxy`. |
| 97 | + |
| 98 | +The proposal is to move where some of the the input validation happens |
| 99 | +to allow the latter 2 capabilities to be configured for |
| 100 | +user-provisioned infrastructure, without requiring configuration that |
| 101 | +is only relevant to the first capability. |
| 102 | + |
| 103 | +### User Stories [optional] |
| 104 | + |
| 105 | +#### Story 1 |
| 106 | + |
| 107 | +As a developer building the assisted installer tools, I want to be |
| 108 | +able to run the OpenShift installer to generate assets for a bare |
| 109 | +metal cluster without invoking validations that only apply to the |
| 110 | +installer-provisioned infrastructure workflows so I can use the assets |
| 111 | +in user-provisioned workflows. |
| 112 | + |
| 113 | +#### Story 2 |
| 114 | + |
| 115 | +As a developer of the OpenShift installer, I want to be able to add |
| 116 | +validation rules for bare metal clusters correctly based on whether |
| 117 | +the cluster is being built using the installer-provisioned |
| 118 | +infrastructure workflow or the user-provisioned workflow so I can |
| 119 | +apply only the rules necessary for each case. |
| 120 | + |
| 121 | +### Implementation Details/Notes/Constraints [optional] |
| 122 | + |
| 123 | +All of the validation is currently performed during early phases of |
| 124 | +the installer for the `InstallConfig` target. The user-provisioned |
| 125 | +workflow used by the assisted installer does not run the phase for the |
| 126 | +`Cluster` target (it only ever runs up to the `IgnitionConfigs` |
| 127 | +target), so if we move some of the validation into the later phase we |
| 128 | +will have a better separation of the parts needed to build the assets |
| 129 | +for the user-provisioned workflow and the assisted installer from the |
| 130 | +parts needed to run the installer for installer-provisioned |
| 131 | +infrastructure for bare metal. A secondary benefit is that the |
| 132 | +validation for the `baremetal` platform will more closely match the |
| 133 | +validation for cloud platforms. |
| 134 | + |
| 135 | +The install config inputs are currently validated as part of |
| 136 | +generating the `InstallConfig` asset in |
| 137 | +[InstallConfig.finish()](https://github.com/openshift/installer/blob/master/pkg/asset/installconfig/installconfig.go#L141), |
| 138 | +which invokes some logic that applies to all platforms in |
| 139 | +[ValidateInstallConfig()](https://github.com/openshift/installer/blob/master/pkg/types/validation/installconfig.go#L40) |
| 140 | +before running the platform-specific logic from |
| 141 | +[validatePlatform()](https://github.com/openshift/installer/blob/master/pkg/types/validation/installconfig.go#L341). For |
| 142 | +the `baremetal` platform that results in calling |
| 143 | +[ValidatePlatform()](https://github.com/openshift/installer/blob/master/pkg/types/baremetal/validation/platform.go#L218). |
| 144 | + |
| 145 | +The |
| 146 | +[PlatformProvisionCheck](https://github.com/openshift/installer/blob/master/pkg/asset/installconfig/platformprovisioncheck.go) |
| 147 | +and |
| 148 | +[PlatformCredsCheck](https://github.com/openshift/installer/blob/master/pkg/asset/installconfig/platformcredscheck.go) |
| 149 | +assets are both dependencies for the `Cluster` asset, which means they |
| 150 | +are evaluated later in the installation process. Both already have |
| 151 | +platform-specific logic, although not for the `baremetal` platform. |
| 152 | + |
| 153 | +The assisted installer does not need the `bootstrapProvisioningIP`, |
| 154 | +`provisioningNetworkCIDR`, `provisioningDHCPRange`, or |
| 155 | +`provisioningNetworkInterface` fields. The values in those fields are |
| 156 | +used to configure the provisioning tools used for the |
| 157 | +installer-provisioned infrastructure workflow, so those validation |
| 158 | +checks could move to a new function in the |
| 159 | +`pkg/types/baremetal/validation` package that can be invoked from |
| 160 | +[PlatformProvisionCheck.Generate()](https://github.com/openshift/installer/blob/master/pkg/asset/installconfig/platformprovisioncheck.go#L37). |
| 161 | + |
| 162 | +There are two validation rules for the `clusterOSImage` and |
| 163 | +`bootstrapOSImage` fields, to ensure the value is a valid URL and to |
| 164 | +ensure the URL exists. Those checks can be split between the two |
| 165 | +phases, so that the URL format validation is done in |
| 166 | +`ValidateInstallConfig()` and the existence check is moved to |
| 167 | +`PlatformProvisionCheck`. |
| 168 | + |
| 169 | +The assisted installer does not need the `libvirtURI` or host BMC |
| 170 | +credentials, so the checks for those values could move to a new public |
| 171 | +function in the `pkg/types/baremetal/validation` package that can be |
| 172 | +invoked from |
| 173 | +[PlatformCredsCheck.Generate()](https://github.com/openshift/installer/blob/master/pkg/asset/installconfig/platformcredscheck.go#L60). |
| 174 | + |
| 175 | +Changing the BMC credential validation in the installer is only part |
| 176 | +of the work required to make `BareMetalHost` resources work without |
| 177 | +credentials. The `baremetal-operator` also [needs to be |
| 178 | +updated](https://github.com/metal3-io/baremetal-operator/pull/563) to |
| 179 | +not treat the missing credentials as an error. |
| 180 | + |
| 181 | +### Risks and Mitigations |
| 182 | + |
| 183 | +Moving some of the validation steps means that invalid data will be |
| 184 | +caught later in the process. Most users will run the `create cluster` |
| 185 | +command directly, so they will not see a significant change. When the |
| 186 | +installer runs under Hive, or if the user runs the phases individually |
| 187 | +so they can inject extra manifests, the `create cluster` step may |
| 188 | +happen much later in the process. Bad data will still be caught before |
| 189 | +the bootstrap VM is launched, though, so it should be relatively quick |
| 190 | +in the overall installation process. |
| 191 | + |
| 192 | +There are no security implications for this change. |
| 193 | + |
| 194 | +## Design Details |
| 195 | + |
| 196 | +### Test Plan |
| 197 | + |
| 198 | +Besides the existing and new unit tests, the existing e2e tests for |
| 199 | +the `baremetal` platform will continue to test the full set of |
| 200 | +validation rules. The test suite for the assisted installer will |
| 201 | +verify that no extra validation rules are applied. |
| 202 | + |
| 203 | +### Graduation Criteria |
| 204 | + |
| 205 | +**Note:** *Section not required until targeted at a release.* |
| 206 | + |
| 207 | +Define graduation milestones. |
| 208 | + |
| 209 | +These may be defined in terms of API maturity, or as something else. Initial proposal |
| 210 | +should keep this high-level with a focus on what signals will be looked at to |
| 211 | +determine graduation. |
| 212 | + |
| 213 | +Consider the following in developing the graduation criteria for this |
| 214 | +enhancement: |
| 215 | +- Maturity levels - `Dev Preview`, `Tech Preview`, `GA` |
| 216 | +- Deprecation |
| 217 | + |
| 218 | +Clearly define what graduation means. |
| 219 | + |
| 220 | +#### Examples |
| 221 | + |
| 222 | +These are generalized examples to consider, in addition to the aforementioned |
| 223 | +[maturity levels][maturity-levels]. |
| 224 | + |
| 225 | +##### Dev Preview -> Tech Preview |
| 226 | + |
| 227 | +- Ability to utilize the enhancement end to end |
| 228 | +- End user documentation, relative API stability |
| 229 | +- Sufficient test coverage |
| 230 | +- Gather feedback from users rather than just developers |
| 231 | + |
| 232 | +##### Tech Preview -> GA |
| 233 | + |
| 234 | +- More testing (upgrade, downgrade, scale) |
| 235 | +- Sufficient time for feedback |
| 236 | +- Available by default |
| 237 | + |
| 238 | +**For non-optional features moving to GA, the graduation criteria must include |
| 239 | +end to end tests.** |
| 240 | + |
| 241 | +## Implementation History |
| 242 | + |
| 243 | +Major milestones in the life cycle of a proposal should be tracked in `Implementation |
| 244 | +History`. |
| 245 | + |
| 246 | +## Alternatives |
| 247 | + |
| 248 | +### Add a new platform |
| 249 | + |
| 250 | +We could add a new platform to the installer to provide a clear home |
| 251 | +for all of the logic that is specific to the assisted installer. This |
| 252 | +would be a more invasive change, and might end up being confusing |
| 253 | +since the installer would either still need to configure the cluster |
| 254 | +with the `baremetal` platform or the `machine-api-operator` would need |
| 255 | +to understand a new platform and configure the correct machine API |
| 256 | +implementation. |
| 257 | + |
| 258 | +### Add flag(s) to install-config.yml |
| 259 | + |
| 260 | +We could add a flag or other input field to `install-config.yml` to |
| 261 | +control the validation behavior. For example, |
| 262 | +<https://github.com/openshift/installer/pull/3794> proposes a list of |
| 263 | +validation rules to skip. There are two reasons not to do this. First, |
| 264 | +it would expose the validation rules through the installer's public |
| 265 | +API, even though we would only want the assisted installer to use the |
| 266 | +feature. Second, it places the burden of managing the set of |
| 267 | +validations on the wrapper in the assisted installer, instead of on |
| 268 | +the author of validation rules in the installer source code. We know |
| 269 | +which validations should apply in each case, so there is no need to |
| 270 | +make the validations configurable. |
| 271 | + |
| 272 | +### Add an environment variable |
| 273 | + |
| 274 | +An environment variable is a hidden API control for the installer, |
| 275 | +which makes using it and debugging its use more difficult. It also |
| 276 | +implies that there may be checks for the variable in several places in |
| 277 | +the code where the existing validation rules are, which spreads the |
| 278 | +knowledge of which validation rules apply in different scenarios |
| 279 | +throughout the installer code base, instead of using the existing |
| 280 | +dependency structure to manage them. |
0 commit comments