Skip to content

Commit 63c9880

Browse files
committed
baremetal: discuss options for managing baremetal-operator
The Bare Metal Operator is currently launched by the Machine API Operator when the cluster is running on the baremetal infrastructure platform. Discuss options for evolving this design so that the Machine API Operator no longer needs to have quite so much bare metal specific knowledge.
1 parent 10df34f commit 63c9880

File tree

1 file changed

+166
-0
lines changed

1 file changed

+166
-0
lines changed
Lines changed: 166 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,166 @@
1+
---
2+
title: baremetal-operator-management
3+
authors:
4+
- "@markmc"
5+
reviewers:
6+
- "@abhinavdahiya"
7+
- "@dhellmann"
8+
- "@enxebre"
9+
- "@eparis"
10+
- "@hardys"
11+
- "@sadasu"
12+
- "@smarterclayton"
13+
- "@stbenjam"
14+
approvers:
15+
- TBD
16+
creation-date: 2020-02-13
17+
last-updated: 2020-02-13
18+
status: provisional
19+
see-also:
20+
- https://github.com/openshift/enhancements/pull/200
21+
replaces:
22+
superseded-by:
23+
---
24+
25+
# Managing the Bare Metal Operator
26+
27+
## Release Signoff Checklist
28+
29+
- [ ] Enhancement is `implementable`
30+
- [ ] Design details are appropriately documented from clear requirements
31+
- [ ] Test plan is defined
32+
- [ ] Graduation criteria for dev preview, tech preview, GA
33+
- [ ] User-facing documentation is created in [openshift-docs](https://github.com/openshift/openshift-docs/)
34+
35+
## Open Questions
36+
37+
1. Should the Bare Metal Operator be managed by the Cluster Version
38+
Operator or the Machine API Operator?
39+
2. How to handle upgrades
40+
3. Which release this change should target
41+
42+
## Summary
43+
44+
The Bare Metal Operator provides bare metal machine management
45+
capabilities needed for the Machine API provider on the `baremetal`
46+
platform, and there is no equivalent for this component on other
47+
platforms.
48+
49+
The Bare Metal Operator is currently managed by the Machine API
50+
Operator in a fashion that requires the Machine API Operator to have
51+
significant bare metal specific knowledge. This can be resolved either
52+
by:
53+
54+
1. The Machine API Operator can add a generic operator management
55+
framework (similar to some Cluster Version Operator capabilities)
56+
for platform specific operators, and the Bare Metal Operator would
57+
integrate with this framework.
58+
2. The Cluster Version Operator can gain the ability to manage
59+
operators that are platform-specific, and the Bare Metal Operator
60+
would move to being managed by the Cluster Version Operator.
61+
62+
## Motivation
63+
64+
In order to bring up a cluster using the baremetal platform:
65+
66+
1. The installer needs to capture bare metal host information and
67+
provisioning configuration for later use by baremetal-operator
68+
2. Something needs to install the CRDs for these bare metal specific
69+
resources created by the installer
70+
3. Something needs to launch baremetal-operator
71+
72+
Currently (1) is achieved by the installer creating:
73+
74+
* A Provisioning resource manifest
75+
* Manifests for BareMetalHost resources and their associated secrets
76+
77+
and these manifests are applied by the cluster-bootstrap component
78+
towards the end of the cluster bootstrapping process.
79+
80+
This resource creation step does not succeed until the step (2)
81+
completes - i.e. the relevant CRDs are applied - and this is currently
82+
done by the CVO as it applies the manifests for the MAO.
83+
84+
Finally, (3) happens when the MAO detects that it is running on the
85+
baremetal infrastructure platform and instantiates the BMO deployment.
86+
87+
There are two problems emerging with this design:
88+
89+
* A sense that launching the BMO is outside of the scope of the MAO,
90+
particularly since it does not manage operators on any other
91+
platform
92+
93+
* Expanding needs for bare metal specific manifests - for example, new
94+
CRDs used to drive new BMO capabilities - to be installed early in
95+
the cluster bring-up means introducing yet more bare metal specific
96+
concerns into the MAO
97+
98+
Steps (2) and (3) are aspects of cluster bring-up which the CVO is
99+
clearly well-suited. However, the CVO has no concept of managing
100+
operators that are specific to an infrastructure platform. Therefore,
101+
to use the CVO capabilities in this case would mean installing base
102+
metal CRDs and launching the bare metal operator even on platforms
103+
where it is not needed.
104+
105+
In contrast, the way (2) and (3) is currently achieved by the MAO is
106+
not at all generic and results in the MAO having substantial built-in
107+
knowledge of the bare metal platform. A more generic mechanism where
108+
it could discover and apply any required manifests from BMO would
109+
mean the addition of operator management capabilities that look very
110+
much like the CVO's capabilities.
111+
112+
### Goals
113+
114+
Allow bare metal machine management capabilities to be fully
115+
encapsulated in the Bare Metal Operator.
116+
117+
Ensure that the Bare Metal Operator is only deployed on the
118+
`baremetal` platform.
119+
120+
### Non-Goals
121+
122+
### Proposal
123+
124+
A choice should be made between the two options listed above:
125+
126+
1. The Machine API Operator adds a generic framework for
127+
platform-specific operators - e.g. a platform should be able to
128+
specify an operator image, and the Machine API Operator would
129+
extract and apply the necessary manifests from that operator image.
130+
2. The Cluster Version Operator adds the ability to have Cluster
131+
Operators that are specific to certain platforms - e.g. an operator
132+
image would be marked as conditional on a platform name, and the
133+
Cluster Version Operator would ignore it unless it detected that the
134+
cluster was configured with that platform.
135+
136+
### Risks and Mitigations
137+
138+
139+
## Design Details
140+
141+
### Test Plan
142+
143+
### Graduation Criteria
144+
145+
146+
### Upgrade / Downgrade Strategy
147+
148+
149+
### Version Skew Strategy
150+
151+
152+
## Implementation History
153+
154+
155+
## Drawbacks
156+
157+
158+
## Alternatives
159+
160+
## References
161+
162+
- "/enhancements/baremetal/baremetal-provisioning-config.md"
163+
- https://github.com/openshift/machine-api-operator/pull/302
164+
- https://github.com/metal3-io/baremetal-operator/issues/227
165+
- https://github.com/openshift/enhancements/pull/90
166+
- https://github.com/openshift/enhancements/pull/102

0 commit comments

Comments
 (0)