-
Notifications
You must be signed in to change notification settings - Fork 6
Added documentation on RBAC and service account #14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
src/jekyll/running-on-kubernetes.md
Outdated
|
||
#### Driver | ||
|
||
The Spark driver pod uses a Kubernetes service account to access the Kubernetes API server to create and watch executor pods. The service account used by the driver pod must have the appropriate permission for the driver to be able to do its work. Specifically, at minimum, the service account must have the [`edit`](https://kubernetes.io/docs/admin/authorization/rbac/#user-facing-roles) [`Role` or `ClusterRole`](https://kubernetes.io/docs/admin/authorization/rbac/#role-and-clusterrole) granted. By default, the driver pod is automatically assigned the `default` service account in the same namespace if no service account is specified when the pod gets created. Depending on the version and setup of Kubernetes deployed, this `default` service account may or may not have the `edit` role granted under the default Kubernetes [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/) policies. Sometimes users may need to specify a custom service account that has the right role granted. Spark on Kubernetes supports specifying a custom service account to be used by the driver pod through the configuration property `spark.kubernetes.authenticate.driver.serviceAccountName=<service account name>`. For example to make the driver pod to use the `spark` service account, a user simply adds the following option to the `spark-submit` command: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Largely LGTM
" in the same namespace " -> in the namespace specified using --kubernetes.namespace
.
Can you break this into multiple lines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The doc looks great. Thanks for writing this!
@foxish do you have any more comments? If not, I will merge. |
src/jekyll/running-on-kubernetes.md
Outdated
|
||
The Spark driver pod uses a Kubernetes service account to access the Kubernetes API server to create and watch executor pods. The service account used by the driver pod must have the appropriate permission for the driver to be able to do its work. | ||
|
||
Specifically, at minimum, the service account must have the [`edit`](https://kubernetes.io/docs/admin/authorization/rbac/#user-facing-roles) [`Role` or `ClusterRole`](https://kubernetes.io/docs/admin/authorization/rbac/#role-and-clusterrole) granted. By default, the driver pod is automatically assigned the `default` service account in the namespace specified by `--kubernetes-namespace`, if no service account is specified when the pod gets created. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Must have "edit" on pods and services?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having the default "edit" role is a bit more than we need right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made it more specific that a role that allows creating pods and secrets is needed.
src/jekyll/running-on-kubernetes.md
Outdated
kubectl create clusterrolebinding spark-role --clusterrole=edit --serviceaccount=default:spark --namespace=default | ||
``` | ||
|
||
Note that a `Role` can only be used to grant access to resources (like pods) within a single namespace, whereas a `ClusterRole` can be used to grant access to cluster-scoped resources (like nodes) as well as namespaced resources (like pods) across all namespaces. For Spark on Kubernetes, since the driver always creates executor pods in the same namespace, a `Role` is sufficient, although users may use a `ClusterRole` instead. For more information on RBAC authorization and how to configure Kubernetes service accounts for pods, please refer to [here](https://kubernetes.io/docs/admin/authorization/rbac/) and [here](https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replace here
with short description of link.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/jekyll/running-on-kubernetes.md
Outdated
|
||
#### Resource Staging Server | ||
|
||
The Resource Staging Server (RSS) watches Spark driver pods to detect completed Spark applications so it knows when to safely delete resource bundles of the applications. When running as a pod in the same Kubernetes cluster as the Spark applications, by default (`spark.kubernetes.authenticate.resourceStagingServer.useServiceAccountCredentials` defaults to `true`), the RSS uses the default Kubernetes service account token located at `/var/run/secrets/kubernetes.io/serviceaccount/token` and the CA certificate located at `/var/run/secrets/kubernetes.io/serviceaccount/ca.crt`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would help to make it clearer that the location is within the pod, and is the default location that k8s uses. Unclear if it's the local machine being referred to here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
LGTM after comments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok to merge
@@ -290,6 +290,55 @@ the command may then look like the following: | |||
|
|||
## Advanced | |||
|
|||
### Configuring Kubernetes Roles and Service Accounts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should specify that this applies to clusters with RBAC. Some clusters use custom authorizers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Fixes apache-spark-on-k8s/spark#500.