Skip to content

Added documentation on RBAC and service account #14

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Sep 24, 2017

Conversation

liyinan926
Copy link
Member


#### Driver

The Spark driver pod uses a Kubernetes service account to access the Kubernetes API server to create and watch executor pods. The service account used by the driver pod must have the appropriate permission for the driver to be able to do its work. Specifically, at minimum, the service account must have the [`edit`](https://kubernetes.io/docs/admin/authorization/rbac/#user-facing-roles) [`Role` or `ClusterRole`](https://kubernetes.io/docs/admin/authorization/rbac/#role-and-clusterrole) granted. By default, the driver pod is automatically assigned the `default` service account in the same namespace if no service account is specified when the pod gets created. Depending on the version and setup of Kubernetes deployed, this `default` service account may or may not have the `edit` role granted under the default Kubernetes [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/) policies. Sometimes users may need to specify a custom service account that has the right role granted. Spark on Kubernetes supports specifying a custom service account to be used by the driver pod through the configuration property `spark.kubernetes.authenticate.driver.serviceAccountName=<service account name>`. For example to make the driver pod to use the `spark` service account, a user simply adds the following option to the `spark-submit` command:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Largely LGTM
" in the same namespace " -> in the namespace specified using --kubernetes.namespace.

Can you break this into multiple lines?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@foxish
Copy link
Member

foxish commented Sep 21, 2017

Thanks!
Can you insert linebreaks? it'll make it easier to comment on individual lines as well.

Copy link
Member

@kimoonkim kimoonkim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc looks great. Thanks for writing this!

@liyinan926
Copy link
Member Author

@foxish do you have any more comments? If not, I will merge.


The Spark driver pod uses a Kubernetes service account to access the Kubernetes API server to create and watch executor pods. The service account used by the driver pod must have the appropriate permission for the driver to be able to do its work.

Specifically, at minimum, the service account must have the [`edit`](https://kubernetes.io/docs/admin/authorization/rbac/#user-facing-roles) [`Role` or `ClusterRole`](https://kubernetes.io/docs/admin/authorization/rbac/#role-and-clusterrole) granted. By default, the driver pod is automatically assigned the `default` service account in the namespace specified by `--kubernetes-namespace`, if no service account is specified when the pod gets created.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Must have "edit" on pods and services?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having the default "edit" role is a bit more than we need right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made it more specific that a role that allows creating pods and secrets is needed.

kubectl create clusterrolebinding spark-role --clusterrole=edit --serviceaccount=default:spark --namespace=default
```

Note that a `Role` can only be used to grant access to resources (like pods) within a single namespace, whereas a `ClusterRole` can be used to grant access to cluster-scoped resources (like nodes) as well as namespaced resources (like pods) across all namespaces. For Spark on Kubernetes, since the driver always creates executor pods in the same namespace, a `Role` is sufficient, although users may use a `ClusterRole` instead. For more information on RBAC authorization and how to configure Kubernetes service accounts for pods, please refer to [here](https://kubernetes.io/docs/admin/authorization/rbac/) and [here](https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace here with short description of link.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


#### Resource Staging Server

The Resource Staging Server (RSS) watches Spark driver pods to detect completed Spark applications so it knows when to safely delete resource bundles of the applications. When running as a pod in the same Kubernetes cluster as the Spark applications, by default (`spark.kubernetes.authenticate.resourceStagingServer.useServiceAccountCredentials` defaults to `true`), the RSS uses the default Kubernetes service account token located at `/var/run/secrets/kubernetes.io/serviceaccount/token` and the CA certificate located at `/var/run/secrets/kubernetes.io/serviceaccount/ca.crt`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would help to make it clearer that the location is within the pod, and is the default location that k8s uses. Unclear if it's the local machine being referred to here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@foxish
Copy link
Member

foxish commented Sep 22, 2017

LGTM after comments

Copy link
Member

@foxish foxish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok to merge

@@ -290,6 +290,55 @@ the command may then look like the following:

## Advanced

### Configuring Kubernetes Roles and Service Accounts
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should specify that this applies to clusters with RBAC. Some clusters use custom authorizers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@foxish foxish merged commit 7397244 into apache-spark-on-k8s:master Sep 24, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants