Skip to content

Commit 55bf2ca

Browse files
kimoonkimfoxish
authored andcommitted
Add initial integration test code (#3)
* Made test code compile * Clean up pom.xml * Unpack distro in place * Clean up redundant resources * Avoid buggy truezip * Cleaned up docker image builder * Builds some docker images * Drop http jar support for now * Clean up * Use spark-submit * Tests pass * Add hacks for dockerfiles and entrypoint.sh * Address review comments * Clean up pom.xml * Add instructions in README.md * Switch to container.image property keys * Define version properies in pom.xml * Fix a bug in pom.xml * Clean up and fix README.md * Fix README.md * Remove unnecessary close * Clean up
1 parent 48600c0 commit 55bf2ca

File tree

15 files changed

+1145
-2
lines changed

15 files changed

+1145
-2
lines changed

README.md

Lines changed: 90 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,90 @@
1-
# spark-integration
2-
Integration tests for Spark
1+
---
2+
layout: global
3+
title: Spark on Kubernetes Integration Tests
4+
---
5+
6+
# Running the Kubernetes Integration Tests
7+
8+
Note that the integration test framework is currently being heavily revised and
9+
is subject to change.
10+
11+
Note that currently the integration tests only run with Java 8.
12+
13+
Running the integration tests requires a Spark distribution package tarball that
14+
contains Spark jars, submission clients, etc. You can download a tarball from
15+
http://spark.apache.org/downloads.html. Or, you can create a distribution from
16+
source code using `make-distribution.sh`. For example:
17+
18+
```
19+
$ git clone [email protected]:apache/spark.git
20+
$ cd spark
21+
$ ./dev/make-distribution.sh --tgz \
22+
-Phadoop-2.7 -Pkubernetes -Pkinesis-asl -Phive -Phive-thriftserver
23+
```
24+
25+
The above command will create a tarball like spark-2.3.0-SNAPSHOT-bin.tgz in the
26+
top-level dir. For more details, see the related section in
27+
[building-spark.md](https://github.com/apache/spark/blob/master/docs/building-spark.md#building-a-runnable-distribution)
28+
29+
30+
The integration tests also need a local path to the directory that
31+
contains `Dockerfile`s. In the main spark repo, the path is
32+
`/spark/resource-managers/kubernetes/docker/src/main/dockerfiles`.
33+
34+
Once you prepare the inputs, the integration tests can be executed with Maven or
35+
your IDE. Note that when running tests from an IDE, the `pre-integration-test`
36+
phase must be run every time the Spark main code changes. When running tests
37+
from the command line, the `pre-integration-test` phase should automatically be
38+
invoked if the `integration-test` phase is run.
39+
40+
With Maven, the integration test can be run using the following command:
41+
42+
```
43+
$ mvn clean integration-test \
44+
-Dspark-distro-tgz=spark/spark-2.3.0-SNAPSHOT-bin.tgz \
45+
-Dspark-dockerfiles-dir=spark/resource-managers/kubernetes/docker/src/main/dockerfiles
46+
```
47+
48+
# Running against an arbitrary cluster
49+
50+
In order to run against any cluster, use the following:
51+
```sh
52+
$ mvn clean integration-test \
53+
-Dspark-distro-tgz=spark/spark-2.3.0-SNAPSHOT-bin.tgz \
54+
-Dspark-dockerfiles-dir=spark/resource-managers/kubernetes/docker/src/main/dockerfiles
55+
-DextraScalaTestArgs="-Dspark.kubernetes.test.master=k8s://https://<master> -Dspark.docker.test.driverImage=<driver-image> -Dspark.docker.test.executorImage=<executor-image>"
56+
```
57+
58+
# Preserve the Minikube VM
59+
60+
The integration tests make use of
61+
[Minikube](https://github.com/kubernetes/minikube), which fires up a virtual
62+
machine and setup a single-node kubernetes cluster within it. By default the vm
63+
is destroyed after the tests are finished. If you want to preserve the vm, e.g.
64+
to reduce the running time of tests during development, you can pass the
65+
property `spark.docker.test.persistMinikube` to the test process:
66+
67+
```
68+
$ mvn clean integration-test \
69+
-Dspark-distro-tgz=spark/spark-2.3.0-SNAPSHOT-bin.tgz \
70+
-Dspark-dockerfiles-dir=spark/resource-managers/kubernetes/docker/src/main/dockerfiles
71+
-DextraScalaTestArgs=-Dspark.docker.test.persistMinikube=true
72+
```
73+
74+
# Reuse the previous Docker images
75+
76+
The integration tests build a number of Docker images, which takes some time.
77+
By default, the images are built every time the tests run. You may want to skip
78+
re-building those images during development, if the distribution package did not
79+
change since the last run. You can pass the property
80+
`spark.docker.test.skipBuildImages` to the test process. This will work only if
81+
you have been setting the property `spark.docker.test.persistMinikube`, in the
82+
previous run since the docker daemon run inside the minikube environment. Here
83+
is an example:
84+
85+
```
86+
$ mvn clean integration-test \
87+
-Dspark-distro-tgz=spark/spark-2.3.0-SNAPSHOT-bin.tgz \
88+
-Dspark-dockerfiles-dir=spark/resource-managers/kubernetes/docker/src/main/dockerfiles
89+
"-DextraScalaTestArgs=-Dspark.docker.test.persistMinikube=true -Dspark.docker.test.skipBuildImages=true"
90+
```

integration-test/pom.xml

Lines changed: 250 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,250 @@
1+
<?xml version="1.0" encoding="UTF-8"?>
2+
<!--
3+
~ Licensed to the Apache Software Foundation (ASF) under one or more
4+
~ contributor license agreements. See the NOTICE file distributed with
5+
~ this work for additional information regarding copyright ownership.
6+
~ The ASF licenses this file to You under the Apache License, Version 2.0
7+
~ (the "License"); you may not use this file except in compliance with
8+
~ the License. You may obtain a copy of the License at
9+
~
10+
~ http://www.apache.org/licenses/LICENSE-2.0
11+
~
12+
~ Unless required by applicable law or agreed to in writing, software
13+
~ distributed under the License is distributed on an "AS IS" BASIS,
14+
~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15+
~ See the License for the specific language governing permissions and
16+
~ limitations under the License.
17+
-->
18+
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
19+
<modelVersion>4.0.0</modelVersion>
20+
21+
<artifactId>spark-kubernetes-integration-tests_2.11</artifactId>
22+
<groupId>spark-kubernetes-integration-tests</groupId>
23+
<version>0.1-SNAPSHOT</version>
24+
<properties>
25+
<commons-lang3.version>3.5</commons-lang3.version>
26+
<commons-logging.version>1.1.1</commons-logging.version>
27+
<docker-client.version>5.0.2</docker-client.version>
28+
<download-maven-plugin.version>1.3.0</download-maven-plugin.version>
29+
<exec-maven-plugin.version>1.4.0</exec-maven-plugin.version>
30+
<extraScalaTestArgs></extraScalaTestArgs>
31+
<guava.version>18.0</guava.version>
32+
<jsr305.version>1.3.9</jsr305.version>
33+
<kubernetes-client.version>3.0.0</kubernetes-client.version>
34+
<log4j.version>1.2.17</log4j.version>
35+
<scala.version>2.11.8</scala.version>
36+
<scala.binary.version>2.11</scala.binary.version>
37+
<scala-maven-plugin.version>3.2.2</scala-maven-plugin.version>
38+
<scalatest.version>2.2.6</scalatest.version>
39+
<scalatest-maven-plugin.version>1.0</scalatest-maven-plugin.version>
40+
<slf4j-log4j12.version>1.7.24</slf4j-log4j12.version>
41+
<sbt.project.name>kubernetes-integration-tests</sbt.project.name>
42+
<spark-distro-tgz>YOUR-SPARK-DISTRO-TARBALL-HERE</spark-distro-tgz>
43+
<spark-dockerfiles-dir>YOUR-DOCKERFILES-DIR-HERE</spark-dockerfiles-dir>
44+
<test.exclude.tags></test.exclude.tags>
45+
</properties>
46+
<packaging>jar</packaging>
47+
<name>Spark Project Kubernetes Integration Tests</name>
48+
49+
<dependencies>
50+
<dependency>
51+
<groupId>commons-logging</groupId>
52+
<artifactId>commons-logging</artifactId>
53+
<version>${commons-logging.version}</version>
54+
</dependency>
55+
<dependency>
56+
<groupId>com.google.code.findbugs</groupId>
57+
<artifactId>jsr305</artifactId>
58+
<version>${jsr305.version}</version>
59+
</dependency>
60+
<dependency>
61+
<groupId>com.google.guava</groupId>
62+
<artifactId>guava</artifactId>
63+
<scope>test</scope>
64+
<!-- For compatibility with Docker client. Should be fine since this is just for tests.-->
65+
<version>${guava.version}</version>
66+
</dependency>
67+
<dependency>
68+
<groupId>com.spotify</groupId>
69+
<artifactId>docker-client</artifactId>
70+
<version>${docker-client.version}</version>
71+
<scope>test</scope>
72+
</dependency>
73+
<dependency>
74+
<groupId>io.fabric8</groupId>
75+
<artifactId>kubernetes-client</artifactId>
76+
<version>${kubernetes-client.version}</version>
77+
</dependency>
78+
<dependency>
79+
<groupId>log4j</groupId>
80+
<artifactId>log4j</artifactId>
81+
<version>${log4j.version}</version>
82+
</dependency>
83+
<dependency>
84+
<groupId>org.apache.commons</groupId>
85+
<artifactId>commons-lang3</artifactId>
86+
<version>${commons-lang3.version}</version>
87+
</dependency>
88+
<dependency>
89+
<groupId>org.scala-lang</groupId>
90+
<artifactId>scala-library</artifactId>
91+
<version>${scala.version}</version>
92+
</dependency>
93+
<dependency>
94+
<groupId>org.scalatest</groupId>
95+
<artifactId>scalatest_${scala.binary.version}</artifactId>
96+
<version>${scalatest.version}</version>
97+
<scope>test</scope>
98+
</dependency>
99+
<dependency>
100+
<groupId>org.slf4j</groupId>
101+
<artifactId>slf4j-log4j12</artifactId>
102+
<version>${slf4j-log4j12.version}</version>
103+
<scope>test</scope>
104+
</dependency>
105+
</dependencies>
106+
107+
<build>
108+
<plugins>
109+
<plugin>
110+
<groupId>net.alchim31.maven</groupId>
111+
<artifactId>scala-maven-plugin</artifactId>
112+
<version>${scala-maven-plugin.version}</version>
113+
<executions>
114+
<execution>
115+
<goals>
116+
<goal>compile</goal>
117+
<goal>testCompile</goal>
118+
</goals>
119+
</execution>
120+
</executions>
121+
</plugin>
122+
<plugin>
123+
<groupId>org.codehaus.mojo</groupId>
124+
<artifactId>exec-maven-plugin</artifactId>
125+
<version>${exec-maven-plugin.version}</version>
126+
<executions>
127+
<execution>
128+
<id>unpack-spark-distro</id>
129+
<phase>pre-integration-test</phase>
130+
<goals>
131+
<goal>exec</goal>
132+
</goals>
133+
<configuration>
134+
<workingDirectory>${project.build.directory}</workingDirectory>
135+
<executable>/bin/sh</executable>
136+
<arguments>
137+
<argument>-c</argument>
138+
<argument>rm -rf spark-distro; mkdir spark-distro-tmp; cd spark-distro-tmp; tar xfz ${spark-distro-tgz}; mv * ../spark-distro; cd ..; rm -rf spark-distro-tmp</argument>
139+
</arguments>
140+
</configuration>
141+
</execution>
142+
<execution>
143+
<!-- TODO: Remove this hack once the upstream is fixed -->
144+
<id>copy-dockerfiles-if-missing</id>
145+
<phase>pre-integration-test</phase>
146+
<goals>
147+
<goal>exec</goal>
148+
</goals>
149+
<configuration>
150+
<workingDirectory>${project.build.directory}/spark-distro</workingDirectory>
151+
<executable>/bin/sh</executable>
152+
<arguments>
153+
<argument>-c</argument>
154+
<argument>test -d dockerfiles || cp -pr ${spark-dockerfiles-dir} dockerfiles</argument>
155+
</arguments>
156+
</configuration>
157+
</execution>
158+
<execution>
159+
<!-- TODO: Remove this hack once upstream is fixed by SPARK-22777 -->
160+
<id>set-exec-bit-on-docker-entrypoint-sh</id>
161+
<phase>pre-integration-test</phase>
162+
<goals>
163+
<goal>exec</goal>
164+
</goals>
165+
<configuration>
166+
<workingDirectory>${project.build.directory}/spark-distro/dockerfiles</workingDirectory>
167+
<executable>/bin/chmod</executable>
168+
<arguments>
169+
<argument>+x</argument>
170+
<argument>spark-base/entrypoint.sh</argument>
171+
</arguments>
172+
</configuration>
173+
</execution>
174+
</executions>
175+
</plugin>
176+
<plugin>
177+
<groupId>com.googlecode.maven-download-plugin</groupId>
178+
<artifactId>download-maven-plugin</artifactId>
179+
<version>${download-maven-plugin.version}</version>
180+
<executions>
181+
<execution>
182+
<id>download-minikube-linux</id>
183+
<phase>pre-integration-test</phase>
184+
<goals>
185+
<goal>wget</goal>
186+
</goals>
187+
<configuration>
188+
<url>https://storage.googleapis.com/minikube/releases/v0.22.0/minikube-linux-amd64</url>
189+
<outputDirectory>${project.build.directory}/minikube-bin/linux-amd64</outputDirectory>
190+
<outputFileName>minikube</outputFileName>
191+
</configuration>
192+
</execution>
193+
<execution>
194+
<id>download-minikube-darwin</id>
195+
<phase>pre-integration-test</phase>
196+
<goals>
197+
<goal>wget</goal>
198+
</goals>
199+
<configuration>
200+
<url>https://storage.googleapis.com/minikube/releases/v0.22.0/minikube-darwin-amd64</url>
201+
<outputDirectory>${project.build.directory}/minikube-bin/darwin-amd64</outputDirectory>
202+
<outputFileName>minikube</outputFileName>
203+
</configuration>
204+
</execution>
205+
</executions>
206+
</plugin>
207+
<plugin>
208+
<!-- Triggers scalatest plugin in the integration-test phase instead of
209+
the test phase. -->
210+
<groupId>org.scalatest</groupId>
211+
<artifactId>scalatest-maven-plugin</artifactId>
212+
<version>${scalatest-maven-plugin.version}</version>
213+
<configuration>
214+
<reportsDirectory>${project.build.directory}/surefire-reports</reportsDirectory>
215+
<junitxml>.</junitxml>
216+
<filereports>SparkTestSuite.txt</filereports>
217+
<argLine>-ea -Xmx3g -XX:ReservedCodeCacheSize=512m ${extraScalaTestArgs}</argLine>
218+
<stderr/>
219+
<systemProperties>
220+
<log4j.configuration>file:src/test/resources/log4j.properties</log4j.configuration>
221+
<java.awt.headless>true</java.awt.headless>
222+
</systemProperties>
223+
<tagsToExclude>${test.exclude.tags}</tagsToExclude>
224+
</configuration>
225+
<executions>
226+
<execution>
227+
<id>test</id>
228+
<goals>
229+
<goal>test</goal>
230+
</goals>
231+
<configuration>
232+
<!-- The negative pattern below prevents integration tests such as
233+
KubernetesSuite from running in the test phase. -->
234+
<suffixes>(?&lt;!Suite)</suffixes>
235+
</configuration>
236+
</execution>
237+
<execution>
238+
<id>integration-test</id>
239+
<phase>integration-test</phase>
240+
<goals>
241+
<goal>test</goal>
242+
</goals>
243+
</execution>
244+
</executions>
245+
</plugin>
246+
</plugins>
247+
248+
</build>
249+
250+
</project>
Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
#
2+
# Licensed to the Apache Software Foundation (ASF) under one or more
3+
# contributor license agreements. See the NOTICE file distributed with
4+
# this work for additional information regarding copyright ownership.
5+
# The ASF licenses this file to You under the Apache License, Version 2.0
6+
# (the "License"); you may not use this file except in compliance with
7+
# the License. You may obtain a copy of the License at
8+
#
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
#
11+
# Unless required by applicable law or agreed to in writing, software
12+
# distributed under the License is distributed on an "AS IS" BASIS,
13+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
# See the License for the specific language governing permissions and
15+
# limitations under the License.
16+
#
17+
18+
# Set everything to be logged to the file target/integration-tests.log
19+
log4j.rootCategory=INFO, file
20+
log4j.appender.file=org.apache.log4j.FileAppender
21+
log4j.appender.file.append=true
22+
log4j.appender.file.file=target/integration-tests.log
23+
log4j.appender.file.layout=org.apache.log4j.PatternLayout
24+
log4j.appender.file.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss.SSS} %t %p %c{1}: %m%n
25+
26+
# Ignore messages below warning level from a few verbose libraries.
27+
log4j.logger.com.sun.jersey=WARN
28+
log4j.logger.org.apache.hadoop=WARN
29+
log4j.logger.org.eclipse.jetty=WARN
30+
log4j.logger.org.mortbay=WARN
31+
log4j.logger.org.spark_project.jetty=WARN

0 commit comments

Comments
 (0)