Solution: Cascading 2.2 Platform Test Does Not Run

Cascading 2.2 has a new feature where you can write your unit tests to be platform independent.   The tests will run in whatever platform is in the class path. You can extend the class to use the PlatformTestCase.  The getPlatform() abstraction is used for the local or Hadoop test platform.

public class RetainOrderTest extends PlatformTestCase {

@Test
public void retainField() throws Exception {

        String testDir = System.getProperty("test.dir");
        
        // input of the job
        String sourcePath = testDir + "/input.txt";
Tap source = getPlatform().getDelimitedFile( Fields.ALL, true, ",", "", sourcePath, SinkMode.KEEP);

        //output of the job
String sinkPath = testDir + "/retain.txt";
        Tap sink = getPlatform().getDelimitedFile(Fields.ALL, true, ",", "/", sinkPath, SinkMode.REPLACE);

// create the job definition, and run it
FlowDef flowDef = RetainOrder.retainField(source,sink);
        getPlatform().getFlowConnector().connect(flowDef).complete();

}

}

The class is not part of the cascading core so you have to add it to the Maven dependency for platforms.  It mentions in the User Guide that you need the tests classifier so add this to the pom.xml file.

        <dependency>
            <groupId>cascading</groupId>
            <artifactId>cascading-platform</artifactId>
            <classifier>tests</classifier>
            <version>2.2.0</version>
        </dependency>


Even if the code compiles, you will need another library for the CascadingTestCase class loaded at runtime so add this to Maven.  This is not specifically mentioned in the UserGuide.

      <dependency>
            <groupId>cascading</groupId>
            <artifactId>cascading-core</artifactId>
            <classifier>tests</classifier>
            <version>2.2.0</version>
        </dependency>

Now running the test will not do anything and there is no error message.  This is because there is no test platform in the class path so cascading thinks there is nothing to do.  

Finally to get the test to run you can add either the local cascading platform and/or the Hadoop cascading platform to the classpath using the following Maven dependencies in the pom.xml.

<dependency>
            <groupId>cascading</groupId>
            <artifactId>cascading-hadoop</artifactId>
            <classifier>tests</classifier>
            <version>2.2.0</version>
        </dependency>

        <dependency>
            <groupId>cascading</groupId>
            <artifactId>cascading-local</artifactId>
            <classifier>tests</classifier>
            <version>2.2.0</version>
        </dependency>


Now your test should run on either environment.


No comments:

Post a Comment