public class ComputeResponse
extends java.lang.Object
NOTE:
- NOT using Elasticsearch in practice - proved to be some speed issues with ES and Spark that appear to be ES-Spark specific - leave code in anticipation that the ES-Spark issues resolve...
- Even if rdd.count() calls are embedded in logger.debug statements, they are computed by Spark. Thus, they are commented out in the code below - uncomment for rdd.count() debug
Constructor and Description |
---|
ComputeResponse(org.apache.hadoop.fs.FileSystem fileSys) |
Modifier and Type | Method and Description |
---|---|
void |
performQuery()
Method to read in data from an allowed input source/format and perform the query
|
void |
performQuery(org.apache.spark.api.java.JavaRDD<org.apache.hadoop.io.MapWritable> inputRDD)
Method to perform the query given an input RDD of MapWritables
|
org.apache.spark.api.java.JavaRDD<org.apache.hadoop.io.MapWritable> |
readData()
Method to read in the data from an allowed input format, filter, and return a RDD of MapWritable data elements
|
org.apache.spark.api.java.JavaRDD<org.apache.hadoop.io.MapWritable> |
readDataES()
Method to read in the data from elasticsearch, filter, and return a RDD of MapWritable data elements
|
public ComputeResponse(org.apache.hadoop.fs.FileSystem fileSys) throws PIRException
PIRException
public void performQuery() throws java.io.IOException, PIRException
java.io.IOException
PIRException
public org.apache.spark.api.java.JavaRDD<org.apache.hadoop.io.MapWritable> readData() throws java.io.IOException, PIRException
java.io.IOException
PIRException
public org.apache.spark.api.java.JavaRDD<org.apache.hadoop.io.MapWritable> readDataES() throws java.io.IOException, PIRException
java.io.IOException
PIRException
public void performQuery(org.apache.spark.api.java.JavaRDD<org.apache.hadoop.io.MapWritable> inputRDD) throws PIRException
PIRException