org.apache.pig.impl.builtin
Class RandomSampleLoader

java.lang.Object
  extended by org.apache.pig.impl.builtin.SampleLoader
      extended by org.apache.pig.impl.builtin.RandomSampleLoader
All Implemented Interfaces:
LoadFunc

public class RandomSampleLoader
extends SampleLoader

A loader that samples the data. This loader can subsume loader that can handle starting in the middle of a record. Loaders that can handle this should implement the SamplableLoader interface.


Nested Class Summary
 
Nested classes/interfaces inherited from interface org.apache.pig.LoadFunc
LoadFunc.RequiredField, LoadFunc.RequiredFieldList, LoadFunc.RequiredFieldResponse
 
Field Summary
 
Fields inherited from class org.apache.pig.impl.builtin.SampleLoader
loader, numSamples, skipInterval
 
Constructor Summary
RandomSampleLoader(String funcSpec, String ns)
          Construct with a class of loader to use.
 
Method Summary
 LoadFunc.RequiredFieldResponse fieldsToRead(LoadFunc.RequiredFieldList requiredFieldList)
          Indicate to the loader fields that will be needed.
 void setNumSamples(int n)
           
 
Methods inherited from class org.apache.pig.impl.builtin.SampleLoader
bindTo, bytesToBag, bytesToCharArray, bytesToDouble, bytesToFloat, bytesToInteger, bytesToLong, bytesToMap, bytesToTuple, computeSamples, determineSchema, getNext, getNumSamples
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

RandomSampleLoader

public RandomSampleLoader(String funcSpec,
                          String ns)
Construct with a class of loader to use.

Parameters:
funcSpec - func spec of the loader to use.
ns - Number of samples per map to collect. Arguments are passed as strings instead of actual types (FuncSpec, int) because FuncSpec only supports string arguments to UDF constructors.
Method Detail

setNumSamples

public void setNumSamples(int n)
Overrides:
setNumSamples in class SampleLoader

fieldsToRead

public LoadFunc.RequiredFieldResponse fieldsToRead(LoadFunc.RequiredFieldList requiredFieldList)
                                            throws FrontendException
Description copied from interface: LoadFunc
Indicate to the loader fields that will be needed. This can be useful for loaders that access data that is stored in a columnar format where indicating columns to be accessed a head of time will save scans. If the loader function cannot make use of this information, it is free to ignore it.

Specified by:
fieldsToRead in interface LoadFunc
Overrides:
fieldsToRead in class SampleLoader
Parameters:
requiredFieldList - RequiredFieldList indicating which columns will be needed.
Throws:
FrontendException


Copyright © ${year} The Apache Software Foundation