org.apache.pig.impl.builtin
Class RandomSampleLoader
java.lang.Object
org.apache.pig.impl.builtin.SampleLoader
org.apache.pig.impl.builtin.RandomSampleLoader
- All Implemented Interfaces:
- LoadFunc
public class RandomSampleLoader
- extends SampleLoader
A loader that samples the data. This loader can subsume loader that
can handle starting in the middle of a record. Loaders that can
handle this should implement the SamplableLoader interface.
Methods inherited from class org.apache.pig.impl.builtin.SampleLoader |
bindTo, bytesToBag, bytesToCharArray, bytesToDouble, bytesToFloat, bytesToInteger, bytesToLong, bytesToMap, bytesToTuple, computeSamples, determineSchema, getNext, getNumSamples |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
RandomSampleLoader
public RandomSampleLoader(String funcSpec,
String ns)
- Construct with a class of loader to use.
- Parameters:
funcSpec
- func spec of the loader to use.ns
- Number of samples per map to collect.
Arguments are passed as strings instead of actual types (FuncSpec, int)
because FuncSpec only supports string arguments to
UDF constructors.
setNumSamples
public void setNumSamples(int n)
- Overrides:
setNumSamples
in class SampleLoader
fieldsToRead
public LoadFunc.RequiredFieldResponse fieldsToRead(LoadFunc.RequiredFieldList requiredFieldList)
throws FrontendException
- Description copied from interface:
LoadFunc
- Indicate to the loader fields that will be needed. This can be useful for
loaders that access data that is stored in a columnar format where indicating
columns to be accessed a head of time will save scans. If the loader
function cannot make use of this information, it is free to ignore it.
- Specified by:
fieldsToRead
in interface LoadFunc
- Overrides:
fieldsToRead
in class SampleLoader
- Parameters:
requiredFieldList
- RequiredFieldList indicating which columns will be needed.
- Throws:
FrontendException
Copyright © ${year} The Apache Software Foundation