org.apache.pig.piggybank.evaluation.util.apachelogparser
Class DateExtractor

java.lang.Object
  extended by org.apache.pig.EvalFunc<String>
      extended by org.apache.pig.piggybank.evaluation.util.apachelogparser.DateExtractor

public class DateExtractor
extends EvalFunc<String>

DateExtractor has four different constructors which each allow for different functionality. The incomingDateFormat ("dd/MMM/yyyy:HH:mm:ss Z" by default) is used to match the date string that gets passed in from the log. The outgoingDateFormat ("yyyy-MM-dd" by default) is used to format the returned string. Different constructors exist for each combination; please use the appropriate respective constructor. Note that any data that exists in the SimpleDateFormat schema can be supported. For example, if you were starting with the default incoming format and wanted to extract just the year, you would use the single string constructor DateExtractor("yyyy"). From pig latin you will need to use aliases to use a non-default format, like define MyDateExtractor org.apache.pig.piggybank.evaluation.util.apachelogparser.DateExtractor("yyyy-MM"); A = FOREACH row GENERATE DateExtractor(dayTime); If a string cannot be parsed, null will be returned and an error message printed to stderr. By default, the DateExtractor uses the GMT timezone. You can use the three-parameter constructor to override the timezone.


Field Summary
 
Fields inherited from class org.apache.pig.EvalFunc
log, pigLogger, reporter, returnType
 
Constructor Summary
DateExtractor()
          forms the formats based on default incomingDateFormat and default outgoingDateFormat
DateExtractor(String outgoingDateString)
          forms the formats based on passed outgoingDateString and the default incomingDateFormat
DateExtractor(String incomingDateString, String outgoingDateString)
          forms the formats based on passed incomingDateString and outgoingDateString
DateExtractor(String incomingDateString, String outgoingDateString, String timeZoneID)
          forms the formats based on passed incomingDateString and outgoingDateString
 
Method Summary
 String exec(Tuple input)
          This callback method must be implemented by all subclasses.
 List<FuncSpec> getArgToFuncMapping()
           
 
Methods inherited from class org.apache.pig.EvalFunc
finish, getLogger, getPigLogger, getReporter, getReturnType, getSchemaName, isAsynchronous, outputSchema, progress, setPigLogger, setReporter, warn
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DateExtractor

public DateExtractor()
forms the formats based on default incomingDateFormat and default outgoingDateFormat

Parameters:
outgoingDateString - outgoingDateFormat is based on outgoingDateString

DateExtractor

public DateExtractor(String outgoingDateString)
forms the formats based on passed outgoingDateString and the default incomingDateFormat

Parameters:
outgoingDateString - outgoingDateFormat is based on outgoingDateString

DateExtractor

public DateExtractor(String incomingDateString,
                     String outgoingDateString)
forms the formats based on passed incomingDateString and outgoingDateString

Parameters:
incomingDateString - incomingDateFormat is based on incomingDateString
outgoingDateString - outgoingDateFormat is based on outgoingDateString

DateExtractor

public DateExtractor(String incomingDateString,
                     String outgoingDateString,
                     String timeZoneID)
forms the formats based on passed incomingDateString and outgoingDateString

Parameters:
incomingDateString - incomingDateFormat is based on incomingDateString
outgoingDateString - outgoingDateFormat is based on outgoingDateString
timeZoneID - time zone id in which dates should be expressed.
Method Detail

exec

public String exec(Tuple input)
            throws IOException
Description copied from class: EvalFunc
This callback method must be implemented by all subclasses. This is the method that will be invoked on every Tuple of a given dataset. Since the dataset may be divided up in a variety of ways the programmer should not make assumptions about state that is maintained between invocations of this method.

Specified by:
exec in class EvalFunc<String>
Parameters:
input - the Tuple to be processed.
Returns:
result, of type T.
Throws:
IOException

getArgToFuncMapping

public List<FuncSpec> getArgToFuncMapping()
                                   throws FrontendException
Overrides:
getArgToFuncMapping in class EvalFunc<String>
Returns:
A List containing FuncSpec objects representing the Function class which can handle the inputs corresponding to the schema in the objects
Throws:
FrontendException


Copyright © ${year} The Apache Software Foundation