public class FileStreams
extends java.lang.Object
FileStreams
is a connector for integrating with file system objects.
File stream operations include:
textFileWriter
directoryWatcher
textFileReader
Modifier and Type | Method and Description |
---|---|
static TStream<java.lang.String> |
directoryWatcher(TopologyElement te,
Supplier<java.lang.String> directory)
Declare a stream containing the absolute pathname of
newly created file names from watching
directory . |
static TStream<java.lang.String> |
directoryWatcher(TopologyElement te,
Supplier<java.lang.String> directory,
java.util.Comparator<java.io.File> comparator)
Declare a stream containing the absolute pathname of
newly created file names from watching
directory . |
static TStream<java.lang.String> |
textFileReader(TStream<java.lang.String> pathnames)
Declare a stream containing the lines read from the files
whose pathnames correspond to each tuple on the
pathnames
stream. |
static TStream<java.lang.String> |
textFileReader(TStream<java.lang.String> pathnames,
Function<java.lang.String,java.lang.String> preFn,
BiFunction<java.lang.String,java.lang.Exception,java.lang.String> postFn)
Declare a stream containing the lines read from the files
whose pathnames correspond to each tuple on the
pathnames
stream. |
static TSink<java.lang.String> |
textFileWriter(TStream<java.lang.String> contents,
Supplier<java.lang.String> basePathname)
Write the contents of a stream to files.
|
static TSink<java.lang.String> |
textFileWriter(TStream<java.lang.String> contents,
Supplier<java.lang.String> basePathname,
Supplier<org.apache.edgent.connectors.file.runtime.IFileWriterPolicy<java.lang.String>> policy)
Write the contents of a stream to files subject to the control
of a file writer policy.
|
public static TStream<java.lang.String> directoryWatcher(TopologyElement te, Supplier<java.lang.String> directory)
directory
.
This is the same as directoryWatcher(t, () -> dir, null)
.
te
- topology element whose topology the watcher will be added todirectory
- Name of the directory to watch.directory
.public static TStream<java.lang.String> directoryWatcher(TopologyElement te, Supplier<java.lang.String> directory, java.util.Comparator<java.io.File> comparator)
directory
.
Hidden files (java.io.File.isHidden()==true) are ignored.
This is compatible with textFileWriter
.
Sample use:
String dir = "/some/directory/path";
Topology t = ...
TStream<String> pathnames = FileStreams.directoryWatcher(t, () -> dir, null);
The order of the files in the stream is dictated by a Comparator
.
The default comparator orders files by File.lastModified()
values.
There are no guarantees on the processing order of files that
have the same lastModified value.
Note, lastModified values are subject to filesystem timestamp
quantization - e.g., 1second.
Note: due to the asynchronous nature of things, if files in the directory may be removed, the receiver of a tuple with a "new" file pathname may need to be prepared for the pathname to no longer be valid when it receives the tuple or during its processing of the tuple.
The behavior on MacOS may be unsavory, even as recent as Java8, as
MacOs Java lacks a native implementation of WatchService
.
The result can be a delay in detecting newly created files (e.g., 10sec)
as well not detecting rapid deletion and recreation of a file.
te
- topology element whose topology the watcher will be added todirectory
- Name of the directory to watch.comparator
- Comparator to use to order newly seen file pathnames.
May be null.directory
.public static TStream<java.lang.String> textFileReader(TStream<java.lang.String> pathnames)
pathnames
stream.
This is the same as textFileReader(pathnames, null, null)
Sample use:
String dir = "/some/directory/path";
Topology t = ...
TStream<String> pathnames = FileStreams.directoryWatcher(t, () -> dir);
TStream<String> contents = FileStreams.textFileReader(pathnames);
contents.print();
pathnames
- Stream containing pathnames of files to read.public static TStream<java.lang.String> textFileReader(TStream<java.lang.String> pathnames, Function<java.lang.String,java.lang.String> preFn, BiFunction<java.lang.String,java.lang.Exception,java.lang.String> postFn)
pathnames
stream.
All files are assumed to be encoded in UTF-8. The lines are output in the order they appear in each file, with the first line of a file appearing first. A file is not subsequently monitored for additional lines.
If a file can not be read, e.g., a file doesn't exist at that pathname or the pathname is for a directory, an error will be logged.
Optional preFn
and postFn
functions may be supplied.
These are called prior to processing a tuple (pathname) and after
respectively. They provide a way to encode markers in the generated
stream.
Sample use:
// watch a directory for files, creating a stream with the contents of
// each file. Use a preFn to include a file separator marker in the
// stream. Use a postFn to delete a file once it's been processed.
String dir = "/some/directory/path";
Topology t = ...
TStream<String> pathnames = FileStreams.directoryWatcher(t, () -> dir);
TStream<String> contents = FileStreams.textFileReader(
pathnames,
path -> { return "###<PATH-MARKER>### " + path },
(path,exception) -> { new File(path).delete(), return null; }
);
contents.print();
pathnames
- Stream containing pathnames of files to read.preFn
- Pre-visit Function<String,String>
.
The input is the pathname.
The result, when non-null, is added to the output stream.
The function may be null.postFn
- Post-visit BiFunction<String,Exception,String>
.
The input is the pathname and an exception. The exception
is null if there were no errors.
The result, when non-null, is added to the output stream.
The function may be null.public static TSink<java.lang.String> textFileWriter(TStream<java.lang.String> contents, Supplier<java.lang.String> basePathname)
The default FileWriterPolicy
is used.
This is the same as textFileWriter(contents, basePathname, null)
.
Sample use:
// write a stream of LogEvent to files, using the default
// file writer policy
String basePathname = "/myLogDir/LOG"; // yield LOG_YYYYMMDD_HHMMSS
TStream<MyLogEvent> events = ...
TStream<String> stringEvents = events.map(event -> event.toString());
FileStreams.textFileWriter(stringEvents, () -> basePathname);
contents
- the lines to writebasePathname
- the base pathname of the created filespublic static TSink<java.lang.String> textFileWriter(TStream<java.lang.String> contents, Supplier<java.lang.String> basePathname, Supplier<org.apache.edgent.connectors.file.runtime.IFileWriterPolicy<java.lang.String>> policy)
A separate policy instance must be used for invocation.
A default FileWriterPolicy
is used if a policy is not specified.
Sample use:
// write a stream of LogEvent to files using a policy of:
// no additional flush, 100 events per file, retain 5 files
IFileWriterPolicy<String> policy = new FileWriterPolicy<String>(
FileWriterFlushConfig.newImplicitConfig(),
FileWriterCycleConfig.newCountBasedConfig(100),
FileWriterRetentionConfig.newFileCountBasedConfig(5)
);
String basePathname = "/myLogDir/LOG"; // yield LOG_YYYYMMDD_HHMMSS
TStream<MyLogEvent> events = ...
TStream<String> stringEvents = events.map(event -> event.toString());
FileStreams.textFileWriter(stringEvents, () -> basePathname, () -> policy);
contents
- the lines to writebasePathname
- the base pathname of the created filespolicy
- the policy to use. may be null.FileWriterPolicy
Copyright © 2016 The Apache Software Foundation. All Rights Reserved - bbe71fa-20161201-1641