Show / Hide Table of Contents

    Interface IDataStreamer<TK, TV>

    Data streamer is responsible for loading external data into cache. It achieves it by properly buffering updates and properly mapping keys to nodes responsible for the data to make sure that there is the least amount of data movement possible and optimal network and memory utilization.

    Note that streamer will load data concurrently by multiple internal threads, so the data may get to remote nodes in different order from which it was added to the streamer.

    Also note that IDataStreamer is not the only way to load data into cache. Alternatively you can use LoadCache(Action<TK, TV>, Object[]) method to load data from underlying data store. You can also use standard cache put and putAll operations as well, but they most likely will not perform as well as this class for loading data. And finally, data can be loaded from underlying data store on demand, whenever it is accessed - for this no explicit data loading step is needed.

    IDataStreamer supports the following configuration properties:

    • PerNodeBufferSizeWhen entries are added to data streamer they are not sent to Ignite right away and are buffered internally for better performance and network utilization. This setting controls the size of internal per-node buffer before buffered data is sent to remote node. Default value is 1024.
    • PerThreadBufferSizeWhen entries are added to data streamer they are not sent to Ignite right away and are buffered internally on per thread basis for better performance and network utilization. This setting controls the size of internal per-thread buffer before buffered data is sent to remote node. Default value is 4096.
    • PerNodeParallelOperationsSometimes data may be added to the data streamer faster than it can be put in cache. In this case, new buffered load messages are sent to remote nodes before responses from previous ones are received. This could cause unlimited heap memory utilization growth on local and remote nodes. To control memory utilization, this setting limits maximum allowed number of parallel buffered load messages that are being processed on remote nodes. If this number is exceeded, then data streamer add/remove methods will block to control memory utilization. Default value is 16.
    • AutoFlushFrequencyAutomatic flush frequency in milliseconds. Essentially, this is the time after which the streamer will make an attempt to submit all data added so far to remote nodes. Note that there is no guarantee that data will be delivered after this concrete attempt (e.g., it can fail when topology is changing), but it won't be lost anyway. Disabled by default (default value is 0).
    • IsolatedDefines if data streamer will assume that there are no other concurrent updates and allow data streamer choose most optimal concurrent implementation. Default value is false.

    All members are thread-safe and may be used concurrently from multiple threads.

    Inherited Members
    System.IDisposable.Dispose()
    Namespace: Apache.Ignite.Core.Datastream
    Assembly: Apache.Ignite.Core.dll
    Syntax
    public interface IDataStreamer<TK, TV> : IDisposable
    Type Parameters
    Name Description
    TK
    TV

    Properties

    AllowOverwrite

    Gets or sets a value indicating whether existing values can be overwritten by the data streamer. Performance is better when this flag is false.

    NOTE: When false, cache updates won't be propagated to cache store (even if SkipStore is false).

    Default is false.

    Declaration
    bool AllowOverwrite { get; set; }
    Property Value
    Type Description
    System.Boolean

    AutoFlushFrequency

    Automatic flush frequency in milliseconds. Essentially, this is the time after which the streamer will make an attempt to submit all data added so far to remote nodes. Note that there is no guarantee that data will be delivered after this concrete attempt (e.g., it can fail when topology is changing), but it won't be lost anyway.

    If set to 0, automatic flush is disabled.

    Default is 0 (disabled).

    Declaration
    long AutoFlushFrequency { get; set; }
    Property Value
    Type Description
    System.Int64

    CacheName

    Name of the cache to load data to.

    Declaration
    string CacheName { get; }
    Property Value
    Type Description
    System.String

    PerNodeBufferSize

    Size of per node key-value pairs buffer.

    Setter must be called before any add/remove operation.

    Default is DefaultPerNodeBufferSize.

    Declaration
    int PerNodeBufferSize { get; set; }
    Property Value
    Type Description
    System.Int32

    PerNodeParallelOperations

    Maximum number of parallel load operations for a single node.

    Setter must be called before any add/remove operation.

    Default is 0, which means Ignite calculates this automatically as DataStreamerThreadPoolSize * DefaultParallelOperationsMultiplier.

    Declaration
    int PerNodeParallelOperations { get; set; }
    Property Value
    Type Description
    System.Int32

    PerThreadBufferSize

    Size of per thread key-value pairs buffer.

    Setter must be called before any add/remove operation.

    Default is DefaultPerThreadBufferSize.

    Declaration
    int PerThreadBufferSize { get; set; }
    Property Value
    Type Description
    System.Int32

    Receiver

    Gets or sets custom stream receiver.

    Declaration
    IStreamReceiver<TK, TV> Receiver { get; set; }
    Property Value
    Type Description
    IStreamReceiver<TK, TV>

    SkipStore

    Flag indicating that write-through behavior should be disabled for data loading.

    AllowOverwrite must be true for write-through to work.

    Default is false.

    Declaration
    bool SkipStore { get; set; }
    Property Value
    Type Description
    System.Boolean

    Task

    Gets the task for this loading process. This task completes whenever method Close(Boolean) completes.

    Declaration
    Task Task { get; }
    Property Value
    Type Description
    System.Threading.Tasks.Task

    Timeout

    Gets or sets the timeout. Negative values mean no timeout. Default is DefaultTimeout.

    Timeout is used in the following cases:

  • Any data addition method can be blocked when all per node parallel operations are exhausted. The timeout defines the max time you will be blocked waiting for a permit to add a chunk of data into the streamer;
  • Total timeout time for Flush() operation;
  • Total timeout time for Close(Boolean) operation.
  • Declaration
    TimeSpan Timeout { get; set; }
    Property Value
    Type Description
    System.TimeSpan

    Methods

    AddData(TK, TV)

    Adds single key-value pair for loading. Passing null as value will be interpreted as removal.

    Declaration
    Task AddData(TK key, TV val)
    Parameters
    Type Name Description
    TK key

    Key.

    TV val

    Value.

    Returns
    Type Description
    System.Threading.Tasks.Task

    Task for this operation.

    AddData(ICollection<KeyValuePair<TK, TV>>)

    Adds collection of key-value pairs for loading.

    Declaration
    Task AddData(ICollection<KeyValuePair<TK, TV>> entries)
    Parameters
    Type Name Description
    System.Collections.Generic.ICollection<System.Collections.Generic.KeyValuePair<TK, TV>> entries

    Entries.

    Returns
    Type Description
    System.Threading.Tasks.Task

    Task for this operation.

    AddData(KeyValuePair<TK, TV>)

    Adds single key-value pair for loading. Passing null as pair's value will be interpreted as removal.

    Declaration
    Task AddData(KeyValuePair<TK, TV> pair)
    Parameters
    Type Name Description
    System.Collections.Generic.KeyValuePair<TK, TV> pair

    Key-value pair.

    Returns
    Type Description
    System.Threading.Tasks.Task

    Task for this operation.

    Close(Boolean)

    Closes this streamer optionally loading any remaining data.

    Declaration
    void Close(bool cancel)
    Parameters
    Type Name Description
    System.Boolean cancel

    Whether to cancel ongoing loading operations. When set to true there is not guarantees what data will be actually loaded to cache.

    Flush()

    Loads any remaining data, but doesn't close the streamer. Data can be still added after flush is finished. This method blocks and doesn't allow to add any data until all data is loaded.

    Declaration
    void Flush()

    RemoveData(TK)

    Adds key for removal.

    Declaration
    Task RemoveData(TK key)
    Parameters
    Type Name Description
    TK key

    Key.

    Returns
    Type Description
    System.Threading.Tasks.Task

    Task for this operation.

    TryFlush()

    Makes an attempt to load remaining data. This method is mostly similar to Flush() with the difference that it won't wait and will exit immediately.

    Declaration
    void TryFlush()

    WithKeepBinary<TK1, TV1>()

    Gets streamer instance with binary mode enabled, changing key and/or value types if necessary. In binary mode stream receiver gets data in binary format. You can only change key/value types when transitioning from non-binary to binary streamer; Changing type of binary streamer is not allowed and will throw an System.InvalidOperationException

    Declaration
    IDataStreamer<TK1, TV1> WithKeepBinary<TK1, TV1>()
    Returns
    Type Description
    IDataStreamer<TK1, TV1>

    Streamer instance with binary mode enabled.

    Type Parameters
    Name Description
    TK1

    Key type in binary mode.

    TV1

    Value type in binary mode.

    Back to top © 2015 - 2019 The Apache Software Foundation