ordering#
Operators related to ordering streams.
Classes#
- class OrderOut#
-
Streams returned from the
orderoperator.- down: KeyedStream[V]#
Items in timestamp order.
- late: KeyedStream[V]#
Upstream items that were late.
Functions#
- order( ) OrderOut[V]#
Order items by timestamp, within limits.
This allows you to take a stream of out-of-timestamp-order items and streaming sort them into timestamp order. Further stateful processing (e.g. a state machine) might need items in timestamp order.
You generally will use this operator with an
EventClock.wait_for_system_durationis a crucial parameter to think about for this clock: The larger this value is, the more out-of-order the timestamps can be in your upstream without dropping items as late. But the tradeoff is that this increases the memory usage (as all items for that duration need to be buffered) and increases the latency of a real-time data source (in that you will need to wait this amount of system time to see if new out-of-order data is still coming).For a streaming system, generally set
wait_for_system_durationto the amount of system time latency you are willing to wait for as correct answers as possible.For bounded batch processing, generally set
wait_for_system_durationto the maximum amount of out-of-orderedness you think you’ll see in the data. You also can setwait_for_system_durationtodatetime.timedelta.maxif you are ok with buffering all of the incoming data and sorting it and don’t want to worry about a maximum amount of out-of-order-ness, but this will have memory usage implications.This operator is not needed before standard
windowingoperators; they sort internally to give you their operator guarantees.- Parameters:
step_id – Unique ID.
up – Stream.
clock – Time definition. This will almost always be an
EventClock.
- Returns:
Order result streams.