Skip to content

Commit 5b5f1e7

Browse files
PartitionEvaluator et al.
1 parent d139926 commit 5b5f1e7

File tree

6 files changed

+117
-0
lines changed

6 files changed

+117
-0
lines changed

docs/PartitionEvaluator.md

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
---
2+
tags:
3+
- DeveloperApi
4+
---
5+
6+
# PartitionEvaluator
7+
8+
`PartitionEvaluator[T, U]` is an [abstraction](#contract) of [partition evaluators](#implementations) that can [compute (_evaluate_) one or more RDD partitions](#eval).
9+
10+
## Contract
11+
12+
### Evaluate Partitions { #eval }
13+
14+
```scala
15+
eval(
16+
partitionIndex: Int,
17+
inputs: Iterator[T]*): Iterator[U]
18+
```
19+
20+
Used when:
21+
22+
* `MapPartitionsWithEvaluatorRDD` is requested to [compute a partition](rdd/MapPartitionsWithEvaluatorRDD.md#compute)
23+
* `ZippedPartitionsWithEvaluatorRDD` is requested to [compute a partition](rdd/ZippedPartitionsWithEvaluatorRDD.md#compute)
24+
25+
## Implementations
26+
27+
!!! note
28+
No built-in implementations available in Spark Core (but [Spark SQL]({{ book.spark_sql }})).

docs/PartitionEvaluatorFactory.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
---
2+
tags:
3+
- DeveloperApi
4+
---
5+
6+
# PartitionEvaluatorFactory
7+
8+
`PartitionEvaluatorFactory[T, U]` is an [abstraction](#contract) of [PartitionEvaluator factories](#implementations).
9+
10+
`PartitionEvaluatorFactory` is a `Serializable` ([Java]({{ java.api }}/java/io/Serializable.html)).
11+
12+
## Contract
13+
14+
### Creating PartitionEvaluator { #createEvaluator }
15+
16+
```scala
17+
createEvaluator(): PartitionEvaluator[T, U]
18+
```
19+
20+
Creates a [PartitionEvaluator](PartitionEvaluator.md)
21+
22+
Used when:
23+
24+
* `MapPartitionsWithEvaluatorRDD` is requested to [compute a partition](rdd/MapPartitionsWithEvaluatorRDD.md#compute)
25+
* `ZippedPartitionsWithEvaluatorRDD` is requested to [compute a partition](rdd/ZippedPartitionsWithEvaluatorRDD.md#compute)
26+
27+
## Implementations
28+
29+
!!! note
30+
No built-in implementations available in Spark Core (but [Spark SQL]({{ book.spark_sql }})).
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
# MapPartitionsWithEvaluatorRDD
2+
3+
`MapPartitionsWithEvaluatorRDD` is an [RDD](RDD.md).
4+
5+
## Creating Instance
6+
7+
`MapPartitionsWithEvaluatorRDD` takes the following to be created:
8+
9+
* <span id="prev"> Previous [RDD](RDD.md)
10+
* <span id="evaluatorFactory"> [PartitionEvaluatorFactory](../PartitionEvaluatorFactory.md)
11+
12+
`MapPartitionsWithEvaluatorRDD` is created when:
13+
14+
* [RDD.mapPartitionsWithEvaluator](RDD.md#mapPartitionsWithEvaluator) operator is used
15+
* [RDDBarrier.mapPartitionsWithEvaluator](../barrier-execution-mode/RDDBarrier.md#mapPartitionsWithEvaluator) operator is used
16+
17+
## Computing Partition { #compute }
18+
19+
??? note "RDD"
20+
21+
```scala
22+
compute(
23+
split: Partition,
24+
context: TaskContext): Iterator[U]
25+
```
26+
27+
`compute` is part of the [RDD](RDD.md#compute) abstraction.
28+
29+
`compute` requests the [PartitionEvaluatorFactory](#evaluatorFactory) to [create a PartitionEvaluator](../PartitionEvaluatorFactory.md#createEvaluator).
30+
31+
`compute` requests the [first parent RDD](RDD.md#firstParent) to [iterator](RDD.md#iterator).
32+
33+
In the end, `compute` requests the [PartitionEvaluator](../PartitionEvaluator.md) to [evaluate the partition](../PartitionEvaluator.md#eval).

docs/rdd/RDD.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -445,6 +445,25 @@ withScope[U](
445445
!!! note
446446
`withScope` is used for most (if not all) `RDD` API operators.
447447

448+
## mapPartitionsWithEvaluator { #mapPartitionsWithEvaluator }
449+
450+
```scala
451+
mapPartitionsWithEvaluator[U: ClassTag](
452+
evaluatorFactory: PartitionEvaluatorFactory[T, U]): RDD[U]
453+
```
454+
455+
`mapPartitionsWithEvaluator` creates a [MapPartitionsWithEvaluatorRDD](MapPartitionsWithEvaluatorRDD.md) for this `RDD` and the given [PartitionEvaluatorFactory](../PartitionEvaluatorFactory.md).
456+
457+
## zipPartitionsWithEvaluator { #zipPartitionsWithEvaluator }
458+
459+
```scala
460+
zipPartitionsWithEvaluator[U: ClassTag](
461+
rdd2: RDD[T],
462+
evaluatorFactory: PartitionEvaluatorFactory[T, U]): RDD[U]
463+
```
464+
465+
`zipPartitionsWithEvaluator` creates a [ZippedPartitionsWithEvaluatorRDD](ZippedPartitionsWithEvaluatorRDD.md) for this `RDD` and the given `RDD` and the [PartitionEvaluatorFactory](../PartitionEvaluatorFactory.md).
466+
448467
<!---
449468
## Review Me
450469
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# ZippedPartitionsWithEvaluatorRDD
2+
3+
`ZippedPartitionsWithEvaluatorRDD` is...FIXME

mkdocs.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -416,6 +416,8 @@ nav:
416416
- ExecutorDeadException: ExecutorDeadException.md
417417
- HeartbeatReceiver: HeartbeatReceiver.md
418418
- InterruptibleIterator: InterruptibleIterator.md
419+
- PartitionEvaluatorFactory: PartitionEvaluatorFactory.md
420+
- PartitionEvaluator: PartitionEvaluator.md
419421
- Utils: Utils.md
420422
- Spark Tips and Tricks:
421423
- Spark Tips and Tricks: spark-tips-and-tricks.md
@@ -548,7 +550,9 @@ nav:
548550
- RDD Checkpointing: rdd/checkpointing.md
549551
- RDDCheckpointData: rdd/RDDCheckpointData.md
550552
- LocalRDDCheckpointData: rdd/LocalRDDCheckpointData.md
553+
- MapPartitionsWithEvaluatorRDD: rdd/MapPartitionsWithEvaluatorRDD.md
551554
- ReliableRDDCheckpointData: rdd/ReliableRDDCheckpointData.md
555+
- ZippedPartitionsWithEvaluatorRDD: rdd/ZippedPartitionsWithEvaluatorRDD.md
552556
- Aggregator: rdd/Aggregator.md
553557
- Demos:
554558
- demo/index.md

0 commit comments

Comments
 (0)