2
2
3
3
Machine:
4
4
5
- - Server
6
- - Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz, 2 Sockets, 20 Cores per socket
7
- - Laptop
8
- - DELL XPS15-9560-R1745: i7-7700HQ 8G 256GSSD
9
- - i5 MacBook Pro (Retina, 13-inch, Early 2015)
10
- - Desktop
11
- - i7-6700k
5
+ - Server: Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz, 2 Sockets, 20 Cores per socket
6
+ - Laptop: TBD
12
7
13
8
System: CentOS release 6.3 (Final), Docker 1.12.1.
14
9
15
- PaddlePaddle: paddlepaddle/paddle: latest (for MKLML and MKL-DNN), paddlepaddle/paddle: latest-openblas (for OpenBLAS)
16
- - MKL-DNN tag v0.11
17
- - MKLML 2018.0.1.20171007
18
- - OpenBLAS v0.2.20
19
- (TODO: will rerun after 0.11.0)
10
+ PaddlePaddle: (TODO: will rerun after 0.11.0)
11
+ - paddlepaddle/paddle: latest (for MKLML and MKL-DNN)
12
+ - MKL-DNN tag v0.11
13
+ - MKLML 2018.0.1.20171007
14
+ - paddlepaddle/paddle: latest-openblas (for OpenBLAS)
15
+ - OpenBLAS v0.2.20
20
16
21
17
On each machine, we will test and compare the performance of training on single node using MKL-DNN / MKLML / OpenBLAS respectively.
22
18
@@ -35,9 +31,7 @@ Input image size - 3 * 224 * 224, Time: images/second
35
31
| MKLML | 12.12 | 13.70 | 16.18 |
36
32
| MKL-DNN | 28.46 | 29.83 | 30.44 |
37
33
38
-
39
- chart on batch size 128
40
- TBD
34
+ <img src =" figs/vgg-cpu-train.png " width =" 500 " >
41
35
42
36
- ResNet-50
43
37
47
41
| MKLML | 32.52 | 31.89 | 33.12 |
48
42
| MKL-DNN | 81.69 | 82.35 | 84.08 |
49
43
50
-
51
- chart on batch size 128
52
- TBD
44
+ <img src =" figs/resnet-cpu-train.png " width =" 500 " >
53
45
54
46
- GoogLeNet
55
47
59
51
| MKLML | 128.46| 137.89| 158.63 |
60
52
| MKL-DNN | 250.46| 264.83| 269.50 |
61
53
62
- chart on batch size 128
63
- TBD
54
+ <img src =" figs/googlenet-cpu-train.png " width =" 500 " >
64
55
65
56
### Laptop
66
57
TBD
67
- ### Desktop
68
- TBD
0 commit comments