Commit Graph

153 Commits (27e4ce728741e6acfa0308b8dfa2ce129bd24e22)

Author SHA1 Message Date
Krzysztof Binias 0aa01929c1 Add backward
7 years ago
Tao Luo 85b6bb5886
Merge pull request #10747 from jczaja/prv-mkldnn-pooling-reuse
7 years ago
dzhwinter 0e4467eee4
"fix compile" (#10657)
7 years ago
Xin Pan 40a2ee9ae8
Merge pull request #10621 from panyx0718/fix_profile
7 years ago
Jacek Czaja 5f1333058c - Draft of reuse of pooling mkldnn operator
7 years ago
yuyang18 dfbe06ccab Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/fix_ninja_build
7 years ago
Xin Pan 94c0a64d62 Fix a profiler race condition
7 years ago
yuyang18 dc6ce071d4 Polish cmake
7 years ago
yuyang18 7c777dd549 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/exec_strategy
7 years ago
yuyang18 08295f9877 Add build strategy
7 years ago
typhoonzero 7b0c0273f4 update by comments
7 years ago
typhoonzero f5840d8925 follow comments
7 years ago
typhoonzero 04bde96e4c Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into gen_nccl_id_op
7 years ago
fengjiayi 2bff03bc1e fix a compile error (#10488)
7 years ago
chengduoZH 345737d0fe add sync
7 years ago
typhoonzero a135fec1fc Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into gen_nccl_id_op
7 years ago
typhoonzero 17009d0627 workable version
7 years ago
Xin Pan dce0732d5e
Merge pull request #10380 from panyx0718/dist_timeline
7 years ago
typhoonzero a529d790b6 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into gen_nccl_id_op
7 years ago
typhoonzero 3667578ec2 testing
7 years ago
chengduoZH d36af62c1e wrap_shfl_x_sync
7 years ago
typhoonzero d9320dcd94 complete code
7 years ago
Xin Pan 5a9f17f02b clean up
7 years ago
Xin Pan 76d8b14bce Add timeline support for distributed training
7 years ago
chengduo 54797abd53
Merge pull request #10347 from chengduoZH/replace___shfl_with__shfl_sync
7 years ago
chengduoZH e97c1a8ca0 fix __shfl
7 years ago
chengduoZH 0cc635497c merge develop
7 years ago
Yiqun Liu 6084af47ef
Fix the bug when a input variable of op is dispensable. (#10268)
7 years ago
chengduo 4fbde42cdf Fix __shfl_down_sync_ of cross_entropy (#10345)
7 years ago
chengduoZH b8f7fa97b6 replace __shfl with __shfl_sync
7 years ago
chengduoZH 90d73c79c3 fix shfl_sync for CUDA8.0
7 years ago
dzhwinter eb6f9dd5de
Feature/cuda9 cudnn7 (#10140)
7 years ago
Yu Yang c02ba51de0
Merge pull request #10191 from reyoung/feature/strict_dynload
7 years ago
Yu Yang 3d53631bad Make dyload strictly use the same ABI in header
7 years ago
gongweibao 6171705a2c Potential bug in paddle/fluid/platform/CMakeLists.txt (#9723)
7 years ago
Tao Luo 44fa823841
Merge pull request #9949 from mozga-intel/mozga-intel/Mul_mkldnn
7 years ago
fengjiayi 9f11da5931 Add synchronous TensorCopy and use it in double buffer
7 years ago
mozga-intel 171471eada
Merge branch 'develop' into mozga-intel/Mul_mkldnn
7 years ago
Yu Yang c3c7b7bd1b
Merge pull request #9928 from reyoung/feature/stablize_code
7 years ago
mozga-intel 6e7b883bdd Initial implementation of multiplication operator for MKLDNN
7 years ago
Tao Luo 038dbb386e
Merge pull request #9958 from luotao1/find_tensorrt
7 years ago
Kexin Zhao 64bf3df0f9 add print support to float16 (#9960)
7 years ago
Luo Tao d4682247e1 auto find tensorrt library
7 years ago
Yan Chunwei 186659798f
add tensorrt build support(#9891)
7 years ago
Yu Yang 093d227a77 Use mutex to stablize ncclCtxMap
7 years ago
Yi Wang 630943c7a7
Update documentation (#9918)
7 years ago
Yi Wang b48cf1712b
Fix cpplint errors in transform_test.cu (#9915)
7 years ago
Yi Wang 47609ab2b8
Document transform.h and fix cpplint errors (#9913)
7 years ago
Yu Yang 6b20b35589 Fix Transformer Hang Problem
7 years ago
Yu Yang c64190ecbb Polish NCCLHelper
7 years ago