Commit Graph

118 Commits (4ecc9b7bae159d0ebd03da7faa88f29341bed781)

Author SHA1 Message Date
Tao Luo 44fa823841
Merge pull request #9949 from mozga-intel/mozga-intel/Mul_mkldnn
7 years ago
fengjiayi 9f11da5931 Add synchronous TensorCopy and use it in double buffer
7 years ago
mozga-intel 171471eada
Merge branch 'develop' into mozga-intel/Mul_mkldnn
7 years ago
Yu Yang c3c7b7bd1b
Merge pull request #9928 from reyoung/feature/stablize_code
7 years ago
mozga-intel 6e7b883bdd Initial implementation of multiplication operator for MKLDNN
7 years ago
Tao Luo 038dbb386e
Merge pull request #9958 from luotao1/find_tensorrt
7 years ago
Kexin Zhao 64bf3df0f9 add print support to float16 (#9960)
7 years ago
Luo Tao d4682247e1 auto find tensorrt library
7 years ago
Yan Chunwei 186659798f
add tensorrt build support(#9891)
7 years ago
Yu Yang 093d227a77 Use mutex to stablize ncclCtxMap
7 years ago
Yi Wang 630943c7a7
Update documentation (#9918)
7 years ago
Yi Wang b48cf1712b
Fix cpplint errors in transform_test.cu (#9915)
7 years ago
Yi Wang 47609ab2b8
Document transform.h and fix cpplint errors (#9913)
7 years ago
Yu Yang 6b20b35589 Fix Transformer Hang Problem
7 years ago
Yu Yang c64190ecbb Polish NCCLHelper
7 years ago
Yu Yang 7483555a81 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/change_int64
7 years ago
qingqing01 129859e732
Support data type int64 in NCCL. (#9818)
7 years ago
Kexin Zhao 7ed457e77a Fix cuda 7.5 error with cublas GEMM (#9811)
7 years ago
Yu Yang 40e3fe173c Make cuda_helper.h Pass cpplint
7 years ago
chengduo b1224da8d9 Move reduceSum to elementwise_op_function.h (#9773)
7 years ago
Kexin Zhao 0f38bb4593
add fp16 support to activation op (#9769)
7 years ago
Yi Wang 8dbd9c394e
Fix part of the cpplint errors in fluid/platform (#9802)
7 years ago
qingqing01 add367c3f4 Code cleanup in the profiler code. (#9782)
7 years ago
Yi Wang 47a4ec0672 Remove call_once.h (#9764)
7 years ago
Yi Wang b1a5a3cab8
Fix cpplint errors with float16* (#9751)
7 years ago
Yi Wang 25ad6884bb Merge branch 'develop' of http://github.com/paddlepaddle/paddle into cpplint-memory-detail
7 years ago
Yi Wang 67ba884d2a Update CMakeLists
7 years ago
Yi Wang 478055bd9f Update CMakeLists.txt
7 years ago
Yi Wang 535646cf25 Update (#9717)
7 years ago
Yi Wang e185502ebe
Fix cpplint errors with paddle/fluid/platform/dynload (#9715)
7 years ago
Yi Wang 0c43a376e2
Fix cpplint errors with paddle/fluid/platform/gpu_info.* (#9710)
7 years ago
Yi Wang 55ffceaadb
Fix cpplint errors paddle/fluid/platform/place.* (#9711)
7 years ago
Yi Wang 809962625f
Fix cpplint errors of enforce.* (#9706)
7 years ago
Yi Wang ef4ee22668
Fix cpplint errors with paddle/fluid/platform/cpu_info* (#9708)
7 years ago
Kexin Zhao b2a1c9e8b7 Add float16 support to non-cudnn softmax op on GPU (#9686)
7 years ago
Yi Wang 797a7184ac
Unify Fluid code to Google C++ style (#9685)
7 years ago
Kexin Zhao d00bd9eb72 Update the cuda API and enable tensor core for GEMM (#9622)
7 years ago
Lei Wang 09b4a1a361 Build: generate all the build related files into one directory. (#9512)
7 years ago
Kexin Zhao d904b3dd1d
Merge pull request #9623 from kexinzhao/enable_cudnn_tensor_core
7 years ago
Kexin Zhao 9ba36604d8 fix cpplint error
7 years ago
Kexin Zhao 187ba08789 enable tensor core for conv cudnn
7 years ago
chengduoZH e099b18045 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/add_CUDAPinnedPlace
7 years ago
chengduoZH 2514d70ea7 follow comments
7 years ago
Luo Tao 5baa529e0e fix compiler error of profiler_test in ONLY_CPU mode
7 years ago
chengduoZH 58a9f9f781 set the max size of cudapinned memory
7 years ago
Yu Yang 7dcb217e31 Refine allreduce op
7 years ago
Yu Yang c0c2e15920 NCCL AllReduce
7 years ago
chengduoZH ab601c19c3 Add CUDAPinnedPlace
7 years ago
chengduoZH 158d6c4d19 add unit test
7 years ago
chengduoZH 18eb77303d add CUDAPinnedPlace
7 years ago