Commit Graph

603 Commits (1943119fc5f98f6b552ebb6d180346b9c27adb8e)

Author SHA1 Message Date
Dun a83e470405
Profiler refine and add CUDA runtime api tracer (#15301)
6 years ago
mozga-intel 13ec2d331b Enable momentum operator for a ngraph engine (#15673)
6 years ago
Tao Luo c797a1f050 remove legacy any.cmake
6 years ago
Tao Luo bd2fa73620
Merge pull request #15794 from sneaxiy/fix-warnings
6 years ago
tensor-tang e1c707fe9c
fix warnings (#15790)
6 years ago
sneaxiy 9b8e0e2f17 fix enforce_test
6 years ago
sneaxiy 209b355762 fix many warning
6 years ago
Zeng Jinle fc87ef741b
Merge pull request #15687 from sneaxiy/fix_enforce
6 years ago
sneaxiy f0590947c3 fix enforce
6 years ago
tensor-tang 31fd8ce1e1
Merge pull request #15375 from mozga-intel/mozga-intel/batch_norm_ngraph_operator
6 years ago
dzhwinter 04e9776aef add details. test=develop
6 years ago
mozga-intel 1198ccae6b Enable batch_norm operator for a ngraph engine
6 years ago
peizhilin 883d22093a fix the lib_any dependency
6 years ago
wopeizl 3614dadf23
Merge pull request #15631 from wopeizl/windows/fixci
6 years ago
peizhilin 061299be87 fix dependency
6 years ago
baojun ac4cde009d Enable accuracy op for ngraph engine (#15592)
6 years ago
dzhwinter ce0394bcd0 merge develop branch. test=develop
6 years ago
guoshengCS b6c3b69af8 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix-beam-search-size
6 years ago
liuwei1031 6e84eb131f expose peak gpu memory API to python test=develop (#15529)
6 years ago
guoshengCS 5dfce93101 To make CUDA_LAUNCH_KERNEL_HELPER support large size.
6 years ago
tensor-tang 8117725852 add jit kernel hsum, hmax and softmax refer code
6 years ago
sneaxiy ba4f43fd62 fix compile error in distributed mode
6 years ago
Yiqun Liu 3008fa1261
Add the CUDA kernel for beam_search op (#15020)
6 years ago
Zeng Jinle 2480a3df7d
Merge pull request #15496 from sneaxiy/lazy_allocator2
6 years ago
sneaxiy 9c360cc798 test=develop
6 years ago
Xin Pan 58cb18d9d9
Merge pull request #15322 from velconia/imperative_resnet
6 years ago
sneaxiy 51227bd447 lazy_allocator
6 years ago
tangwei12 8b50ad80ff
checkpoint at distributed training (#14854)
6 years ago
minqiyang 8ce198b2e1 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into imperative_resnet
6 years ago
minqiyang 315b133e67 Add single GPU support to imperative
6 years ago
tensor-tang 3759c1db8c
Merge pull request #14805 from mozga-intel/mozga-intel/element_wise_operator_ngraph
6 years ago
peizhilin eea75a1d93 fix issue when type is invalid
6 years ago
peizhilin 9adb158e5b Merge remote-tracking branch 'upstream/develop' into debug/support
6 years ago
chengduo 46d01d798e
Revert "Revert "Remove workspace_handle in conv_cudnn (#15186)"" (#15290)
6 years ago
Wojciech Uss cb2ba58458 Fix performance drop when with MKL-DNN
6 years ago
chengduozh c4eced9881 fix thread safe bug
6 years ago
chengduozh 358e657f68 Revert "Remove workspace_handle in conv_cudnn (#15186)"
6 years ago
wopeizl 5d9edb4124
Merge pull request #15156 from wopeizl/windows/fixgpuissue
6 years ago
chengduo 064512aa47
Remove workspace_handle in conv_cudnn (#15186)
6 years ago
xiaolil1 8f17c714de Conv int8 residual (#15145)
6 years ago
peizhilin 439691f5bd adjust the shlwapi on windows
6 years ago
peizhilin 92da467c99 Merge remote-tracking branch 'upstream/develop' into windows/fixgpuissue
6 years ago
peizhilin c1235c935f add the enable_debug flag
6 years ago
Zeng Jinle e29f10d315
Merge pull request #15207 from sneaxiy/remove_op_handle_lock_and_fix_var
6 years ago
mozga-intel a42f8f4f6f Enable element_wise_add operator for a ngraph
6 years ago
Zeng Jinle c562be20d9
Merge pull request #15193 from sneaxiy/fix_cudnn_compatible_check
6 years ago
peizhilin 1cd95d8a0b use thread local instance test=develop
6 years ago
sneaxiy ed409ac9f4 Revert "Revert "Remove op handle lock""
6 years ago
peizhilin d54133ea85 not include the numeric under linux test=develop
6 years ago
peizhilin a6f5ceee74 add the python callstack for debug support test=develop
6 years ago