Commit Graph

947 Commits (6e5670b8bdb9a4c62a98b69ea6fe33b6ed38065b)

Author SHA1 Message Date
hutuxian 969e6378b9
Pipeline Concurrency (#17402)
6 years ago
Zeng Jinle 3ece61f71e
Remove attribute in Allocator::Allocate (#17878)
6 years ago
Zeng Jinle 3925bd81e8
Fix cuda/cudnn version detection error (#17853)
6 years ago
chengduo d1169afaa3
remove InstallFailureSignalHandler (#17828)
6 years ago
Leo Zhao 50326563d5 enable mkldnn primitive reuse for platform reorder (#17826)
6 years ago
wangchaochaohu c10157a5df
revise the cudnn conv choose algorithm to improve the performance(mask rcnn benchmark) (#17753)
6 years ago
chengduo 863c75168c
polish error doc (#17772)
6 years ago
gongweibao 0d561ef442
fix 2dconn test=develop (#17681)
6 years ago
gongweibao 65bbf950ee
Add multi-ncclcomm and 2D ncclallreduce support. (#17263)
6 years ago
wopeizl 6724a652f3
add __str__ method for tensor and lodtensor to support print test=dev… (#17588)
6 years ago
mozga-intel f2694e122d [NGraph] Enable assign operator for a ngraph, test=develop (#17437)
6 years ago
Zeng Jinle c6189637cd
Fix allocator bug (#16712)
6 years ago
mozga-intel 109b5aed5a [NGraph] Enable reshape operator test=develop (#17512)
6 years ago
guomingz 2281ebf0f3 Enable the convolution/relu6(bounded_relu) fusion for FP32 on Intel platform. (#17130)
6 years ago
qingqing01 97f0ec2357 Fix compiling error with cuDNN 5.1 (#17458)
6 years ago
Zeng Jinle eab34b2df6
fix_dygraph_mem_leak, test=develop (#17396)
6 years ago
qingqing01 e32c9888f5
Double backward of conv2d. (#17211)
6 years ago
zhaoyuchen2018 792443ef23
Refine elementwise kernel. (#16952)
6 years ago
chengduo db5e74ab95
update assert (#17282)
6 years ago
baojun 7bd1d03ee5 Adding lrn op for ngraph engine (#17189)
6 years ago
Tao Luo ff1661f12a
remove unused FLAGS_warpctc_dir (#17162)
6 years ago
Huihuang Zheng e4a5332416
Fix a typo in gpu_info.cc (#17175)
6 years ago
Huihuang Zheng b9494058b3
Use CudnnWorkspaceHandle in exhaustive search (#17082)
6 years ago
Zeng Jinle 0c335dcd2c
Make conv cudnn workspace size configurable (#17036)
6 years ago
Zeng Jinle 1202d3fc74
Refine model gpu memory (#16993)
6 years ago
gongweibao cbdb8a17b1
Polish DGC code (#16818)
6 years ago
xuezhong 742d758747 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_infershape_bug2
6 years ago
xuezhong 5663fbfb0a fix infershape bug
6 years ago
Jacek Czaja 87a44b1149 [MKL-DNN] Added reusing of primitive descriptors (fp32) (#16667)
6 years ago
dongdaxiang a659b37ace make lodtensor_printer usable in gpu setting
6 years ago
Chen Weihang 0b2aec14b6 Revert "Model data cryption link all lib (#16555)"
6 years ago
Chen Weihang c38c7c5619
Model data cryption link all lib (#16555)
6 years ago
guru4elephant 76b49f02ee
Merge pull request #16539 from guru4elephant/train_with_pipe_reader_merge_develop
6 years ago
gongweibao fea91164b7 Fix windows compilation error! (#16546)
6 years ago
dongdaxiang 3a79be6eb3 refine API spec
6 years ago
dongdaxiang 98dda08a85 fix pull sparse slow problem
6 years ago
dongdaxiang 93c3c7f9b3 fix dataset testcase problem
6 years ago
dongdaxiang d739bab844 fix async_executor problem and remove some unnecessary testcase, fix trainer_desc import problem
6 years ago
dongdaxiang e3107a6ae0 fix windows compile problem
6 years ago
dongdaxiang 398004ece0 disable sys/wait.h to fix windows compile problem, include scope in lodtensor_printer
6 years ago
dongdaxiang 39362a8415 move root_scope->DropKids() into Finalize() so that we do not have to drop all the kids
6 years ago
dongdaxiang a0b59773af fix code style
6 years ago
dongdaxiang 365be5d559 support win32 flag in io.cc shell.cc, fix code style problem in fleet_wrapper, fix lodtensor_printer_test problem
6 years ago
dongdaxiang dc8cf36e4b add more example on datagenerator
6 years ago
dongdaxiang 6bf796df14 refine print fetch list
6 years ago
dongdaxiang cf1360643f add printer for fetch variable
6 years ago
Jacek Czaja 2632327429 [MKL-DNN] Tensor modifications revert (#16462)
6 years ago
Zeng Jinle 69cb9792ea
Merge pull request #16506 from sneaxiy/revert-16424-fix_allocator_bug
6 years ago
sneaxiy 5656fa9f7c fix travis ci
6 years ago
Zeng Jinle 174d0d0b90 Revert "Fix allocator bug"
6 years ago
gongweibao eb83abeac3
Add DGC(Deep Gradient Compression) interface. (#15841)
6 years ago
Zeng Jinle 644e8af4cf
Merge pull request #16424 from sneaxiy/fix_allocator_bug
6 years ago
nhzlx 953bdde058 Merge branch 'develop' of https://github.com/paddlepaddle/paddle into HEAD
6 years ago
sneaxiy 2d92b6be98 merge develop
6 years ago
Zeng Jinle c64d959343
Merge pull request #16295 from zhhsplendid/zhenghuihuang-dev-2
6 years ago
nhzlx a1d11bb175 fix ci bug: cudnn handler in multi card
6 years ago
nhzlx 3df7b98a0f Merge branch 'develop' of https://github.com/paddlepaddle/paddle into HEAD
6 years ago
sneaxiy 953214ad97 add more unittest
6 years ago
Wu Yi b7baeed7bb fix win gpu build test=develop (#16334)
6 years ago
zhhsplendid 124f1df481 Add flags for init and re-alloc gpu
6 years ago
nhzlx 07dcf2856c git cherry-pick from feature/anakin-engine: update anakin subgraph #16278
6 years ago
Wu Yi 6382b62f6b
Collective ops (#15572)
6 years ago
zhhsplendid 22715487dc add allocator flags
6 years ago
sneaxiy fd23262e0c merge develop, fix conflict
6 years ago
qingqing01 86e912c544 Fix windows compiling (#16230)
6 years ago
qingqing01 8ad672a287
Support sync batch norm. (#16121)
6 years ago
sneaxiy 682f2dbf29 merge develop
6 years ago
sneaxiy 2c4fcaa683 merge develop
6 years ago
chengduo 0979956619
Add memory profiler (#16137)
6 years ago
chengduo ad80bde824
Revert "Revert "Add Event for TensorCopy"" (#16035)
6 years ago
sneaxiy 2a639d5c2a add allocator chain to fix bug
6 years ago
chengduo e2da3a5b22
Revert "Add Event for TensorCopy" (#16022)
6 years ago
chengduo 7235fd662b
Add Event for TensorCopy (#15953)
6 years ago
Tao Luo 4efdebc6f6
Merge pull request #15931 from yihuaxu/develop_2c5c7b2a7_gelu_mkl_opt
6 years ago
dzhwinter 225c11a91f polish cudnn related code and fix bug. (#15164)
6 years ago
xiaolil1 6724be2b0d INT8 Pool kernel Key Creation Optimization. (#15883)
6 years ago
Yihua Xu 7396788694 Optimize gelu operation with mkl erf.
6 years ago
peizhilin c6472579c0 test=develop
6 years ago
peizhilin b5d6e38b05 fix build issue for cudaEvent_t
6 years ago
wopeizl 3ccd8964a4
Merge pull request #15905 from wopeizl/win/fix_eigen
6 years ago
chengduo 8e904d322f
Remove unnecessary dependence for profiler (#15899)
6 years ago
Xin Pan 44e7fcddc5
Merge pull request #15844 from panyx0718/infer
6 years ago
Jacek Czaja dec9cf53c8 [MKL-DNN] MKL-DNN specific Tensor modification (#15429)
6 years ago
peizhilin 6ccdb1b947 fix build issue on windows for sample prop op
6 years ago
Dun c6bd434ffe
add memset CUPTI && test=develop (#15868)
6 years ago
Sylwester Fraczek 74672d1aff Change *(smart_ptr.get()) -> *smart_ptr
6 years ago
tensor-tang ee2321debd
Revert 15770 develop a6910f900 gelu mkl opt (#15872)
6 years ago
chengduo 3b08c9abf4
enhance profiler (#15842)
6 years ago
Yihua Xu 676995c86c Optimze Gelu with MKL Erf function (#15770)
6 years ago
Tao Luo e3dd6970fc disable dam temporarily (#15860)
6 years ago
Dun Liang 35a90e06bf test=develop
6 years ago
Dun Liang c9080f516b test=develop
6 years ago
Dun Liang 1c7bb0e40c test=develop
6 years ago
Xin Pan 5eb87506bc add per kernel config and remove const_cast.
6 years ago
Dun a83e470405
Profiler refine and add CUDA runtime api tracer (#15301)
6 years ago
mozga-intel 13ec2d331b Enable momentum operator for a ngraph engine (#15673)
6 years ago
Tao Luo c797a1f050 remove legacy any.cmake
6 years ago
Tao Luo bd2fa73620
Merge pull request #15794 from sneaxiy/fix-warnings
6 years ago
tensor-tang e1c707fe9c
fix warnings (#15790)
6 years ago
sneaxiy 9b8e0e2f17 fix enforce_test
6 years ago
sneaxiy 209b355762 fix many warning
6 years ago
Zeng Jinle fc87ef741b
Merge pull request #15687 from sneaxiy/fix_enforce
6 years ago
sneaxiy f0590947c3 fix enforce
6 years ago
tensor-tang 31fd8ce1e1
Merge pull request #15375 from mozga-intel/mozga-intel/batch_norm_ngraph_operator
6 years ago
dzhwinter 04e9776aef add details. test=develop
6 years ago
mozga-intel 1198ccae6b Enable batch_norm operator for a ngraph engine
6 years ago
peizhilin 883d22093a fix the lib_any dependency
6 years ago
wopeizl 3614dadf23
Merge pull request #15631 from wopeizl/windows/fixci
6 years ago
peizhilin 061299be87 fix dependency
6 years ago
baojun ac4cde009d Enable accuracy op for ngraph engine (#15592)
6 years ago
dzhwinter ce0394bcd0 merge develop branch. test=develop
6 years ago
guoshengCS b6c3b69af8 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix-beam-search-size
6 years ago
liuwei1031 6e84eb131f expose peak gpu memory API to python test=develop (#15529)
6 years ago
guoshengCS 5dfce93101 To make CUDA_LAUNCH_KERNEL_HELPER support large size.
6 years ago
tensor-tang 8117725852 add jit kernel hsum, hmax and softmax refer code
6 years ago
sneaxiy ba4f43fd62 fix compile error in distributed mode
6 years ago
Yiqun Liu 3008fa1261
Add the CUDA kernel for beam_search op (#15020)
6 years ago
Zeng Jinle 2480a3df7d
Merge pull request #15496 from sneaxiy/lazy_allocator2
6 years ago
sneaxiy 9c360cc798 test=develop
6 years ago
Xin Pan 58cb18d9d9
Merge pull request #15322 from velconia/imperative_resnet
6 years ago
sneaxiy 51227bd447 lazy_allocator
6 years ago
tangwei12 8b50ad80ff
checkpoint at distributed training (#14854)
6 years ago
minqiyang 8ce198b2e1 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into imperative_resnet
6 years ago
minqiyang 315b133e67 Add single GPU support to imperative
6 years ago
tensor-tang 3759c1db8c
Merge pull request #14805 from mozga-intel/mozga-intel/element_wise_operator_ngraph
6 years ago
peizhilin eea75a1d93 fix issue when type is invalid
6 years ago
peizhilin 9adb158e5b Merge remote-tracking branch 'upstream/develop' into debug/support
6 years ago
chengduo 46d01d798e
Revert "Revert "Remove workspace_handle in conv_cudnn (#15186)"" (#15290)
6 years ago
Wojciech Uss cb2ba58458 Fix performance drop when with MKL-DNN
6 years ago
chengduozh c4eced9881 fix thread safe bug
6 years ago
chengduozh 358e657f68 Revert "Remove workspace_handle in conv_cudnn (#15186)"
6 years ago
wopeizl 5d9edb4124
Merge pull request #15156 from wopeizl/windows/fixgpuissue
6 years ago
chengduo 064512aa47
Remove workspace_handle in conv_cudnn (#15186)
6 years ago
xiaolil1 8f17c714de Conv int8 residual (#15145)
6 years ago
peizhilin 439691f5bd adjust the shlwapi on windows
6 years ago
peizhilin 92da467c99 Merge remote-tracking branch 'upstream/develop' into windows/fixgpuissue
6 years ago
peizhilin c1235c935f add the enable_debug flag
6 years ago
Zeng Jinle e29f10d315
Merge pull request #15207 from sneaxiy/remove_op_handle_lock_and_fix_var
6 years ago
mozga-intel a42f8f4f6f Enable element_wise_add operator for a ngraph
6 years ago
Zeng Jinle c562be20d9
Merge pull request #15193 from sneaxiy/fix_cudnn_compatible_check
6 years ago
peizhilin 1cd95d8a0b use thread local instance test=develop
6 years ago
sneaxiy ed409ac9f4 Revert "Revert "Remove op handle lock""
6 years ago
peizhilin d54133ea85 not include the numeric under linux test=develop
6 years ago
peizhilin a6f5ceee74 add the python callstack for debug support test=develop
6 years ago
Zeng Jinle dacfaaa966 Revert "Remove op handle lock"
6 years ago
xiaolil1 c8f101e5da Conv int8 relu (#15130)
6 years ago
sneaxiy 9793a0b6a6 fix_cudnn_compatible_check
6 years ago
Zeng Jinle ccb322d6a5 merge develop
6 years ago
Zeng Jinle f3a13512fc
Merge pull request #15139 from sneaxiy/remove_op_handle_lock
6 years ago
xiaolil1 bbc9336878 Enable basic MKL-DNN INT8 Conv OP (#15124)
6 years ago
peizhilin c919b2f31d Merge remote-tracking branch 'upstream/develop' into windows/fixgpuissue
6 years ago
peizhilin fd4f4d0e5f fix build issue test=develop
6 years ago
Yan Xu a1e60ab19b
Merge pull request #14791 from Yancey1989/parallel_graph_mode
6 years ago
peizhilin 9ae50dd07d fix gpu buils issue on windows test=develop
6 years ago
sneaxiy d0a8a1e950 remove_op_handle_lock
6 years ago
Yancey1989 e65436103f Merge branch 'develop' of github.com:PaddlePaddle/Paddle into parallel_graph_mode
6 years ago
sneaxiy 6f06e6cdac Merge remote origin
6 years ago
Xin Pan 9186451f60 hide GetTensor
6 years ago
sneaxiy d25395fc98 remove tensor core lock
6 years ago
Yancey1989 82b42e31f0 polish unittest test=develop
6 years ago
Yancey1989 0a885ac12a Merge branch 'develop' of github.com:PaddlePaddle/Paddle into parallel_graph_mode
6 years ago
peizhilin 813c2ce539 fix timer test=develop
6 years ago
wopeizl 7ab501264d
Merge pull request #15069 from wopeizl/windows/dsosupport
6 years ago
guru4elephant ff739449ab
Merge pull request #15018 from guru4elephant/add_timer
6 years ago
Yancey1989 4743c9cd5d Merge branch 'develop' of github.com:PaddlePaddle/Paddle into parallel_graph_mode
6 years ago
wopeizl 719ebe3786
Merge pull request #15070 from wopeizl/windows/testcasefix
6 years ago
Qiyang Min 0238a3bb4f
Merge pull request #14972 from velconia/accelerate_lstm
6 years ago
Yancey1989 86bb583881 Merge branch 'develop' of github.com:PaddlePaddle/Paddle into parallel_graph_mode
6 years ago
peizhilin 01c00b07dd fix test issues on windows
6 years ago
peizhilin 1e7f83e60a add cuda dso support for windows
6 years ago
Yancey1989 41a64f6a2a Merge branch 'develop' of github.com:PaddlePaddle/Paddle into parallel_graph_mode
6 years ago
Wu Yi 856f0da0fe
Fp16 training (#14992)
6 years ago
chengduo b9fb03cf54
Move GetTensor to tensor_util (#15011)
6 years ago
dongdaxiang ab2abfc5b2 Merge branch 'add_timer' of https://github.com/guru4elephant/Paddle into add_timer
6 years ago
dongdaxiang 4cb833d2de Merge branch 'add_timer' of https://github.com/guru4elephant/Paddle into add_timer
6 years ago
tensor-tang f0e02a65ed
Merge pull request #14974 from xiaolil1/quantize
6 years ago
dongdaxiang 68a2d1f3d7 Merge branch 'add_timer' of https://github.com/guru4elephant/Paddle into add_timer
6 years ago
dongdaxiang 2e5ebc4594 Merge branch 'add_timer' of https://github.com/guru4elephant/Paddle into add_timer
6 years ago
dongdaxiang 5dfd9c9aa9 Merge branch 'add_timer' of https://github.com/guru4elephant/Paddle into add_timer
6 years ago
dongdaxiang d0a5159946 Merge branch 'add_timer' of https://github.com/guru4elephant/Paddle into add_timer
6 years ago
dongdaxiang f9b8168508 Merge branch 'add_timer' of https://github.com/guru4elephant/Paddle into add_timer
6 years ago
minqiyang 52b4821a6e Fix Sprintf problem
6 years ago
minqiyang 010f657b33 Polish code
6 years ago
minqiyang 45acfbd011 1. Add specific condition for one or no arg in PADDLE_ENFORCE
6 years ago
dongdaxiang 2dee8f6cd5 add TrainFilesWithTimer in async_executor
6 years ago
xiaoli.liu@intel.com d83d0f33fd extract templated function
6 years ago
wopeizl b117a5f208
Merge pull request #14931 from wopeizl/windows/mkl
6 years ago
dongdaxiang cf6188a823 add a linux timer
6 years ago
chengduo 79bd6dfa18
[Feature] Add Temporary Allocator (#14875)
6 years ago
minqiyang e4719eb462 Fix bug in Windows VC 2010
6 years ago
minqiyang 5a5c577529 Polish code
6 years ago
minqiyang 099186cd41 Support one argument PADDLE_ENFORCE
6 years ago
minqiyang 4af97c6946 Polish code
6 years ago
minqiyang 41b81293ab Polish code
6 years ago
peizhilin 9e60c58666 Merge remote-tracking branch 'upstream/develop' into windows/mkl
6 years ago
minqiyang bc66401566 Polish code
6 years ago
minqiyang 53619a79b4 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into accelerate_lstm
6 years ago
peizhilin b06ce129bc some not so useful adjust
6 years ago
minqiyang 679d1a9e0b Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into accelerate_lstm
6 years ago
Jacek Czaja 709d9e3cb7 - Added reusing MKL-DNN primitives for Transpose MKL-DNN op
6 years ago
peizhilin 40a94a138f remove irrelevant fix for mkl
6 years ago
mozga-intel 9035bb81fe Enable mul operator for a ngraph engine (#14801)
6 years ago
peizhilin 07c7eaabb4 Merge remote-tracking branch 'upstream/develop' into windows/mkl
6 years ago
peizhilin ed5bd5e586 test=develop
6 years ago
peizhilin 19ebd8b4cf add ctc support for windows
6 years ago
minqiyang a3fa3f85d7 Polish code
6 years ago
Yu Yang 2803cf5776
Merge pull request #14868 from reyoung/feature/refine_w2v
6 years ago
peizhilin b601f2de8d include the mkl fix only
6 years ago
peizhilin 5a6d7fe2ff add mkl,ctc support for windows
6 years ago
wopeizl 0f085f0a5a
Merge pull request #14892 from wopeizl/windows/port3
6 years ago
Zeng Jinle 36a1d021a4
Merge pull request #14927 from sneaxiy/fix_cuda_stream_callback_in_cuda10
6 years ago
wopeizl fa78fc60be
Merge pull request #14907 from wopeizl/windows/avx
6 years ago
sneaxiy 2373aeb5e8 fix bug
6 years ago
minqiyang aa41ee75a1 Accelerate PADDLE_ENFORCE
6 years ago
peizhilin 41456e1723 Remove the useless definition
6 years ago
Yu Yang 740e1626ce Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/refine_w2v
6 years ago
Yancey1989 a760a550b0 Merge branch 'develop' of github.com:PaddlePaddle/Paddle into parallel_graph_mode
6 years ago
peizhilin d519fd6944 test=develop
6 years ago
Yu Yang bacf1d2399 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/tensor_type
6 years ago
Yan Chunwei a985949be9
Fea/fuse conv elementwise add fuse (#14669)
6 years ago
Yancey1989 4a4ccac1d0 update by comment test=develop
6 years ago
peizhilin 23dec78772 fix script issue
6 years ago
Yancey1989 c722b1dcb6 Merge branch 'develop' of github.com:PaddlePaddle/Paddle into parallel_graph_mode
6 years ago
Yu Yang 4ecdb6f486 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/tensor_type
6 years ago
Yu Yang 7b10bf0e60 Use mkl
6 years ago
sneaxiy ca84c2ca8f merge develop
6 years ago
Yu Yang 81520a24cf Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/refine_eigen_tensor
6 years ago
Yu Yang 9bd70a1e04 Change tensor uses proto::VarType::type
6 years ago
Yu Yang 8175983ef9
Merge pull request #14814 from reyoung/feature/gprof
6 years ago
Yu Yang 5e60906996 Fix compile error
6 years ago
Yu Yang 7604b1ad51 Fix Eigen macro when using GPU
6 years ago
sneaxiy 7923042365 merge develop
6 years ago
Yu Yang b22d638d8f Speed up SizeOfType
6 years ago
Yancey1989 2dda19f756 Merge branch 'develop' of github.com:PaddlePaddle/Paddle into parallel_graph_mode
6 years ago
sneaxiy 66182abda6 add cuda cudnn version check
6 years ago
Zeng Jinle add98c9e7d
Merge pull request #14745 from sneaxiy/fix_eigen_deallocate
6 years ago
Yancey1989 cb8a24be14 clean code
6 years ago
Tao Luo 54fcafb5f6
Merge pull request #14707 from yihuaxu/develop_4f71a6ee2_conv3d_mkldnn_opt
6 years ago
Yancey1989 c9de6f1b05 init parallel graph mode
6 years ago
sneaxiy 0f96c2e80f fix thread-safety bug
6 years ago
Yihua Xu 65dbc7cca4
Merge branch 'develop' into develop_4f71a6ee2_conv3d_mkldnn_opt
6 years ago
tensor-tang 4a93db9288 remove jit namespace
6 years ago
sneaxiy 900765224c fix deallocate bug
6 years ago
liuhongyu 773dc73fbf Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_cudnn_5_support
6 years ago
liuhongyu 8daf67f90f fix bugs; test=develop
6 years ago
Xin Pan 052cc5f538
Merge pull request #14725 from ZongwuYang/my-cool-stuff
6 years ago
Wu Yi 29d9fb53fc
[Feature] multi process multi gpu dist training, boost v100 performance by 20% (#14661)
6 years ago
liuhongyu 968dd3c078 add cudnn 5 support; test=develop
6 years ago
ZongwuYang 1560eb4a6d Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into my-cool-stuff
6 years ago
ZongwuYang deb04809bd test=develop
6 years ago
sneaxiy 35a2578426 fix bug
6 years ago
sneaxiy 64ad051b9a merge develop
6 years ago
sneaxiy c47c451a00 fix bug
6 years ago
Yihua Xu 669191c9cc Implement conv3d with mkldnn library (test=develop)
6 years ago
Hongyu Liu 4f71a6ee2c
Merge pull request #14622 from PaddlePaddle/add_cudnn_lstm
6 years ago
Yibing Liu c7382df80f
Print assert failure id in lookup_table_op (#14698)
6 years ago
sneaxiy 096673f675 refactor eager deletion
6 years ago
phlrain cf1fe61004 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_cudnn_lstm
6 years ago
Tao Luo 20120d9c97
Merge pull request #14608 from jczaja/prv-conv2d-transpose-mkldnn
6 years ago
Tao Luo ea47685f91
Merge pull request #14646 from jczaja/prv-softmax-mkl-sasum
6 years ago
minqiyang a02ce58f2c Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into revert_vlog
6 years ago
Tao Luo 4ec9de0122
Merge pull request #14628 from Sand3r-/mgallus/mkldnn-elementwise_mul
6 years ago
Clementine 6c71c1f8f9 Add activation gelu (#14569)
6 years ago
Michal Gallus 9455be0ba5 EltwiseMul: Extract StringToFormat to MKLDNN helper
6 years ago
Jacek Czaja 8bfa1fa9bb - ASUM MKL integration
6 years ago
liuhongyu 05917c3c79 add cudnn lstm; test=develop
6 years ago
peizhilin 38715e6fd0 minor fix
6 years ago
Jacek Czaja fb24690a58 - conv2d transpose MKL-DNN
6 years ago
minqiyang be04d99fe4 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into revert_vlog
6 years ago
minqiyang 53433d7f2e Revert the changes of VLOG
6 years ago
peizhilin 36cd18b549 Merge remote-tracking branch 'upstream/develop' into windows/build
6 years ago
peizhilin b2f8d4183d Given the different fraction_of_gpu_memory_to_use depends on platform
6 years ago
Yu Yang 26af9cf90c
Merge pull request #14565 from chengduoZH/fix_cublas_warp_error
6 years ago
chengduozh f7847ca6a3 fix cublas warp error
6 years ago
luotao1 e21edb26f6 add Set/GetCPUNumThreads api
6 years ago
peizhilin 445fff24dc add the bigobj option to NVCC compile
6 years ago
chengduo 00b9e9a135
Refine cublas to support CUBLAS_TENSOR_OP_MATH (#13929)
6 years ago
peizhilin 7c8c9dc9bf fix unit test cases
6 years ago
wopeizl d9a1f3e58e Windows/online (#14474)
6 years ago
peizhilin 6e66fadb95 clean up the pre-definitions on windows
6 years ago
peizhilin 67562a6fcd Merge remote-tracking branch 'upstream/develop' into windows/build
6 years ago
peizhilin 703b26e697 add profiler, parallel_executor back
6 years ago
chengduo a8d3aaae2a
print output log warning (#14497)
6 years ago
peizhilin 3a72a634cf Merge remote-tracking branch 'upstream/develop' into windows/build
6 years ago
peizhilin ee0fd78c81 Merge remote-tracking branch 'upstream/develop' into windows/build
6 years ago
Yu Yang f1a392a5fe
Merge pull request #13804 from sneaxiy/rewrite_allocation
6 years ago
qingqing01 fd7e643153
Convolution fusion operator. (#14449)
6 years ago
Yu Yang 98bbfc17be Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
6 years ago
peizhilin c59d3e83bc test case fix
6 years ago
peizhilin 8580b7a130 Merge remote-tracking branch 'upstream/develop' into windows/build
6 years ago
Wu Yi b32c13dc20
Add cudnn ctc loss (#12366)
6 years ago
peizhilin d1a1fafc4c code style
6 years ago
peizhilin 162f2d4109 disable the openblas multi-thread on windows since no support
6 years ago
Yu Yang c8f6e70ab4 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
6 years ago
peizhilin d1429ac4a5 add recordio support
6 years ago
Yu Yang 0d6718fcbd Pass compile
6 years ago
peizhilin be332a13bc Merge remote-tracking branch 'upstream/develop' into windows/build
6 years ago
Yu Yang d93b2d0365 Refine code
6 years ago
peizhilin 1a9008c420 code style fix
6 years ago
tensor-tang 1be85d011d add mkl vsqr and vpow
6 years ago