Commit Graph

383 Commits (1722678258fab032676bbd63aa3f95e6e925d1e4)

Author SHA1 Message Date
dzhwinter dbe90cc0f6 merge develop branch
7 years ago
fengjiayi f79ca23115 fix bugs
7 years ago
fengjiayi c501826f42 use framework::RWLock
7 years ago
fengjiayi 1f36a4c27c Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_CudnnHolder
7 years ago
fengjiayi b0aca8824d make CudnnHolder thread safe
7 years ago
luotao1 7169f9378c fix mkldnn include format
7 years ago
fengjiayi 15cc9128be fix compile error
7 years ago
fengjiayi 407ff0bdbc use CudnnHolder in conv_cudnn_op
7 years ago
fengjiayi 04bfd5c10c add CudnnHolder to manage cudnn_handle and workspace
7 years ago
Yan Chunwei 902f19b46a
fea/fuse attention lstm simplify.with fusion lstm.with sequnce expand (#13006)
7 years ago
dzhwinter b78394ea57 done
7 years ago
dzhwinter b74af56bbc cpu compile is done
7 years ago
dzhwinter 78aab05b71 fix more op errors
7 years ago
dzhwinter cd8f3e9ed0 operator module is done
7 years ago
dzhwinter d361624c1d
platform module (#12932)
7 years ago
dzhwinter 2ec589a24e float.h fixed
7 years ago
dzhwinter 7dceb8a080 check some operators
7 years ago
dzhwinter d7f98f37a7 more platform is done
7 years ago
dzhwinter efd0884fa9 add op registry
7 years ago
dzhwinter eca4563e5d
operators module (#12938)
7 years ago
dzhwinter 488a2dd2e8 with ir node
7 years ago
dzhwinter cfbf1ba305 add source
7 years ago
dzhwinter c1ad52f768 pre-commit
7 years ago
dzhwinter 89f95ea25e merge develop branch
7 years ago
dzhwinter 34f8c9b6f5 windows port
7 years ago
tensor-tang 0d46f518ae refine avx condition and warning
7 years ago
tensor-tang 4e538db14d refine jit space
7 years ago
tensor-tang ec59f0d454 add cpu vec
7 years ago
tensor-tang 3dd66390b2 add blas vexp
7 years ago
tensor-tang 0ec1f65cf1 fix blas dot and add cblas scal
7 years ago
tensor-tang a2203d0466 add cblas dot
7 years ago
Michał Gallus cd32ddac12 Fuse Convolution and Eltwise Add into MKLDNN's Conv+Bias (#12669)
7 years ago
dzhwinter e23ddf6ae4
status (#12764)
7 years ago
Tao Luo d04ef276a5
Merge pull request #12745 from tensor-tang/refine/op/elewise_mul
7 years ago
dzhwinter 00463fdfe3
cudnn windows support (#12757)
7 years ago
dzhwinter 17602eab94 windows port of malloc
7 years ago
dzhwinter 2673798ddb
"fix float16 ShuffleDownSync Bug" (#12756)
7 years ago
dzhwinter 5c88cd2af5 remove werror in windows
7 years ago
dzhwinter 64ce1210aa "windows support"
7 years ago
dzhwinter 36878d78cc comment out backtarce
7 years ago
dzhwinter 335398f18b dlfnh
7 years ago
tensor-tang 6644ce79a5 add mklml vmul
7 years ago
tensor-tang ff92b6ba81
Merge pull request #12531 from tensor-tang/refine/op/gru
7 years ago
Chen Weihang 1e961b145c
Merge pull request #12591 from chenwhql/enforce_msg_polish
7 years ago
Yan Chunwei 0a641ba326
add ratio to profiler (#12701)
7 years ago
tensor-tang c588c64a76 Merge remote-tracking branch 'ups/develop' into refine/op/gru
7 years ago
chenweihang da39d84a48 refine by reviewer's advice
7 years ago
tensor-tang 1ab1d03c62 fix missing macro condition
7 years ago
Qiao Longfei e8fcb71bed
Merge pull request #12620 from jacquesqiao/timeline-support-pure-cpu
7 years ago
tensor-tang 3bf3e77ac8 Merge remote-tracking branch 'ups/develop' into refine/op/gru
7 years ago
qiaolongfei 5a6c3cd9e0 fix profiler dead lock
7 years ago
tensor-tang a50889f523 introduce xbyak
7 years ago
qiaolongfei 3f2aa91970 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into timeline-support-pure-cpu
7 years ago
qiaolongfei e008600b08 optimize code
7 years ago
qiaolongfei 7c649e06c3 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into timeline-support-pure-cpu
7 years ago
Sylwester Fraczek d74bb6ab9c fix ut for mkldnn 0.15 - added forcing layout NCHW in mkldnn conv tests
7 years ago
chenweihang b1dd4149b9 adjust enforce test cases
7 years ago
chenweihang 61052cdbc6 polish high frequency enforce error message
7 years ago
qiaolongfei 954d680b40 fix test_parallel_do.py
7 years ago
tensor-tang 836068569f Merge remote-tracking branch 'ups/develop' into refine/op/gru
7 years ago
qiaolongfei 1623f1ba4f Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-profiler
7 years ago
qiaolongfei 4c5bcd7859 add guard to profiler
7 years ago
tensor-tang 43cee33a23 add mkl packed gemm
7 years ago
Xin Pan caf10b474f make profiler use thread_id from g_thread_id
7 years ago
dzhwinter 6d3da458a7
Fix/float16 style (#12446)
7 years ago
dzhwinter 39ac9e39c2
float16 type support enhance (#12181)
7 years ago
tensor-tang 4f0383f52e fix unknown flag
7 years ago
tensor-tang 9788e5ab87 add flags to control num_threads
7 years ago
tensor-tang 10a1c2bb86 control omp num_threads
7 years ago
typhoonzero 54e9fd3f61 fix cudnn enforce
7 years ago
qiaolongfei a6d30a8607 profiler support cpu
7 years ago
Xin Pan 7781297c70 variants
7 years ago
Tao Luo e568acbee2
Merge pull request #12092 from velconia/add_deps_to_device_ctx
7 years ago
minqiyang 2cc6ca43a0 Add framework_proto to device context deps
7 years ago
Jacek Czaja fbe25ef510 MKLDNN: Extending Conv MKLDNN op to reuse MKLDNN primitives (#11750)
7 years ago
tensor-tang 2e418a5227 fix conflicts
7 years ago
tensor-tang 3df99e72ab Merge remote-tracking branch 'ups/develop' into refine/set_num_threads
7 years ago
dzhwinter 4ed0b62476
Move fluid::framework::InitDevices into fluid::platform (#11757)
7 years ago
dzhwinter 99a99ec7e3
"remove lapack" (#11966)
7 years ago
fengjiayi ce16b40b04
Merge pull request #11891 from JiayiFeng/dev_eof_exp
7 years ago
Yu Yang 037ce12ee4
Merge pull request #11907 from reyoung/feature/use_dev_ctx_for_op
7 years ago
yuyang18 2d0e5592b5
Use std::map for Place <--> DeviceContext
7 years ago
Xin Pan 94cb59ad09 hide utils to legacy
7 years ago
fengjiayi ed4b2475f5 add an unittest
7 years ago
fengjiayi 8553ac6a95 fix unittests
7 years ago
fengjiayi 3fab4f65a4 Add EOFException to represent EOF in C++ reader
7 years ago
Yan Chunwei 28172bbb8e
add debug to replacing enforce with GLOG for debug (#11244)
7 years ago
gongweibao e2b1c5d925
fix code style (#11862)
7 years ago
mozga-intel b8a04c2fa1 Duplicated code was moved to common function
7 years ago
tensor-tang e3a96300bb move SetNumThreads to platform
7 years ago
Tao Luo 2dae8a4631
Merge pull request #11596 from tensor-tang/refine/mklml/dyload
7 years ago
Yi Wang 2625178add
No NCCL on macOS (#11652)
7 years ago
Tao Luo 60647c9aa4
Merge pull request #11519 from jczaja/prv-softmax-mkldnn-grad-operator
7 years ago
chengduo da556ed6d4
enhance ParallelExecutor stable (#11637)
7 years ago
Jacek Czaja 98f3ad3ba1 - MKLDNN Softmax Grad Op
7 years ago
tensor-tang d5fb8fa778 Revert "Merge pull request #11628 from PaddlePaddle/revert-11102-mozga-intel/Sum_mkldnn_layout"
7 years ago
Yu Yang 9b3f48d7e6
Merge pull request #11616 from chengduoZH/fix_parallel_exe
7 years ago
tensor-tang 28a0ef9522 remove usr local lib when dynamic load lib
7 years ago
tensor-tang 90780e22ce
Revert "MKLDNN layout: Support for sum operator"
7 years ago
chengduoZH c99fca5f90 Add No Mutex
7 years ago
tensor-tang 3e73a7a924 add usr local lib to dynamic search path
7 years ago
tensor-tang f503f12925 enable dynamic load mklml lib on fluid
7 years ago
mozga-intel 6512be59ec MKLDNN layout: the code-review changes
7 years ago
tensor-tang 9a25f2895c update the default cpu memory with MKLDNN
7 years ago
tensor-tang a8c2ff316f refine the initial cpu memory flag for mkldnn
7 years ago
Qiyang Min 046bb5c8cb Fix NCCLBcast hang up bug in Parallel Executor (#11377)
7 years ago
Xin Pan d2afd21021 Remove cuptiFinalize.
7 years ago
qiaolongfei 9ebbfa6bbc fix build on mac
7 years ago
tensor-tang 056dd40475 add initial memory flag in MB for infer
7 years ago
yuyang18 a1254a86ba Add lock to record_event.
7 years ago
mozga-intel 3ff9ba0e6b Mkldnn layout (#11040)
7 years ago
Xin Pan ca2d6d3c66
Merge pull request #11224 from dzhwinter/fix/cudnn
7 years ago
qingqing01 e0a32074bd
Fix PADDLE_ASSERT. (#10981)
7 years ago
dzhwinter 44c662b4e1 Merge remote-tracking branch 'origin/develop' into fix/cudnn
7 years ago
Yu Yang c36dd3b338
Merge pull request #11114 from reyoung/feature/yep
7 years ago
dzhwinter 2b9ef7e249 "fix"
7 years ago
dzhwinter 75d8e8ca33 "fix compiled in manylinux"
7 years ago
dzhwinter 4777aec9be "done"
7 years ago
dzhwinter 7971d4a310
Feature/deterministic (#11205)
7 years ago
yuyang18 53dab95b75 Static DSO handle
7 years ago
yuyang18 c5115950a8 Use static for dlsym
7 years ago
yuyang18 7cf8b656a2 Remove lock in device context
7 years ago
Xin Pan 7eca286159
Merge pull request #11078 from panyx0718/improve_profiler
7 years ago
gongweibao 4fb7cc7f5e
Move sync_mode device ctx from grpc server (#10881)
7 years ago
Xin Pan 75ea577fd3 allow profiler and timeline to work when dev_ctx is nullptr.
7 years ago
Xin Pan f14e579cc3 clean up
7 years ago
Xin Pan 3cb6395688 better profiler and benchmark
7 years ago
Xin Pan 0d598cf9f6
Merge pull request #10822 from panyx0718/dist_opt
7 years ago
Xin Pan 08e4970e45 follow comments
7 years ago
Xin Pan b4dd4c048d multi-thread handlerequest
7 years ago
Krzysztof Binias 0aa01929c1 Add backward
7 years ago
Tao Luo 85b6bb5886
Merge pull request #10747 from jczaja/prv-mkldnn-pooling-reuse
7 years ago
dzhwinter 0e4467eee4
"fix compile" (#10657)
7 years ago
Xin Pan 40a2ee9ae8
Merge pull request #10621 from panyx0718/fix_profile
7 years ago
Jacek Czaja 5f1333058c - Draft of reuse of pooling mkldnn operator
7 years ago
yuyang18 dfbe06ccab Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/fix_ninja_build
7 years ago
Xin Pan 94c0a64d62 Fix a profiler race condition
7 years ago
yuyang18 dc6ce071d4 Polish cmake
7 years ago
yuyang18 7c777dd549 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/exec_strategy
7 years ago
yuyang18 08295f9877 Add build strategy
7 years ago
typhoonzero 7b0c0273f4 update by comments
7 years ago
typhoonzero f5840d8925 follow comments
7 years ago
typhoonzero 04bde96e4c Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into gen_nccl_id_op
7 years ago
fengjiayi 2bff03bc1e fix a compile error (#10488)
7 years ago
chengduoZH 345737d0fe add sync
7 years ago
typhoonzero a135fec1fc Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into gen_nccl_id_op
7 years ago
typhoonzero 17009d0627 workable version
7 years ago
Xin Pan dce0732d5e
Merge pull request #10380 from panyx0718/dist_timeline
7 years ago
typhoonzero a529d790b6 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into gen_nccl_id_op
7 years ago
typhoonzero 3667578ec2 testing
7 years ago