Commit Graph

603 Commits (1943119fc5f98f6b552ebb6d180346b9c27adb8e)

Author SHA1 Message Date
peizhilin d1429ac4a5 add recordio support
6 years ago
Yu Yang 0d6718fcbd Pass compile
6 years ago
peizhilin be332a13bc Merge remote-tracking branch 'upstream/develop' into windows/build
6 years ago
Yu Yang d93b2d0365 Refine code
6 years ago
peizhilin 1a9008c420 code style fix
6 years ago
tensor-tang 1be85d011d add mkl vsqr and vpow
6 years ago
peizhilin 13bfee1f85 Merge branch 'windows/build' into windows/online
6 years ago
peizhilin 7840d181c9 fix style issue
6 years ago
peizhilin dc339b78d7 fix code style
6 years ago
sneaxiy d231e55065 merge develop
6 years ago
peizhilin 9b558a8035 Merge branch 'windows/build' into windows/online
6 years ago
peizhilin 7638f0afb3 simplify the logic
6 years ago
peizhilin 6f9c70acb7 Merge branch 'windows/build' into windows/online
6 years ago
peizhilin ca60e1d34d Merge remote-tracking branch 'upstream/develop' into windows/build
6 years ago
peizhilin 4bd0c4c5ee test=develop
6 years ago
peizhilin 4b1f1a8787 fix merge issue
6 years ago
Yu Yang 6ae0b91b39 Clean LockGuardPtr
6 years ago
peizhilin 52f7644f53 Merge remote-tracking branch 'upstream/develop' into windows/build
6 years ago
Qiyang Min 698698f2fa
Merge branch 'develop' into fix_vlog
6 years ago
qingqing01 abe209234f
Exhaustive search for cuDNN conv. (#14286)
6 years ago
Yu Yang b59a9bfb7c Clean buffered_allocator
6 years ago
Yu Yang fdc689142c Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
6 years ago
minqiyang 87450b9ad4 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_vlog
6 years ago
peizhilin 4ffa92d4f0 Merge branch 'develop' into windows/build
6 years ago
peizhilin 45125ba538 fix share library issue
6 years ago
Zhaolong Xing ba8b5619a3
Revert "cherry picked windows patches."
6 years ago
minqiyang fcc0452c8b Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_vlog
6 years ago
minqiyang 0c3227a523 Change the origin VLOG level to 10 times
6 years ago
peizhilin 869487a2b7 Merge remote-tracking branch 'origin/develop' into windows/build
6 years ago
dzhwinter 234a1d9248 Merge remote-tracking branch 'origin/develop' into windows/debug
6 years ago
Yu Yang c774bcbd2d Merge device_context
6 years ago
Yu Yang 057a682ee9 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
6 years ago
tensor-tang c9730d33d9 fix run error on mac
6 years ago
Qiao Longfei 2921f8a79c Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-pserver-profiler-thread-pool
6 years ago
dzhwinter 2835e04409 merge develop branch. test=develop
6 years ago
qingqing01 db8c52da5e Revert " Exhaustive search for cuDNN conv. (#14043)"
6 years ago
qingqing01 ce7d9b0799
Exhaustive search for cuDNN conv. (#14043)
6 years ago
Zeng Jinle 8ac2242b6e
Merge pull request #14075 from sneaxiy/remove_some_locks_in_pe
6 years ago
sneaxiy 8684553633 stream callback support in cuda 10
6 years ago
sneaxiy faac8a76ce remove unnecessary codes
6 years ago
Yu Yang ff9e531bd9
style(platform): disable warning when cuda cc not matched (#14029)
6 years ago
Qiao Longfei 59fbfbfbf7 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-pserver-profiler-thread-pool
6 years ago
Qiao Longfei 9e4e9e9b6e clean rpc server profiler
6 years ago
tensor-tang c3cbf0b8ef
Merge pull request #14185 from tpatejko/tpatejko/mkldnn-conv-residual-data-reorder
6 years ago
peizhilin 71d7980f69 fix build issue 1
6 years ago
tensor-tang 6b49ee42c3
Merge pull request #14239 from tensor-tang/fix/avx
6 years ago
tensor-tang e09a7c793d remove the warning log since do not have avx2, avx512 flags
6 years ago
tensor-tang f524c1b62b throw error when mismatch cpu version
6 years ago
peizhilin 9d67c1fb69 cpu build support
6 years ago
dzhwinter 60f70b174d test=develop
6 years ago
sneaxiy 7ff320f8cc merge develop
6 years ago
Kaipeng Deng daed473d4a
Merge pull request #14089 from heavengate/pool_exclude
6 years ago
whs 0c319e0b35
Add affine grid generator op (#12238)
6 years ago
dzhwinter 0a180584e6 clean cmake. test=develop
6 years ago
dzhwinter 1ace55c8ee merge develop branch
6 years ago
Tomasz Patejko 8899d42265 MKLDNN conv residual data: primitive reuse interface used. Reorder done when formats are different
6 years ago
Yu Yang 90d9e5aee8
feat(platform): lazy initialization of devicecontext in pool (#14067)
6 years ago
dzhwinter 316765839d add back jit simd instructions. stage.
6 years ago
dzhwinter bf2e4cb188 cleard. staged
6 years ago
dzhwinter ebfe5a02b3 merge develop branch
6 years ago
Yu Yang c01696f8c2 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
6 years ago
dengkaipeng c93e044ae0 add inclusive/exclusive mode in PoolOp avg pool type
6 years ago
dzhwinter 7141debe38 add cudnn back. staged.
6 years ago
Sylwester Fraczek 2098b42584 review fixes (Teamcity fails)
6 years ago
sneaxiy 5be6f762d0 remove_lock_in_some_ops
6 years ago
Brian Liu a53e8a8da6 Update MKLDNN integration framework to support Paddle multi-instances
6 years ago
Yu Yang 1d4d4e73ab Remove place hash
6 years ago
Yu Yang 461f71a90b Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into rewrite_allocation
6 years ago
gongweibao 58c027cc38
Add rpc profiler flags. (#13989)
6 years ago
Xin Pan d10e54c460
Merge pull request #14003 from chengduoZH/fix_fast_parallel_exe_bug
6 years ago
chengduozh 82d2903b63 Fix fast ParallelExe bug
6 years ago
sneaxiy 2002e71da8 fix pinned allocator
6 years ago
sneaxiy 21fdf8e87d add unittest for allocator_facade.cc
6 years ago
gongweibao 078223b3e3
Add rpc timeline. (#13900)
6 years ago
tensor-tang 6447155dac
Merge pull request #13851 from tensor-tang/fea/jitkernel_peephole
6 years ago
Yibing Liu 6b795d424c
Merge pull request #13901 from kuke/seq_slice_py
6 years ago
dzhwinter e41a3fcd68 fix update to develop hang problem.
6 years ago
Qiao Longfei 681226e97c
Merge pull request #13864 from jacquesqiao/py-reader-add-test-mode
6 years ago
Yibing Liu 16b2c6dc78 Add py api for sequence_slice_op
6 years ago
chengduo 2c9839c847
add cuda version display (#13885)
6 years ago
wanghaoshuang 3ae9645084 compile in linux
6 years ago
Qiao Longfei 8686f7c68e add reader_queue_speed_test_mode flag for speed test
6 years ago
tensor-tang bcb8ea397d Merge remote-tracking branch 'ups/develop' into fea/jitkernel_peephole
6 years ago
Qiao Longfei 5428cb9908
Profiler support merge data of all thread (#13811)
6 years ago
sneaxiy 4c672ab1a2 Merge reyoung:rewrite_allocation
6 years ago
tensor-tang ea7dc9cbf6 Merge remote-tracking branch 'ups/develop' into fea/jitkernel
6 years ago
Xin Pan ab798a2832 clarify the fraction_of_gpu_memory flag
6 years ago
Yu Yang 15076c325e Add comments and polish code style
6 years ago
Yu Yang 29f66c2408 Polish code
6 years ago
Yu Yang 8e3fdc6e65 Fix SetDevice on init
6 years ago
Yu Yang 524f6e9b36 Refine code
6 years ago
Yu Yang 5cf395beaf Fix bug in uts
6 years ago
dzhwinter 2d00e65819
namespace issue (#13543)
6 years ago
Yu Yang 58ed412f68 refactor(memory): rewrite memory allocation and make it extentable
6 years ago
typhoonzero a4f7696a18 Revert "Some trivial optimization (#13530)"
6 years ago
tensor-tang dee5d35c20 refine vmul
6 years ago
chengduo 1d91a49d2f
Some trivial optimization (#13530)
6 years ago
dzhwinter 7806c5625f
fix enforce (#13544)
6 years ago
dzhwinter 97636a9fcf
"fix link error" (#13545)
7 years ago
sneaxiy 612e1a3155 modification
7 years ago
sneaxiy d0b2453ecd merge develop
7 years ago
sneaxiy 24ea39c4c6 feature/eager_delete_tensor
7 years ago
dzhwinter 85f8dd1c77 debug version
7 years ago
dzhwinter e1999538eb debug the device context
7 years ago
dzhwinter 372caf4000 windows staff
7 years ago
dzhwinter c3e1fb5a3e add demo
7 years ago
Krzysztof Binias 2ed7982d09
Merge pull request #13327 from kbinias/kbinias/conv-weights-converted-once
7 years ago
Krzysztof Binias accdecc681 Correcting Lint errors
7 years ago
Krzysztof Binias 1ce9e9dc30 Renaming decision variable
7 years ago
Krzysztof Binias 1658958fe6 Reusing converted weights
7 years ago
Yang Yu 8331e835a8 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix_CudnnHolder_bug
7 years ago
Wu Yi f90c7865f0
Benchmark tool for imgnet (#12305)
7 years ago
JiabinYang e322fc4e0e add error info for nccl not found
7 years ago
fengjiayi 7b577b92e0 fix a memory bug in CudnnHolder
7 years ago
fengjiayi 82a1b35b9b Revert "Revert "Add CudnnHolder and use it in Conv and ConvTranspose op""
7 years ago
guochaorong 151e169eb7
Revert "Add CudnnHolder and use it in Conv and ConvTranspose op"
7 years ago
dzhwinter 6fb28796f5
memory (#13143)
7 years ago
dzhwinter f05520060e
fix style (#13142)
7 years ago
fengjiayi 0236966b68 follow commits
7 years ago
fengjiayi 5398e1a3a6 fix bugs
7 years ago
dzhwinter dbe90cc0f6 merge develop branch
7 years ago
fengjiayi f79ca23115 fix bugs
7 years ago
fengjiayi c501826f42 use framework::RWLock
7 years ago
fengjiayi 1f36a4c27c Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_CudnnHolder
7 years ago
fengjiayi b0aca8824d make CudnnHolder thread safe
7 years ago
luotao1 7169f9378c fix mkldnn include format
7 years ago
fengjiayi 15cc9128be fix compile error
7 years ago
fengjiayi 407ff0bdbc use CudnnHolder in conv_cudnn_op
7 years ago
fengjiayi 04bfd5c10c add CudnnHolder to manage cudnn_handle and workspace
7 years ago
Yan Chunwei 902f19b46a
fea/fuse attention lstm simplify.with fusion lstm.with sequnce expand (#13006)
7 years ago
dzhwinter b78394ea57 done
7 years ago
dzhwinter b74af56bbc cpu compile is done
7 years ago
dzhwinter 78aab05b71 fix more op errors
7 years ago
dzhwinter cd8f3e9ed0 operator module is done
7 years ago
dzhwinter d361624c1d
platform module (#12932)
7 years ago
dzhwinter 2ec589a24e float.h fixed
7 years ago
dzhwinter 7dceb8a080 check some operators
7 years ago
dzhwinter d7f98f37a7 more platform is done
7 years ago
dzhwinter efd0884fa9 add op registry
7 years ago
dzhwinter eca4563e5d
operators module (#12938)
7 years ago
dzhwinter 488a2dd2e8 with ir node
7 years ago
dzhwinter cfbf1ba305 add source
7 years ago
dzhwinter c1ad52f768 pre-commit
7 years ago
dzhwinter 89f95ea25e merge develop branch
7 years ago
dzhwinter 34f8c9b6f5 windows port
7 years ago
tensor-tang 0d46f518ae refine avx condition and warning
7 years ago
tensor-tang 4e538db14d refine jit space
7 years ago
tensor-tang ec59f0d454 add cpu vec
7 years ago
tensor-tang 3dd66390b2 add blas vexp
7 years ago
tensor-tang 0ec1f65cf1 fix blas dot and add cblas scal
7 years ago
tensor-tang a2203d0466 add cblas dot
7 years ago
Michał Gallus cd32ddac12 Fuse Convolution and Eltwise Add into MKLDNN's Conv+Bias (#12669)
7 years ago
dzhwinter e23ddf6ae4
status (#12764)
7 years ago
Tao Luo d04ef276a5
Merge pull request #12745 from tensor-tang/refine/op/elewise_mul
7 years ago
dzhwinter 00463fdfe3
cudnn windows support (#12757)
7 years ago
dzhwinter 17602eab94 windows port of malloc
7 years ago
dzhwinter 2673798ddb
"fix float16 ShuffleDownSync Bug" (#12756)
7 years ago
dzhwinter 5c88cd2af5 remove werror in windows
7 years ago
dzhwinter 64ce1210aa "windows support"
7 years ago
dzhwinter 36878d78cc comment out backtarce
7 years ago
dzhwinter 335398f18b dlfnh
7 years ago
tensor-tang 6644ce79a5 add mklml vmul
7 years ago
tensor-tang ff92b6ba81
Merge pull request #12531 from tensor-tang/refine/op/gru
7 years ago
Chen Weihang 1e961b145c
Merge pull request #12591 from chenwhql/enforce_msg_polish
7 years ago
Yan Chunwei 0a641ba326
add ratio to profiler (#12701)
7 years ago
tensor-tang c588c64a76 Merge remote-tracking branch 'ups/develop' into refine/op/gru
7 years ago
chenweihang da39d84a48 refine by reviewer's advice
7 years ago
tensor-tang 1ab1d03c62 fix missing macro condition
7 years ago
Qiao Longfei e8fcb71bed
Merge pull request #12620 from jacquesqiao/timeline-support-pure-cpu
7 years ago
tensor-tang 3bf3e77ac8 Merge remote-tracking branch 'ups/develop' into refine/op/gru
7 years ago
qiaolongfei 5a6c3cd9e0 fix profiler dead lock
7 years ago
tensor-tang a50889f523 introduce xbyak
7 years ago
qiaolongfei 3f2aa91970 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into timeline-support-pure-cpu
7 years ago
qiaolongfei e008600b08 optimize code
7 years ago
qiaolongfei 7c649e06c3 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into timeline-support-pure-cpu
7 years ago
Sylwester Fraczek d74bb6ab9c fix ut for mkldnn 0.15 - added forcing layout NCHW in mkldnn conv tests
7 years ago
chenweihang b1dd4149b9 adjust enforce test cases
7 years ago
chenweihang 61052cdbc6 polish high frequency enforce error message
7 years ago
qiaolongfei 954d680b40 fix test_parallel_do.py
7 years ago
tensor-tang 836068569f Merge remote-tracking branch 'ups/develop' into refine/op/gru
7 years ago
qiaolongfei 1623f1ba4f Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into optimize-profiler
7 years ago
qiaolongfei 4c5bcd7859 add guard to profiler
7 years ago
tensor-tang 43cee33a23 add mkl packed gemm
7 years ago
Xin Pan caf10b474f make profiler use thread_id from g_thread_id
7 years ago
dzhwinter 6d3da458a7
Fix/float16 style (#12446)
7 years ago
dzhwinter 39ac9e39c2
float16 type support enhance (#12181)
7 years ago
tensor-tang 4f0383f52e fix unknown flag
7 years ago
tensor-tang 9788e5ab87 add flags to control num_threads
7 years ago
tensor-tang 10a1c2bb86 control omp num_threads
7 years ago
typhoonzero 54e9fd3f61 fix cudnn enforce
7 years ago
qiaolongfei a6d30a8607 profiler support cpu
7 years ago
Xin Pan 7781297c70 variants
7 years ago
Tao Luo e568acbee2
Merge pull request #12092 from velconia/add_deps_to_device_ctx
7 years ago
minqiyang 2cc6ca43a0 Add framework_proto to device context deps
7 years ago
Jacek Czaja fbe25ef510 MKLDNN: Extending Conv MKLDNN op to reuse MKLDNN primitives (#11750)
7 years ago
tensor-tang 2e418a5227 fix conflicts
7 years ago
tensor-tang 3df99e72ab Merge remote-tracking branch 'ups/develop' into refine/set_num_threads
7 years ago
dzhwinter 4ed0b62476
Move fluid::framework::InitDevices into fluid::platform (#11757)
7 years ago
dzhwinter 99a99ec7e3
"remove lapack" (#11966)
7 years ago
fengjiayi ce16b40b04
Merge pull request #11891 from JiayiFeng/dev_eof_exp
7 years ago
Yu Yang 037ce12ee4
Merge pull request #11907 from reyoung/feature/use_dev_ctx_for_op
7 years ago
yuyang18 2d0e5592b5
Use std::map for Place <--> DeviceContext
7 years ago
Xin Pan 94cb59ad09 hide utils to legacy
7 years ago
fengjiayi ed4b2475f5 add an unittest
7 years ago
fengjiayi 8553ac6a95 fix unittests
7 years ago
fengjiayi 3fab4f65a4 Add EOFException to represent EOF in C++ reader
7 years ago
Yan Chunwei 28172bbb8e
add debug to replacing enforce with GLOG for debug (#11244)
7 years ago
gongweibao e2b1c5d925
fix code style (#11862)
7 years ago
mozga-intel b8a04c2fa1 Duplicated code was moved to common function
7 years ago
tensor-tang e3a96300bb move SetNumThreads to platform
7 years ago
Tao Luo 2dae8a4631
Merge pull request #11596 from tensor-tang/refine/mklml/dyload
7 years ago
Yi Wang 2625178add
No NCCL on macOS (#11652)
7 years ago
Tao Luo 60647c9aa4
Merge pull request #11519 from jczaja/prv-softmax-mkldnn-grad-operator
7 years ago
chengduo da556ed6d4
enhance ParallelExecutor stable (#11637)
7 years ago
Jacek Czaja 98f3ad3ba1 - MKLDNN Softmax Grad Op
7 years ago
tensor-tang d5fb8fa778 Revert "Merge pull request #11628 from PaddlePaddle/revert-11102-mozga-intel/Sum_mkldnn_layout"
7 years ago
Yu Yang 9b3f48d7e6
Merge pull request #11616 from chengduoZH/fix_parallel_exe
7 years ago
tensor-tang 28a0ef9522 remove usr local lib when dynamic load lib
7 years ago
tensor-tang 90780e22ce
Revert "MKLDNN layout: Support for sum operator"
7 years ago
chengduoZH c99fca5f90 Add No Mutex
7 years ago
tensor-tang 3e73a7a924 add usr local lib to dynamic search path
7 years ago
tensor-tang f503f12925 enable dynamic load mklml lib on fluid
7 years ago
mozga-intel 6512be59ec MKLDNN layout: the code-review changes
7 years ago
tensor-tang 9a25f2895c update the default cpu memory with MKLDNN
7 years ago
tensor-tang a8c2ff316f refine the initial cpu memory flag for mkldnn
7 years ago
Qiyang Min 046bb5c8cb Fix NCCLBcast hang up bug in Parallel Executor (#11377)
7 years ago
Xin Pan d2afd21021 Remove cuptiFinalize.
7 years ago
qiaolongfei 9ebbfa6bbc fix build on mac
7 years ago
tensor-tang 056dd40475 add initial memory flag in MB for infer
7 years ago
yuyang18 a1254a86ba Add lock to record_event.
7 years ago
mozga-intel 3ff9ba0e6b Mkldnn layout (#11040)
7 years ago
Xin Pan ca2d6d3c66
Merge pull request #11224 from dzhwinter/fix/cudnn
7 years ago
qingqing01 e0a32074bd
Fix PADDLE_ASSERT. (#10981)
7 years ago
dzhwinter 44c662b4e1 Merge remote-tracking branch 'origin/develop' into fix/cudnn
7 years ago
Yu Yang c36dd3b338
Merge pull request #11114 from reyoung/feature/yep
7 years ago
dzhwinter 2b9ef7e249 "fix"
7 years ago
dzhwinter 75d8e8ca33 "fix compiled in manylinux"
7 years ago
dzhwinter 4777aec9be "done"
7 years ago
dzhwinter 7971d4a310
Feature/deterministic (#11205)
7 years ago
yuyang18 53dab95b75 Static DSO handle
7 years ago
yuyang18 c5115950a8 Use static for dlsym
7 years ago
yuyang18 7cf8b656a2 Remove lock in device context
7 years ago
Xin Pan 7eca286159
Merge pull request #11078 from panyx0718/improve_profiler
7 years ago
gongweibao 4fb7cc7f5e
Move sync_mode device ctx from grpc server (#10881)
7 years ago
Xin Pan 75ea577fd3 allow profiler and timeline to work when dev_ctx is nullptr.
7 years ago
Xin Pan f14e579cc3 clean up
7 years ago
Xin Pan 3cb6395688 better profiler and benchmark
7 years ago
Xin Pan 0d598cf9f6
Merge pull request #10822 from panyx0718/dist_opt
7 years ago
Xin Pan 08e4970e45 follow comments
7 years ago
Xin Pan b4dd4c048d multi-thread handlerequest
7 years ago