Commit Graph

7504 Commits (8137f58c06b5b39e914ea9b0b63570ba8c894bba)

Author SHA1 Message Date
Yu Yang 0760aaf440 Shrink batch_norm_grad's inputs
7 years ago
guosheng 454b0a96be Remove the extra call of ValidateShape in ReshapeKernel
7 years ago
guosheng 437f7a3279 Resolve conflict according to the latest code
7 years ago
Jacek Czaja 3b95b55f07 - Softmax MKLDNN primitive integration
7 years ago
guosheng eb12cbe764 Refine reshape_op infershape
7 years ago
Yu Yang 64d7a30271 Extract SSAGraph
7 years ago
Yu Yang 8dec4ad7a1 Use int not Place for vars
7 years ago
Yu Yang 3181501013 Rerange code
7 years ago
Yu Yang f28ae6e4b1 Reorganize Code
7 years ago
Yu Yang 5c333e4143 Add dctor for dev_ctx
7 years ago
Liu Yiqun 0968753454 Enable the test of not creating variables every time.
7 years ago
Yu Yang 15f5f10ed5 AddInput/AddOutput for OpHandle
7 years ago
Yu Yang 5368e50d84 Reorganize code
7 years ago
typhoonzero 1eec926124 updates
7 years ago
Yu Yang fe7ed285d1 Extract NCCLCtxMap
7 years ago
typhoonzero e9d815e32b prepare and create op before run
7 years ago
Kexin Zhao ed2bc194c5
Merge pull request #9176 from kexinzhao/batch_norm_fp16
7 years ago
fengjiayi cd07c0f021
Merge pull request #9259 from JiayiFeng/dev_MultiEpochReader
7 years ago
Yu Yang 6ebc6bf533 ReorganizeCode
7 years ago
Yiqun Liu 7bb4ea9c13
Add an argument in Executor.Run to allow users to choose whether to create and destroy variables every time. (#9242)
7 years ago
Yu Yang a478a11e0b NCCL Guard for bcast
7 years ago
Yu Yang f2685bed81 Clean code
7 years ago
Yu Yang 41ad632341 Add NCCL Group Guard
7 years ago
Yu Yang 99fe83a020 Move nccl helper
7 years ago
Yu Yang 90f980167d Do not wait computation stream
7 years ago
Yu Yang 7ac969b88c Debug
7 years ago
fengjiayi 809530f418 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_MultiEpochReader
7 years ago
fengjiayi 7c041e48f4
Merge pull request #9182 from JiayiFeng/dev_MultipleReader
7 years ago
fengjiayi e4bd63d0e1
Merge pull request #9240 from JiayiFeng/fix_bug_in_recordio
7 years ago
typhoonzero 18461d0935 wip
7 years ago
wanghaoshuang edb4e29ab7 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into average_model
7 years ago
Kexin Zhao b7801b9fcb small fix
7 years ago
Kexin Zhao 70e7122785 initial commit
7 years ago
Kexin Zhao d60180af39 inital commit
7 years ago
Kexin Zhao c1e9b1e37e
Merge pull request #9231 from kexinzhao/elementwise_add_fp16
7 years ago
Qiao Longfei 37a272e670
add executor.prepare (#9022)
7 years ago
fengjiayi 4286ea6197 Merge branch 'fix_bug_in_recordio' into dev_MultiEpochReader
7 years ago
fengjiayi 0b2f1b3f45 clear stream during Scanner::Reset()
7 years ago
chengduoZH eaa90d38ad add use_pinned
7 years ago
fengjiayi 91b6d60003 Merge branch 'fix_bug_in_recordio' into dev_MultiEpochReader
7 years ago
Yu Yang 599f7a87ba Refine code
7 years ago
Yu Yang 43e54079a8 Debug code
7 years ago
fengjiayi 2532b922dc Add more unittests and fix bugs
7 years ago
Yu Yang e335f01826 Add more logs
7 years ago
Yu Yang 82693e7227 Wait nccl all reduce
7 years ago
Yu Yang eb0a580e78 Add enforce
7 years ago
wanghaoshuang ad63722ed9 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into average_model
7 years ago
Yu Yang 65bc7d17d5 Add mtx to ncclAllReduce
7 years ago
Yu Yang d42117e742 Set NumThreads
7 years ago
Yu Yang ba227df941 Expose num_threads
7 years ago
Yu Yang 1533bf12df Use event and single thread
7 years ago
Yu Yang 176277b824 Add log
7 years ago
Yu Yang ed7727e8f0 Fix bug in system allocator
7 years ago
Yu Yang 95a0d7c7c1 Illegal memory access
7 years ago
Yu Yang 798e6907b4 Change mem order
7 years ago
fengjiayi f863866471 Add an unitest
7 years ago
Yu Yang 1c2b6100b0 Add
7 years ago
武毅 5008020d19
Merge pull request #9154 from typhoonzero/pserver_parallel
7 years ago
Yu Yang a0494f8e55 Mutex lock wait
7 years ago
Yu Yang 4e43b71377 Add wait log
7 years ago
Yu Yang dbed123382 Debug
7 years ago
Yu Yang e53b6aba63 Use no thread
7 years ago
Yu Yang a8bd7b9809 Add log
7 years ago
fengjiayi 02b7d8bea5 Merge branch 'fix_bug_in_recordio' into dev_MultipleReader
7 years ago
Yu Yang 3c9cea597e Add more log
7 years ago
Yu Yang f8f1a963d9 Add debug code
7 years ago
Yu Yang fbbcedda01 Fix bug
7 years ago
Yu Yang 7643c2cbab Add flag for use event
7 years ago
Yu Yang ca4b3d2532 Use 12 threads
7 years ago
fengjiayi c346a345e0 fix a bug
7 years ago
Yu Yang f251a58e85 Use base class manage events
7 years ago
typhoonzero 3666d7c02f fix num_blocks==2
7 years ago
chengduoZH 236b7dd2bd add pinned memory
7 years ago
Yu Yang 1dd216dc3b Wait bcast param
7 years ago
Yu Yang 4185dd48e4 Disable multi-thread
7 years ago
Yu Yang 631aa3d10a Wait all inputs ready
7 years ago
Yu Yang 9b1f4d5d62 After nccl add event
7 years ago
sabreshao e50205e744 CMake refine for HIP support.
7 years ago
fengjiayi a2981f5c50 fix a bug
7 years ago
Yu Yang feb569f8ea Add log
7 years ago
Yang yaming 381c6a026d
Merge pull request #9100 from pkuyym/fix-9049
7 years ago
Kexin Zhao d307b5e4a6 Merge remote-tracking branch 'upstream/develop' into elementwise_add_fp16
7 years ago
typhoonzero 139ae08fdf workable
7 years ago
Kexin Zhao 5271c32d24
Merge pull request #9223 from kexinzhao/dropout_fp16
7 years ago
Yu Yang 260cfe3b86 Stop Wait NCCL Stream
7 years ago
Yu Yang e025e284c6 Exchange wait op
7 years ago
Yu Yang 3238ce0672 Add wait
7 years ago
Yu Yang 8a9de67e17 Remove wait
7 years ago
Yu Yang d2cb3790e9 Wait all evernts
7 years ago
Kexin Zhao 3da094fd7b rearrange test
7 years ago
Yu Yang 4137bb4eda Add wait
7 years ago
fengjiayi 832deee448
Merge pull request #9178 from JiayiFeng/fix_bugs_in_reader
7 years ago
Yu Yang 3da4159f88 Add run iter
7 years ago
Yu Yang d3c82c356e Wait multiple stream
7 years ago
Yu Yang c18c2f6ab0 Sync all computation streams at the end of run
7 years ago
wanghaoshuang e01c770c05 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into average_model
7 years ago
wanghaoshuang d22f4de794 Refine sum_accumulates_op.
7 years ago
yangyaming 2c22552542 Fix some comments and adapt test_machine_translation.py.
7 years ago
fengjiayi 6f7e812bb3 fix bugs
7 years ago
yangyaming 2f2c5f5e60 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix-9049
7 years ago
Kexin Zhao 4bf168b274 add fp16 kernel for elementwise add
7 years ago
Kexin Zhao 182da95317 small fix
7 years ago
Kexin Zhao f2bbbb2b66 fix arithmetic operator
7 years ago
Kexin Zhao 18d616ed70 add float16 arithmetic operators on new GPU
7 years ago
Kexin Zhao d03dbb97f9 remove AttrType
7 years ago
Kexin Zhao 05ad15832a initial commit
7 years ago
Xi Chen 9eae086e39 add math_function to softmax's dep list
7 years ago
emailweixu b3f076a6e4
Merge pull request #9168 from emailweixu/fix_compile
7 years ago
chengduo 597ba3f3f2 add more times close test (#9215)
7 years ago
yangyaming 869a6f9cea Add python wrapper.
7 years ago
yangyaming ea788fc5df Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix-9171
7 years ago
Yu Yang c372ce2885 Add event for computational op
7 years ago
Yu Yang b94ffacbd7 SetDev
7 years ago
Yu Yang 99f85a9fbc Set dev
7 years ago
Yu Yang d26f093f9d Log
7 years ago
Yu Yang d55a03d916 Scale loss on place
7 years ago
Yu Yang 932364a275 Sync dev
7 years ago
Yu Yang dad7bdabd4 Add setDev
7 years ago
fengjiayi 3bcffd495e Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dev_MultiEpochReader
7 years ago
Yu Yang 7fd0d24e0c Add lgo
7 years ago
Yu Yang bade579826 Wait code
7 years ago
Yu Yang 4a330094f9 Add log
7 years ago
Yu Yang 9824e8f311 Scale loss op use event
7 years ago
Yu Yang 071043c388 Add paddle enforce
7 years ago
Yu Yang 8af57706e2 Only wait same device
7 years ago
Yu Yang 29cc9f308d SetDev for nccl
7 years ago
Liu Yiqun 961151f17a Disable the link flags on Mac.
7 years ago
Yu Yang d7badb3ed2 Use event to sync stream
7 years ago
Yu Yang 3aa7051b98 Remove DevCtx lock
7 years ago
Yu Yang 1f53193a63 Use atomic code
7 years ago
Yu Yang c7beac1426 Add dummy var
7 years ago
Yu Yang 5fa535b717 Wait all thread done
7 years ago
fengjiayi d9868b0839 Add multi_pass_reader
7 years ago
Xin Pan 898e0ffa21
Merge pull request #9190 from panyx0718/p2p
7 years ago
Yu Yang 7bff02b2ca Change to pending op
7 years ago
Yu Yang 866f6f1be0 Debug
7 years ago
Yu Yang d3e55fde03 Guard devctx
7 years ago
Yu Yang a5ba704de0 Counter
7 years ago
Yu Yang a87ce91c4b Use mtx
7 years ago
Yu Yang ea11a0a853 Use volitie
7 years ago
Tomasz Patejko 2d95527527 Removing WITHIN_CHANNEL algorithm for lrn. CPU lrn operator works only with ACROSS_CHANNELS
7 years ago
Xin Pan ce55975bb5 fix
7 years ago
Xin Pan 18ac6947d0 Enable P2P memory copy
7 years ago
Tomasz Patejko c51c446221 Content of GetExpectedKernelType moved to standalone function
7 years ago
Tomasz Patejko 192cc5dd32 Implementation of MKLDNN LRN
7 years ago
yangyaming 332b665fc7 Enhanced cpp implementation and unit test.
7 years ago
Yu Yang 515e516e77 Add more log
7 years ago
Yu Yang 1f063d0900 Memorder
7 years ago
Yu Yang b1cb8bbd40 Debug
7 years ago
Yu Yang b57b880b05 Debug
7 years ago
Yu Yang f3e983e499 Memory order
7 years ago
Yu Yang 36e0415220 Single Thread
7 years ago
Yu Yang 5957f28b86 Debug
7 years ago
Yu Yang f52714d391 Debug
7 years ago
Yu Yang 0023c3bcf5 Use atomic bool
7 years ago
caoying03 a6e64242d8 follow comments.
7 years ago
Tao Luo c068d9c19e
Merge pull request #9065 from Xreki/core_inference_shared_library
7 years ago
Yu Yang 09935ab936 Debug
7 years ago
Yu Yang f8141d90c8 Debug
7 years ago
Yu Yang e18a269705 Add debug code
7 years ago
Yu Yang 9cb8f50302 Complete fetch op
7 years ago
fengjiayi 07d38a9b9a refine patch
7 years ago
fengjiayi a571ef382e fix bugs
7 years ago
Kexin Zhao 446d54f5c3 update
7 years ago
Kexin Zhao ffa22a5f90 fix scaling param type
7 years ago
Tao Luo c0421379b7
Merge pull request #9043 from Xreki/core_inference_remove_clone
7 years ago
caoying03 c87d11a716 Merge branch 'develop' into enhance_reshape
7 years ago
Kexin Zhao e870947cfd fix batch norm fp16 param type
7 years ago
wanghaoshuang 92a01d4994 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into average_model
7 years ago
wanghaoshuang e0b136c0f9 Refine average accumulates op
7 years ago
typhoonzero 093e07d39e need to split var scopes
7 years ago
fengjiayi 3d677b1eca fix compile errors and make OpenFilesOpMaker derived from FileReaderMakerBase
7 years ago
fengjiayi 550622529c Add MultipleReader and open_files_op
7 years ago
Kexin Zhao 0a95a44b9a add python batch norm inference test
7 years ago
Kexin Zhao df99b16a16
Merge pull request #9167 from kexinzhao/pool2d_fp16
7 years ago
Kexin Zhao 39c676e208 initial commit
7 years ago
xuwei06 ab3543e35e Fix compilation for gcc5.4
7 years ago
Kexin Zhao 8ebfc153dd update
7 years ago
Kexin Zhao 3f5705c346
Merge pull request #9148 from kexinzhao/cast_op_fp16
7 years ago
Kexin Zhao bfbc25bdb8 add fp16 pool2d support
7 years ago
Darcy 02b3cfb156
Merge pull request #9137 from putcn/fix-capi-dep
7 years ago
yangyaming 3b03e3748d Refine some ENFORCE.
7 years ago
yangyaming 58730ba131 Enhance unit test.
7 years ago
yangyaming bf3f56e899 Finish adaption for backward.
7 years ago
sabreshao 45c988d86a Demostration of cmake refine for HIP support.
7 years ago
Yu Yang 254d7ff4f5 Refactor local_scopes
7 years ago
Yu Yang b2c7a9b828 Wait by stream
7 years ago
Yu Yang e8a7e5d1e6 Update
7 years ago
Yu Yang 8f0590e7c5 Add ncclAllReduce
7 years ago
Yu Yang c15d2c9edc Update
7 years ago
Yu Yang d470763f6c Stash
7 years ago
Yu Yang 9fc0b596a9 Test more
7 years ago
Liu Yiqun 371c53f88c Add profiling event in feed, fetch and load op.
7 years ago
Yu Yang 0ef9edf566 Stash
7 years ago
Liu Yiqun 253ba6672f Merge branch 'develop' into core_inference_remove_clone
7 years ago
Yu Yang 5e87cd7574 Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into cpp_parallel_executor
7 years ago
Yu Yang 8b397d1602 Make recordio file reader thread-safe by default
7 years ago
Yu Yang 8c9cd369dc Polish code style
7 years ago
Yu Yang 6f0dfd89a4 Single GPU ParallelExecutor complete
7 years ago
qiaolongfei a39c861530 rm unused private field in profiler
7 years ago
typhoonzero b8f4c8599e pserver runs in parallel
7 years ago
Kexin Zhao 8e7310146f
Merge pull request #9143 from kexinzhao/numpy_conv2d_pool2d_fp16
7 years ago
Kexin Zhao f3c5e81556 add fp16 for cast op
7 years ago
Xin Pan 21e2c42a46
Merge pull request #9141 from panyx0718/develop
7 years ago
Tao Luo a448fbe9e1
Merge pull request #9134 from putcn/fix-selected-row-dep
7 years ago
Tao Luo 20be8e7e33
Merge pull request #9104 from ranqiu92/doc_dir
7 years ago
Xin Pan 1ca1e1c384 Fix a program copy regression.
7 years ago
qingqing01 7c1a0b77a0
Delete the detection_output_op, which had been split into several operators. (#9121)
7 years ago
Kexin Zhao e967d19b0a add more tests
7 years ago
Kexin Zhao a13ec3432a fix test error
7 years ago
Xi Chen 845592708f add gserver to capi dep
7 years ago
Kexin Zhao e4de5dc347 add conv2d fp16 support
7 years ago
Xi Chen d20c6eb6de add math_function to selected_rows_functor dependency list
7 years ago
qingqing01 1cd700d8e8
Fix bug in LRN operator. (#9124)
7 years ago
ranqiu 64775126f3 change the dir of docs
7 years ago
qingqing01 b5a16dca20
Fix a critical bug in softmax_with_cross_entropy_op backward. (#9120)
7 years ago
Yu Yang d84ddcf123 Stash
7 years ago
Yu Yang 193c0a7e43 Handle var hazard
7 years ago
Thuan Nguyen 1e4c504e60 Implement Select OP (#9088)
7 years ago
qingqing01 45073b7c39
Always synchronize when copy data on GPU from C++ to Numpy array. (#9110)
7 years ago
Yu Yang 35744e7b36 Polish code
7 years ago
Xin Pan d284cf88e5
Merge pull request #9037 from panyx0718/develop
7 years ago
Yu Yang ae88fdefb7 Use thread pool
7 years ago
dzhwinter 128adf53cb
[Speed]implement cudnn sequence softmax cudnn (#8978)
7 years ago
yangyaming 352fa41a16 Finish adapting forward.
7 years ago
wanghaoshuang d7e5e1f13d Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into average_model
7 years ago
Kexin Zhao e26f1123da
Add fp16 mul op support and bind paddle fp16 to numpy fp16 (#9017)
7 years ago
dzhwinter 7140071152
"exported scatter to python" (#9038)
7 years ago
Tao Luo cf2addd21f
Merge pull request #9067 from luotao1/with_fluid
7 years ago
chengduo 11c43e5da3
Merge pull request #9072 from chengduoZH/feature/refine_parallel_do
7 years ago
Abhinav Arora 41894da145
Add changes to channel that are needed for select op (#9084)
7 years ago
wanghaoshuang 8a645685ce Add sum accumulator with window for model average
7 years ago
Yu Yang 692a0f7425 Better name
7 years ago
Yu Yang baef1124fb ParallelExecutor And dependency engine
7 years ago
Yibing Liu 90afbd2856 Move back operator's event to RunImpl()
7 years ago
Xin Pan 4840c49b27 Better timeline
7 years ago
chengduoZH ef28e7deba refine parallel_do_grad
7 years ago
Luo Tao 76e1c6af9f enable WITH_FLUID option
7 years ago
Liu Yiqun 6c614814da Limit the symbol table of fluid shared library.
7 years ago
Yu Yang 48f213e5a1
Merge pull request #8991 from reyoung/feature/shuffle_reader
7 years ago
Cao Ying 881c5227ab
Merge pull request #8843 from zhouhanqing/Paddle-ReduceProd
7 years ago
Liu Yiqun 9ed8e2a082 Merge branch 'develop' into core_inference_remove_clone
7 years ago
Liu Yiqun 8ecad98578 Add the bool variable to decide whether to have a copy of the program in ExecutorPrepareContext.
7 years ago
武毅 d13ce35875 Feature/send recv can now retry (#9027)
7 years ago
dzhwinter 14fe40aaa6
Refine/nccl (#9009)
7 years ago
chengduo 788c600e9d
Merge pull request #8932 from chengduoZH/feature/add_concat_rows
7 years ago
Yang Yang 8f061e43b7 delete param name
7 years ago
Yang Yang 0621c327f1 init commit
7 years ago
Liu Yiqun c0a9aebe1c Remove the clone of program in C++ Executor.Run().
7 years ago
chengduoZH 92e2207e18 refine doc
7 years ago