Commit Graph

782 Commits (12e4be038262fcb33b089710edaed2fe62320e3d)

Author SHA1 Message Date
633WHU 12e4be0382 Dlpack support (#20039)
5 years ago
Wilber 751812a674
enable cpu machine to run paddle in gpu lib
5 years ago
Zeng Jinle 1d1d221f26
refine allocator_flag, test=develop, test=document_fix (#20400)
5 years ago
danleifeng 425279a57b Improve elementwise operators performance in same dimensions. (#19763)
5 years ago
qingqing01 1a3eef026c
Enable users to create custom cpp op outside framework. (#19256)
5 years ago
liym27 24010472d4 fix pool2d pool3d,support asymmetric padding and channel_last (#19739)
5 years ago
Chen Weihang b916335025 Paddle error message stack shaping and optimization (#19895)
5 years ago
joanna.wozna.intel 1d32897c5c Fix test pool2d int8 mkldnn (#19976)
5 years ago
Zeng Jinle 37f76407b0
fix cuda dev_ctx allocator cmake deps, test=develop (#19953)
6 years ago
Jacek Czaja 5b07ca9cdd - ReImplemented pooling fwd mkldnn (#19911)
6 years ago
chengduo d7251a8e1e
Delete local execution scopes (#19749)
6 years ago
Zeng Jinle c7f36e7c00
Add lock to cudnn handle calls (#19845)
6 years ago
Zeng Jinle b25d1e758d
remove enforce.h file written, test=develop (#19897)
6 years ago
Jacek Czaja 619c797a7f [MKL-DNN] LRN refactoring (#19798)
6 years ago
lidanqing 2c32c2d649 Refactor conv computeINT8 (#19574)
6 years ago
Adam c7e688921b Add template functions for Acquire primitive/primitive_desc (#19867)
6 years ago
Zeng Jinle 13ca364ceb
remove some flags and add comments to some flags, test=develop (#19813)
6 years ago
Zeng Jinle 5eb381a3e2
refine reallocate of workspace size, test=develop (#19843)
6 years ago
Adam dfdd73cbc0 Add MKLDNNhandlerT templatized class (#19801)
6 years ago
Zeng Jinle 32b1151f5e
reduce default value of cudnn workspace size, test=develop (#19780)
6 years ago
Adam d4413a54bc Add common CreateKey for mkldnn handlers (#19767)
6 years ago
Yihua Xu 0d6ea52958 Fix the definition issue when used mkl_scsrmm and mkl_dcsrmm functions. (#19774)
6 years ago
Jacek Czaja 9e4c958552 Refactoring activation mkldnn op (#19748)
6 years ago
Huihuang Zheng 12542320c5
Replace TemporaryAllocator by CUDADeviceContextAllocator (#18989)
6 years ago
Adam 428b2b9e17 MKLDNN handler cleanup (#19713)
6 years ago
XiaoguangHu 27235cf222
Add document annotations for FLAGS that need to be open to external developers test=develop (#19692)
6 years ago
Tao Luo f05d2c519d paddle::framework::vectorize() templatization [PART3] (#19643)
6 years ago
Yiqun Liu 42b5bec6f9
Integrate NVRTC to support compiling CUDA kernel at runtime (#19422)
6 years ago
Tao Luo 3ae939e48a
unify PADDLE_ASSERT_MSG into PADDLE_ENFORCE(error_message) (#19631)
6 years ago
Tao Luo 75d1571995
refine PADDLE_ENFORCE codes for unify PADDLE_ASSERT_MSG (#19603)
6 years ago
Adam e94b26daf5 using MKLDNNMemoryFormat = mkldnn::memory::format changes (#19568)
6 years ago
Tao Luo 49523ea189
replace PADDLE_ASSERT with PADDLE_ASSERT_MSG (#19586)
6 years ago
zhouwei25 84c728013c fix the compilation issue on windows caused by mkl_CSRMM (#19533)
6 years ago
Jacek Czaja cef95ee30d [MKL-DNN] Refactoring Softmax (#19312)
6 years ago
Zeng Jinle 0a73f7202a
Add retry_allocator for gpu (#19409)
6 years ago
Jacek Czaja ecd9f330c9 [MKL-DNN] Fix to face model on AVX512 platforms (#19282)
6 years ago
liuwei1031 d6cb1a4122
add dynamic C runtime support on windows, test=develop (#19502)
6 years ago
Zeng Jinle c2c5b1b941
remove signal raise msg, test=develop (#19527)
6 years ago
Zeng Jinle caf59d0f3f
Add signal message to stderr (#19421)
6 years ago
Yi Liu efb05ba258
supports multiple NCCL communicators preserved in NCCLCommContext (#19407)
6 years ago
wopeizl b8aa37d529
save the callstack information to file when exception throws test=dev… (#19324)
6 years ago
Tao Luo 6527a7df67
replace part of PADDLE_ASSERT to PADDLE_ENFORCE (#19285)
6 years ago
Yihua Xu b920395842 Use sparse matrix to implement fused emb_seq_pool operator (#19064)
6 years ago
Zeng Jinle 91a0911ca3
Make PADDLE_ENFORCE_EQ support types that cannot be converted to std::string (#19243)
6 years ago
Zeng Jinle 708bd9798d
move_flags_to_unified_files_for_management, test=develop (#19224)
6 years ago
Zeng Jinle 002f325dcd
add PADDLE_ENFORCE_CUDA_SUCCESS, test=develop (#19211)
6 years ago
Adam b837689e97 Add generalized Conv+Activation MKLDNN fuse pass creation (#19072)
6 years ago
gongweibao 29d8781240
Polish fleet API to support cuda collective mode and nccl2 mode. (#18966)
6 years ago
wopeizl 80b7ef6fc8
add tensorrt support for windows (#19084)
6 years ago
Zhang Ting c2063217e7 optimize error message for "embedding" and "cross_entropy" OP (#18765)
6 years ago