Commit Graph

15959 Commits (3d006fa630c722c9803c8c79397df8e6a268e4c3)

Author SHA1 Message Date
chengduo 2450d15b78
disable fuse_all_optimizer_ops (#19966)
6 years ago
Aurelius84 f58c8db668
Require x.dims=label.dims in huber_loss (#20017)
6 years ago
Yang Zhang cde73a7bbf
Expose `mutable_data` as python binding (#19932)
6 years ago
Aurelius84 137e6336ef
Remove constraint that last dimension is forced to be 1 in rank_loss (#19997)
6 years ago
chengduo 101a2b610a Add dtype for coalesce_tensor_op (#20016)
6 years ago
Zhaolong Xing f04f2b232a
fix if else error info (#19974)
6 years ago
gongweibao a7512db2bc
Polish elementwise max min pow document to add more examples. (#19946)
6 years ago
Aurelius84 2b5b4b3c5e
fix dataType in C++ comment in embedding op (#20004)
6 years ago
Tao Luo bcb2903e60
enhance shape error message of mul_op (#19998)
6 years ago
mapingshuo d62360fe5f
fix doc of apply_optimize (#19965)
6 years ago
Chen Weihang 1409586eaa Add LoD empty check for all related sequence ops (#19980)
6 years ago
Huihuang Zheng 88af4ab650
Add new data layer (#19916)
6 years ago
xujiaqi01 f50e701b3b
fix memory leak in HogwildWorker (#19956)
6 years ago
Zeng Jinle b8aff5e5e9
fix buddy_allocator_test, test=develop (#19967)
6 years ago
Zeng Jinle 4a5ce4feb1
Add AdadeltaOptimizer doc (#19875)
6 years ago
Zeng Jinle 7912e6caa1
Expose set_gradient_clip API (#19869)
6 years ago
chengjuntao 0099e54924
refine deformable roi pooling doc (#19944)
6 years ago
zhongpu b1bb23841e add kernel for fill_op, test=develop (#19719)
6 years ago
wangchaochaohu 382d099dcb
add support tensor and tensorlist for strided_slice OP (#19929)
6 years ago
lvmengsi fe218df326
Fix ssdloss num and batch norm format and conv2d (#19754)
6 years ago
lvmengsi 619a241bd0
Fix OpTest of bn (#19062)
6 years ago
Bob Zhu c670058a8d add support of matmul with multiple head even different width and height (#19708)
6 years ago
Liufang Sang 6884dc800a refine ctc align op with padding (#19926)
6 years ago
Zhaolong Xing e89b12884a
FIx C++ inference BUG: When open memory optim and enable trt subgraph at the same time, there is a bug (#19969)
6 years ago
Wojciech Uss 4286a6270d Add support for new QAT models (#18970)
6 years ago
Aurelius84 99a9615a4b
Removing length dims constraints of seq_pad and seq_unpad (#19497)
6 years ago
chengduo cca26f5c42
polish multi process warning info (#19961)
6 years ago
Yi Liu 2efdf0ef35
update en document of shard_index_op (#19963)
6 years ago
jhjiangcs 766bd529d1 add optimizer:dpsgd,test=develop (#19915)
6 years ago
Zeng Jinle 37f76407b0
fix cuda dev_ctx allocator cmake deps, test=develop (#19953)
6 years ago
Yang Zhang ebff68fa74
Add float16 support to `sync_batch_norm_op` (#19681)
6 years ago
Aurelius84 039b9710d5
Remove constraint that last dimension is forced to be 1 by adding lookup_table_v2 (#19735)
6 years ago
Zeng Jinle 80e0f547bb
fix allocator ut,test=develop (#19945)
6 years ago
whs bdb3e376d0
[PaddleSlim] Enhence compressor api in PaddleSlim (#19894)
6 years ago
xujiaqi01 cedc04775c
support change shuffle and train thread num (#19841)
6 years ago
Kaipeng Deng 14625ffe9e
add elementwise mod support float/double. test=develop (#19570)
6 years ago
Jacek Czaja 5b07ca9cdd - ReImplemented pooling fwd mkldnn (#19911)
6 years ago
Zeng Jinle b1e83b33b0
fix huber loss op attr type, test=develop (#19937)
6 years ago
Zeng Jinle cc157d5990
add inplace to assign op, test=develop (#19927)
6 years ago
chengduo 55ce696986
clean tensor array (#19930)
6 years ago
Leo Chen 57606205f5 Make OpTest check grad inplace even if forward has no inplace (#19847)
6 years ago
Zhang Ting cb8f3c03a7 resize Ops support data_layout:channel_last, test=develop, test=document_preview (#19914)
6 years ago
mapingshuo 9901f69677
Forward recompute3 (#19913)
6 years ago
chengduo d7251a8e1e
Delete local execution scopes (#19749)
6 years ago
wopeizl 5452b6a152
remove the useless warning for user to avoid confuse test=develop (#19871)
6 years ago
ruri d31c92a2cd
add mse_loss (#19759)
6 years ago
hong 85b398f171
Add op compatible information (#19910)
6 years ago
Kaipeng Deng 3f021781a1
fix softmax CE time limit check failed (#19846)
6 years ago
Tao Luo a4919d3688
move tree_conv to fluid.contrib.layers (#19918)
6 years ago
石晓伟 30adea0a23
tensor_array_to_tensor_op.cc, test=develop (#19289)
6 years ago
Zeng Jinle 0436efd6a3
Unify DataLoader APIs (#19305)
6 years ago
lvmengsi 4155e62559
add instance norm (#19500)
6 years ago
Zeng Jinle c7f36e7c00
Add lock to cudnn handle calls (#19845)
6 years ago
pawelpiotrowicz 2c5c636514 Add two extra flags for test_analyzer_int8_image_classification to disable fp32/int8 (#19840)
6 years ago
Adam cb65439da8 Add support for other axes in MKLDNN softmax op (#19907)
6 years ago
Jiabin Yang 454254115e
Feature/auto prune in dygraph (#19757)
6 years ago
Aurelius84 418a0967f3
move match_matrix var_conv2d et.al api into fluid.contrib test=develop (#19859)
6 years ago
Pei Yang baccd7e2ca
Add TRT input shape check between model and runtime (#19864)
6 years ago
Pei Yang 74812d1c90
Fix BUGS: paddle-TRT repeatedly sets weight_map and overdeletes repetitive_params (#19825)
6 years ago
Zeng Jinle 747d44980a
Refine err msg of out of gpu memory (#19779)
6 years ago
Aurelius84 fcf53e55ff
support 2-level lod of input in sequence_pool (#19839)
6 years ago
Zeng Jinle b25d1e758d
remove enforce.h file written, test=develop (#19897)
6 years ago
Zhang Ting 93364b45c1 group_norm support data_layout:NHWC, test=develop, test=document_preview (#19614)
6 years ago
Huihuang Zheng e117114289
Set states of recurrent op as dependent vars in prune (#19865)
6 years ago
石晓伟 d004a0f50e
fix multi-thread exec of trt, test=develop (#19338)
6 years ago
Zeng Jinle b754700fb5
fix reduce and broadcast to avoid multi-stream, test=develop (#19889)
6 years ago
Zeng Jinle 8359b415e4
add free chunks to auto growth allocator, test=develop (#19890)
6 years ago
Jacek Czaja 619c797a7f [MKL-DNN] LRN refactoring (#19798)
6 years ago
Zhang Ting 439d95e157 modified interpolate op to support tensor attribute, test=develop, test=document_preview (#19287)
6 years ago
Zhang Ting b38889413d add crop_tensor_op, test=develop, test=document_preview (#19314)
6 years ago
lidanqing 2c32c2d649 Refactor conv computeINT8 (#19574)
6 years ago
joanna.wozna.intel 3f1d0234ae Fix conv2d+dequantize squash for residual fusion (#19545)
6 years ago
Huihuang Zheng a35557d8f4
Fix deps of prune (#19876)
6 years ago
Adam c7e688921b Add template functions for Acquire primitive/primitive_desc (#19867)
6 years ago
flame fe18cfdb4f
hide with inference optim API (#17355)
6 years ago
Leo Chen 578a2f5da3 fix SplitLodTensor when batch_size = 0, test=develop (#19866)
6 years ago
Aurelius84 b125e327aa
Remove constraint that last dimension is forced to be 1 in cross_entropy (#19606)
6 years ago
wopeizl a7c440d303
add precise roi pooling op test=develop (#18960)
6 years ago
Yiqun Liu 3cd985a669
Add a pass to fuse fc+elementwise_add+layernorm (#19776)
6 years ago
Jie Fang d9db94d752 Optimize amp for multi-gpu to enable FP16 gradients transfer across gpus. (#19714)
6 years ago
wangchaochaohu 47af618f70
Strided slice (#19642)
6 years ago
Zeng Jinle 13ca364ceb
remove some flags and add comments to some flags, test=develop (#19813)
6 years ago
123malin 1bc285a53a
add retry function to try to solve grpc error code 14 (#19661)
6 years ago
Zeng Jinle 5eb381a3e2
refine reallocate of workspace size, test=develop (#19843)
6 years ago
石晓伟 71b2ed61bc
support MLU nums, test=develop (#19372)
6 years ago
Zeng Jinle 3f87464e9c
refine executor_gc_helper codes, test=develop (#19814)
6 years ago
LielinJiang 6d72a86b14 fix_roi_transform_bug (#19785)
6 years ago
Zeng Jinle 3fd3b663a8
fix gc bug in controlflow ops, test=develop (#19827)
6 years ago
Leo Chen 982e61f5ff Update elementwise double grad to save gpu memory (#19509)
6 years ago
Zeng Jinle db26de8389
[Bug fix] Disable memory reuse on feeded variables (#19835)
6 years ago
Adam dfdd73cbc0 Add MKLDNNhandlerT templatized class (#19801)
6 years ago
Zeng Jinle cabb9501bd
fix leaky_relu op when alpha is zero, test=develop (#19833)
6 years ago
Pei Yang 9cbc1eff2d
zerocopytensor support uint8, analysis config support profile, analysis predictor support GetInputTensorShape, test=develop (#19822)
6 years ago
chengjuntao 00efd1d8a9
add deformable conv v1 op and cpu version of deformable conv v2 (#18500)
6 years ago
Thunderbrook 40c66f8df9
rm return in vfork (#19734)
6 years ago
Zhaolong Xing 110be57c1b
fix memory optimization type (#19781)
6 years ago
liym27 677e714425 fix pow op, support tensor for agument factor. (#19313)
6 years ago
liym27 bd89a27308 add tensor support for argument shape in reshape op; (#19268)
6 years ago
liym27 88628016b2 add tensor(tensor and tensor in list) support for argument starts and ends in slice op; (#19208)
6 years ago
liym27 e9e3c08777 fix expand op: (#19302)
6 years ago
xujiaqi01 6bf298bf09
support preload thread, optimize hdfs log, fix master+patch bug (#19695)
6 years ago
Huihuang Zheng a0d80754c5
Add comments for CUDA Device Context Allocator related stuff (#19809)
6 years ago
Jiabin Yang cc311bdf95
Feature/add transform data dygraph (#19707)
6 years ago
lvmengsi b76343c3b7
cpu Conv double grad (#19672)
6 years ago
Zeng Jinle 754fd57ed7
disable memory optimization passes when FLAGS_use_ngraph=True, test=develop (#19778)
6 years ago
翟飞跃 93c85c930a Implement FusedEmbeddingSeqPoolGradKernel with cblas_saxpy (#19770)
6 years ago
chengduo 8281497030
Fix warning info of build_strategy (#19805)
6 years ago
Zeng Jinle b34933d9ee
fix retry allocator bug, test=develop (#19794)
6 years ago
Yiqun Liu c67c8758cb
Enhance fc_fuse_pass to enable fusing relu to fc_op (#19733)
6 years ago
Zeng Jinle 32b1151f5e
reduce default value of cudnn workspace size, test=develop (#19780)
6 years ago
zhongpu 52673956de add kernel for squeeze_op, test=develop (#19656)
6 years ago
zhongpu 2a81c3679a add kernel for unstack_op, test=develop (#19538)
6 years ago
Chen Weihang 00d5375e0c
Add prune_backward function to cover complicated test_program.clone situation (#19772)
6 years ago
Kaipeng Deng 99c78b772a
fix softmax axis!=-1. test=develop (#19800)
6 years ago
tianshuo78520a 38f1c2fe28 change approve site (#19791)
6 years ago
Adam d4413a54bc Add common CreateKey for mkldnn handlers (#19767)
6 years ago
Yihua Xu 0d6ea52958 Fix the definition issue when used mkl_scsrmm and mkl_dcsrmm functions. (#19774)
6 years ago
chengduo 056fdedde3
Open fuse all reduce option (#19765)
6 years ago
Aurelius84 8c7e411908
Remove constraint that last dimension is forced to be 1 by adding one_hot_v2 (#19716)
6 years ago
JesseyXujin e352467c1c
modify activation op API, delete use_cudnn args, test=develop, (#19758)
6 years ago
Jacek Czaja 9e4c958552 Refactoring activation mkldnn op (#19748)
6 years ago
Huihuang Zheng 12542320c5
Replace TemporaryAllocator by CUDADeviceContextAllocator (#18989)
6 years ago
Zeng Jinle 0daa5c9772
Make leaky relu inplacable (#19676)
6 years ago
Zeng Jinle 078a678219
refine math_op_patch, test=develop (#19727)
6 years ago
chengduo e506c99c20
Open fuse broadcast option (#18833)
6 years ago
Jacek Czaja 47f670d58c - Softmax mkl-dnn refactoring (#19615)
6 years ago
Yiqun Liu a65c728e5d
Implement the GPU kernel of fc operator (#19687)
6 years ago
Aurelius84 22301115d0
Remove constraint that last dimension is forced to be 1 in huber_loss op (#19562)
6 years ago
chengduo 5866a7a5fe
Enable fused_all_reduce_op_handle support GPU and CPU Gradients (#19418)
6 years ago
Youwei Song 3e5fb6361b fix api-doc error for dygraph and backward (#19721)
6 years ago
Tao Luo ec9bc1bd9f
paddle::framework::vectorize() templatization (#19730)
6 years ago
Zeng Jinle bb4f8dee83
add logs to left var memory size, test=develop (#19722)
6 years ago
Adam 428b2b9e17 MKLDNN handler cleanup (#19713)
6 years ago
XiaoguangHu 27235cf222
Add document annotations for FLAGS that need to be open to external developers test=develop (#19692)
6 years ago
Zeng Jinle 1c25c88aba
refine memory usage of some operators, test=develop (#19700)
6 years ago
wangguanzhong 25dcd74d34
merge empty lod tensor, test=develop (#19228)
6 years ago
yaoxuefeng c6756ed225 fix instag op (#19591)
6 years ago
gongweibao 6c2bc29cc0
Fix float16 optimizer. (#19682)
6 years ago
Zeng Jinle 713c05dd60
refine tensor.mutable_data, test=develop (#19680)
6 years ago
Chen Weihang c78a4781bf
Fix train error when test_program.clone is executed after optimizer.minimize (#19397)
6 years ago
zhongpu 5f627488db add kernel for unsqueeze_op and Add unsqueezed op test, test=develop (#19436)
6 years ago
Zeng Jinle a7691603a5
add gpu_allocator_try_time config, test=develop (#19675)
6 years ago
JesseyXujin 0b06db9413
delete transmission args in linear_chain_crf op (#19619)
6 years ago
Tao Luo f05d2c519d paddle::framework::vectorize() templatization [PART3] (#19643)
6 years ago
hutuxian 1ca6ea0318
fix cmakelist deps (#19668)
6 years ago
Tao Luo bcddbc78d4
remove -Wmaybe-uninitialized warning (#19653)
6 years ago
Zeng Jinle 2db40d9f60
reduce thread num of retry_allocator_test,test=develop (#19638)
6 years ago
wangchaochaohu 4440d7ced0
test=develop cuda realization of label smooth op (#19175)
6 years ago
chengduo 31c5a5ee26 Remove linear_chain_crf_op.cu (#19645)
6 years ago
123malin a25a716e87
Optimize fleet API: add input check for some interfaces (#18971)
6 years ago