Commit Graph

4548 Commits (b34933d9ee3b61dbbd642fd02f244c36d0d14550)

Author SHA1 Message Date
guru4elephant ab57d3893e
make auc op compatible with 1 dim (#18551)
6 years ago
Hongyu Liu a20b2b43fc
fix cudnn lstm shape bug; test=develop (#18492)
6 years ago
Zeng Jinle d3003a1620
Feature/buffer_shared_inplace (#17911)
6 years ago
Zeng Jinle be24e5b391
Clean unused code of dim and place (#18565)
6 years ago
Jacek Czaja 8869d7f735 Activations MKLDNN ops refactoring (#18191)
6 years ago
Yibing Liu b86234fc0b
Register fp16 for concat_op (#18563)
6 years ago
Physher 5e1220ef37 fix compile error which caused by gcc4.8 related commit;test=develop (#18567)
6 years ago
Jiabin Yang 667f88f9a6
Fix/gcc 4.8 ubt link error (#18558)
6 years ago
Physher 0caa08ea40 Add mkldnn int8 mul-op kernel (#17834)
6 years ago
LielinJiang 24d1c44a0c Fix roi_perspective_transform_op bug (#18522)
6 years ago
Zhaolong Xing 88b52a27fe
Inference: fix mask rcnn model diff, optim memory usage, memory leak. (#18532)
6 years ago
zhaoyuchen2018 832d8191ff
Fix topk cannot handle 1D vector bug (#18466)
6 years ago
qingqing01 7ac4818a98
Refine Infershape in activation_op for double_grad. (#18485)
6 years ago
chengduo 7453857324 Make fuse_all_reduce_op_pass support mix_precision (#17652)
6 years ago
zhoukunsheng 7c6f2350b9 support Tensor input for edit_distance op (#18162)
6 years ago
zhoukunsheng 26318544d2 support Tensor input for chunk_eval op (#18226)
6 years ago
zhoukunsheng 206c44e2a8 add unique kernel and op (#17557)
6 years ago
zhoukunsheng 71af72b1c2 upgrade hash op to support Tensor and LoDTensor input (#17998)
6 years ago
zhoukunsheng d3b3443d10 add ones_like op (#17388)
6 years ago
zhoukunsheng 67b48d7fe7 add size op (#17412)
6 years ago
Leo Zhao 8f5fffca0a rename mkldnn set/get_cur_thread_id() to set/get_cur_mkldnn_session_id() (#18453)
6 years ago
Yi Liu a873fa84ce
supports collective training with programs (#18392)
6 years ago
chengduo e0d8c6ac68
Add find_no_grad_vars in backward.py (#17942)
6 years ago
LielinJiang 449c7a9f98 Make roi_perspective_transform op return mask and transform matrix (#18371)
6 years ago
Brian Liu 4bc2987d2f Fix bug in quantize kernel which cause crash in vgg16/19 model (#17964)
6 years ago
Leo Zhao 681d3553f1 Fix potential mkldnn concat/pool/conv kernel issues (#18393)
6 years ago
Zeng Jinle f5641000bb
Add a unittest to inplace elementwise_add (#18385)
6 years ago
tangwei12 999d9a59a5
fix communicator with pyreader (#18350)
6 years ago
HaoRen b7128bac5f supports collective communicated training (#18175)
6 years ago
Sylwester Fraczek 9252e8fa08 add int8 mkldnn prior_box (#17242)
6 years ago
Jacek Czaja c2efdfd5bc [MKL-DNN] Extending reusing to Elementwise_add_mkldnn op (#18146)
6 years ago
qingqing01 9047ac687e
Simplify multi_box_head API in detection.py and remove assign op. (#18310)
6 years ago
Yibing Liu 23941e43ec
Update lamb optimizer (#18333)
6 years ago
tensor-tang 81ec538279
fix softrelu doc (#18324)
6 years ago
Hongyu Liu df2eee71d8
Sequence mask support tensor (#18249)
6 years ago
Qiao Longfei 0e08e91c18
optimize communicator merge sparse gradient test=develop (#18159)
6 years ago
Yibing Liu f57ee3693b
Fix the bug of sequence_unpad op (#18290)
6 years ago
chengduo 5489216eba
Clean build strategy (#18148)
6 years ago
songhao 6b3d96254d fix some bug when merge sparse embedding parameters, test=develop (#18223)
6 years ago
xiaoting b58bb80248 set src_idx > 0 for bilinear_interp_op (#18238)
6 years ago
Hongyu Liu cefd0fb598
Fix slice op shape=-1 bug (#18107)
6 years ago
翟飞跃 802ea50956 fix spelling errors (#17941)
6 years ago
FlyingQianMM 944c3165ec
fix type error of std::pow in sigmoid_focal_loss_op.cu and sigmoid_focal_loss_op.h (#18152)
6 years ago
Zeng Jinle 6eec66a1b1
Fix py_reader iterable bug (#18108)
6 years ago
qingqing01 80d2e66f9e
Update backward appending stragety to support double backward and fix some bug. (#18104)
6 years ago
FlyingQianMM ff83655f7e
add detection output operator for supporting retinanet (#17896)
6 years ago
FlyingQianMM 0aee1f0074
add sigmoid focal loss operator for supporting retinanet (#17895)
6 years ago
FDInSky 9e4b9d9798 Update generate_proposal_labels_op to support CascadeRCNN. (#17200)
6 years ago
FlyingQianMM 9ed2f936f1
add target assign operator for supporting retinanet (#17893)
6 years ago
chengduo 24e988a471
Fix bug of scope_buffered_ssa_graph_executor (#18100)
6 years ago
whs 354643d8d9
Add warning for cudnn warpctc kernel in CUDA9\CUDA10. (#18046)
6 years ago
Yiqun Liu 660c1a65f3
Optimize fused_elewise_activation_grad op. (#18041)
6 years ago
lidanqing f8ecc3de89 refactor the function ConvFwdPrimitiveDesc (#17897)
6 years ago
Wojciech Uss 78e932862c Added unit test for QAT FP32 & INT8 comparison (#17814)
6 years ago
tensor-tang 566bf2ec56
concat op support negative axis (#18045)
6 years ago
Yiqun Liu 7e463c84a6
Optimize the concat and split cuda implementation for cases when the number of inputs/outputs is less than 5. (#17979)
6 years ago
tangwei12 101f74cb19
fix save/load in fleet (#17675)
6 years ago
Guo Sheng a06b316b94
Fix GetExpectedKernelType of add_position_encoding_op (#17935)
6 years ago
wawltor 8eb134c3c1
Fix scatter and gather op when has duplicate index (#17952)
6 years ago
lujun 75fcd29220
update load_error_info, test=develop (#18000)
6 years ago
wawltor 2ae8decc90
test=develop (#17984)
6 years ago
cjt222 871af28d6c
add deformable psroi pooling (#17827)
6 years ago
SunGaofeng 40885c225b
add unfold op (new op),test=develop (#17944)
6 years ago
Jacek Czaja 84bb45c054 [MKL-DNN] Thread-Safety for MKL-DNN reusing Part 1 (#17965)
6 years ago
石晓伟 bce259e5bf
Update the Anakin interfaces for content-dnn and MLU (#17890)
6 years ago
Tao Luo 53fd507bae
fix merge conflict of 'Remove attribute in Allocator::Allocate' and elementwise_add_mkldnn_op (#17949)
6 years ago
jerrywgz aab4d12c0e
refine GetExpectedKernelType in conat op, test=develop (#17934)
6 years ago
Zeng Jinle 3ece61f71e
Remove attribute in Allocator::Allocate (#17878)
6 years ago
Yibing Liu 33d1e56506
Enable seq_pool op to accept len 0 input (#17284)
6 years ago
Yihua Xu 9b5017366a Fix the format issue when 'X' is not nchw. (#17833)
6 years ago
Hongyu Liu 8062bd510c
Reshape support tensor attribute (#17781)
6 years ago
Zeng Jinle 0a96ec699c
fix conv v7 workspace size limit error, test=develop (#17902)
6 years ago
Yihua Xu 14a32bf0c4 Fix the accuracy issue while using float precision to get the scale. (#17884)
6 years ago
gongweibao fbbdc9ccad
Add backward and optimizer operator dependency pass. (#17746)
6 years ago
baojun e2c1b7c354 [NGraph] cache compiled function instead test=develop (#17845)
6 years ago
Zhaolong Xing ae576f3c68
fix: when use the load model from memory mode, the RAM occupy is high (#17788)
6 years ago
Zhaolong Xing 5efe8c7287
fix bug: the lod_tensor_to_array op will aplly a new var but not release when dong inference (#17856)
6 years ago
pawelpiotrowicz 39bc8a55a4 [NGraph] Enable ngraph layer_norm operator (#17599)
6 years ago
baojun a4c528a31c [NGraph] some ngraph updates to enable bert (#17739)
6 years ago
baojun 7611208ab7 [NGraph] added gather_grad to ngraph test=develop (#17646)
6 years ago
jerrywgz 92d9bdfce2
fix api doc in slice op, test=develop (#17804)
6 years ago
Hongyu Liu dfec676270
expand op supprt tensor attribute (#17773)
6 years ago
Hongyu Liu 82358bfdc1
ont hot support tensor depth (#16972)
6 years ago
Brian Liu 7cfddf22c8 Optimize bilinear interpolate op with OpenMP (#17800)
6 years ago
wangchaochaohu c10157a5df
revise the cudnn conv choose algorithm to improve the performance(mask rcnn benchmark) (#17753)
6 years ago
mozga-intel 6a6bf597f7 [NGraph] Enable elementwise_div operator test=develop (#17515)
6 years ago
lidanqing d7c5c2bd64 Add input format in Transpose GetHash (#17737)
6 years ago
baojun 2c58f1a83c [NGraph] Added lookup table to ngraph engine test=develop (#17647)
6 years ago
pawelpiotrowicz bacc822492 [NGraph] Enable transpose ngraph operator (#17636)
6 years ago
baojun 90eae0b39a [NGraph] Addded slice op to ngraph test=develop (#17648)
6 years ago
baojun 2fbaa5c075 [NGraph] added matmul op to ngraph engine test=develop (#17645)
6 years ago
Bai Yifan bba57cdd82
Add deformable conv v2 op,test=develop (#17145)
6 years ago
YishengCheng bd15912d65 fix bug for ctr_reader for svm data (#17575)
6 years ago
Yiqun Liu 8fd39f3e99
Enhance fused_elementwise_activation op and add python api in contrib.layers (#17236)
6 years ago
Yiqun Liu 2704479bb2
Optimize recurrent_op using Prepare and RunPreparedContext, avoiding create operators in every iter. (#17689)
6 years ago
pawelpiotrowicz 9b99876442 Enable less_than ngraph operator (#17642)
6 years ago
hutuxian 4ff87c049d
remove useless input 'Softmax@GRAD' from softmax_with_cross_entropy op (#17612)
6 years ago
pawelpiotrowicz 70a887af63 [NGraph] Add reduce_sum operator for Ngraph (#17450)
6 years ago
baojun 29baca0dd8 add depthwise_conv2d op to ngraph engine (#17454)
6 years ago
gongweibao 0d561ef442
fix 2dconn test=develop (#17681)
6 years ago
mozga-intel ccf9e2327b [Lite] Enable cast operator test=develop (#17294)
6 years ago
Yiqun Liu 5782dddad0
Optimize the concat and split kernel for specical cases when the number of inputs/outputs is 2 (#17415)
6 years ago
lidanqing 04b6c29ee0 Improve mobilenetv2 INT8 performance by using INT8 relu as post-op (#17570)
6 years ago
Tao Luo 962eed6f82
Revert "Enable SQRT operator for the nGraph Bridge (#17549)" (#17680)
6 years ago
Krzysztof Binias f34830e2aa Enable SQRT operator for the nGraph Bridge (#17549)
6 years ago
Sylwester Fraczek 96845d2168 add Concat quantization (#17448)
6 years ago
gongweibao 65bbf950ee
Add multi-ncclcomm and 2D ncclallreduce support. (#17263)
6 years ago
Krzysztof Binias b1bd483a7d [NGraph] Enable gelu operator for the nGraph Bridge. (#17547)
6 years ago
chengduo 343017324e
Polish Print Op (#17651)
6 years ago
Zeng Jinle 4aa931dd85
Code clean of Allocator (#17602)
6 years ago
Guo Sheng 430e25654b
Fix the usage of out_grad lod in sequence_slice_op. (#17625)
6 years ago
hutuxian 1670db5e86
Gather Op Index Support int64_t datatype (#17610)
6 years ago
mozga-intel 2b83d75bfa Enable elementwise pow operator for ngraph (#17526)
6 years ago
Zhaolong Xing 61221ebc28
TRT: Support set dynamic range in int8 mode. (#17524)
6 years ago
Michał Gallus 0c39b97b4e [MKL-DNN] Add Fully Connected Op for inference only(#15226)
6 years ago
Krzysztof Binias e9216d0602 Enable logical operators for the nGraph Bridge. (#17543)
6 years ago
Yibing Liu e8990e64f6
Fix trust ratio in lamb (#17614)
6 years ago
chengduo b5f4d5ed0e
Add broadcast operators (#17503)
6 years ago
Sylwester Fraczek bccb0ba49a fix quantize_squash_pass segfault when no tensor linked to Bias (#17292)
6 years ago
mozga-intel 0d4cbdad91 [NGraph] Enable elementwise mul operator (#17552)
6 years ago
mozga-intel f2694e122d [NGraph] Enable assign operator for a ngraph, test=develop (#17437)
6 years ago
mozga-intel cf02cb5e98 Enable elementwise sub operator for ngraph (#17527)
6 years ago
tensor-tang 7ae461eb13
[CPU] refine cpu softmax bwd (#17534)
6 years ago
tensor-tang 0600b370ea
[CPU] refine softmax op fwd on CPU (#17522)
6 years ago
mozga-intel 035771512d Enable elementwise min operator for ngraph (#17521)
6 years ago
jerrywgz c1aae8b8d2
Fix GetExpectedKernelType in Concat op (#17459)
6 years ago
Qiao Longfei 58f7695ab2
Async exe support communicator (#17386)
6 years ago
mozga-intel 109b5aed5a [NGraph] Enable reshape operator test=develop (#17512)
6 years ago
Krzysztof Binias 43d15b9d96 Enable square operator for the nGraph Bridge. (#17551)
6 years ago
Sevin F. Varoglu f86f49e779 [NGraph] add increment op to ngraph engine (#16929)
6 years ago
baojun 8923612b10 NGraph enable parse serialized graph test=develop (#17453)
6 years ago
guomingz 2281ebf0f3 Enable the convolution/relu6(bounded_relu) fusion for FP32 on Intel platform. (#17130)
6 years ago
Yibing Liu f9796b1249
Add LAMB Optimizer support (#17489)
6 years ago
mozga-intel 99ab57123c Enabled ngraph elementwise max operator (#17517)
6 years ago
Tao Luo 3d19f44a89
remove unused SERIAL compiler option (#17500)
6 years ago
mozga-intel 1eb151752e Enable abs operator for a ngraph test=develop (#17436)
6 years ago
liuwei1031 ba70cc499e
fix security bugs : (#17464)
6 years ago
Zhaolong Xing ff7f911b4d
add quant_dequant_moving_avg_max_abs op (#17480)
6 years ago
Qiao Longfei 287de41c04
Optimize communicator flags (#17494)
6 years ago
lvmengsi 10b23a72c1 Double backward elementwise div (#17416)
6 years ago
Zeng Jinle 3d4e8268c6 fix recurrent fwd bug when no backward and scope clear (#17460)
6 years ago
lvmengsi 977e9fcb27
support elementwise_sub double backward (#17476)
6 years ago
chengduo 5a6ab38013 Add record event And remove CSP (#17447)
6 years ago
Yan Xu 0217555530 polish parallel dygraph code (#17164)
6 years ago
Bai Yifan 3a9ae28d32
fix assert,test=develop (#17445)
6 years ago
zhaoyuchen2018 b02f2aff04
Add conditional compile for gru opt (#17368)
6 years ago
Zeng Jinle 712bfb17cb
fix recurrent_op,test=develop (#17433)
6 years ago
mozga-intel 6ee6700fac Eanble stack operator for a Ngraph, test=develop (#17406)
6 years ago
Krzysztof Binias 0823a7bc8b Optimize the sequence padding op (#17403)
6 years ago
baojun 1ce7b45b9e NGraph Added fill_zeros_like op test=develop (#17295)
6 years ago
baojun 910196524d NGraph Added dropout and dropout_grad to ngraph test=develop (#17320)
6 years ago
mozga-intel b189480734 Ngraph Enable gather operator test=develop (#17296)
6 years ago
lvmengsi 4ef631013c Double backward sqrt (#17387)
6 years ago
lvmengsi 5d1ac41b00 Double backward reduce mean (#17372)
6 years ago
jerrywgz 0cae5a36b6
enhance generate mask labels, test=develop (#17380)
6 years ago
Kaipeng Deng bd9bef5a4e
add elementwise_add_grad_grad op (#17366)
6 years ago
jerrywgz 1c6d064627
add collect fpn proposals op,test=develop (#16074)
6 years ago
Kaipeng Deng 60be66e2c0
support fc_op double grad (#17317)
6 years ago
liuwei1031 0863599323
Fix the uninitialized gru_value.output_value. (#17197)
6 years ago
Yihua Xu 218d8d8f73 Optimize the computing kernel of sequence_reverse operator (#17349)
6 years ago
Yiqun Liu dcda20233c
Optimize the elementwise op using eigen (#15494)
6 years ago
Kaipeng Deng 8bae8590ac
add double grad for elementwise_mul op (#17255)
6 years ago
Kaipeng Deng 11d3a38f25
add double grad for square op (#17173)
6 years ago
zhoukunsheng d4b67e1692 Add Where Op(#16793)
6 years ago
zhoukunsheng 1bfff02047 Add Diag Op(#17027)
6 years ago
zhaoyuchen2018 8a2caacdbc
improve gru unit performance. (#16338)
6 years ago
qingqing01 e32c9888f5
Double backward of conv2d. (#17211)
6 years ago
Zeng Jinle fff270eacd
follow comments,test=develop (#17273)
6 years ago
zhoukunsheng 4292bd8687 Mod floordiv (#17251)
6 years ago
xiaoting 9ed4aaada4 modified formula for Lrn (#17281)
6 years ago
zhaoyuchen2018 792443ef23
Refine elementwise kernel. (#16952)
6 years ago
Yiqun Liu 6b84688ba2
Optimize the cuda implementation of sum_op (#17283)
6 years ago
chengduo db5e74ab95
update assert (#17282)
6 years ago
Hongyu Liu c3195de522
Fix concat shape check (#17247)
6 years ago
whs 7d7e29957f Fix bp of roi perspective transform op. (#17216)
6 years ago
baojun 7bd1d03ee5 Adding lrn op for ngraph engine (#17189)
6 years ago
gongweibao 91784f8ec3
Fix code in document. (#17237)
6 years ago
Zeng Jinle 4f8594088d
Enhance inplace/mem-opt pass and enhance softmax_with_cross_entropy op inplace (#17225)
6 years ago
baojun e782b54b9c update sofmax with axis arg test=develop (#17190)
6 years ago
Kaipeng Deng a71d8fdb87
Softmax_cross_entropy op add axis (#16806)
6 years ago
Zhen Wang a914d9b116
Quant output scale (#17215)
6 years ago
zhaoyuchen2018 32b62c25af
optimize sum op (#16820)
6 years ago
石晓伟 a72dbe9abf
Cherry-pick benchmark related changes from release/1.4 (#17156)
6 years ago
jerrywgz ef66baedc0
Refine api doc (#17230)
6 years ago
jerrywgz cc95a7516c
fix distribute fpn proposals, test=develop (#16152)
6 years ago
Zeng Jinle ee2028a110
Add use_cuda to inplace pass (#17205)
6 years ago
jerrywgz a72907bbf4
Enhance concat op to support empty input. (#17015)
6 years ago
Zeng Jinle 4e1bc6e805
Rewrite inplace pass and fix gc bug (#17126)
6 years ago
Zeng Jinle 08773b6069
fix reader default stream,test=develop (#17106)
6 years ago
xiaoting bc48453b73 polish the label_smooth (#17138)
6 years ago
Leo Zhao bf4b21fa3d fix assertion failure issue when test_analyzer_bert uses ngraph (#17148)
6 years ago
tangwei12 deb510d451
cvm op feature (#17081)
6 years ago
Zeng Jinle 28d69d710a
Refine dropout gpu memory (#17095)
6 years ago
Huihuang Zheng b9494058b3
Use CudnnWorkspaceHandle in exhaustive search (#17082)
6 years ago
xiaoting 7da7881c0e Detailed coordinate description for yolov3 loss (#17007)
6 years ago
ceci3 258e000be6
test=develop, double backward leaky_relu (#17067)
6 years ago
Kaipeng Deng 10c487eb21
fix interpolate cu. test=develop (#17101)
6 years ago
whs 55ce36e981
Speedup roi_perspective_transform op by caching the information of linear interpolation in forward (#17090)
6 years ago
Yan Xu 0b07eef118
ParallelDyGraph with GPU collective mode (#16827)
6 years ago
Zeng Jinle 0c335dcd2c
Make conv cudnn workspace size configurable (#17036)
6 years ago