Commit Graph

16827 Commits (289edf3962f039394452bfccafcd70ce3c3dde0f)

Author SHA1 Message Date
yaoxuefeng 660ff18488
fix datsset test=develop (#23043)
5 years ago
Zhang Ting 714b0076b6
Override GetKernelTypeForVar to avoid device transform, test=develop (#23032)
5 years ago
wangchaochaohu 112e3edbf6
fix the conv group problem test=develop (#23025)
5 years ago
Wilber db40ee86db
fix unittets. test=develop (#23018)
5 years ago
wangchaochaohu 99db0cf762
remove debug log test=develop (#22994)
5 years ago
wangchaochaohu 3757e0687c
Add Unittest for backward of fusion group (#22932)
5 years ago
chengjuntao 63f3ada7b9
fix bug which input shape (#22965)
5 years ago
Zhang Ting 137d6563fc
add check for assigned data, test=develop (#22960)
5 years ago
wangchaochaohu f0d193a23c
Cast fusion for fusion group (#22876)
5 years ago
yaoxuefeng 29a7a52d38
Fix instag (#22632)
5 years ago
wangchaochaohu c979c9f2b0
refine the profiler print test=develop (#22968)
5 years ago
Wilber ff3ddbb502
add skip_layernorm pass. test=develop (#22895)
5 years ago
wawltor f154d5860f
Speed up the matmul op, use the gemm replace the batch gemm (#22926)
5 years ago
Adam 056edf3929
Change ShareDataWith() to TensorCopy() in conv_mkldnn (#22695)
5 years ago
Zhaolong Xing 8d6dc102fe
[Ernie GPU Optimize]: Embedding_eltwise_layernorm Fuse (#22494)
5 years ago
guofei 3d8571e884
modify assign op and add unittest of assign op (#22769)
5 years ago
Zeng Jinle d33c4343e1
Imperative tracer refactoring (#22457)
5 years ago
liu zhengxi 61fef9754b
Fix fc padding bug during inference fusion (#22860)
5 years ago
tangwei12 ad9c8f6d2d
fix communicator when break under pyreder mode (#22911)
5 years ago
mapingshuo 5ba9dfc16a
add lookup_table_dequant_op (#22900)
5 years ago
zhaoyuchen2018 a020a25797
Fix model int8 quant fail, test=develop (#22891)
5 years ago
Zhaolong Xing dd67d44a50
[Paddle-TRT] : (Part1) Dynamic shape support (#22868)
5 years ago
tangwei12 07e13b84cd
remove vlog, test=develop (#22898)
5 years ago
Zhang Ting ca9c8b417d
fix compute ratio of profile, test=develop (#22872)
5 years ago
wangchaochaohu dbb0b9b3b6
refine the profiler print (#22823)
5 years ago
Michał Gallus 0038bfbd1d
Prevent loading of warmup data in analyzer_int8 if enable_int8 is set to false (#22857)
5 years ago
Chen Weihang 1644926a6c
Polish detail implement of dygraph data loader (#22878)
5 years ago
Wilber f686310d81
fix concat_mkldnn op. test=develop (#22692)
5 years ago
hong 5191e54494
reduce default attrs for dynamic graph (#22850)
5 years ago
Zhaolong Xing 1a533ed2de
[BUG]: Multihead matmul op's ouput size should be BxSx(N*H) (#22848)
5 years ago
hong c736fef93b
dygraph backward engine accelerate (#22808)
5 years ago
Zeng Jinle d41d802ba3
Add flags to limit gpu memory (#22793)
5 years ago
石晓伟 1861ca88f1
serialize the PaddleTensor, test=develop (#22810)
5 years ago
Zhang Ting 72ff5a09c3
fix print bug of profile, test=develop (#22804)
5 years ago
Zhang Ting 4e8bc02461
add fluid.device_guard to specify the device type for Op (#22254)
5 years ago
石晓伟 ddb9b46fec
change the function in op_teller, test=develop (#22794)
5 years ago
Zhen Wang 89cfa49156
Unmerged fetch list (#22635)
5 years ago
wangchaochaohu 8456c3f4dd
polish the profiler_help code (#22811)
5 years ago
zhongpu 2fd1ec1e3e
fix docker build for paddle openblas, test=develop (#22795)
5 years ago
Chen Weihang 7d8d573453
Speed up dygraph DataLoader based on shared memory and LoDTensor serialization (#22541)
5 years ago
liu zhengxi 324f2b3922
Fix inference c api PD_GetZeroCopyOutput lod (#22768)
5 years ago
wangchaochaohu 7578fcbac4
Profile code refine (#22800)
5 years ago
hutuxian 53a2b68f4e
support customized download command in dataset (#22782)
5 years ago
wangchaochaohu ca9e77a8d4
add sum op support for fusion group (#22771)
5 years ago
tianshuo78520a 433cef03e5
fix typo word (#22784)
5 years ago
Kaipeng Deng ebc7ffc300
fix detection_map. test=develop (#22705)
5 years ago
zhaoyuchen2018 72dde4abde
Refine adam op to improve performance, test=develop (#22346)
5 years ago
wangguanzhong f2d1cd119a
fix lod level, test=develop (#22755)
5 years ago
FlyingQianMM 79d712346f
Correct CPU gradients of the argsort op (#22739)
5 years ago
Adam 2b80e9a719
Add cpu_info without XBYAK (#22716)
5 years ago
guofei ae8b5f11a3
Change ShareDataWith() to TensorCopy() in ref_by_trainer_id (#22717)
5 years ago
liu zhengxi 71ab0458e1
Fix pointer and c-api encapsulation (#22663)
5 years ago
Leo Chen b2c1be851a
support cond in clone, test=develop (#22657)
5 years ago
Zhang Ting f97f3f9301
add framework overhead ratio in profile report (#22590)
5 years ago
zhouwei25 160d0f1308
fix the CI risk that network cannot be connected (#22736)
5 years ago
chengjuntao 15c2667143
register fp16 for assign op (#22744)
5 years ago
zhangchunle 882e7f7c3b
Directly getting API.spec for tools/sampcd_processor.py (#22728)
5 years ago
dyning 1c0653462d
fix generate_mask_labels lod level (#22743)
5 years ago
GaoWei8 ba140222d6
fix compile&runtime lod_equality of lod_reset (#22737)
5 years ago
hutuxian 175954d894
PaddleBox Framework Part2 (#22466)
5 years ago
ShenLiang 3132681e8a
add partial_sum op in contrib (#22292)
5 years ago
wangchaochaohu 611411b90e
Fusion group profile support (#22718)
5 years ago
ShenLiang e136661304
add partial_concat op in contrib (#22528)
5 years ago
GaoWei8 cdf5f6fb8c
Add an inference interface to disable FC padding (#22097)
5 years ago
tianshuo78520a d2ba91aad1
fix typo words (#22653)
5 years ago
Yibing Liu 6e7bfe30a6
register fp16 kernel for some ops (#22650) (#22696)
5 years ago
tangwei12 66a3150135
SYNC with communicaotor (#22344)
5 years ago
Yiqun Liu 22bbd54719
Add the support of fp16 in fusion_group (#22239)
5 years ago
flame d97475d53b
fix CPU C inference API compile bug (#22702)
5 years ago
Huihuang Zheng adfa5b8354
Add PADDLE_ENFORCE to Check Sequence Length of RecurrentOp (#22673)
5 years ago
flame 74eb82de19
fix go api bug (#22669)
5 years ago
wangchaochaohu a089072c8b
fix the profile print error (#22665)
5 years ago
lidanqing d926214535
[UT coverage] improve the mul_mkldnn_op line coverage (#22408)
5 years ago
wangchaochaohu c65c6ae534
add flag to control profile level in python API (#22319)
5 years ago
123malin 00594c1c88
support dumping params/grads in transpiler mode (#22490)
5 years ago
Zhaolong Xing a06d75a280
[Paddle-TRT] Refine the error log about runtime batch and max_batch_size. (#22535)
5 years ago
Adam 608447bfd5
Update MKLDNN to v1.2 (#22521)
5 years ago
Adam ab610a34ff
transpose_mkldnn code change to meet Paddle standards (#22591)
5 years ago
Jiawei Wang 8f035fb637
Add TopK Op Grad CPU&GPU Kernel test=develop (#22628)
5 years ago
Steffy-zxf 90ee366653
update ops's unittest data type from float32 to float64 and shape over 100 (#22544)
5 years ago
flame f7eafca828
remove python inference warning (#22602)
5 years ago
Chen Weihang fe685cc185
fix enforce test error, test=develop (#22610)
5 years ago
Wilber 9a8203aa25
fix fc_lstm_fuse when multi sub-graph use same fc_bias. test=develop (#22551)
5 years ago
Chen Weihang 266106da75
Fix mismatch with plus sign in the line (#22588)
5 years ago
flame 1d503e6a9e
Golang inference API (#22503)
5 years ago
Zhaolong Xing 8acd745c25
[Ernie GPU Optim]: Fuse three fc to multihtead matmul (#22486)
5 years ago
Yiqun Liu 96770f519e
Disable fusion_group for windows and mac in build_strategy. (#22549)
5 years ago
Zeng Jinle 08033c8634
fix traced layer with non persistable vars, test=develop (#22552)
5 years ago
Guo Sheng 31b5464632
Add support for dynamic_decode(while) training. (#22231)
5 years ago
tangwei12 b0675c8193
fix bug with compiledProgram (#22495)
5 years ago
Wojciech Uss 4cddb43c5c
Add support for Ernie NLP model to the Slim QAT (#22506)
5 years ago
Double_V 58d99247f4
support slice double grad, test=develop (#22166)
5 years ago
hutuxian 1a7962be97
Paddlebox about box_wrapper (#22497)
5 years ago
huzhiqiang 9e29d3ebed
【OpPorting Example】DEMO OF FIX COMPILE&RUNTIME LOD_EQUALITY (#22460)
5 years ago
yaoxuefeng 2235ee1a5e
multi-loss optimization by adding a DownpourOpt worker (#22025)
5 years ago
zhaoyuchen2018 54970444ce
Improve transpose performance with tile sm copy, test=develop (#22311)
5 years ago
Wilber a90fa54092
Compile without nccl deps. [1/2] (#22509)
5 years ago
guofei 3a59a7a11f
Make assign op support LoDTensorArray and modify while_loop API (#22309)
5 years ago
Zhaolong Xing 54a325a52f
[Refine Paddle-TRT INT8]: Support PaddleSlim's Resnet50, Mobilenetv1, Yolov3 models for Inference. (#22483)
5 years ago
zhongpu 5739eeb9fa add cp27-cp27m-gcc82 and cp27-cp27mu-gcc82 branch to support gcc8.2 compile for paddle, test=develop (#22504)
5 years ago
Wilber de009152a7 Compile without nccl deps. [2/2] (#22484)
5 years ago
Yiqun Liu 4b2227e958 Fix dismatch of std::max's arguments type on windows. (#22507)
5 years ago
Wilber 870f465887 fix test_fusion_seqpool_concat lod level between compile and runtime (#22488)
5 years ago
Zhong Hui a61d09527b
Fix the integer overflow problem of sequence2batch (#22479)
5 years ago
cc 197913ebe1
Add weight quantization in post_training_quanzitaion (#22445)
5 years ago
Yiqun Liu dcfb603897
Enable the detection of subgraph composed of grad ops (#21223)
5 years ago
Tao Luo 7c9ce097f1
refine reshape_op shape error message (#22480)
5 years ago
LielinJiang 2b1386b2b2
optimize performance of interpolate op (#22436)
5 years ago
wangchaochaohu 77dd0d97bb
use enum class to replace the usage of enum in some condition test=develop (#22464)
5 years ago
Yiqun Liu 44b45b9f07
Correct the use of DeviceContext in unittest sequence_pooling_test and sequence_padding_test (#22456)
5 years ago
joanna.wozna.intel 17f2c0899f
Add dequant-scale squash (#22409)
5 years ago
mapingshuo 9c4deedbc2
update readme of imdb training demo (#22455)
5 years ago
Zhaolong Xing ceda0b9b1a
[Fix BUG]: Core when multi thread + clone + paddle-trt (#22442)
5 years ago
Wilber 7bc4b09500
add WITH_NCCL option for cmake. (#22384)
5 years ago
Tao Luo 943cb8c664
fix sigmoid cudnn bug (#22439)
5 years ago
xujiaqi01 d51ffe860a
fix copy table bug (#22432)
5 years ago
Leo Chen 822e5b36ec
Support int16 for Tensor (#22423)
5 years ago
石晓伟 e1b0d7cbb1
remove anakin from code, test=develop (#22420)
5 years ago
liu zhengxi 0404e7a985
Update the precision of pad, pad2d, pad_constant_like's unit tests from fp32 to fp64 (#22394)
5 years ago
xujiaqi01 371f377bea
add GeneralRoleMaker (#22295)
5 years ago
Michał Gallus 269db0d1d1
[DNNL] Fix accuracy in INT8 FC (#22404)
5 years ago
joanna.wozna.intel fb3086fd57
[UT coverage]Remove unnecessary transpose op registration (#22402)
5 years ago
lidanqing ade5022681 [UT Coverage]Improve sum_mkldnn_op line coverage (#22275)
5 years ago
joanna.wozna.intel 3099d9d47c Restore requantize squash (#22399)
5 years ago
Wojciech Uss 92462e948d improve elementwise_add_mkldnn_op test code coverage (#22359)
5 years ago
ceci3 20f30dd604
add benchmark flag for conv_transpose (#22389)
5 years ago
Leo Chen b96c7c9a7a
polish code, test=develop (#22380)
5 years ago
Chengmo 8f36c39537
Fix GEO-SGD init & send Bug (#22375)
5 years ago
zhupengyang c6f888e5a5 update unittest accuracy to float64 for relu, prelu, maxout (#22273)
5 years ago
wangchaochaohu 0d8b222b79
Optimize the depthwise op test=develop (#22265)
5 years ago
Leo Chen aaa4fe491a
use function instead of lambda, test=develop (#22348)
5 years ago
Adam e7a9f6bbb7 [Bugfix] Preserve shape in inpalce operators (#22360)
5 years ago
qingqing01 2d20869c94 Fix infer_shape in compling for elementwise_op (#22291)
5 years ago
Yiqun Liu b7cac50b64
Implement a common python unittest to test the ir passes. (#22209)
5 years ago
tangwei12 82bc814a57
integrated HALF_ASYNC to communicator (#21869)
5 years ago
wangchaochaohu 1e932eccfa
remove unused code test=develop (#22327)
5 years ago
Leo Chen 3e5744aa65
Remove unused inputs for some operators (#22284)
5 years ago
zhangchunle 805328e13b fix typo in error message (#22312)
5 years ago
lidanqing 895f8da7d6 change std::cout to log(INFO), vlog (#22316)
5 years ago
石晓伟 8cb04664b9
revert paddle_fluid.map, test=develop (#22236)
5 years ago
Chen Weihang 35efbe6d95
Speeding up dygraph DataLoader with multiprocessing (#21762)
5 years ago
Zeng Jinle 9435533adf
remove op_use_default_grad_op_maker.spec, test=develop, test=document_fix (#22300)
5 years ago
wangchaochaohu 7b76a76495
fix the conda build confilict test=develop (#22279)
5 years ago
Zeng Jinle 5e601a92ad
polish grad op check (#22290)
5 years ago
Bai Yifan faba4b116a
Remove disable flag in test_fsp_op.py (#22171)
5 years ago
Zhen Wang e40cfb1010
fix the bug of assert_is_op_output. test=develop (#22262)
5 years ago
Wojciech Uss d3a6647372 improve placement pass tests code coverage (#22197)
5 years ago
liu zhengxi 07afc29e90
Make api.cc malloc consistent with paddle_api.h for PaddleBuf (#22255)
5 years ago
silingtong123 4f1da4adcb remove the useless third_party library from C++ inference library (#22021)
5 years ago
zhouwei25 549e6de7ac faster build by reduce by-product, reduce linking library and fix compile warning of std=c++11 (#22164)
5 years ago
xujiaqi01 e3a457d34b
add collective communication library in fleet (#22211)
5 years ago
Zhen Wang f2522e91c4 fix the type error caused by setting bool attr in OpDesc. test=develop (#22257)
5 years ago
songyouwei 0ba1d140d4 Add CI check for sequence ops' unittests (#21615)
5 years ago
Zeng Jinle 1b76e789cf
remove cuda allocator ctor, test=develop (#22212)
5 years ago
Adam 9942d9ed5c Add caching mechanizm to requantize_mkldnn_op (#22223)
5 years ago
Wilber 1230c110cb
[fluid-lite] adjust to relative error (#22232)
5 years ago
123malin 985bceac53
Bug fix for sparse recorder (#21969)
5 years ago
Chen Weihang fc0b21e17b
Polish fetch error message of parallel executor (#22206)
5 years ago
Wojciech Uss 2e90c4eb0a improve mkldnn_quantizer_config test code coverage (#22216)
5 years ago
Wilber 5750152e80
support fluid-lite subgraph run resnet test=develop (#22191)
5 years ago
wangchaochaohu 621d3e0b66
fix the bug of profile update (#22207)
5 years ago
FlyingQianMM 443a713c9e
add backward gradient computation for op argsort (#22203)
5 years ago
Zhen Wang 46189b166d Add bn and relu fuse pass (#22048)
5 years ago
zhouwei25 2f3e2a84af fix ci rule to show Shell variables (#22177)
5 years ago
baojun 298ee7d28a Improve ngraph file line coverage (#22155)
5 years ago
zhongpu d0f0a2520c test Optimizer in dygraph (#21949)
5 years ago
石晓伟 ad0dfb17c1
[Feature] Lite subgraph (#22114)
5 years ago
joanna.wozna.intel 5b2e98aa17 Add multiple quantize operators fuse (#22062)
5 years ago
Yiqun Liu 96980c2244
Polish the PADDLE_ENFORCE in fusion_group pass related codes. (#22144)
5 years ago
wangchaochaohu c3876cf82d
add support for nested profiling event and printing in different level (#22061)
5 years ago
Zeng Jinle c3bcd3c1e2
fix dygraph non zero gpu bug, test=develop (#22165)
5 years ago
zhaoyuchen2018 3d4f2aa689
Refine stack op to improve xlnet performance, test=develop (#22142)
5 years ago
zhongpu cf475f95df Remove FC in dygraph, modify FC to Linear in sample code (#22082)
5 years ago
liu zhengxi 64a4044292
add double register op_data_type of pad2d and fix compile error, test=develop (#22075)
5 years ago
Liu Xudong 7ba7acd197 Add coverage tools (#21975)
5 years ago
Double_V 6ea3809143 Support prroi_pool_op with Tensor and LoDTensor rois (#20649)
5 years ago
Pei Yang d8a9b134e3
fix trt instance_norm serialize bug. test=develop (#22152)
5 years ago
zhongpu cc1a9f4238 fix sample code in paddle/fluid/imperative/README.md (#22141)
5 years ago
Zeng Jinle 4c2df8e4d4
fix allocator strategy comment, test=develop, test=document_fix (#22121)
5 years ago
bingyanghuang 7872d06ff4 Add explanation on conv grad for dims<3 (#22125)
5 years ago
liu zhengxi 724b13e459
fix xception precision problem, test=develop (#22124)
5 years ago
Yiqun Liu b1401fb74d Remove subgraph_detector from inference/analysis to the common framework/ir directory. (#22094)
5 years ago
Pei Yang 50bee83f71
add TRT support for instance_norm op (#21928)
5 years ago
zhaoyuchen2018 3dbd4087fe
Fix windows build not kernel issue, test=develop (#22105)
5 years ago
Chengmo 418abc92f4
Update pyramid related OP (#21372)
5 years ago
bingyanghuang 4b4a9cc88f fix format in operator.cc (#22101)
5 years ago
Feiyu Chan 14aebc7a95
add erf op (#21785)
5 years ago
Chen Weihang ba8414d3a5
replace CUDNN_ENFORCE with PADDLE_ENFORCE_CUDA_SUCCESS, test=develop (#22109)
5 years ago
silingtong123 6c20e7c4e6 test=develop, remove unused parameter from class RuntimeInferShapeContext constructors (#22046)
5 years ago
Double_V fab4b0765a support elu_op double grad (#21822)
5 years ago
Pei Yang 0a51098a71
Add TRT support for BERT (#21135)
5 years ago
Jacek Czaja b0b27ff699 [MKL-DNN] Conv grad and Batch Norm grad NHWC support (#22088)
5 years ago
Huihuang Zheng dd4361568e
Add ParallelExecutor Test for Cond API and Fix PE Checks Shape Bug (#22029)
5 years ago
Zeng Jinle 9587249442
polish allocator strategy doc, test=develop, test=document_fix (#22095)
5 years ago
Zeng Jinle d9f5d1eb29
ag allocator by default, test=develop (#21837)
5 years ago
123malin 7fb817d447
add distributed_strategy (#21710)
5 years ago
Jacek Czaja ad8a9cb82c [MKL-DNN] Pool & LRN Grad Ops NHWC support (#21747)
5 years ago
Kaipeng Deng 34c57120eb polish cross_entropy ENFORCE (#22056)
5 years ago
SunAhong1993 7f4abaf2f5
register int/int64_t/float16 in pow/square kernel,test=develop (#22023)
5 years ago
Leo Chen 3f653c8323
register NoNeedBufferVarsInference for max_pool_grad_op, test=develop (#22055)
5 years ago
Yiqun Liu d48320777e
Add the first implememtation of fusion_group op (#19621)
5 years ago
Michał Gallus 6192108408 [DNNL] 3D Fully-Connected (#21746)
5 years ago
FDInSky aa2ed0dcc6 fix generate_proposal_labesl op (#21793)
5 years ago
ceci3 95d79b6d00
update error log for batch_norm_grad (#22017)
5 years ago
Aurelius84 c53b62eb8e
fix integer overflow in match_matrix (#22036)
5 years ago
Chen Weihang 2e9082250d
polish default error msg & cublas error hint, test=develop (#22032)
5 years ago
wangchaochaohu 64baee4144
polish code test=develop (#22014)
5 years ago
Chen Weihang 35ff1568e9 Add error message for cublas inItizalize failed (#21995)
5 years ago
Chen Weihang fbb42173a9
fix no hint problem when use ENFORCE for cuda, test=develop (#21994)
5 years ago
zhouwei25 e66f92d1ae Modify demo_ci to support Windows, prepare for PR_Windows_Inference (#21873)
5 years ago
danleifeng b7697f6218 fix broadcast bug;test=develop (#21898)
5 years ago
liu zhengxi 196e20dfbb
Fix multi-threads memory out of bounds error for passes (#21920)
5 years ago
zhaoyuchen2018 8859ddd6cf
Refine multihead kernel, align block to 32 (#21961)
5 years ago
silingtong123 fd9b00df4b test=develop, remove unused variable (#21974)
5 years ago
zhoushiyu cee2ccb078
add shuffle batch op (#21674)
5 years ago
mapingshuo c3e1954918
make reverse op support negative axis (#21925)
5 years ago
石晓伟 03479469a7
fix multi-thread error of fc_gru_fuse_pass.cc, test=develop (#21841)
5 years ago
wangchaochaohu de9ba01f11
add conda build python script test=develop (#21943)
5 years ago
Aurelius84 10d6846900
Remove double registered dataType in Pad2d (#21942)
5 years ago
zhouwei25 2df4be5d35 Fix openblas bug to support compile on windows when WITH_MKL=OFF (#21902)
5 years ago
hutuxian 27decacb8a
fix aucop stat shape (#21846)
5 years ago
Pei Yang 3e5008ad01
fix trt calib not working bug, test=develop (#21934)
5 years ago
Aurelius84 5cb2c74127
add register op_data_type of pad/expand_as et.al (#21718)
5 years ago
qingqing01 2066745847
Pack imperative/layer into paddle_framework.so (#21921)
5 years ago
hong 30d000f8c2
fix matmul error message; test=develop (#21885)
5 years ago
zhouwei25 a01663ca1f remove patch command and file of cares to Improved quality of Paddle Repo (#21776)
5 years ago
flame 2bbc0d7d60
python zero copy inference, delete pass (#21897)
5 years ago
Aurelius84 51a86d2b6b Optimize adam speed (#21777)
5 years ago
Leo Chen 310edc0d0c
Update layers used in ptb model to use auto-generated op functions in dygraph mode (#21724)
5 years ago
lidanqing 9dff56e8e2 change qat_performance with mobilenet, change batch_size of qat2_resnet50 (#21895)
5 years ago
FDInSky 6b9fbcf3ad Update iou_similarity op to support non-normalized bbox (#21671)
5 years ago
guofei 46f9184aff Modify the while_loop API (#21844)
5 years ago
Guo Sheng 7689b6aaa4
Fix default label dim of label_smooth_op. test=develop (#21862)
5 years ago
zhouwei25 13e4756f18 change ci check rule of deleting unit-test (#21876)
5 years ago
GaoWei8 d4dda8628e optimize fc jit (#21878)
5 years ago
zhouwei25 013225bb68 fix Execution order of ci_check_unittest, and add it to Linux_py35 (#21640)
5 years ago
Chen Weihang 2b941736f3 fix softmax_with_cross_entropy_fix bug, test=develop (#21810)
5 years ago
Thunderbrook c3cf42d0f7
add table id in cache shuffle (#21585)
5 years ago
Michał Gallus 253e664275 Disable memory opt pass when DNNL is on (#21826)
5 years ago
Chengmo a86f11b5f5
Speed GEO dense calc & communication (#21579)
5 years ago
Wojciech Uss 666c3bb9b0 handle multi-inputs with empty inputs for mkldnn_concat_op (#21827)
5 years ago
Zeng Jinle aa4d6a5d6c
Add some debug flags to auto growth allocator (#21766)
5 years ago
guofei 8b7c50f49a Make While Op could run on GPU place and add while_loop unittest (#21672)
5 years ago
WangXi 17299b8d21 fix batch_norm_grad infer shape=0 & add allreduce enforce shape, test=develop (#21801)
5 years ago
Huihuang Zheng 557bce77da
Fix Backward Bugs in Conditional Block (#21809)
5 years ago
xujiaqi01 0eb4d990c4
fix compiled error when with_pslib=on (#21769)
5 years ago
Huihuang Zheng 0677a1c1c1
Fix That conditional_block_op Doesn't Have InferShape (#21733)
5 years ago
zhaoyuchen2018 a5a8d14414
Fix softmax cuda bug (#21720)
5 years ago
Kaipeng Deng 943a44492b
yolo_box OP add Attr(clip_bbox). (#21620)
5 years ago
Michał Gallus a5159d8480 Re-anble vgg and resnet101 models download (#21713)
5 years ago