Commit Graph

16697 Commits (03deb41d736bea9c8d593b11d9aa541a056d250a)

Author SHA1 Message Date
Pei Yang 46b8d282dc
Add some inference API comments for AnalysisConfig (#23117)
5 years ago
Adam 4f5e4540f8
Improve SGD jit code to work with large data (#23120)
5 years ago
Liufang Sang 4db031902d
add dequantize_log_op and make pyramid hash support int8 weight (#22548)
5 years ago
Zeng Jinle e5fef8f38a
[Dygraph double grad]Code polish (#23121)
5 years ago
Zeng Jinle 9258e96094
fix read op comments, test=develop, test=document_fix (#23122)
5 years ago
Zeng Jinle acfc9b8a70
Reader sequential and inference partial feed (#22699)
5 years ago
Wilber 95b356a069
update embedding_eltwise_layernorm fuse and kernel. test=develop (#23114)
5 years ago
Zeng Jinle a31d7328b7
Add dygraph double grad implementation (#22939)
5 years ago
Yiqun Liu 3af4771122
Add the detection and code-generation of sqrt and square in fusion_group (#23095)
5 years ago
hutuxian 0c30098f8b
Add need_save_delta parameter to solve OOM (#23097)
5 years ago
songyouwei 2e2da7124b
high-performance dygraph slice (#22879)
5 years ago
Sylwester Fraczek abee05a8c8
added mkldnn swish activation (#23041)
5 years ago
Zhaolong Xing 8c6fde9e69
fix align error (#23090)
5 years ago
Liufang Sang 915b892a15
Fix div zero in fake quantize op (#22966)
5 years ago
Yi Liu 121b2aed4d
initialize global nccl context in dygraph (#23037)
5 years ago
Zhang Ting 880eb04d93
skip PrepareData when it is unnecessary (#22839)
5 years ago
Feiyu Chan 01ab8a0619
add approximation for gelu, test=develop (#22961)
5 years ago
Adam 5842ae6785
Revert "Change ShareDataWith() to TensorCopy() in conv_mkldnn (#22695)" (#22985)
5 years ago
Pei Yang 24db750386
fix trt int8 calib precision bug. test=develop (#23036)
5 years ago
GaoWei8 1dc1f9270e
Fix lod error of concat op for axis = 0 (#22538)
5 years ago
yaoxuefeng 660ff18488
fix datsset test=develop (#23043)
5 years ago
Zhang Ting 714b0076b6
Override GetKernelTypeForVar to avoid device transform, test=develop (#23032)
5 years ago
wangchaochaohu 112e3edbf6
fix the conv group problem test=develop (#23025)
5 years ago
Wilber db40ee86db
fix unittets. test=develop (#23018)
5 years ago
wangchaochaohu 99db0cf762
remove debug log test=develop (#22994)
5 years ago
wangchaochaohu 3757e0687c
Add Unittest for backward of fusion group (#22932)
5 years ago
chengjuntao 63f3ada7b9
fix bug which input shape (#22965)
5 years ago
Zhang Ting 137d6563fc
add check for assigned data, test=develop (#22960)
5 years ago
wangchaochaohu f0d193a23c
Cast fusion for fusion group (#22876)
5 years ago
yaoxuefeng 29a7a52d38
Fix instag (#22632)
5 years ago
wangchaochaohu c979c9f2b0
refine the profiler print test=develop (#22968)
5 years ago
Wilber ff3ddbb502
add skip_layernorm pass. test=develop (#22895)
5 years ago
wawltor f154d5860f
Speed up the matmul op, use the gemm replace the batch gemm (#22926)
5 years ago
Adam 056edf3929
Change ShareDataWith() to TensorCopy() in conv_mkldnn (#22695)
5 years ago
Zhaolong Xing 8d6dc102fe
[Ernie GPU Optimize]: Embedding_eltwise_layernorm Fuse (#22494)
5 years ago
guofei 3d8571e884
modify assign op and add unittest of assign op (#22769)
5 years ago
Zeng Jinle d33c4343e1
Imperative tracer refactoring (#22457)
5 years ago
liu zhengxi 61fef9754b
Fix fc padding bug during inference fusion (#22860)
5 years ago
tangwei12 ad9c8f6d2d
fix communicator when break under pyreder mode (#22911)
5 years ago
mapingshuo 5ba9dfc16a
add lookup_table_dequant_op (#22900)
5 years ago
zhaoyuchen2018 a020a25797
Fix model int8 quant fail, test=develop (#22891)
5 years ago
Zhaolong Xing dd67d44a50
[Paddle-TRT] : (Part1) Dynamic shape support (#22868)
5 years ago
tangwei12 07e13b84cd
remove vlog, test=develop (#22898)
5 years ago
Zhang Ting ca9c8b417d
fix compute ratio of profile, test=develop (#22872)
5 years ago
wangchaochaohu dbb0b9b3b6
refine the profiler print (#22823)
5 years ago
Michał Gallus 0038bfbd1d
Prevent loading of warmup data in analyzer_int8 if enable_int8 is set to false (#22857)
5 years ago
Chen Weihang 1644926a6c
Polish detail implement of dygraph data loader (#22878)
5 years ago
Wilber f686310d81
fix concat_mkldnn op. test=develop (#22692)
5 years ago
hong 5191e54494
reduce default attrs for dynamic graph (#22850)
5 years ago
Zhaolong Xing 1a533ed2de
[BUG]: Multihead matmul op's ouput size should be BxSx(N*H) (#22848)
5 years ago
hong c736fef93b
dygraph backward engine accelerate (#22808)
5 years ago
Zeng Jinle d41d802ba3
Add flags to limit gpu memory (#22793)
5 years ago
石晓伟 1861ca88f1
serialize the PaddleTensor, test=develop (#22810)
5 years ago
Zhang Ting 72ff5a09c3
fix print bug of profile, test=develop (#22804)
5 years ago
Zhang Ting 4e8bc02461
add fluid.device_guard to specify the device type for Op (#22254)
5 years ago
石晓伟 ddb9b46fec
change the function in op_teller, test=develop (#22794)
5 years ago
Zhen Wang 89cfa49156
Unmerged fetch list (#22635)
5 years ago
wangchaochaohu 8456c3f4dd
polish the profiler_help code (#22811)
5 years ago
zhongpu 2fd1ec1e3e
fix docker build for paddle openblas, test=develop (#22795)
5 years ago
Chen Weihang 7d8d573453
Speed up dygraph DataLoader based on shared memory and LoDTensor serialization (#22541)
5 years ago
liu zhengxi 324f2b3922
Fix inference c api PD_GetZeroCopyOutput lod (#22768)
5 years ago
wangchaochaohu 7578fcbac4
Profile code refine (#22800)
5 years ago
hutuxian 53a2b68f4e
support customized download command in dataset (#22782)
5 years ago
wangchaochaohu ca9e77a8d4
add sum op support for fusion group (#22771)
5 years ago
tianshuo78520a 433cef03e5
fix typo word (#22784)
5 years ago
Kaipeng Deng ebc7ffc300
fix detection_map. test=develop (#22705)
5 years ago
zhaoyuchen2018 72dde4abde
Refine adam op to improve performance, test=develop (#22346)
5 years ago
wangguanzhong f2d1cd119a
fix lod level, test=develop (#22755)
5 years ago
FlyingQianMM 79d712346f
Correct CPU gradients of the argsort op (#22739)
5 years ago
Adam 2b80e9a719
Add cpu_info without XBYAK (#22716)
5 years ago
guofei ae8b5f11a3
Change ShareDataWith() to TensorCopy() in ref_by_trainer_id (#22717)
5 years ago
liu zhengxi 71ab0458e1
Fix pointer and c-api encapsulation (#22663)
5 years ago
Leo Chen b2c1be851a
support cond in clone, test=develop (#22657)
5 years ago
Zhang Ting f97f3f9301
add framework overhead ratio in profile report (#22590)
5 years ago
zhouwei25 160d0f1308
fix the CI risk that network cannot be connected (#22736)
5 years ago
chengjuntao 15c2667143
register fp16 for assign op (#22744)
5 years ago
zhangchunle 882e7f7c3b
Directly getting API.spec for tools/sampcd_processor.py (#22728)
5 years ago
dyning 1c0653462d
fix generate_mask_labels lod level (#22743)
5 years ago
GaoWei8 ba140222d6
fix compile&runtime lod_equality of lod_reset (#22737)
5 years ago
hutuxian 175954d894
PaddleBox Framework Part2 (#22466)
5 years ago
ShenLiang 3132681e8a
add partial_sum op in contrib (#22292)
5 years ago
wangchaochaohu 611411b90e
Fusion group profile support (#22718)
5 years ago
ShenLiang e136661304
add partial_concat op in contrib (#22528)
5 years ago
GaoWei8 cdf5f6fb8c
Add an inference interface to disable FC padding (#22097)
5 years ago
tianshuo78520a d2ba91aad1
fix typo words (#22653)
5 years ago
Yibing Liu 6e7bfe30a6
register fp16 kernel for some ops (#22650) (#22696)
5 years ago
tangwei12 66a3150135
SYNC with communicaotor (#22344)
5 years ago
Yiqun Liu 22bbd54719
Add the support of fp16 in fusion_group (#22239)
5 years ago
flame d97475d53b
fix CPU C inference API compile bug (#22702)
5 years ago
Huihuang Zheng adfa5b8354
Add PADDLE_ENFORCE to Check Sequence Length of RecurrentOp (#22673)
5 years ago
flame 74eb82de19
fix go api bug (#22669)
5 years ago
wangchaochaohu a089072c8b
fix the profile print error (#22665)
5 years ago
lidanqing d926214535
[UT coverage] improve the mul_mkldnn_op line coverage (#22408)
5 years ago
wangchaochaohu c65c6ae534
add flag to control profile level in python API (#22319)
5 years ago
123malin 00594c1c88
support dumping params/grads in transpiler mode (#22490)
5 years ago
Zhaolong Xing a06d75a280
[Paddle-TRT] Refine the error log about runtime batch and max_batch_size. (#22535)
5 years ago
Adam 608447bfd5
Update MKLDNN to v1.2 (#22521)
5 years ago
Adam ab610a34ff
transpose_mkldnn code change to meet Paddle standards (#22591)
5 years ago
Jiawei Wang 8f035fb637
Add TopK Op Grad CPU&GPU Kernel test=develop (#22628)
5 years ago
Steffy-zxf 90ee366653
update ops's unittest data type from float32 to float64 and shape over 100 (#22544)
5 years ago