Paddle

Commit Graph

Author	SHA1	Message	Date
jerrywgz	41471d28ac	add box_coder_and_assign, test=develop	6 years ago
lidanqing	02c106c717	MKLDNN: Add UT for conv_transpose_mkldnn op. (#16030 ) * MKLDNN: Add UT for conv_transpose_mkldnn op. test=develop * MKLDNN: Add fuse_bias check UT for conv_transpose_mkldnn op. test=develop	6 years ago
dengkaipeng	3eab9e4b95	fix statement. test=develop	6 years ago
dengkaipeng	e37f5ab5b1	fix API.spec. test=develop	6 years ago
dengkaipeng	54bbbfa71f	fix doc statement. test=develop	6 years ago
dengkaipeng	c1a69e3ea0	refine doc. test=develop	6 years ago
dengkaipeng	65d375a09f	fix format. test=develop	6 years ago
dengkaipeng	82d514345c	fix spectral_norm doc. test=develop	6 years ago
dengkaipeng	2ea5843cbf	add doc and test_layers. test=develop	6 years ago
dengkaipeng	037855f42d	fix attr dim calc. test=develop	6 years ago
dengkaipeng	70dbd59839	add grad kernel for spectral_norm. test=develop	6 years ago
dengkaipeng	72509ec3bd	add unittest for spectral_norm. test=develop	6 years ago
dengkaipeng	3bf1ae9b59	add spectral_norm forwarn kenel	6 years ago
Zhen Wang	545247d7b4	add channel wise quantize op.	6 years ago
tensor-tang	b16dabd7e0	refine vbroadcast jitcode test=develop	6 years ago
tensor-tang	c2e56e6bbc	Merge remote-tracking branch 'ups/develop' into op/embgrad	6 years ago
chengduo	e2da3a5b22	Revert "Add Event for TensorCopy" (#16022 ) * Revert "Add Event for TensorCopy (#15953)" This reverts commit `7235fd662b`. test=develop * fix CI test=develop	6 years ago
baojun	9aaea38c0a	fix cpplint test=develop (#16028 )	6 years ago
chengduo	7235fd662b	Add Event for TensorCopy (#15953 ) Add Event for TensorCopy	6 years ago
Tink_Y	31d830de9f	refine image_resize annotation (#15976 ) * fix image_resize annotation test=develop * fix some typo * Update nn.py * Update interpolate_op.cc test=develop	6 years ago
tensor-tang	641b3cccce	add vbroadcast mkl code and jitcode test=develop	6 years ago
tensor-tang	41a1270856	add vbroadcast jitkernel refer code and use it test=develop	6 years ago
tensor-tang	867e93b21a	add jitkernel vcopy and speedup unit test time test=develop	6 years ago
jerrywgz	c31da7899a	refine code, test=develop	6 years ago
Yiqun Liu	798925453e	Revert "Optimize while_op when is_test is true. (#15811 )" (#15968 ) test=develop	6 years ago
Yiqun Liu	87248281f7	Fix error in CUDA kernel of beam_search. (#15957 ) test=develop	6 years ago
jerrywgz	e8a8fe07e7	fix code for windows CI, test=develop	6 years ago
jerrywgz	149411762a	add gpu kernel, test=develop	6 years ago
Tao Luo	4efdebc6f6	Merge pull request #15931 from yihuaxu/develop_2c5c7b2a7_gelu_mkl_opt Optimize gelu operation with mkl erf	6 years ago
tensor-tang	e5f9d3a47c	Merge pull request #15892 from tensor-tang/jit/sgd refine sgd op	6 years ago
Tao Luo	e6bab55f1b	Merge pull request #15959 from luotao1/infershape_refine refine infershape of sequence_enumerate, hash and fuse_emb_seq_pool	6 years ago
Yiqun Liu	613d9d0756	Optimize while_op when is_test is true. (#15811 ) test=develop	6 years ago
xiaolil1	1abddd8d97	Optimize Quantize Op with primitive reuse. (#15929 ) test=develop	6 years ago
Tao Luo	7ec97a0a7e	Merge pull request #15930 from xiaolil1/dequantize-reuse Optimize INT8 DeQuantize Op with primitive reuse.	6 years ago
nhzlx	2eff3e26b6	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_static_model_load_for_trt	6 years ago
nhzlx	06a088a199	fix comments and fix cpplint test=develop	6 years ago
luotao1	34404f9c31	refine infershape of sequence_enumerate, hash and fuse_emb_seq_pool test=develop	6 years ago
baojun	f285191fb3	Added adam op test=develop (#15710 )	6 years ago
jerrywgz	b92ef45fe9	Merge pull request #15678 from jerrywgz/refine_softmax_with_cross_entropy change default option related to softmax, test=develop	6 years ago
mozga-intel	558f94cd77	Register sum operator (#15889 ) test=develop	6 years ago
tensor-tang	58b8231338	added concat op test=develop (#15946 )	6 years ago
Tao Luo	47d36b2008	Merge pull request #15924 from baojun-nervana/ngraph_v14 Update ngraph version to v0.14	6 years ago
jerrywgz	0f652f304c	add distribute fpn proposals op, test=develop	6 years ago
dzhwinter	225c11a91f	polish cudnn related code and fix bug. (#15164 ) * staged. * polish code * polish code. test=develop * polish code. test=develop * api change. test=develop * fix default value. test=develop * fix default value. test=develop	6 years ago
Yiqun Liu	454f4f2140	Rewrite is_empty op to avoid unnecessary data transform. (#15509 ) * Rewrite is_empty op to avoid unnecessary data transform. test=develop * Add the implementation of InferShape and InferVarType for is_empty op. test=develop * Rewrite is_empty op to avoid directly inherit OperatorBase. test=develop	6 years ago
xiaolil1	6724be2b0d	INT8 Pool kernel Key Creation Optimization. (#15883 ) * Optimize key creation of INT8 pool kernel to improve the peformance of ResNet-50 and MobileNet, especially for latency. test=develop * Optimize key creation of pool fp32 grad. test=develop	6 years ago
xiaoli.liu@intel.com	c4187dbd7c	Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into dequantize-reuse	6 years ago
Tao Luo	ba90e05281	Merge pull request #15917 from jczaja/prv-tensor-mkldnn-ops [MKL-DNN] Adjusting ops to Tensor modifications	6 years ago
baojun-nervana	e4ab40a7b9	added concat op test=develop	6 years ago
colourful-tree	7d8f639883	Merge pull request #15902 from colourful-tree/new_develop remove mkldnn & fix commit	6 years ago
Tao Luo	effec86600	Merge pull request #15913 from liangan1/func_coverage Enable function coverage for U8/S8 ConvMKLDNNOpKernel	6 years ago
tensor-tang	8bc6381546	fix jitcodekey and refine test test=develop	6 years ago
tensor-tang	7044cfa7c7	add sgd jitcode and op test test=develop	6 years ago
tensor-tang	8e04133719	add benchmark and mkl sgd implement test=develop	6 years ago
tensor-tang	07efdb5139	Merge remote-tracking branch 'ups/develop' into jit/sgd	6 years ago
Jacek Czaja	c63f6b2039	- MKL-DNN pooling updated to set_prim_desc - MKLDNN ops revisited - disabled softmax modifications - disabled elementwise_add - reverted LRN modifications - reverted SUM primitive - Partial reviing of softmax - Enable softmax - Softmax changes - LRN is back - LRN partially disabled - LRN is back - LRN fix - compilation fixes - Sum fixed(hopefully) - Enabling (partially) elementwise_add - Fixes to elemenwise_add - Lint fixes quantize fix - compilation fix test=develop Disabling pooling - Disabled quantize op test=develop	6 years ago
qingqing01	8e439ccfff	Fix bug in fake_quantize_op and add more unit testing (#15912 )	6 years ago
qingqing01	f4846bf3dc	loosly check in the InferShape of cross_entropy_op. (#15863 ) * loosly check in cross_entropy_op when soft_label is True * Add Runtime assertion in backward infer_shape check. * Skip InferShape check when un-know the input dimensions	6 years ago
Yihua Xu	7396788694	Optimize gelu operation with mkl erf. test=develop	6 years ago
nhzlx	0ed63b2108	6. delete useless predictor id test=develop	6 years ago
xiaoli.liu@intel.com	70759d181b	Optimize INT8 DeQuantize Op with primitive reuse. test=develop	6 years ago
Yiqun Liu	f4634d76d7	Optimize the CUDA implementation of sequence_expand op by reduce the times of copying lod data from CPU to GPU. (#15493 ) * Optimize the CUDA implementation of sequence_expand op by reduce the times of copying lod data from CPU to GPU. test=develop * Refine the op benchmark to support setting lod in config. test=develop	6 years ago
guomingz	630c1e8317	This PR improve performance of prior_box op about 1.25x faster on CPU. (#15909 ) * This PR improve performance of prior_box op about 1.25x faster on CPU. * Test Env:SKX 8180 with fake data on 28 threads(bs=1). * The below table shows the ~25% improvement which generated by [eval_tp_fake_data.py](https://github.com/PaddlePaddle/Paddle/issues/15618#issuecomment-464613976). \| Type \|Event \| Calls \| Total \| Min. \| Max. \| Ave. \| Ratio.\| \| ---------------- \| ------------------ \| ---- \| ------- \| -------- \| -------- \| ------------ \| -------- \| \| w/ optimization \| thread0::prior_box \| 6000 \| 921.201 \| 0.110572 \| 0.383402 \| 0.153533 \| 0.084585 \| \| w/o optimization \| thread0::prior_box \| 6000 \| 1151.85 \| 0.102276 \| 0.426702 \| 0.191976 \| 0.103337 \| test=develop * Fix the style issue. test=develop	6 years ago
Tao Luo	9c05421c97	Merge pull request #15914 from Sand3r-/mgallus/mkldnn-sum-code-reuse Refactor MKL-DNN Sum to use reference version on fallback	6 years ago
chengduo	7ca8553d4e	Add alloc_continuous_space_op (#15900 ) * add alloc_continuous_space_op test=develop * Polish code test=develop * follow comment test=develop	6 years ago
baojun-nervana	2ffacdebc2	Update ngraph version to v0.14 test=develop	6 years ago
Michal Gallus	6ebe9877bb	Improve code reuse at MKL-DNN sum test=develop	6 years ago
liangan1	4acc522087	Enable function coverage for U8/S8 ConvMKLDNNOpKernel test=develop	6 years ago
Xin Pan	44e7fcddc5	Merge pull request #15844 from panyx0718/infer add per kernel config and remove const_cast.	6 years ago
Jacek Czaja	dec9cf53c8	[MKL-DNN] MKL-DNN specific Tensor modification (#15429 ) * - Implemented draft of primitive desc keeping in Tensor test=develop - TransposeMKLDNNHandler::AcquireSrcMemory was reimplemented - Added nchw and nc formats setting for sake of compatiblity Fixed unit tests - Worakaround to problem with 5D data in conv - Added 3D and 1D MKL-DNN formats for name handles for tensor test=develop - Fix to UTs test=develop - Conv fp32 op was updated Cosmetic fixes test=develop - tensor mkldnn cosmetics test=develop - Moved most of mkl-dnn specific code from Tensor to mkl-dnn utils * - Lint fixes test=develop * - setting prim dec in Tensor , sets also layout to kMKLDNN test=develop * - Moved creation of prim desc totally out of Tensor test=develop * - Cosmetic fixes adter review test=develop	6 years ago
heqiaozhi	08c96d1b48	remove mkldnn & fix commit test=develop	6 years ago
Xin Pan	5dd281f738	polish test=develop	6 years ago
heqiaozhi	fab09ac0b8	Merge branch 'new_develop' of https://github.com/colourful-tree/Paddle into new_develop	6 years ago
heqiaozhi	da4f5a2f18	remove mkl & fix commit test=develop	6 years ago
colourful-tree	f2d6473ef8	Merge branch 'develop' into new_develop	6 years ago
heqiaozhi	04f876f5bc	remove mkl & fix commit	6 years ago
dengkaipeng	373cfb0ccf	use kernel size in global_pooling. test=develop	6 years ago
dengkaipeng	60305196b8	fix spell mistakes. test=develop	6 years ago
Tao Luo	8a7efc78f1	Merge pull request #15882 from sfraczek/unique_ptr_dereference Change (smart_ptr.get()) -> smart_ptr	6 years ago
tensor-tang	a0c37662b9	enable sgd jitkernel refer code and test test=develop	6 years ago
xuezhong	1dad36f6aa	Merge pull request #15609 from xuezhong/add_sample_logits_op add sample_logits and sampled_softmax_with_cross_entropy op	6 years ago
Kaipeng Deng	9e524a7b51	Merge pull request #15870 from heavengate/fix_adaptive_pool_doc fix adaptive pool doc.test=develop	6 years ago
dengkaipeng	14df92fe8f	fix spell error. test=develop	6 years ago
dengkaipeng	144016fcfc	fix adaptive_pool and yolov3_loss. test=develop	6 years ago
Sylwester Fraczek	74672d1aff	Change (smart_ptr.get()) -> smart_ptr reason: dereferencing smart pointer is the same as the underlying pointer test=develop	6 years ago
tensor-tang	ee2321debd	Revert 15770 develop `a6910f900` gelu mkl opt (#15872 ) * Revert "Optimze Gelu with MKL Erf function (#15770)" This reverts commit `676995c86c`. * test=develop	6 years ago
xuezhong	81870723c6	Merge pull request #15605 from xuezhong/fix_bug_for_lstmp Fix bug for lstmp	6 years ago
dengkaipeng	eb65b4e47d	\frac -> \frac. test=develop	6 years ago
nhzlx	1d5ef7c9ee	5. add static trt load model 1). add static trt load model 2). fix bug: when device_id is not 0, the trt will have a bug test=develop	6 years ago
dengkaipeng	8167588f14	add blank after math::. test=develop	6 years ago
dengkaipeng	d9ec605873	use math:: instead of 29. test=develop	6 years ago
dengkaipeng	19292ac6a1	fix adaptive pool doc.test=develop	6 years ago
Yiqun Liu	7d96c74ab2	Initialize the benchmark tester for operator. (#15772 ) * Initialize the benchmark tester for operator. test=develop * Rearrange the codes. test=develop	6 years ago
Yihua Xu	676995c86c	Optimze Gelu with MKL Erf function (#15770 ) * Optimize for gelu operator * Set up the low accuracy mode of MKL ERF function. test=develop * Only enable MKLML ERF when OS is linux * Use the speical mklml version included vmsErf function to verify gelu mkl kernel. test=develop * Add the CUDA macro to avoid NVCC's compile issue. test=develop * Add the TODO comments for mklml library modification. test=develop * Clean Code test=develop * Add the comment of marco for NVCC compiler. test=develop	6 years ago
mozga-intel	5d132ecf83	Auto-cmake generator, auto-fill map (#15402 ) test=develop	6 years ago
Krzysztof Binias	1578c60bdd	Add new ut and remove unnecessary code test=develop	6 years ago
Xin Pan	5eb87506bc	add per kernel config and remove const_cast. test=develop	6 years ago
Dun	a83e470405	Profiler refine and add CUDA runtime api tracer (#15301 ) * refine profiler && add runtime tracer * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * test=develop * fix bug && test=develop * add thread id map && test=develop * test=develop * testing * bug fix * remove cuda event && refine code && test=develop * test=develop * test=develop * test=develop * fix windows temp file && test=develop * test=develop * fix windows bug && test=develop * fix start up issue && test=develop * code polish && test=develop * remove unused code && test=develop * add some cupti cbid && test=develop * add FLAGS_multiple_of_cupti_buffer_size && test=develop * fix compile error && test=develop * add keyword && test=develop * fix && test=develop * code polish && test=develop	6 years ago
mozga-intel	13ec2d331b	Enable momentum operator for a ngraph engine (#15673 ) * Enable momentum operator for a ngraph engine test=develop * Update tests test=develop * Unnecessary line of the code as intended was removed test=develop	6 years ago
xuezhong	eb7bc3e7ea	remove non-ascii charactor test=develop	6 years ago

1 2 3 4 5 ...

3729 Commits (a7a4f053dacd028469c0fd9a2a9e6e54eb3fa55d)