ShenLiang
c706ff20a3
fix conflict, test=develop ( #23298 )
5 years ago
ShenLiang
5223e2bbc4
Add a new DataFeed named PaddleBoxDataFeed ( #23321 )
...
* add paddleboxdatafeed
* add ifdef linux and boxps
* add untest for datafeed
* fix untest of test_paddlebox_datafeed
* fix untest
* rename function
5 years ago
Chen Weihang
75bd350710
Implement StaticModelRunner to support dygraph fine-tune static graph pre-training model ( #23171 )
...
* static model runner basic implement, test=develop
* add run program op to execute loaded program, test=develop
* refactor static model runner & run program op, test=develop
* reset engine.cc to resolve conflict
* adapt the change of dygraph double grad, test=develop
* refactor impl to solve control flow error, test=develop
* clear debug code, test=develop
* fix ci str compatible error & checkout dygraph grad maker & add example, test=develop
* hide api & add op test, test=develop
* fix run program op test places error, test=develop
* fix program by review comment, test=develop
* delete change var desc name, test=develop
* fix other program by review comment, test=develop
* remove _static_graph_guard, test=develop
* add selectedrows test, test=develop
* remove desc parser, test=develop
* fix detail program, test=develop
* change socpe create & add test, test=develop
5 years ago
cc
9297f49e4b
[OP] Add randperm op ( #23292 )
5 years ago
Kaipeng Deng
d223a24904
Fix inplace_abn compile error on Windows ( #23464 )
...
* fix inplace_abn windows compile error. test=develop
5 years ago
Tao Luo
0b583235f5
Revert "Solve the conflict of ops with the same name. ( #23199 )" ( #23494 )
...
This reverts commit abe3e6906d
.
test=develop
5 years ago
wawltor
6577f91b74
Add the sum op to API 2.0, add some parameters for new api
...
* Add the sum op to API 2.0, test=develop
* Fix the import meesage in common_ops_import
5 years ago
石晓伟
36b82eae0e
refine the doc of paddle_api.h, test=develop ( #23402 )
...
* refine the doc of paddle_api.h, test=develop
* fix documents, test=develop
5 years ago
WuHaobo
c4d0305239
add tril op and triu op ( #23469 )
...
add tril op and triu op
5 years ago
yongqiangma
eb035f24d1
add unbind op ( #23359 )
...
* add unbind op
unbind(tensor, dim=0):
说明:移除指定维后,返回一组数组,包含了沿着指定维切片后的各个切片。
tensor(Tensor) -- 输入Tensor
dim(int) -- 删除的维度
示例:
Input = [[1,2],
[3,4],
[5,6]]
axis = 0
Output[0] = [1,2]
Output[1] = [3,4]
Output[2] = [5,6]
5 years ago
zhangchunle
fd9b7bdb3d
Op (FusedEmbeddingSeqPool) error message enhancement. ( #23454 )
5 years ago
Chen Weihang
16315d3d9e
Delete Ref & VectorRef and add GetDataSafely ( #22997 )
...
* delete invalid check inferface Ref & VectorRef, test=develop
* fix vector ref delete error, test=develop
* try the new check inferface, test=develop
* change all related code with new check macro, test=develop
* remove static assert, test=develop
* polish detail, test=develop
* skip coverage problem, test=develop
* add new check macro, test=develop
5 years ago
Zhen Wang
abe3e6906d
Solve the conflict of ops with the same name. ( #23199 )
...
* solve the conflict of ops with the same name. test=develop
5 years ago
wawltor
0b092d05f1
Add the argmax op to API 2.0, and update some parameters
...
* Add the argmax op to API 2.0, test=develop
* Fix the compiler problem in arg_max op, test=develop
* Fix the import meesage in common_ops_import, test=develop
* Fix the default dtype of arg_min_max, test=develop
5 years ago
Leo Chen
f297a33285
Dev/fix init flags ( #23465 )
...
* fix init_gflags with 'python -c', test=develop
* add test, test=develop
* use sys.executable instead of python, test=develop
* keep dummy, test=develop
5 years ago
Zhaolong Xing
6a23850a3f
add init value to varis in analysis config. ( #23442 )
5 years ago
wawltor
915341e3de
Add the zeros, ones, ones_like, zeros_like for api 2.0, test=develop ( #23471 )
...
Update the new api ops of creation ops to the api 2.0
5 years ago
Zhen Wang
56b50c97f8
Add allclose_op ( #23335 )
...
* Add allclose Op, and its function is analogous to numpy.allclose. It returns True if two tensors are elementwise equal within a tolerance.
5 years ago
kinghuin
948c57d84b
move sin, sqrt, tanh, atan to paddle.tensor.math and add a new parameter "out" ( #23387 )
...
* sin sqrt tanh atan add out, test=develop
* optimize doc, test=develop
* add dygraph test, test=develop
5 years ago
Chengmo
a2e9af5663
Add Tdm child OP in contrib ( #23241 )
...
* add tdm child op
5 years ago
Wilber
9676ac1c5c
Add flip op. ( #23255 )
...
* add flip op
5 years ago
tianshuo78520a
d8a21ef6f3
test=develop;fix error ( #23467 )
5 years ago
Feiyu Chan
81f1402f6c
Add functional convolutions in paddle.nn.functional ( #23408 )
...
* add functional conv
* add test and doc for function convs, test=develop
* update ConvTransposeOp's InferShape and error message, test=develop
5 years ago
Zhaolong Xing
70782e6379
[Inference doc]: refine paddle_api.h doc ( #23354 )
...
* refine paddle api doc
test=develop
* fix comments
test=develop
5 years ago
Feiyu Chan
bcafe3179a
add MKL computation back to gelu's non-approximate part ( #23420 )
5 years ago
zhongpu
dbfbd7eac4
support Exhaustive search in dygraph ( #23415 )
...
* use global conv cache; test=develop
* use singleton cache; test=develop
* fix format error; test=develop
* add cudnn helper header; test=develop
* fix header error; test=develop
* fix mac unitest; test=develop
* fix mac unitest; test=develop
* fix file format; test=develop
* fix include file error, test=develop
* remove kernel_configs_ in class ExecutionContext and kernel_configs_map_ in class OperatorWithKernel, test=develop
* fix test_elementwise_mul_op_dim, test=develop
* fix compile error, test=develop
Co-authored-by: phlrain <phliuhongyu@126.com>
5 years ago
zhaoyuchen2018
01d7ccd4b6
Fix elementwise compile error, test=develop ( #23381 )
...
elementwise function used before definition then failed in cuda 8, move it ahead.
5 years ago
gongweibao
24a063f6ac
Add fleet checkpoint on local fs and remote fs(such as hdfs) for EDL ( #22586 )
5 years ago
Zeng Jinle
0c23e3ff4d
fix Tracer::NoGrad, test=develop ( #23443 )
5 years ago
channings
a2e10930cf
update linspace, equal operators to API 2.0 ( #23274 )
...
* update linspace, equal operators to API 2.0, test=develop
* equal support higher performance CUDA kernel, test=develop
* update comment of equal&linspace operator, test=develop
* update comment of equal&linspace operator, test=develop
5 years ago
zhaoyuchen2018
4fe9ca6959
improve elementwise performance. ( #23405 )
...
* improve elementwise performance.
* Add contiguous check, test=develop
5 years ago
wangchaochaohu
5c60778731
polish the code of fusion group test=develop ( #23370 )
5 years ago
Leo Chen
a62599a888
[feature] prune program by feed and fetch_list automatically ( #22474 )
...
* prune train program by fetch_list, test=develop
* add unittest for prune, test=develop
* fix pruned feed, test=develop
* support ParallelExecutor and feed prune, test=develop
* add comments, test=develop
* update unittest, test=develop
* update unittests, test=develop
* remove debug code, test=develop
* support cond in clone, test=develop
* support cond in prune, test=develop
* support multiple minimize, test=develop
* support cache, test=develop
* fix _copy_param_info_from, test=develop
* support python2 str, test=develop
* remove debug code, test=develop
* fix bug of caching CompiledProgram, test=develop
* fix multi_device issue, test=develop
* tmp
* support tuple in fetch_list and overriding use_prune, test=develop
* dont use nonlocal in python2, test=develop
* remove nonlocal, test=develop
* code clean, test=develop
* code clean, test=develop
* feed list, test=develop
* test adam, test=develop
* follow comments, test=develop
* reduce duplicate code, test=develop
* update comments, test=develop
5 years ago
Chen Weihang
7f1ad510bd
Add op inout check macro to simplify error message writing ( #23430 )
...
* add op inout check macro, test=develop
* fix enforce_test, test=develop
5 years ago
Yiqun Liu
bc2981e998
Disable test_code_generator and test_post_training_quantization_mobilenetv1 ( #23440 )
5 years ago
Zeng Jinle
29337f4e17
fix conflict of inferne partial feed with gpu parallel ssa graph executor, test=develop ( #23400 )
5 years ago
Pei Yang
7e439780d9
add full paddle_analysis_config.h APIs. ( #23215 )
5 years ago
zhongpu
bfb07aafe8
Revert "Exhaustive search ( #22821 )", test=develop ( #23401 )
...
This reverts commit 48144e4099
.
5 years ago
liym27
b7b0b3595b
Add unittest for transformer prediction in dygraph_to_static ( #23207 )
...
* Add unittest for transformer prediction in dygraph_to_static.
* fix bug in fill_constant api.
* Make transpose support size 0. test=develop
5 years ago
xujiaqi01
93ea9dd27a
fix stat var in hogwild worker ( #23367 )
...
* fix stat var in hogwild worker
* test=develop
5 years ago
joanna.wozna.intel
8c463700e1
Add default pass attributes ( #23042 )
5 years ago
zhongpu
48144e4099
Exhaustive search ( #22821 )
...
* use global conv cache; test=develop
* use singleton cache; test=develop
* fix format error; test=develop
* add cudnn helper header; test=develop
* fix header error; test=develop
* fix mac unitest; test=develop
* fix mac unitest; test=develop
* fix file format; test=develop
* fix include file error, test=develop
* remove kernel_configs_ in class ExecutionContext and kernel_configs_map_ in class OperatorWithKernel, test=develop
* fix test_elementwise_mul_op_dim, test=develop
Co-authored-by: phlrain <phliuhongyu@126.com>
5 years ago
Adam
da7c73f847
Delete is_test attribute from activation operators ( #23318 )
...
* Delete is_test from activation operators
test=develop
* Revent unneeded changes
test=develop
5 years ago
Kaipeng Deng
21d95be0db
Add inplace abn op ( #22806 )
...
* add inplace_abn_op. test=develop
5 years ago
Yi Liu
821534efd3
add paralell_executor dependancy to collective_helper ( #23380 )
...
test=develop
5 years ago
Zeng Jinle
3a21980b78
add reader dependency pass, test=develop ( #23301 )
5 years ago
wangchaochaohu
69e3f99362
refine the error message ( #23212 )
...
* refine the error message of tensor_array_read_write Op
5 years ago
石晓伟
5c59d2139e
reverts the commit 23177, test=develop ( #23363 )
5 years ago
wangchaochaohu
d280106007
Add support for attr type Op and add fill_constant Op and scale Op ( #23163 )
...
* add attr support for fusion group and add support for fill_constant and scale Op
5 years ago
xujiaqi01
3a45767d49
add fleet pslib pull and push sparse op and push dense op ( #23139 )
...
* add fleet pslib pull and push sparse op and push dense op
* test=develop
5 years ago
songyouwei
99d30bfc36
speedup slice impl ( #23340 )
...
test=develop
5 years ago
Zhaolong Xing
1a6ce8b910
add swish split gelu plugin dynamic support ( #23305 )
...
test=develop
5 years ago
Jacek Czaja
2bb1b0e89e
[DNNL] Added MKL-DNN inplace pass for C-API inference ( #23315 )
5 years ago
Yi Liu
0471476a18
fix nccl comm double free bug ( #23344 )
...
As nccl comm is not created by CUDADeviceContext, it should be destroyed by the creator as the best practice of RAII.
5 years ago
wangchaochaohu
1ee2a9a424
Profiler refine ( #23294 )
...
* refine output of profiler for child event
5 years ago
Leo Chen
488b2387e2
Feature/expand params in auto-generated pybind functions for dygraph operators ( #23181 )
...
* expand parameters, test=develop
* support resnet, test=develop
* fix resnet, test=develop
* support duplicable out, test=develop
* support ptb
* fix bugs, test=develop
* support null input, test=develop
* fix bugs, test=develop
* fix batchNorm is_test, test=develop
* refine code, test=develop
* follow comments, test=develop
* follow comments, test=develop
* follow comments, test=develop
* follow comments, test=develop
5 years ago
GaoWei8
20eed5401a
Change fluid.layers.where‘s C++ operator name ( #23250 )
5 years ago
Yi Liu
2169e6fb58
Initialize global nccl_comm in PE ( #23275 )
5 years ago
Jacek Czaja
012886df79
[DNNL] Softmax mkldnn op inplace support ( #23197 )
5 years ago
石晓伟
75ebb48a91
supports thread-binding stream, test=develop ( #23177 )
5 years ago
石晓伟
708ded584e
pause the io_utils_test of int64 and resume after repair, test=develop ( #23234 )
5 years ago
Zeng Jinle
babda94c8a
Distinguish public/private global vars ( #23269 )
...
* distinguish public/private vars, test=develop
* fix windows issues, test=develop
5 years ago
zhaoyuchen2018
58615a6272
Improve elementwise performance. ( #23001 )
...
* Improve elementwise performance.
Elementwise performace is poor as walk into CommonGradBroadcastCUDA, add some new kernels for different data pattern.
* Add some cuda kernel to speedup common broadcast cases. test=develop
* Add more test cases and fix cuda kernel bug. test=develop
* Remove tests as cpu percision fails.test=develop
* Refine SplitDims, test=develop
* Change file mode, test=develop
5 years ago
Wojciech Uss
f836c8aa8f
add check for scales and a message ( #23119 )
5 years ago
Zeng Jinle
8bfd62ffb7
Expose dygraph.grad api ( #23124 )
...
* expose dygraph.grad api, test=develop, test=document_fix
* add more parameter in dygraph.grad API, test=develop
* add only_inputs=True parameter, test=develop
* follow comments, test=develop, test=document_fix
* fix typo, test=develop, test=document_fix
5 years ago
Wilber
0129f4b568
Add some inference API comments for AnalysisPredictor ( #23242 )
...
* add inference api doc. test=develop
5 years ago
Tao Luo
c00d427d52
simplify the cmake log of ir/CMakeLists.txt ( #23262 )
...
test=develop
5 years ago
Zeng Jinle
77b4dc80c9
code polish for adding const qualifier, test=develop, test=document_fix ( #23248 )
5 years ago
Zhaolong Xing
430b0099c9
[Paddle-TRT]: Ernie Dynamic shape support. ( #23138 )
...
* add dynamic plugin support.
test=develop
* change emb eltwise layernorm to math function
test=develop
* add emb eltwise layernorm
test=develop
* can run dynamic shape ernie
test=develop
* fix ci
test=develop
* add ut for trt ernie dynamic
test=develop
* refine dynamic shape c++ interface.
test=develop
* fix comments
test=develop
* fix comments
test=develop
5 years ago
xujiaqi01
68ea1ad55b
add clear one table ( #23089 )
...
* add clear_one_table
* test=develop
5 years ago
danleifeng
ae3bb16d06
add MaskAucCalculator in paddlebox ( #23157 )
...
* add maskauc in paddlebox; test=develop
5 years ago
liym27
6af480ca33
Support int64 for op assign_value. test=develop ( #23179 )
5 years ago
Zeng Jinle
53e6f8e1da
rename macro, test=develop ( #23161 )
5 years ago
Zeng Jinle
bba740710d
add cuda resource pool for BufferedReader, test=develop ( #23152 )
5 years ago
Zeng Jinle
7d8d50b6cc
rename no_need_buffer_vars macro, test=develop ( #23160 )
5 years ago
Liufang Sang
a486a739e1
fix compile error in win gpu ( #23196 )
...
* fix compile error in win gpu test=develop
* fix compile error in win gpu test=develop
* fix compile error in win gpu test=develop
5 years ago
Zeng Jinle
7ca77a90ac
add Tensor::IsSharedBufferWith method, test=develop ( #23175 )
5 years ago
Zeng Jinle
b8886bf122
rename no_need_buffer_vars_macro, test=develop ( #23159 )
5 years ago
Zeng Jinle
bae5930ba1
fix graph attr copy issues, test=develop ( #23191 )
5 years ago
wangchaochaohu
b721e23b25
transpose cudnn using cudnn v7 api ( #19738 )
...
* refine the transopose conv using v7 to choose algorithm
5 years ago
Pei Yang
46b8d282dc
Add some inference API comments for AnalysisConfig ( #23117 )
...
* add some API comments in paddle_analysis_config.h, test=develop
* add some API comments in paddle_analysis_config.h, test=develop
5 years ago
Adam
4f5e4540f8
Improve SGD jit code to work with large data ( #23120 )
5 years ago
Liufang Sang
4db031902d
add dequantize_log_op and make pyramid hash support int8 weight ( #22548 )
...
* add dequantize_log_op and make pyramid hash support int8 weight test=develop
* add unittest and update pyramid hash op test=develop
* remove paddle_enforce test=develop
* fix error message test=develop
* remove incorrent commit test=develop
* fix error message in log_dequantize test=develop
* change 2019 to 2020 test=develop
* remove useless check_grad test=develop
5 years ago
Zeng Jinle
e5fef8f38a
[Dygraph double grad]Code polish ( #23121 )
...
* fix dygraph double grad, test=develop
* fix unpack constructor, test=develop
5 years ago
Zeng Jinle
9258e96094
fix read op comments, test=develop, test=document_fix ( #23122 )
5 years ago
Zeng Jinle
acfc9b8a70
Reader sequential and inference partial feed ( #22699 )
...
* sequential reader stage 1, test=develop
* fix ut, test=develop
* fix iterable=False reset bug, add some logs and polish code, test=develop
* inference feed partial data, test=develop
* Turn on keep_order=True for test, test=develop
* enhance ut to test more cases, test=develop
* test commit for reverting
* Revert "test commit for reverting", test=develop
This reverts commit 80aef42ef52ba1ee79627d6f663a624ec4f12f58.
* add ut of merged and unmerged results, test=develop
* add more uts for coverages and add en doc of api, test=develop
* follow comments, test=develop
* change note style, test=develop
5 years ago
Wilber
95b356a069
update embedding_eltwise_layernorm fuse and kernel. test=develop ( #23114 )
...
update embedding_eltwise_layernorm fuse pass and fused kernel, to support multi input
5 years ago
Zeng Jinle
a31d7328b7
Add dygraph double grad implementation ( #22939 )
...
* add double grad implementation for dygraph, test=develop
* polish code, add uts, test=develop
* fix place bug, test=develop
* polish codes, add more uts for coverages, test=develop
* add no_grad_set, test=develop
* add star gan ut, test=develop
* follow comments, test=develop
5 years ago
Yiqun Liu
3af4771122
Add the detection and code-generation of sqrt and square in fusion_group ( #23095 )
5 years ago
hutuxian
0c30098f8b
Add need_save_delta parameter to solve OOM ( #23097 )
5 years ago
songyouwei
2e2da7124b
high-performance dygraph slice ( #22879 )
...
* move __getitem__ to cpp
* bug fix
* add type check and gil release
* support negative step with omitted ends
test=develop
* code refine
test=develop
* bug fix
test=develop
* slice always return different pyobj
test=develop
5 years ago
Sylwester Fraczek
abee05a8c8
added mkldnn swish activation ( #23041 )
5 years ago
Zhaolong Xing
8c6fde9e69
fix align error ( #23090 )
...
test=develop
5 years ago
Liufang Sang
915b892a15
Fix div zero in fake quantize op ( #22966 )
...
* fix div zero test=develop
* fix div zero test=develop
* add hostdevice function test=develop
* add eps when is zero test=develop
5 years ago
Yi Liu
121b2aed4d
initialize global nccl context in dygraph ( #23037 )
...
initialize global nccl context in dygraph
test=develop
5 years ago
Zhang Ting
880eb04d93
skip PrepareData when it is unnecessary ( #22839 )
...
* remove unnecessary prepare data, test=develop
* Op in while block will not skip PrepareData, test=develop
5 years ago
Feiyu Chan
01ab8a0619
add approximation for gelu, test=develop ( #22961 )
...
add approximation for gelu, default value is False (only kernel with eigen is added, remove code for computing gelu with MKLDNN temporarily)
5 years ago
Adam
5842ae6785
Revert "Change ShareDataWith() to TensorCopy() in conv_mkldnn ( #22695 )" ( #22985 )
5 years ago
Pei Yang
24db750386
fix trt int8 calib precision bug. test=develop ( #23036 )
5 years ago
GaoWei8
1dc1f9270e
Fix lod error of concat op for axis = 0 ( #22538 )
5 years ago
yaoxuefeng
660ff18488
fix datsset test=develop ( #23043 )
5 years ago
Zhang Ting
714b0076b6
Override GetKernelTypeForVar to avoid device transform, test=develop ( #23032 )
5 years ago
wangchaochaohu
112e3edbf6
fix the conv group problem test=develop ( #23025 )
5 years ago
Wilber
db40ee86db
fix unittets. test=develop ( #23018 )
5 years ago
wangchaochaohu
99db0cf762
remove debug log test=develop ( #22994 )
5 years ago
wangchaochaohu
3757e0687c
Add Unittest for backward of fusion group ( #22932 )
...
* add fusion group test for backward and refine code
5 years ago
chengjuntao
63f3ada7b9
fix bug which input shape ( #22965 )
...
* fix bug which input shape, test=develop
* add error type,test=develop
5 years ago
Zhang Ting
137d6563fc
add check for assigned data, test=develop ( #22960 )
5 years ago
wangchaochaohu
f0d193a23c
Cast fusion for fusion group ( #22876 )
...
* add support for expression type convert and add cast Op support in fusion group
5 years ago
yaoxuefeng
29a7a52d38
Fix instag ( #22632 )
...
* update
* update test=develop
* update compile set test=develop
* update compile set test=develop
* update test=develop
* update test=develop
* update test=develop
* update compile setting test=develop
* update compile setting test=develop
* update run demo test=develop
* update test=develop
* update test=develop
* fix test=develop
* update test=develop
* update test=develop
* update test=develop
* update test=develop
* update test=develop
* update test=develop
* update test=develop
* update test=develop
* update test=develop
* update format test=develop
* update format test=develop
* update style test=develop
* update style test=develop
* change style test=develop
* change style test=develop
* change style test=develop
* add dataset unittest test=develop
* update test=develop
* update for record test=develop
* udpate style for record test=develop
* update for record test=develop
* update for record test=develop
* update for record test=develop
* fix format test=develop
* update test=develop
* update test=develop
* update test=develop
* update test=develop
* update test=develop
* update test=develop
* update test=develop
* update test=develop
* update test=develop
* update test=develop
* update test=develop
* update test=develop
* update test=develop
* update test=develop
* update test=develop
* fix compile warning test=develop
* add attr default test=develop
* add unittest test=develop
* fix style test=develop
* fix style test=develop
* change out_val_ifempty to out_val_if_empty test=develop
5 years ago
wangchaochaohu
c979c9f2b0
refine the profiler print test=develop ( #22968 )
5 years ago
Wilber
ff3ddbb502
add skip_layernorm pass. test=develop ( #22895 )
...
* add skip_layernorm pass. test=develop
5 years ago
wawltor
f154d5860f
Speed up the matmul op, use the gemm replace the batch gemm ( #22926 )
...
In the op of gemm, we use the gemm to replace batch gemm, speed up the matmul op
5 years ago
Adam
056edf3929
Change ShareDataWith() to TensorCopy() in conv_mkldnn ( #22695 )
5 years ago
Zhaolong Xing
8d6dc102fe
[Ernie GPU Optimize]: Embedding_eltwise_layernorm Fuse ( #22494 )
...
* 1. add embedding eltwise layernorm fuse
2. add embedding eltwise layernorm op
3. refine inplace_add_relu
4. refine fc_eltwise_layernorm
test=develop
* 1. refine fc
test=develop
* fix comments
test=develop
* fix comments
test=develop
5 years ago
guofei
3d8571e884
modify assign op and add unittest of assign op ( #22769 )
...
As the title.
5 years ago
Zeng Jinle
d33c4343e1
Imperative tracer refactoring ( #22457 )
...
* refine grad maker, test=develop
* refactor tracer stage 1, test=develop
* merge develop to solve conflict third times, test=develop
5 years ago
liu zhengxi
61fef9754b
Fix fc padding bug during inference fusion ( #22860 )
...
* fix fc padding during fusion, test=develop
* fix optim model inference after SaveOptimModel, test=develop
5 years ago
tangwei12
ad9c8f6d2d
fix communicator when break under pyreder mode ( #22911 )
...
* fix communicator when breaking under PyReader mode, test=develop
* revert some vlog level to 0, test=develop
5 years ago
mapingshuo
5ba9dfc16a
add lookup_table_dequant_op ( #22900 )
...
add lookup_table_dequant_op
5 years ago
zhaoyuchen2018
a020a25797
Fix model int8 quant fail, test=develop ( #22891 )
...
As model fails when enable int8 quant, so disable allocate memory in cpu
for small variable.
5 years ago
Zhaolong Xing
dd67d44a50
[Paddle-TRT] : (Part1) Dynamic shape support ( #22868 )
...
* change the ci trt from version 5. to 6.0
* paddle-trt dynamic shape support init
* conv+bias or conv+bn dynamic shape support
test=develop
* modity trt engine opconvert
test=develop
* fix ci error
test=develop
5 years ago
tangwei12
07e13b84cd
remove vlog, test=develop ( #22898 )
5 years ago
Zhang Ting
ca9c8b417d
fix compute ratio of profile, test=develop ( #22872 )
5 years ago
wangchaochaohu
dbb0b9b3b6
refine the profiler print ( #22823 )
...
* refine the profiler print test=develop
5 years ago
Michał Gallus
0038bfbd1d
Prevent loading of warmup data in analyzer_int8 if enable_int8 is set to false ( #22857 )
5 years ago
Chen Weihang
1644926a6c
Polish detail implement of dygraph data loader ( #22878 )
...
* polish detail implement of data loader, test=develop
* solve coverage ci problem, test=develop
5 years ago
Wilber
f686310d81
fix concat_mkldnn op. test=develop ( #22692 )
...
fix concat_mkldnn op when encounter extreame conditions.
5 years ago
hong
5191e54494
reduce default attrs for dynamic graph ( #22850 )
...
* reduce default attrs for dynamic graph, test=develop
* add some explanations for explicit attr, test=develop
* tweak explicit attr comments, test=develop
5 years ago
Zhaolong Xing
1a533ed2de
[BUG]: Multihead matmul op's ouput size should be BxSx(N*H) ( #22848 )
...
test=develop
5 years ago
hong
c736fef93b
dygraph backward engine accelerate ( #22808 )
...
* fix loaded program load bug; test=develop
* first version
* speed backward engin; test=develop
* remove useless code; test=develop
* reconvery io.py; test=develop
* remove useless code; test=develop
* remove useless code; test=develop
5 years ago
Zeng Jinle
d41d802ba3
Add flags to limit gpu memory ( #22793 )
...
* add recorded cuda memory apis, fix typo, test=develop
* add more ut, test=develop
* follow comments, test=develop
* fix py35 incompatible issues, test=develop
5 years ago
石晓伟
1861ca88f1
serialize the PaddleTensor, test=develop ( #22810 )
...
* encapsulate the PaddleTensorToLoDTensor, test=develop
* serialize the pd_tensor, test=develop
* serialize tensors to file, test=develop
5 years ago
Zhang Ting
72ff5a09c3
fix print bug of profile, test=develop ( #22804 )
5 years ago
Zhang Ting
4e8bc02461
add fluid.device_guard to specify the device type for Op ( #22254 )
...
* add fluid.device_guard to specify the device type for Op
5 years ago
石晓伟
ddb9b46fec
change the function in op_teller, test=develop ( #22794 )
...
* change the function in op_teller, test=develop
* correct the commit-id, test=develop
5 years ago
Zhen Wang
89cfa49156
Unmerged fetch list ( #22635 )
...
* update ScopeBufferedSSAGraphExecutor&AsyncSSAGraphExecutor&ThreadedSSAGraphExecutor&FastThreadedSSAGraphExecutor&ParallelSSAGraphExecutor&ParallelExecutor for fetching unmerged results.
* add the unit test for fetch_unmerged.
* update ut for multi-card and multi-cpu.
* add the error message and the user suggestion in FetchOpHandle. test=develop
5 years ago
wangchaochaohu
8456c3f4dd
polish the profiler_help code ( #22811 )
5 years ago
zhongpu
2fd1ec1e3e
fix docker build for paddle openblas, test=develop ( #22795 )
5 years ago
Chen Weihang
7d8d573453
Speed up dygraph DataLoader based on shared memory and LoDTensor serialization ( #22541 )
...
* add lodtensor share memory & serialization, test=develop
* fix windows compile error, test=develop
* deal vartype pickle & fix unittest matching error message, test=develop
* update timeout variable name, test=develop
* refactor memory map implement, test=develop
* clear mmap file discripter when exit unexpectedly, test=develop
* remove the child process fd in advance, test=develop
* remove mmap fds after Queue.put in child process, test=develop
* add hard unittests for register exit func, test=develop
* fix python2 compatibility problem in unittest, test=develop
* fix exception unittest error, test=develop
* polish code based review comment, test=develop
5 years ago
liu zhengxi
324f2b3922
Fix inference c api PD_GetZeroCopyOutput lod ( #22768 )
...
* fix inference c api lod, test=develop
* fix capi lod problem and enrich tests, test=develop
* delete useless header files and alter const_cast, test=develop
5 years ago
wangchaochaohu
7578fcbac4
Profile code refine ( #22800 )
...
* add profiler_help.h to refine the code test=develop
5 years ago
hutuxian
53a2b68f4e
support customized download command in dataset ( #22782 )
...
* user can call dataset.set_download_cmd to set its customized download cmd
* add UT to cover this scenario
5 years ago
wangchaochaohu
ca9e77a8d4
add sum op support for fusion group ( #22771 )
...
* Add the codegen and auto fusion for sum Op in fusion group
5 years ago
tianshuo78520a
433cef03e5
fix typo word ( #22784 )
5 years ago
Kaipeng Deng
ebc7ffc300
fix detection_map. test=develop ( #22705 )
5 years ago
zhaoyuchen2018
72dde4abde
Refine adam op to improve performance, test=develop ( #22346 )
...
* Refine adam op, test=develop
* Fuse kernels together to reduce cpu time.
* Refine paddle enforce, test=develop
* Remove some comments, test=develop
* Refine code,test=develop
* Refine cuda kernel, test=develop
* Refine code according to comments, test=develop
5 years ago
wangguanzhong
f2d1cd119a
fix lod level, test=develop ( #22755 )
5 years ago
FlyingQianMM
79d712346f
Correct CPU gradients of the argsort op ( #22739 )
...
* Correct CPU gradients of the argsort op, form a network to test its forward and backward process, test=develop
* fix dynamic threshold error in test_argsort_op, test=develop
5 years ago
Adam
2b80e9a719
Add cpu_info without XBYAK ( #22716 )
5 years ago