Qiao Longfei
4e218dabc5
code format test=develop
6 years ago
Tao Luo
6375fe45d7
Merge pull request #16039 from luotao1/execution_context
...
remove legacy function in ExecutionContext
6 years ago
sneaxiy
814a759061
merge develop
...
test=develop
6 years ago
sneaxiy
597dc65e76
enhance gc
...
test=develop
6 years ago
liuwei1031
caadd0581d
add IfElse test case for ir memory optimize ( #15998 )
...
* add ir memory optimize test case for IfElse op, test=develop
* fix some unitttest failure by force using the python memory_optimize, test=develop
* tweak comments, test=develop
* fix unittest, test=develop
* fix unittest, test=develop
6 years ago
Qiao Longfei
f28c258453
code clean test=develop
6 years ago
Qiao Longfei
8c38aca954
tmp commit
6 years ago
Tao Luo
f4587789d8
remove legacy function in ExecutionContext
...
test=develop
6 years ago
Liu Yiqun
1041e18c47
Refine codes.
...
test=develop
6 years ago
luotao1
c0b240aa43
try to fix distributed unit-test
...
test=develop
6 years ago
luotao1
784826a4f5
enhance cache runtime_context for different scope
...
test=develop
6 years ago
Qiao Longfei
fab1b54d99
Merge branch 'add-communicator' of ssh://github.com/jacquesqiao/Paddle into add-async-ssa-graph-executor-communicator
6 years ago
Qiao Longfei
8744f9a083
fix parallel executor async mode
6 years ago
Qiao Longfei
e70b1727ef
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor
6 years ago
Liu Yiqun
d8a939d8a8
Merge branch 'develop' into core_opt_choose_kernel
6 years ago
chengduo
e2da3a5b22
Revert "Add Event for TensorCopy" ( #16022 )
...
* Revert "Add Event for TensorCopy (#15953 )"
This reverts commit 7235fd662b
.
test=develop
* fix CI
test=develop
6 years ago
luotao1
2fb38c108c
Merge branch 'develop' into runtime_context
6 years ago
sneaxiy
a9ea99d700
merge develop
6 years ago
Qiao Longfei
3691a46fa3
improve communicator
6 years ago
chengduo
ae37f82964
Unified ParallelExecutor and Compiler ( #15970 )
...
* Unified ParallelExecutor and Compiler
6 years ago
chengduo
7235fd662b
Add Event for TensorCopy ( #15953 )
...
Add Event for TensorCopy
6 years ago
luotao1
82b0bb9d72
fix cpplint error
...
test=develop
6 years ago
Liu Yiqun
d4674dab13
Cache the chosen kernel of operators'.
...
test=develop
6 years ago
luotao1
9773f38f99
cache runtime_context
...
test=develop
6 years ago
tangwei12
6d5a04c1e7
add op type in check nan/inf ( #15986 )
...
* add op name in check nan/inf, test=develop
6 years ago
Qiao Longfei
847e4f4e85
pure async mode train
6 years ago
Qiyang Min
187cffd019
Merge pull request #15928 from velconia/imperative_backward_hooks
...
Imperative backward hooks
6 years ago
Yiqun Liu
798925453e
Revert "Optimize while_op when is_test is true. ( #15811 )" ( #15968 )
...
test=develop
6 years ago
minqiyang
e5f3435dd5
Add missing headers
...
test=develop
6 years ago
minqiyang
50639fafdb
Polish code
...
test=develop
6 years ago
Yiqun Liu
613d9d0756
Optimize while_op when is_test is true. ( #15811 )
...
test=develop
6 years ago
nhzlx
2eff3e26b6
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_static_model_load_for_trt
6 years ago
nhzlx
06a088a199
fix comments and fix cpplint
...
test=develop
6 years ago
dzhwinter
225c11a91f
polish cudnn related code and fix bug. ( #15164 )
...
* staged.
* polish code
* polish code. test=develop
* polish code. test=develop
* api change. test=develop
* fix default value. test=develop
* fix default value. test=develop
6 years ago
Tao Luo
d5a888e15c
Merge pull request #15943 from kbinias/kbinias/add-placement-pass-tester
...
MKL-DNN: Add placement pass tester
6 years ago
Krzysztof Binias
72253391b6
Add MKL-DNN placement pass tester
...
test=develop
6 years ago
minqiyang
cb85ee987b
Remove var op deps in imperative mode
...
test=develop
6 years ago
Tao Luo
effec86600
Merge pull request #15913 from liangan1/func_coverage
...
Enable function coverage for U8/S8 ConvMKLDNNOpKernel
6 years ago
Qiao Longfei
49f2f4f91d
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-communicator
6 years ago
Qiao Longfei
f768fbf715
support multi graph
...
test=develop
6 years ago
Tao Luo
60546b78cc
Merge pull request #15923 from Sand3r-/mgallus/conv-residual-ut
...
Add Conv Residual Connection UT for Projection
6 years ago
Qiao Longfei
ff01d70583
fix style
...
test=develop
6 years ago
Qiao Longfei
dab7f36909
optimize code test=develop
6 years ago
Qiao Longfei
cf0511f21e
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor
...
test=develop
6 years ago
Michal Gallus
6a2bc9a275
Add Conv Residual Connection UT for Projection
...
test=develop
6 years ago
Qiao Longfei
43c82376cb
use one graph
6 years ago
dzhwinter
660e410655
Merge pull request #15855 from dzhwinter/fix/nightly_test
...
accelerate memory optimize process
6 years ago
minqiyang
b420ec3a92
invoke backward_hooks after reduce op's depcounts map
...
test=develop
6 years ago
Qiyang Min
4bd28b304b
Merge pull request #15831 from velconia/imperative_engine
...
Imperative training network to the end
6 years ago
Xin Pan
a6e3cd5eb7
Merge pull request #15425 from panyx0718/api
...
Pass graph to parallel executor instead of program
6 years ago
Qiao Longfei
b8491bfd4e
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-communicator
6 years ago
liangan1
4acc522087
Enable function coverage for U8/S8 ConvMKLDNNOpKernel
...
test=develop
6 years ago
Xin Pan
44e7fcddc5
Merge pull request #15844 from panyx0718/infer
...
add per kernel config and remove const_cast.
6 years ago
dzhwinter
a71f2fbe4f
fix default value. test=develop
6 years ago
Jacek Czaja
dec9cf53c8
[MKL-DNN] MKL-DNN specific Tensor modification ( #15429 )
...
* - Implemented draft of primitive desc keeping in Tensor
test=develop
- TransposeMKLDNNHandler::AcquireSrcMemory was reimplemented
- Added nchw and nc formats setting for sake of compatiblity
Fixed unit tests
- Worakaround to problem with 5D data in conv
- Added 3D and 1D MKL-DNN formats for name handles for tensor
test=develop
- Fix to UTs
test=develop
- Conv fp32 op was updated
Cosmetic fixes
test=develop
- tensor mkldnn cosmetics
test=develop
- Moved most of mkl-dnn specific code from Tensor to mkl-dnn utils
* - Lint fixes
test=develop
* - setting prim dec in Tensor , sets also layout to kMKLDNN
test=develop
* - Moved creation of prim desc totally out of Tensor
test=develop
* - Cosmetic fixes adter review
test=develop
6 years ago
minqiyang
84bf4d7b06
Move ClearBlock into OpBase and VarBase's destructor
...
test=develop
6 years ago
Xin Pan
5dd281f738
polish
...
test=develop
6 years ago
Qiao Longfei
10393dd0d1
add some check test=develop
6 years ago
乔龙飞 Qiao Longfei
ec8e878200
Merge pull request #15840 from jacquesqiao/revert-15684-revert-15661-fix-cpu-broadcast
...
fix cpu broadcast
6 years ago
minqiyang
a15a3fc314
Polish code
...
test=develop
6 years ago
Qiao Longfei
2b7931d5c9
refine code test=develop
6 years ago
Qiao Longfei
b5b8e6cc9c
revert the change of scope test=develop
6 years ago
Xin Pan
8d83e38a6b
remove mutex
...
test=develop
6 years ago
Xin Pan
0362ef75f4
fix
...
test=develop
6 years ago
minqiyang
9dc64edfd9
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into imperative_engine
...
test=develop
6 years ago
Xin Pan
12a0e2ed9d
polish codes
...
test=develop
6 years ago
Xin Pan
19d78f6797
polish
...
test=develop
6 years ago
Qiao Longfei
ecedd531c1
fix code bug test=develop
6 years ago
Qiao Longfei
f4f4816b0c
fix gpu error test=develop
6 years ago
Xin Pan
32d5a16036
resolve conflicts
...
test=develop
6 years ago
Qiao Longfei
3f9263f67e
optimize style test=develop
6 years ago
Qiao Longfei
4233d0a820
add more comment test=develop
6 years ago
Michał Gallus
c4faf36e7a
MKL-DNN: Add test for conv bias fuse pass ( #15824 )
...
* MKL-DNN: Add test for conv bias fuse pass
test=develop
* Remove const cast from Conv Bias Pass Test
* Add conv with bias test case for conv+bias fuse ut
test=develop
6 years ago
Qiao Longfei
3bccc1e6e2
optimize broadcast logic test=develop
6 years ago
Tao Luo
3831a4695d
Merge pull request #15862 from sfraczek/add-override-to-apply_impl
...
add override to ApplyImpl and precommit fixes
6 years ago
Xin Pan
26e32e095a
allow compiler to use graph
...
test=develop
6 years ago
Sylwester Fraczek
0b926114c0
add override to ApplyImpl
...
and #pragma once in edited headers
add #include<string> in edited headers
test=develop
6 years ago
Sylwester Fraczek
543e53db05
fix typo releated->related
6 years ago
Qiao Longfei
12f6b8c3d6
change the include of ThreadPool.h test=develop
6 years ago
Qiao Longfei
7f3be09045
fix multi graph test=develop
6 years ago
Qiao Longfei
9465c3d0c3
fix compile problem
6 years ago
Xin Pan
5eb87506bc
add per kernel config and remove const_cast.
...
test=develop
6 years ago
Qiao Longfei
31a05d3efd
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor
...
test=develop
6 years ago
Qiao Longfei
62f1248ff5
fix use gpu test=develop
6 years ago
Xin Pan
6019054cdd
Merge pull request #15716 from Yancey1989/refine_pg
...
Refine ParallelGraph Execution
6 years ago
Dun
a83e470405
Profiler refine and add CUDA runtime api tracer ( #15301 )
...
* refine profiler && add runtime tracer
* test=develop
* test=develop
* test=develop
* test=develop
* test=develop
* test=develop
* test=develop
* test=develop
* fix bug && test=develop
* add thread id map && test=develop
* test=develop
* testing
* bug fix
* remove cuda event && refine code && test=develop
* test=develop
* test=develop
* test=develop
* fix windows temp file && test=develop
* test=develop
* fix windows bug && test=develop
* fix start up issue && test=develop
* code polish && test=develop
* remove unused code && test=develop
* add some cupti cbid && test=develop
* add FLAGS_multiple_of_cupti_buffer_size && test=develop
* fix compile error && test=develop
* add keyword && test=develop
* fix && test=develop
* code polish && test=develop
6 years ago
Qiao Longfei
cc71e89499
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor
...
test=develop
6 years ago
minqiyang
f53e1d5c4b
implement ClearBlock
6 years ago
dzhwinter
f2e8409f5a
Merge pull request #15795 from dzhwinter/fix/block_desc
...
fix bug when op node has no block pointer
6 years ago
tensor-tang
e1c707fe9c
fix warnings ( #15790 )
...
* fix warnings
test=develop
* fix enforce test
test=develop
6 years ago
Yancey1989
4b193db14c
polish code test=develop
6 years ago
dzhwinter
6deb17ed8c
fix default value. test=develop
6 years ago
dzhwinter
089d262c41
fix default value. test=develop
6 years ago
dzhwinter
9c92d0304f
fix default value. test=develop
6 years ago
Yancey1989
d5090c892d
polish code test=develop
6 years ago
dzhwinter
28609b3435
Merge pull request #15696 from dzhwinter/cherry-pick/memory
...
cherry picked modifies.
6 years ago
Yancey1989
0f8bd73cc9
cleanup code test=develop
6 years ago
Yancey1989
5677c9d4ee
update comment test=develop
6 years ago
Yancey1989
642fd68ce0
update by comment test=develop
6 years ago
dzhwinter
d94a314db5
add reference. test=develop
6 years ago
dzhwinter
591ad33e32
polish code for reading. test=develop
6 years ago
dzhwinter
18afb77e78
polish code for reading. test=develop
6 years ago
Yan Chunwei
077d12b939
fix scale cleaner ( #15742 )
6 years ago
dzhwinter
684b572307
polish code for reading. test=develop
6 years ago
dzhwinter
3787e61fca
polish code for reading. test=develop
6 years ago
dzhwinter
c1455e606d
Merge remote-tracking branch 'origin/develop' into cherry-pick/memory
...
test=develop
6 years ago
dzhwinter
d376cf71b7
polish code for reading. test=develop
6 years ago
nhzlx
ecc12fb430
3. when runing in trt mode, do not allocate memory for parameters in fluid.
...
test=develop
6 years ago
Yancey1989
7cd6de37f5
fix cpu test=develop
6 years ago
Yancey1989
bd0d44af24
fix build failed test=develop
6 years ago
Yancey1989
ecdd1166b8
cleanup code test=develop
6 years ago
Yancey1989
73005ee00d
cleanup code test=develop
6 years ago
Yancey1989
88d3dc949e
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into refine_pg
...
test=develop
6 years ago
Yancey1989
f3463ecb6e
refine pg execution
6 years ago
dzhwinter
283573c6aa
add details. test=develop
6 years ago
chengduo
5a03b515ae
fix potential bug in async_executor ( #15707 )
...
test=develop
6 years ago
乔龙飞 Qiao Longfei
45b19cbc9a
Revert "Revert "cpu reduce mode did not need to broadcast params test=develop""
6 years ago
chengduo
ad61e1b22c
fix potential bug ( #15688 )
...
test=develop
6 years ago
dzhwinter
6d6ddcfe15
add details. test=develop
6 years ago
dzhwinter
f9ac88e1a0
Merge pull request #15694 from liuwei1031/fix_security_issue
...
Fix security issue
6 years ago
dzhwinter
11afbe0f53
add details. test=develop
6 years ago
tensor-tang
e49706c80e
Merge pull request #15659 from GBuella/add_to_string
...
Tests - add some missing to_string calls
6 years ago
liuwei1031
b1f97a6fa9
fix security issue 27, 38 test=develop
6 years ago
Gabor Buella
da9c94da33
Clang build fixes ( #15628 )
...
* Remove some superfluous std::move calls
The std:move triggered a build error (with -Werror):
```
[ 9%] Building CXX object paddle/fluid/memory/allocation/CMakeFiles/allocator_facade.dir/allocator_facade.cc.o
/home/tej/code/gbuella_paddle/paddle/fluid/memory/allocation/allocator_facade.cc:86:29: error: moving a temporary object prevents copy elision [-Werror,-Wpessimizing-move]
[this] { return std::move(CreateAllocatorWithChunk()); }, capacity);
^
/home/tej/code/gbuella_paddle/paddle/fluid/memory/allocation/allocator_facade.cc:86:29: note: remove std::move call here
[this] { return std::move(CreateAllocatorWithChunk()); }, capacity);
^~~~~~~~~~ ~
1 error generated.
```
See: https://reviews.llvm.org/D7633
* Remove a superfluous lambda capture from framework/operator.h
```
[ 10%] Building CXX object paddle/fluid/platform/CMakeFiles/device_context.dir/init.cc.o
In file included from /home/tej/code/gbuella_paddle/paddle/fluid/platform/init.cc:19:
/home/tej/code/gbuella_paddle/paddle/fluid/framework/operator.h:229:21: error: lambda capture 'this' is not used [-Werror,-Wunused-lambda-capture]
[this](Variable* var) { return var; });
^~~~
1 error generated.
```
Changing it to `return it->second;`, as is in the function below.
* Rethrow an exception (instead of copying it)
```
[ 11%] Building CXX object paddle/fluid/framework/CMakeFiles/operator.dir/operator.cc.o
/home/tej/code/gbuella_paddle/paddle/fluid/framework/operator.cc:191:13: error: local variable 'exception' will be copied despite being thrown by name [-Werror,-Wreturn-std-move]
throw exception;
^~~~~~~~~
/home/tej/code/gbuella_paddle/paddle/fluid/framework/operator.cc:191:13: note: call 'std::move' explicitly to avoid copying
throw exception;
^~~~~~~~~
std::move(exception)
```
See https://reviews.llvm.org/D43322 for an explanation of this diagnostic message.
* Remove an unused variable
```
/home/tej/code/gbuella_paddle/paddle/fluid/framework/operator.cc:884:16: error: private field 'scope_' is not used [-Werror,-Wunused-private-field]
const Scope& scope_;
^
```
* struct ComputationOpHandle -> class ComputationOpHandle
```
[ 13%] Building CXX object paddle/fluid/framework/details/CMakeFiles/memory_early_delete_pass.dir/memory_early_delete_pass.cc.o
In file included from /home/tej/code/gbuella_paddle/paddle/fluid/framework/details/memory_early_delete_pass.cc:21:
/home/tej/code/gbuella_paddle/paddle/fluid/framework/details/reference_count_pass_helper.h:30:1: error: class 'ComputationOpHandle' was previously declared as a struct; this is valid, but may result in linker errors under the Microsoft C++ ABI [-Werror,-Wmismatched-tags]
class ComputationOpHandle;
^
/home/tej/code/gbuella_paddle/paddle/fluid/framework/details/computation_op_handle.h:29:8: note: previous use is here
struct ComputationOpHandle : public OpHandleBase {
^
/home/tej/code/gbuella_paddle/paddle/fluid/framework/details/reference_count_pass_helper.h:30:1: note: did you mean struct here?
class ComputationOpHandle;
^~~~~
struct
1 error generated.
```
* Fix name() methods under fluid/operators
```
In file included from /home/tej/code/gbuella_paddle/paddle/fluid/operators/jit/gen/act.cc:15:
In file included from /home/tej/code/gbuella_paddle/paddle/fluid/operators/jit/gen/act.h:19:
/home/tej/code/gbuella_paddle/paddle/fluid/operators/jit/gen/jitcode.h:71:23: error: 'name' overrides a member function but is not marked 'override' [-Werror,-Winconsistent-missing-override]
virtual const char* name() const = 0;
^
/home/tej/code/gbuella_paddle/paddle/fluid/operators/jit/gen_base.h:31:23: note: overridden virtual function is here
virtual const char* name() const = 0;
^
```
test=develop
6 years ago
kolinwei
acfe28d5eb
Merge pull request #15684 from PaddlePaddle/revert-15661-fix-cpu-broadcast
...
Revert "cpu reduce mode did not need to broadcast params test=develop"
6 years ago
Xin Pan
d670d8ef1d
Merge pull request #15671 from cjld/fix_graph
...
fix bug CreateControlDepVar duplicate name
6 years ago
乔龙飞 Qiao Longfei
6e0e706198
Revert "cpu reduce mode did not need to broadcast params test=develop"
6 years ago
Qiao Longfei
97b143fb49
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix-cpu-broadcast
...
test=develop
6 years ago
dzhwinter
381f2015a5
Merge pull request #15665 from dzhwinter/experiment/refactor_memory
...
refactor optimize pass.
6 years ago
Qiao Longfei
ffd0d1d216
clean need_broadcast_var_ test=develop
6 years ago
Qiao Longfei
fbadd4b60c
follow comment test=develop
6 years ago
dzhwinter
04e9776aef
add details. test=develop
6 years ago
baojun
f4a0e68481
Fix ngraph compile WITH_DISTRIBUTE=ON ( #15636 )
...
* fix compile issue with_distribute test=develop
* simplified logic test=develop
* use ngraph dependency test=develop
* set cpu only test=develop
* update test and eliminate fp16 test test=develop
6 years ago
Dun Liang
1905f1a108
bug fix && test=develop
6 years ago
Qiao Longfei
2171aa77f1
async ssa exe only support local mode
6 years ago
Qiao Longfei
c4ded17e8c
async mode support dist train
6 years ago
Qiao Longfei
84367cf8bc
support async mode in dist mode parallel executor
6 years ago
Qiao Longfei
e72637ddd2
ThreadedSSAGraphExecutor support num_iteration_per_run test=develop
6 years ago
Qiao Longfei
a7152613f7
Merge branch 'fix-cpu-broadcast' of ssh://github.com/jacquesqiao/Paddle into add-communicator
6 years ago
Qiao Longfei
76072261f8
fix compiler
...
test=develop
6 years ago
Qiao Longfei
b99db0e2c2
cpu reduce mode did not need to broadcast test=develop
6 years ago
Qiao Longfei
5cf0092825
add more log and fix test_dist_base in multi_batch_merge_pass
6 years ago
Gabor Buella
4975a9050a
Tests - add some missing to_string calls
...
```
/home/tej/code/gbuella_paddle/paddle/fluid/framework/ir/seqpool_concat_fuse_pass_tester.cc:167:40: error: adding 'int' to a string does not append to the string [-Werror,-Wstring-plus-int]
std::string prefix = "seqpool_op_" + i;
~~~~~~~~~~~~~~^~~
/home/tej/code/gbuella_paddle/paddle/fluid/framework/ir/seqpool_concat_fuse_pass_tester.cc:167:40: note: use array indexing to silence this warning
std::string prefix = "seqpool_op_" + i;
^
& [ ]
1 error generated.
```
test=develop
6 years ago
Qiao Longfei
b1fe8d4570
add a check for async_ssa_graph_exe test=develop
6 years ago
Qiao Longfei
16af1dbc7b
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor
...
test=develop
6 years ago
Qiao Longfei
381f383989
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-communicator
6 years ago
wopeizl
c1e18b13aa
Merge pull request #15635 from wopeizl/fixbuildissue
...
fix the build issue on gpu mode for win
6 years ago
dzhwinter
5d30b55de1
rerun ci. test=develop
6 years ago
dzhwinter
4ef34916a4
enhanced print message. test=develop
6 years ago
peizhilin
238ef94702
fix the build issue on gpu mode for win
...
test=develop
6 years ago
dzhwinter
ce0394bcd0
merge develop branch. test=develop
6 years ago
Xin Pan
74bc55c2a6
Merge pull request #14975 from dzhwinter/ir_inplace_pass
...
Ir inplace pass
6 years ago
dzhwinter
cca71532eb
add skip send.recv test=develop
6 years ago
dzhwinter
9f001c6525
skip dist. test=develop
6 years ago
Yan Chunwei
dc5e25fc7f
remove dot marked node ( #15606 )
6 years ago
dzhwinter
2561a6fc59
follow comment. test=develop
6 years ago
dzhwinter
2a5ecb68b0
follow comment. test=develop
6 years ago
dzhwinter
9f693fcac4
rerun ci. test=develop
6 years ago
dzhwinter
e537634d16
delete graph print pass. test=develop
6 years ago
dzhwinter
4f01de6378
Merge remote-tracking branch 'origin/develop' into feature/ir_inplace_pass
6 years ago
liuwei1031
6e84eb131f
expose peak gpu memory API to python test=develop ( #15529 )
...
* expose peak gpu memory API to python test=develop
* add unittest for peak gpu memory monitoring test=develop
* add pybind change test=develop
* add mutex to gpu mem usage monitor test=develop
* update benchmark flag definition file test=develop
* tweak unittest for memory monitoring test=develop
6 years ago
dzhwinter
5cab99a686
fuck windows. rerun windows ci. test=develop
6 years ago
dzhwinter
9c9ad7d40b
Merge remote-tracking branch 'origin/develop' into feature/ir_inplace_pass
...
test=develop
6 years ago
dzhwinter
0a63234c85
follow comments. test=develop
6 years ago
Yan Chunwei
897789b16e
fix save_inferece_model bug ( #15365 )
6 years ago
dzhwinter
9e87fbebb7
rerun windows ci. test=develop
6 years ago
dzhwinter
6f9904e99a
rerun windows ci. test=develop
6 years ago
dzhwinter
a52be7c081
refine build strategy. test=develop
6 years ago
dzhwinter
32a2014939
refine build strategy. test=develop
6 years ago
Krzysztof Binias
b1bdcd4de8
Make separate folders for mkldnn codes
...
test=develop
6 years ago
dzhwinter
06f2448848
Merge remote-tracking branch 'origin/develop' into feature/ir_inplace_pass
6 years ago
dzhwinter
8156fedf56
merge develop branch. test=develop
6 years ago
Qiao Longfei
d6c0dcaa16
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor
...
test=develop
6 years ago
Jiabin Yang
fd286f3596
Merge pull request #15534 from JiabinYang/fix/multi_output_support_imperative
...
test=develop, fix/multi_output_support_imperative
6 years ago
Qiao Longfei
c7e3868007
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-communicator
6 years ago
Qiao Longfei
02dab46ab8
add some debug info
6 years ago
dzhwinter
ee3aae56cd
merge develop branch. test=develop
6 years ago
dzhwinter
d6d3e6afe2
add more skip strategy
6 years ago
Yan Chunwei
b62b756b28
add version support ( #15469 )
6 years ago
tensor-tang
3c224e7e79
Merge pull request #15537 from baojun-nervana/rm_ngraph_operator
...
rm ngraph_operator.cc test=develop
6 years ago
Zhaolong Xing
97b76c94c4
Merge pull request #15242 from NHZlX/trt_int8_ultimate_version
...
add trt int8 support
6 years ago
Jiabin Yang
10bc9ffc2d
Merge pull request #15518 from JiabinYang/fix/refine_error_message
...
test=develop, refine_error_message for data type
6 years ago
Qiao Longfei
be738a646e
add some debug infor
6 years ago
Qiao Longfei
62549e0714
add GenParentScopeTreeDebugInfo
6 years ago
dzhwinter
2739096eec
compatibable with python side mem_opt
6 years ago
Qiao Longfei
a66115bed5
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor
...
test=develop
6 years ago
Qiao Longfei
fab8457e6b
code optimize
6 years ago
gongweibao
d303270a0e
revert test=develop ( #15535 )
6 years ago
baojun-nervana
8e9308a51a
mv ngraph_bridge to ngraph directory test=develop
6 years ago
baojun-nervana
da3f9cc512
rm ngraph_operator.cc test=develop
6 years ago
Qiao Longfei
ada43e89c3
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor
...
test=develop
6 years ago
JiabinYang
5639f49b16
test=develop, fix/multi_output_support_imperative
6 years ago
gongweibao
d54494ba87
cleanup test=develop ( #15347 )
6 years ago
JiabinYang
c52f57de5b
test=develop, refine_error_message for data type
6 years ago
baojun
efce25673c
Adding ngraph_engine_op ( #14948 )
...
* enable ngraph_engine_op
test=develop
* merge develop test=develop
* avoid const_cast test=develop
* rm ngraph_operator test=develop
* Added TODO to move EnableNgraph test=develop
* Add TODO to remove const_cast test=develop
6 years ago
Yiqun Liu
3008fa1261
Add the CUDA kernel for beam_search op ( #15020 )
...
* Refine the beam_search op and test.
* A basic CUDA implementation of beam_search for small batch_size.
* Implement CUDA kernel for beam_search_op.
* Use multiple CUDA threads in the same block to select the top beam.
* Update the python api of beam_search op.
* Enable extend function in CPU kernel of beam_search op.
* Unify the CUDA codes.
test=develop
* Unify the CPU kernel of beam_search op.
* Ensure the seletced items of beam_search_op's CPU kernel sorted by scores.
* Update the description of beam_search in API.spec.
* Enable the use of CUDA kernel in beam_search op.
* Exclude the beam_search's CUDA unittest when there is no CUDA gpu, and delete some debuging statements.
test=develop
* Follow comments.
test=develop
* Call the CPU kernel for beam_search op when batch_size > 4.
test=develop
* Remove the except of is_empty op in PrepareData.
test=develop
6 years ago
nhzlx
0779e35544
fix two bug:
...
1. graph and program_desc alignment
2. trt stream
test=develop
6 years ago
Qiao Longfei
ca5d96bb3d
complete send lod tensor
6 years ago
Zeng Jinle
dec89bd7ed
Merge pull request #15460 from sneaxiy/try_to_turn_on_remove_unnecessary_lock
...
Turn on remove_unnecessary_lock by default
6 years ago
Xin Pan
58cb18d9d9
Merge pull request #15322 from velconia/imperative_resnet
...
Imperative Resnet
6 years ago
Qiao Longfei
be72940b76
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-communicator
6 years ago
sneaxiy
ef788603d4
merge develop
...
test=develop
6 years ago
sneaxiy
d8568acd19
turn on remove_unnecessary_lock
...
test=develop
6 years ago
sneaxiy
eac5a0aa0c
Merge develop
...
test=develop
6 years ago
WangZhen
3ce6172052
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into graph_quantization
6 years ago
dzhwinter
8f3b252392
squash commits. test=develop
6 years ago
Yan Chunwei
885c4e57ab
fea/infer memory optim2 ( #14953 )
6 years ago
minqiyang
8ce198b2e1
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into imperative_resnet
...
test=develop
6 years ago
Dun
9f8f0fc2d3
Memory optimization of depthwise conv op and group norm op ( #15313 )
...
* mem opt
* test=develop
* test=develop
* test=develop
* test=develop
* test=develop
* test=develop
* test=develop
* refine code test=develop
* refine code test=develop
* refine code test=develop
* refine code test=develop
* refine with cub test=develop
* fix mkldnn test && remove comments && test=develop
* polish code && test=develop
* add only_forward test && test=develop
6 years ago
WangZhen
e2ff300b02
add UT for quantization.
6 years ago
WangZhen
451896fce4
init quantization.
6 years ago
Qiao Longfei
f3210b60ba
fix copy_memory and share_memory
6 years ago
Qiao Longfei
9958775b31
add NewTmpScope to scope
6 years ago
Qiao Longfei
7021979bc2
init communicator
6 years ago
Qiao Longfei
69484f71e0
remote communicator
6 years ago
Qiao Longfei
88d71fa2f9
support num_iteration_per_run
6 years ago
gongweibao
7cd4dd7ce4
Hide varhandle members. ( #15382 )
6 years ago
Qiao Longfei
ea66979684
can run
6 years ago
Qiao Longfei
afda840126
init communicator
6 years ago
Qiao Longfei
92a6c7a049
init async ssa executor
6 years ago
tensor-tang
3759c1db8c
Merge pull request #14805 from mozga-intel/mozga-intel/element_wise_operator_ngraph
...
Enable element_wise_add operator for a ngraph engine
6 years ago
mozga-intel
cba729404d
Enable softmax operator for a ngraph engine
...
test=develop
6 years ago
tensor-tang
a7fc3d42a0
Merge pull request #15304 from tensor-tang/fuse/second_order_mul_sub
...
Fuse/second order mul sub and fuse repeated fc relu
6 years ago
乔龙飞 Qiao Longfei
b14d4cdd75
Merge pull request #14890 from jacquesqiao/multithread-sparse-adam
...
adam support multithread
6 years ago
Qiao Longfei
9b4fe283e1
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into multithread-sparse-adam
...
test=develop
6 years ago
peizhilin
5e450833bd
test=develop
6 years ago
peizhilin
eea75a1d93
fix issue when type is invalid
...
test=develop
6 years ago
peizhilin
9adb158e5b
Merge remote-tracking branch 'upstream/develop' into debug/support
6 years ago
tensor-tang
84b0ecdcce
Merge remote-tracking branch 'ups/develop' into fuse/second_order_mul_sub
...
test=develop
6 years ago
chengduo
46d01d798e
Revert "Revert "Remove workspace_handle in conv_cudnn ( #15186 )"" ( #15290 )
...
test=develop
This reverts commit 358e657f68
.
6 years ago
tensor-tang
d618e48309
fix fuse square mat order and refine test
...
test=develop
6 years ago
tensor-tang
a5d2a6d1ad
add fuse pass of sequared mat sub fusion
6 years ago
tensor-tang
ca6fdc6e33
refine and fix test
...
test=develop
6 years ago
tensor-tang
a89296ac1f
add repeated fc relu pass
6 years ago
Xin Pan
50b4ac08b0
fix
...
test=develop
6 years ago
Xin Pan
a1bfb35dd6
try fix py2
...
test=develop
6 years ago
Xin Pan
6a18c0f9ff
Merge pull request #15278 from chengduoZH/revert_remove_workspace_handle_in_conv2d_cudnn
...
Revert "Remove workspace_handle in conv_cudnn (#15186 )"
6 years ago
Zhaolong Xing
98e85f3735
add_transpose_flatten_concat_fuse ( #15121 )
6 years ago
chengduozh
358e657f68
Revert "Remove workspace_handle in conv_cudnn ( #15186 )"
...
test=develop
This reverts commit 064512aa47
.
6 years ago
tensor-tang
fc9fbab6a0
Merge pull request #15271 from tensor-tang/fix/typo
...
fix typo and refine
6 years ago
chengduo
064512aa47
Remove workspace_handle in conv_cudnn ( #15186 )
...
* remove workspace_handle in conv2d_cudnn
test=develop
* remove workspace_handle
test=develop
* fix bug
test=develop
* make test_conv2d_op SERIAL
test=develop
* save memory in conv_cudnn
test=develop
* enhance thread safety
test=develop
* enhance temporary allocator
test=develop
* Add excess fraction
test=develop
* follow comments
test=develop
* fix bug and code refine
test=develop
* fix memory size check
test=develop
* rename reuse_tmp_allocation_excess_fraction
test=develop
6 years ago
tensor-tang
c3a9f3c4b2
fix typo and refine
...
test=develop
6 years ago
tensor-tang
ab9c4b2a9f
refine seqpool concat pass and remove unused nodes
...
test=develop
6 years ago
tensor-tang
ce909664d8
Merge remote-tracking branch 'ups/develop' into refine/seqpool/feed
6 years ago
flame
fb63cd89d4
Add python ir graph API ( #14917 )
6 years ago
tensor-tang
a0a27bd240
add seqpool concat fuse pass tester
...
test=develop
6 years ago
sneaxiy
594dc4d8f0
partial gc 1st version
...
test=develop
6 years ago
tensor-tang
8e086a8521
follow comment and fix typo
...
test=develop
6 years ago
tensor-tang
48410b9bfe
Merge pull request #15237 from tensor-tang/fuse/seqpool_concat_2
...
Fuse/seqpool concat 2
6 years ago
peizhilin
c1235c935f
add the enable_debug flag
...
test=develop
6 years ago
Xin Pan
7b73fc9e1a
Merge pull request #15089 from panyx0718/api
...
try unify Executor and ParallelExecutor
6 years ago
tensor-tang
f8c305b243
Merge remote-tracking branch 'ups/develop' into fuse/seqpool_concat_2
...
test=develop
6 years ago
Zeng Jinle
e29f10d315
Merge pull request #15207 from sneaxiy/remove_op_handle_lock_and_fix_var
...
Remove op handle lock and fix var
6 years ago
Zeng Jinle
7b638f2781
Merge pull request #15218 from sneaxiy/fix_same_name_func
...
Fix same name func framework::ToTypeIndex
6 years ago
mozga-intel
a42f8f4f6f
Enable element_wise_add operator for a ngraph
...
test=develop
6 years ago
tensor-tang
72d2a1801e
add seqpool concat fuse pass
...
test=develop
6 years ago
sneaxiy
bc205ef374
fix same name func
...
test=develop
6 years ago
xuezhong
c0bc818688
Merge pull request #15188 from velconia/add_pyramid_dnn_support
...
Add no lock optimization pass
6 years ago
sneaxiy
4a443ffc98
merge develop
...
test=develop
6 years ago
sneaxiy
7c7342bf12
fix scope.var()
...
test=develop
6 years ago
Tao Luo
4d9aa1745a
Merge pull request #14806 from mozga-intel/mozga-intel/scale_operator_ngraph
...
Enable scale operator for a ngraph engine
6 years ago
peizhilin
a6f5ceee74
add the python callstack for debug support test=develop
6 years ago
minqiyang
b76695418a
Polish log
...
test=develop
6 years ago
minqiyang
1bfbc0d963
Polish code
...
test=develop
6 years ago
minqiyang
7f45b9511a
Polish code
6 years ago
minqiyang
68a07328fa
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_pyramid_dnn_support
...
test=develop
6 years ago
Qiao Longfei
44b300556d
change min_row_size_to_use_multithread to parameter of adam
...
test=develop
6 years ago
Qiao Longfei
87b4eb1da4
change min_param_size_to_use_multithread to min_row_size_to_use_multithread
6 years ago
minqiyang
4bfa110fd8
Add no lock optimize pass
...
test=develop
6 years ago
chengduo
eabb2105fa
Refactor MultiDevSSAGraphBuilder ( #15090 )
...
* Refactor ParallelExecutor
test=develop
* extract Reduce and AllReduce mode from MultiDevSSAGraphBuilder
test=develop
* Refactor MultiDevSSAGraphBuilder
test=developt
* Remove enable_data_balance
test=develop
* code refine
test=develop
* remove data balance
test=develop
* refine ScaleLossGradOp
test=develop
* remove uncessary file
test=develop
* code refine
test=develop
* modify function name
test=develop
* follow comments
test=develop
* add is_distribution field
test=develop
* set is_distribution
test=develop
* fix DistSSAGraphBuilder
test=develop
6 years ago
Yan Chunwei
875a07c32d
refactor inference analysis api ( #14634 )
6 years ago
mozga-intel
e77956c920
Enable mean operator for a ngraph
...
test=develop
6 years ago
mozga-intel
dd768714ab
Enable scale operator for a ngraph
...
test=develop
6 years ago
Qiao Longfei
17b1b660fc
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into multithread-sparse-adam
...
test=develop
6 years ago
baojun-nervana
f0cde74564
Update ngraph with elt-wise relu test=develop
6 years ago
Xin Pan
8ae9094e07
polish and resolve conflicts
...
test=develop
6 years ago
Xin Pan
5e928e579a
try unify Executor and ParallelExecutor
...
test=develop
6 years ago
Yan Xu
a1e60ab19b
Merge pull request #14791 from Yancey1989/parallel_graph_mode
...
[Feature] Add ParallelGraph executor mode in parallelexecutor to improve performance
6 years ago
Yancey1989
4ad9de74dd
disable sync nccl by default test=develop
6 years ago
Yancey1989
db603398b7
disable parallel graph executor by default
6 years ago
Xin Pan
087af6a686
Merge pull request #15131 from panyx0718/clean
...
hide temp tensor allocation
6 years ago
Yancey1989
e65436103f
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into parallel_graph_mode
...
test=develop
6 years ago
Yancey1989
94c80347b6
update by comment
6 years ago
Qiyang Min
23761beaef
Merge pull request #14971 from velconia/imperative_mnist
...
Imperative Optimizer
6 years ago
Wu Yi
227e0c4518
fix nccl2 mode startup test=develop ( #15132 )
6 years ago
Xin Pan
9186451f60
hide GetTensor
...
test=develop
6 years ago
Yancey1989
35cda13e9f
fix unittest test=develop
6 years ago
minqiyang
2547f9d1b8
Polish code
...
test=develop
6 years ago
minqiyang
09e2e66236
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into imperative_mnist
6 years ago
Yancey1989
0a885ac12a
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into parallel_graph_mode
...
test=develop
6 years ago
Yancey1989
ca8c77d966
selecte execution according to strategy test=develop
6 years ago
minqiyang
858e903231
Add unittest for operator
...
test=develop
6 years ago
wopeizl
7ab501264d
Merge pull request #15069 from wopeizl/windows/dsosupport
...
add cuda dso support for windows
6 years ago
minqiyang
6a5f604607
Support stop_gradients var in imperative backward
...
test=develop
6 years ago
guru4elephant
ff739449ab
Merge pull request #15018 from guru4elephant/add_timer
...
Add debug thread function for async executor
6 years ago
Qiyang Min
e29cbfe4f7
Merge pull request #14829 from velconia/accelerate_ddpg
...
Accelerate little models
6 years ago
Tao Luo
9c2cbfb89e
Merge pull request #15093 from baojun-nervana/intel/cmake
...
Upgrade ngraph & clean up cmake
6 years ago
Zeng Jinle
25b49a0896
Merge pull request #14933 from sneaxiy/rewrite_ddim
...
Rewrite ddim
6 years ago
Wu Yi
a8bc05b5ff
Refactor distributed RPC ( #15075 )
...
* wip
* wip
* refactor no.1 dir structure test=develop
* fix linking test=develop
* fix includes test=develop
* fix build test=develop
* fix build test=develop
6 years ago
baojun-nervana
555fbc10d8
upgrade ngraph to v0.10.1 test=develop
6 years ago
baojun-nervana
c714c36482
simplify logic test=develop
6 years ago
Xin Pan
3e8408429d
Merge pull request #15053 from panyx0718/imperative_hold
...
refactor to avoid scope.
6 years ago
sneaxiy
73896eeb94
merge develop
...
test=develop
6 years ago
Wu Yi
e26cced7cc
refine batch merge pass ( #14777 )
...
* refine batch merge pass
* refine batch merge pass test=develop
6 years ago
Yancey1989
4743c9cd5d
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into parallel_graph_mode
6 years ago
sneaxiy
9a3a246cb5
fix py35 compile error
...
test=develop
6 years ago
Zhaolong Xing
4048cfa9da
Merge pull request #15048 from NHZlX/add_affine_channel_fuse
...
Add conv+ affine channel fuse pass
6 years ago
minqiyang
ef7d563db9
Add changes back
...
test=develop
6 years ago
minqiyang
a318a490ab
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into accelerate_ddpg
...
test=develop
6 years ago
Zeng Jinle
c0bcff00dc
Merge pull request #14962 from sneaxiy/rewrite_variable_type
...
Rewrite variable type
6 years ago
chengduo
fe8495a758
[WIP] Refine MultiDevSSAGraph ( #15040 )
...
* refine parallel_exe
test=develop
* rename shared_var_device
* code refine
* add test_weight_decay
* remove Sort
test=develop
* Add SortForReduce
test=develop
* code refine
test=develop
* follow comment
test=develop
6 years ago
dongdaxiang
82335cd88c
Merge branch 'add_timer' of https://github.com/guru4elephant/Paddle into add_timer
...
test=develop
6 years ago
wopeizl
719ebe3786
Merge pull request #15070 from wopeizl/windows/testcasefix
...
fix test issues on windows
6 years ago
Xin Pan
b91a7a9d30
clear operator changes
...
test=develop
6 years ago
Xin Pan
f52b514dcd
call kernel
6 years ago
Xin Pan
4e80e04f23
fix
...
test=develop
6 years ago
Xin Pan
61491ce250
clean
...
test=develop
6 years ago
Xin Pan
ce7e503cbe
refactor to avoid scope.
...
test=develop
6 years ago
Qiyang Min
0238a3bb4f
Merge pull request #14972 from velconia/accelerate_lstm
...
Accelerate PADDLE_ENFORCE
6 years ago
Houjiang Chen
242d3c71a6
Merge pull request #15031 from hjchen2/develop
...
Fix conv_elementwise_add2_act pass
6 years ago
Xin Pan
71a4a8e981
Merge pull request #15071 from wopeizl/revert/15035
...
Revert "cherry-pick the #12759"
6 years ago
Qiao Longfei
3b294e2e2e
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into multithread-sparse-adam
6 years ago
sneaxiy
c4ce2e7b21
merge develop, solve conflict
...
test=develop
6 years ago
minqiyang
8ed0233924
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into accelerate_ddpg
...
test=develop
6 years ago
Zeng Jinle
9c6a0203e2
Merge pull request #15073 from sneaxiy/add_scope_pool
...
Add scope_pool
6 years ago
sneaxiy
b56aca82e9
merge develop
...
test=develop
6 years ago
sneaxiy
ee83ce75bf
try to fix py35 compile error
...
test=develop
6 years ago
sneaxiy
3e917a934a
add scope_pool
...
add module cleanup
test=develop
6 years ago
Yancey1989
1a4f79a7de
fix unittest test=develop
6 years ago
Yancey1989
86bb583881
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into parallel_graph_mode
6 years ago
Yancey1989
495e73d766
enable gc
6 years ago
Yancey1989
28cdfbc2b0
delete comment code
6 years ago
Yancey1989
845bfd5807
cleanup code
6 years ago
peizhilin
2388d0e7d6
Revert "cherry-pick the #12759"
...
test=develop
This reverts commit 7f6d8acecb
.
6 years ago
peizhilin
01c00b07dd
fix test issues on windows
...
test=develop
6 years ago
peizhilin
1e7f83e60a
add cuda dso support for windows
...
test=develop
6 years ago
tangwei12
dc8eca826e
code style fix, test=develop ( #15045 )
...
* code style fix, test=develop
6 years ago
Yancey1989
41a64f6a2a
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into parallel_graph_mode
6 years ago
nhzlx
a6aa8ea771
faster rcnn input is presistable. (fix it in paddle-trt)
...
test=develop
6 years ago
hjchen2
956cf92145
Fix conv_elementwise_add2_act pass
...
test=develop
6 years ago
Tao Luo
69659f4ae2
Merge pull request #15037 from jianhang-liu/fix/abnormal_stack_op_time
...
Fix/abnormal stack op time
6 years ago
sneaxiy
179acc60b3
fix conflict with develop
...
test=develop
6 years ago
wopeizl
09bd8fa67a
Merge pull request #15035 from wopeizl/debug/improvement1
...
cherry-pick the #12759
6 years ago
sneaxiy
dde3afe7b7
Merge develop
...
test=develop
6 years ago
dongdaxiang
2df1d80767
Merge branch 'add_timer' of https://github.com/guru4elephant/Paddle into add_timer
...
test=develop
6 years ago
Wu Yi
856f0da0fe
Fp16 training ( #14992 )
...
* wip
* wip
* wip
* wip for test
* add fp16 tests test=develop
* fix cpu build test=develop
* fix test=develop
* fix py3 tests test=develop
* fix lr_scheduler dtype test=develop
* fix test=dvelop
* test fix ci compile test=develop
* fix build and merge test=develop
* fallback momentumop change to general test=develop
* make fp16 lr schedule simple test=develop
* fix ut test=develop
* fix tests test=develop
* remove fp16 learning rate cast test=develop
6 years ago
Brian Liu
e821b12f57
Fix issue which cause abnormal CPU usage in stack op
...
Stack OP has much higher CPU cost than expected in release mode.
Caused by DebugStringEx() in base class OperatorWithKernel. Actually
this issue occur for each OP which hasn't implement it's own
GetExpectedKernelType().
test=develop
6 years ago
chengduo
b9fb03cf54
Move GetTensor to tensor_util ( #15011 )
...
* refine tensor
test=develop
* refine tensor
test=develop
* fix device_context log
test=develop
6 years ago
nhzlx
73b47df1f4
Merge branch 'develop' of https://github.com/paddlepaddle/paddle into add_affine_channel_fuse
...
test=develop
6 years ago
nhzlx
ce3782c193
add affine_channel fuse.
...
fix conv+elemenwise fuse bug.
6 years ago