Wojciech Uss
2579ade45f
Add cpu_quantize_pass for C-API quantization ( #16127 )
...
* Add cpu_quantize_pass for C-API quantization
test=develop
* add cpu_quantize_pass test
* fix lint: add include memory unorderd_map and unordered_set
test=develop
* fuse_relu 1
test=develop
* tuned 2 without squash
* fixes
test=develop
* remove unused vars
test=develop
* refactored
test=develop
* fix lint c-style cast -> C++ style cast
test=develop
* remove QuantMax and c style casts
test=develop
* last usage of QuantMax removed
test=develop
* Fix Analysis Predictor UT
Check if memory_optimize_pass has already been added
to the analysis config before adding a new one, so
that it is not added multiple times.
test=develop
* change map to unordered_map
fix the forgotten part of cpu_quantize_pass_tester.cc
test=develop
* removed quantized attribute
* fixed cpu_quantize_pass_tester and op attr comments
test=develop
* removed redundant line
test=debug
* removed gmock
test=develop
* fix after merge
6 years ago
qingqing01
86e912c544
Fix windows compiling ( #16230 )
...
test=develop
6 years ago
minqiyang
362253732c
Polish code
...
test=develop
6 years ago
minqiyang
c0ddb93ccc
Polish code
...
test=develop
6 years ago
minqiyang
b5078c211a
Make infer var type virtual
...
test=develop
6 years ago
minqiyang
438bca9c3d
Implement Runtime Var Type Inference
...
test=develop
6 years ago
luotao1
46ee6bb1aa
fix distributed unit-tests
...
test=develop
6 years ago
luotao1
1b59bed989
Merge branch 'develop' into runtime_context
6 years ago
luotao1
6ce25c99a0
Merge branch 'develop' into runtime_context
6 years ago
qingqing01
8ad672a287
Support sync batch norm. ( #16121 )
...
* Support Sync Batch Norm.
* Note, do not enable it in one device.
Usage:
build_strategy = fluid.BuildStrategy()
build_strategy.sync_batch_norm = True
binary = fluid.compiler.CompiledProgram(tp).with_data_parallel(
loss_name=loss_mean.name,
build_strategy=build_strategy)
6 years ago
minqiyang
ca392c7e97
Implement infer var type context
6 years ago
liuwei1031
1c6caf8466
1. disable reuse SELECTED_ROWS type variable ( #16150 )
...
2. remove lod check in reshape op
test=develop
6 years ago
Wojciech Uss
b9252f3df8
Add cpu_quantize_squash_pass for C-API quantization ( #16128 )
...
* Add cpu_quantize_squash_pass for C-API quantization
test=develop
* add cpu_quantize_squash_pass teste
* fix lint: add include memory unorderd_map and unordered_set
test=develop
* lint fix 2
* fixes
test=develop
* refactored
test=develop
* fix windows ci
test=develop
6 years ago
luotao1
b2898c0f57
Merge branch 'develop' into runtime_context
...
test=develop
6 years ago
sneaxiy
a7a4f053da
Merge develop
...
test=develop
6 years ago
Tao Luo
4ef6f738c3
Merge pull request #16154 from luotao1/infershape_example
...
add all_kernels_must_compute_runtime_shape example for speedup infershape
6 years ago
minqiyang
42e96a029f
Accelerate CPU part
6 years ago
sneaxiy
682f2dbf29
merge develop
...
test=develop
6 years ago
sneaxiy
2c4fcaa683
merge develop
6 years ago
luotao1
d94fd97230
add runtime_context_cache_pass
...
test=develop
6 years ago
Yan Xu
30568473ec
fix broadcast on mp mode ( #15951 )
...
* fix broadcast with mp mode
* polish code test=develop
* fix bcast strategy test=develop
* fic cpplint test=develop
* fix py3 failed test=develop
* fix comment test=develop
* update comment test=develop
6 years ago
baojun
e3c37bd564
remove const_cast and refactor ngraph engine code ( #15925 )
...
* remove concast_cast and refactor code test=develop
* reduce flag use test=develop
6 years ago
luotao1
b561ad1e55
Merge branch 'develop' into runtime_context
6 years ago
Zhen Wang
41b8cf0bae
Merge pull request #16162 from wzzju/fix_nan_static_quant
...
Fix NaN bugs for static quantization strategy (mutil-cards train).
6 years ago
luotao1
fe78a92e6e
refine with comments
...
test=develop
6 years ago
Zhen Wang
94b7c1ea7b
Merge pull request #16107 from wzzju/add_graph_clone
...
Add clone function for IrGraph.
6 years ago
wopeizl
85709f4378
restore the exception caught since it is necessary for python call stack ( #16160 )
...
test=develop
6 years ago
Zhen Wang
5685a48c23
Add some fixme. test=develop
6 years ago
luotao1
8f6597aa0e
Merge branch 'develop' into infershape_example
6 years ago
Zhen Wang
ac6ef06ffa
Add the Clone method in Graph. test=develop
6 years ago
Zhen Wang
01eddf125c
Not add graph copy construction method. test=develop
6 years ago
Zhen Wang
1b9c8d5f06
add clone function for IrGraph. test=develop
6 years ago
Zeng Jinle
472f16b5aa
Merge pull request #16063 from sneaxiy/enhance_gc
...
Enhance gc
6 years ago
luotao1
31ccaf0916
add all_kernels_must_compute_runtime_shape example for speedup infershape
...
test=develop
6 years ago
chengduo
ad80bde824
Revert "Revert "Add Event for TensorCopy"" ( #16035 )
...
* Revert "Revert "Add Event for TensorCopy" (#16022 )"
This reverts commit e2da3a5b22
.
* use default stream
test=develop
6 years ago
sneaxiy
732fa00eaf
disable gc in recurrent_op currently
...
test=develop
6 years ago
Qiao Longfei
ff8054c5a7
can run
6 years ago
Yihua Xu
40f1dd818b
Fix the node's order issue when the content of graph is changed ( #16088 )
...
* Fix the node's sort issue when the graph is changed.
test=develop
* Clean code
test=develop
6 years ago
Zhaolong Xing
3d63aa0a11
Merge pull request #15729 from NHZlX/add_static_model_load_for_trt
...
Four points for enhancing Paddle-TRT
6 years ago
Qiao Longfei
3225e19591
fix remove recv op
6 years ago
Qiao Longfei
fe6a840924
fix delete recv ops
6 years ago
Wu Yi
d206582337
add parallel graph dist test ( #16076 )
...
* add parallel graph dist test=develop
* update test=develop
* update style test=develop
6 years ago
Qiao Longfei
446fdf9563
fix compile problem
6 years ago
Qiao Longfei
a23f1ee85a
optimize code
6 years ago
Qiao Longfei
a0bb18beec
Merge branch 'add-async-ssa-graph-executor' of ssh://github.com/jacquesqiao/Paddle into add-async-ssa-graph-executor-communicator
6 years ago
sneaxiy
2a639d5c2a
add allocator chain to fix bug
...
test=develop
6 years ago
liuwei1031
045e5911bf
fix a code bug which cause crash when empty variable is used, test=develop ( #16080 )
6 years ago
sneaxiy
7b608396fe
fix travis-ci format check
...
test=develop
6 years ago
Qiao Longfei
255b36dad2
can run
6 years ago
Qiao Longfei
5e8de51409
code format test=develop
6 years ago
Qiao Longfei
4e218dabc5
code format test=develop
6 years ago
Tao Luo
6375fe45d7
Merge pull request #16039 from luotao1/execution_context
...
remove legacy function in ExecutionContext
6 years ago
sneaxiy
814a759061
merge develop
...
test=develop
6 years ago
sneaxiy
597dc65e76
enhance gc
...
test=develop
6 years ago
liuwei1031
caadd0581d
add IfElse test case for ir memory optimize ( #15998 )
...
* add ir memory optimize test case for IfElse op, test=develop
* fix some unitttest failure by force using the python memory_optimize, test=develop
* tweak comments, test=develop
* fix unittest, test=develop
* fix unittest, test=develop
6 years ago
Qiao Longfei
f28c258453
code clean test=develop
6 years ago
Qiao Longfei
8c38aca954
tmp commit
6 years ago
Tao Luo
f4587789d8
remove legacy function in ExecutionContext
...
test=develop
6 years ago
Liu Yiqun
1041e18c47
Refine codes.
...
test=develop
6 years ago
luotao1
c0b240aa43
try to fix distributed unit-test
...
test=develop
6 years ago
luotao1
784826a4f5
enhance cache runtime_context for different scope
...
test=develop
6 years ago
Qiao Longfei
fab1b54d99
Merge branch 'add-communicator' of ssh://github.com/jacquesqiao/Paddle into add-async-ssa-graph-executor-communicator
6 years ago
Qiao Longfei
8744f9a083
fix parallel executor async mode
6 years ago
Qiao Longfei
e70b1727ef
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor
6 years ago
Liu Yiqun
d8a939d8a8
Merge branch 'develop' into core_opt_choose_kernel
6 years ago
chengduo
e2da3a5b22
Revert "Add Event for TensorCopy" ( #16022 )
...
* Revert "Add Event for TensorCopy (#15953 )"
This reverts commit 7235fd662b
.
test=develop
* fix CI
test=develop
6 years ago
luotao1
2fb38c108c
Merge branch 'develop' into runtime_context
6 years ago
sneaxiy
a9ea99d700
merge develop
6 years ago
Qiao Longfei
3691a46fa3
improve communicator
6 years ago
chengduo
ae37f82964
Unified ParallelExecutor and Compiler ( #15970 )
...
* Unified ParallelExecutor and Compiler
6 years ago
chengduo
7235fd662b
Add Event for TensorCopy ( #15953 )
...
Add Event for TensorCopy
6 years ago
luotao1
82b0bb9d72
fix cpplint error
...
test=develop
6 years ago
Liu Yiqun
d4674dab13
Cache the chosen kernel of operators'.
...
test=develop
6 years ago
luotao1
9773f38f99
cache runtime_context
...
test=develop
6 years ago
tangwei12
6d5a04c1e7
add op type in check nan/inf ( #15986 )
...
* add op name in check nan/inf, test=develop
6 years ago
Qiao Longfei
847e4f4e85
pure async mode train
6 years ago
Qiyang Min
187cffd019
Merge pull request #15928 from velconia/imperative_backward_hooks
...
Imperative backward hooks
6 years ago
Yiqun Liu
798925453e
Revert "Optimize while_op when is_test is true. ( #15811 )" ( #15968 )
...
test=develop
6 years ago
minqiyang
e5f3435dd5
Add missing headers
...
test=develop
6 years ago
minqiyang
50639fafdb
Polish code
...
test=develop
6 years ago
Yiqun Liu
613d9d0756
Optimize while_op when is_test is true. ( #15811 )
...
test=develop
6 years ago
nhzlx
2eff3e26b6
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_static_model_load_for_trt
6 years ago
nhzlx
06a088a199
fix comments and fix cpplint
...
test=develop
6 years ago
dzhwinter
225c11a91f
polish cudnn related code and fix bug. ( #15164 )
...
* staged.
* polish code
* polish code. test=develop
* polish code. test=develop
* api change. test=develop
* fix default value. test=develop
* fix default value. test=develop
6 years ago
Tao Luo
d5a888e15c
Merge pull request #15943 from kbinias/kbinias/add-placement-pass-tester
...
MKL-DNN: Add placement pass tester
6 years ago
Krzysztof Binias
72253391b6
Add MKL-DNN placement pass tester
...
test=develop
6 years ago
minqiyang
cb85ee987b
Remove var op deps in imperative mode
...
test=develop
6 years ago
Tao Luo
effec86600
Merge pull request #15913 from liangan1/func_coverage
...
Enable function coverage for U8/S8 ConvMKLDNNOpKernel
6 years ago
Qiao Longfei
49f2f4f91d
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-communicator
6 years ago
Qiao Longfei
f768fbf715
support multi graph
...
test=develop
6 years ago
Tao Luo
60546b78cc
Merge pull request #15923 from Sand3r-/mgallus/conv-residual-ut
...
Add Conv Residual Connection UT for Projection
6 years ago
Qiao Longfei
ff01d70583
fix style
...
test=develop
6 years ago
Qiao Longfei
dab7f36909
optimize code test=develop
6 years ago
Qiao Longfei
cf0511f21e
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor
...
test=develop
6 years ago
Michal Gallus
6a2bc9a275
Add Conv Residual Connection UT for Projection
...
test=develop
6 years ago
Qiao Longfei
43c82376cb
use one graph
6 years ago
dzhwinter
660e410655
Merge pull request #15855 from dzhwinter/fix/nightly_test
...
accelerate memory optimize process
6 years ago
minqiyang
b420ec3a92
invoke backward_hooks after reduce op's depcounts map
...
test=develop
6 years ago
Qiyang Min
4bd28b304b
Merge pull request #15831 from velconia/imperative_engine
...
Imperative training network to the end
6 years ago
Xin Pan
a6e3cd5eb7
Merge pull request #15425 from panyx0718/api
...
Pass graph to parallel executor instead of program
6 years ago
Qiao Longfei
b8491bfd4e
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-communicator
6 years ago
liangan1
4acc522087
Enable function coverage for U8/S8 ConvMKLDNNOpKernel
...
test=develop
6 years ago
Xin Pan
44e7fcddc5
Merge pull request #15844 from panyx0718/infer
...
add per kernel config and remove const_cast.
6 years ago
dzhwinter
a71f2fbe4f
fix default value. test=develop
6 years ago
Jacek Czaja
dec9cf53c8
[MKL-DNN] MKL-DNN specific Tensor modification ( #15429 )
...
* - Implemented draft of primitive desc keeping in Tensor
test=develop
- TransposeMKLDNNHandler::AcquireSrcMemory was reimplemented
- Added nchw and nc formats setting for sake of compatiblity
Fixed unit tests
- Worakaround to problem with 5D data in conv
- Added 3D and 1D MKL-DNN formats for name handles for tensor
test=develop
- Fix to UTs
test=develop
- Conv fp32 op was updated
Cosmetic fixes
test=develop
- tensor mkldnn cosmetics
test=develop
- Moved most of mkl-dnn specific code from Tensor to mkl-dnn utils
* - Lint fixes
test=develop
* - setting prim dec in Tensor , sets also layout to kMKLDNN
test=develop
* - Moved creation of prim desc totally out of Tensor
test=develop
* - Cosmetic fixes adter review
test=develop
6 years ago
minqiyang
84bf4d7b06
Move ClearBlock into OpBase and VarBase's destructor
...
test=develop
6 years ago
Xin Pan
5dd281f738
polish
...
test=develop
6 years ago
Qiao Longfei
10393dd0d1
add some check test=develop
6 years ago
乔龙飞 Qiao Longfei
ec8e878200
Merge pull request #15840 from jacquesqiao/revert-15684-revert-15661-fix-cpu-broadcast
...
fix cpu broadcast
6 years ago
minqiyang
a15a3fc314
Polish code
...
test=develop
6 years ago
Qiao Longfei
2b7931d5c9
refine code test=develop
6 years ago
Qiao Longfei
b5b8e6cc9c
revert the change of scope test=develop
6 years ago
Xin Pan
8d83e38a6b
remove mutex
...
test=develop
6 years ago
Xin Pan
0362ef75f4
fix
...
test=develop
6 years ago
minqiyang
9dc64edfd9
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into imperative_engine
...
test=develop
6 years ago
Xin Pan
12a0e2ed9d
polish codes
...
test=develop
6 years ago
Xin Pan
19d78f6797
polish
...
test=develop
6 years ago
Qiao Longfei
ecedd531c1
fix code bug test=develop
6 years ago
Qiao Longfei
f4f4816b0c
fix gpu error test=develop
6 years ago
Xin Pan
32d5a16036
resolve conflicts
...
test=develop
6 years ago
Qiao Longfei
3f9263f67e
optimize style test=develop
6 years ago
Qiao Longfei
4233d0a820
add more comment test=develop
6 years ago
Michał Gallus
c4faf36e7a
MKL-DNN: Add test for conv bias fuse pass ( #15824 )
...
* MKL-DNN: Add test for conv bias fuse pass
test=develop
* Remove const cast from Conv Bias Pass Test
* Add conv with bias test case for conv+bias fuse ut
test=develop
6 years ago
Qiao Longfei
3bccc1e6e2
optimize broadcast logic test=develop
6 years ago
Tao Luo
3831a4695d
Merge pull request #15862 from sfraczek/add-override-to-apply_impl
...
add override to ApplyImpl and precommit fixes
6 years ago
Xin Pan
26e32e095a
allow compiler to use graph
...
test=develop
6 years ago
Sylwester Fraczek
0b926114c0
add override to ApplyImpl
...
and #pragma once in edited headers
add #include<string> in edited headers
test=develop
6 years ago
Sylwester Fraczek
543e53db05
fix typo releated->related
6 years ago
Qiao Longfei
12f6b8c3d6
change the include of ThreadPool.h test=develop
6 years ago
Qiao Longfei
7f3be09045
fix multi graph test=develop
6 years ago
Qiao Longfei
9465c3d0c3
fix compile problem
6 years ago
Xin Pan
5eb87506bc
add per kernel config and remove const_cast.
...
test=develop
6 years ago
Qiao Longfei
31a05d3efd
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor
...
test=develop
6 years ago
Qiao Longfei
62f1248ff5
fix use gpu test=develop
6 years ago
Xin Pan
6019054cdd
Merge pull request #15716 from Yancey1989/refine_pg
...
Refine ParallelGraph Execution
6 years ago
Dun
a83e470405
Profiler refine and add CUDA runtime api tracer ( #15301 )
...
* refine profiler && add runtime tracer
* test=develop
* test=develop
* test=develop
* test=develop
* test=develop
* test=develop
* test=develop
* test=develop
* fix bug && test=develop
* add thread id map && test=develop
* test=develop
* testing
* bug fix
* remove cuda event && refine code && test=develop
* test=develop
* test=develop
* test=develop
* fix windows temp file && test=develop
* test=develop
* fix windows bug && test=develop
* fix start up issue && test=develop
* code polish && test=develop
* remove unused code && test=develop
* add some cupti cbid && test=develop
* add FLAGS_multiple_of_cupti_buffer_size && test=develop
* fix compile error && test=develop
* add keyword && test=develop
* fix && test=develop
* code polish && test=develop
6 years ago
Qiao Longfei
cc71e89499
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor
...
test=develop
6 years ago
minqiyang
f53e1d5c4b
implement ClearBlock
6 years ago
dzhwinter
f2e8409f5a
Merge pull request #15795 from dzhwinter/fix/block_desc
...
fix bug when op node has no block pointer
6 years ago
tensor-tang
e1c707fe9c
fix warnings ( #15790 )
...
* fix warnings
test=develop
* fix enforce test
test=develop
6 years ago
Yancey1989
4b193db14c
polish code test=develop
6 years ago
dzhwinter
6deb17ed8c
fix default value. test=develop
6 years ago
dzhwinter
089d262c41
fix default value. test=develop
6 years ago
dzhwinter
9c92d0304f
fix default value. test=develop
6 years ago
Yancey1989
d5090c892d
polish code test=develop
6 years ago
dzhwinter
28609b3435
Merge pull request #15696 from dzhwinter/cherry-pick/memory
...
cherry picked modifies.
6 years ago
Yancey1989
0f8bd73cc9
cleanup code test=develop
6 years ago
Yancey1989
5677c9d4ee
update comment test=develop
6 years ago
Yancey1989
642fd68ce0
update by comment test=develop
6 years ago
dzhwinter
d94a314db5
add reference. test=develop
6 years ago
dzhwinter
591ad33e32
polish code for reading. test=develop
6 years ago
dzhwinter
18afb77e78
polish code for reading. test=develop
6 years ago
Yan Chunwei
077d12b939
fix scale cleaner ( #15742 )
6 years ago
dzhwinter
684b572307
polish code for reading. test=develop
6 years ago
dzhwinter
3787e61fca
polish code for reading. test=develop
6 years ago
dzhwinter
c1455e606d
Merge remote-tracking branch 'origin/develop' into cherry-pick/memory
...
test=develop
6 years ago
dzhwinter
d376cf71b7
polish code for reading. test=develop
6 years ago
nhzlx
ecc12fb430
3. when runing in trt mode, do not allocate memory for parameters in fluid.
...
test=develop
6 years ago
Yancey1989
7cd6de37f5
fix cpu test=develop
6 years ago
Yancey1989
bd0d44af24
fix build failed test=develop
6 years ago
Yancey1989
ecdd1166b8
cleanup code test=develop
6 years ago
Yancey1989
73005ee00d
cleanup code test=develop
6 years ago
Yancey1989
88d3dc949e
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into refine_pg
...
test=develop
6 years ago
Yancey1989
f3463ecb6e
refine pg execution
6 years ago
dzhwinter
283573c6aa
add details. test=develop
6 years ago
chengduo
5a03b515ae
fix potential bug in async_executor ( #15707 )
...
test=develop
6 years ago
乔龙飞 Qiao Longfei
45b19cbc9a
Revert "Revert "cpu reduce mode did not need to broadcast params test=develop""
6 years ago
chengduo
ad61e1b22c
fix potential bug ( #15688 )
...
test=develop
6 years ago
dzhwinter
6d6ddcfe15
add details. test=develop
6 years ago
dzhwinter
f9ac88e1a0
Merge pull request #15694 from liuwei1031/fix_security_issue
...
Fix security issue
6 years ago
dzhwinter
11afbe0f53
add details. test=develop
6 years ago
tensor-tang
e49706c80e
Merge pull request #15659 from GBuella/add_to_string
...
Tests - add some missing to_string calls
6 years ago
liuwei1031
b1f97a6fa9
fix security issue 27, 38 test=develop
6 years ago
Gabor Buella
da9c94da33
Clang build fixes ( #15628 )
...
* Remove some superfluous std::move calls
The std:move triggered a build error (with -Werror):
```
[ 9%] Building CXX object paddle/fluid/memory/allocation/CMakeFiles/allocator_facade.dir/allocator_facade.cc.o
/home/tej/code/gbuella_paddle/paddle/fluid/memory/allocation/allocator_facade.cc:86:29: error: moving a temporary object prevents copy elision [-Werror,-Wpessimizing-move]
[this] { return std::move(CreateAllocatorWithChunk()); }, capacity);
^
/home/tej/code/gbuella_paddle/paddle/fluid/memory/allocation/allocator_facade.cc:86:29: note: remove std::move call here
[this] { return std::move(CreateAllocatorWithChunk()); }, capacity);
^~~~~~~~~~ ~
1 error generated.
```
See: https://reviews.llvm.org/D7633
* Remove a superfluous lambda capture from framework/operator.h
```
[ 10%] Building CXX object paddle/fluid/platform/CMakeFiles/device_context.dir/init.cc.o
In file included from /home/tej/code/gbuella_paddle/paddle/fluid/platform/init.cc:19:
/home/tej/code/gbuella_paddle/paddle/fluid/framework/operator.h:229:21: error: lambda capture 'this' is not used [-Werror,-Wunused-lambda-capture]
[this](Variable* var) { return var; });
^~~~
1 error generated.
```
Changing it to `return it->second;`, as is in the function below.
* Rethrow an exception (instead of copying it)
```
[ 11%] Building CXX object paddle/fluid/framework/CMakeFiles/operator.dir/operator.cc.o
/home/tej/code/gbuella_paddle/paddle/fluid/framework/operator.cc:191:13: error: local variable 'exception' will be copied despite being thrown by name [-Werror,-Wreturn-std-move]
throw exception;
^~~~~~~~~
/home/tej/code/gbuella_paddle/paddle/fluid/framework/operator.cc:191:13: note: call 'std::move' explicitly to avoid copying
throw exception;
^~~~~~~~~
std::move(exception)
```
See https://reviews.llvm.org/D43322 for an explanation of this diagnostic message.
* Remove an unused variable
```
/home/tej/code/gbuella_paddle/paddle/fluid/framework/operator.cc:884:16: error: private field 'scope_' is not used [-Werror,-Wunused-private-field]
const Scope& scope_;
^
```
* struct ComputationOpHandle -> class ComputationOpHandle
```
[ 13%] Building CXX object paddle/fluid/framework/details/CMakeFiles/memory_early_delete_pass.dir/memory_early_delete_pass.cc.o
In file included from /home/tej/code/gbuella_paddle/paddle/fluid/framework/details/memory_early_delete_pass.cc:21:
/home/tej/code/gbuella_paddle/paddle/fluid/framework/details/reference_count_pass_helper.h:30:1: error: class 'ComputationOpHandle' was previously declared as a struct; this is valid, but may result in linker errors under the Microsoft C++ ABI [-Werror,-Wmismatched-tags]
class ComputationOpHandle;
^
/home/tej/code/gbuella_paddle/paddle/fluid/framework/details/computation_op_handle.h:29:8: note: previous use is here
struct ComputationOpHandle : public OpHandleBase {
^
/home/tej/code/gbuella_paddle/paddle/fluid/framework/details/reference_count_pass_helper.h:30:1: note: did you mean struct here?
class ComputationOpHandle;
^~~~~
struct
1 error generated.
```
* Fix name() methods under fluid/operators
```
In file included from /home/tej/code/gbuella_paddle/paddle/fluid/operators/jit/gen/act.cc:15:
In file included from /home/tej/code/gbuella_paddle/paddle/fluid/operators/jit/gen/act.h:19:
/home/tej/code/gbuella_paddle/paddle/fluid/operators/jit/gen/jitcode.h:71:23: error: 'name' overrides a member function but is not marked 'override' [-Werror,-Winconsistent-missing-override]
virtual const char* name() const = 0;
^
/home/tej/code/gbuella_paddle/paddle/fluid/operators/jit/gen_base.h:31:23: note: overridden virtual function is here
virtual const char* name() const = 0;
^
```
test=develop
6 years ago
kolinwei
acfe28d5eb
Merge pull request #15684 from PaddlePaddle/revert-15661-fix-cpu-broadcast
...
Revert "cpu reduce mode did not need to broadcast params test=develop"
6 years ago
Xin Pan
d670d8ef1d
Merge pull request #15671 from cjld/fix_graph
...
fix bug CreateControlDepVar duplicate name
6 years ago
乔龙飞 Qiao Longfei
6e0e706198
Revert "cpu reduce mode did not need to broadcast params test=develop"
6 years ago
Qiao Longfei
97b143fb49
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into fix-cpu-broadcast
...
test=develop
6 years ago
dzhwinter
381f2015a5
Merge pull request #15665 from dzhwinter/experiment/refactor_memory
...
refactor optimize pass.
6 years ago
Qiao Longfei
ffd0d1d216
clean need_broadcast_var_ test=develop
6 years ago
Qiao Longfei
fbadd4b60c
follow comment test=develop
6 years ago
dzhwinter
04e9776aef
add details. test=develop
6 years ago
baojun
f4a0e68481
Fix ngraph compile WITH_DISTRIBUTE=ON ( #15636 )
...
* fix compile issue with_distribute test=develop
* simplified logic test=develop
* use ngraph dependency test=develop
* set cpu only test=develop
* update test and eliminate fp16 test test=develop
6 years ago
Dun Liang
1905f1a108
bug fix && test=develop
6 years ago
Qiao Longfei
2171aa77f1
async ssa exe only support local mode
6 years ago
Qiao Longfei
c4ded17e8c
async mode support dist train
6 years ago
Qiao Longfei
84367cf8bc
support async mode in dist mode parallel executor
6 years ago
Qiao Longfei
e72637ddd2
ThreadedSSAGraphExecutor support num_iteration_per_run test=develop
6 years ago
Qiao Longfei
a7152613f7
Merge branch 'fix-cpu-broadcast' of ssh://github.com/jacquesqiao/Paddle into add-communicator
6 years ago
Qiao Longfei
76072261f8
fix compiler
...
test=develop
6 years ago
Qiao Longfei
b99db0e2c2
cpu reduce mode did not need to broadcast test=develop
6 years ago
Qiao Longfei
5cf0092825
add more log and fix test_dist_base in multi_batch_merge_pass
6 years ago
Gabor Buella
4975a9050a
Tests - add some missing to_string calls
...
```
/home/tej/code/gbuella_paddle/paddle/fluid/framework/ir/seqpool_concat_fuse_pass_tester.cc:167:40: error: adding 'int' to a string does not append to the string [-Werror,-Wstring-plus-int]
std::string prefix = "seqpool_op_" + i;
~~~~~~~~~~~~~~^~~
/home/tej/code/gbuella_paddle/paddle/fluid/framework/ir/seqpool_concat_fuse_pass_tester.cc:167:40: note: use array indexing to silence this warning
std::string prefix = "seqpool_op_" + i;
^
& [ ]
1 error generated.
```
test=develop
6 years ago
Qiao Longfei
b1fe8d4570
add a check for async_ssa_graph_exe test=develop
6 years ago
Qiao Longfei
16af1dbc7b
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor
...
test=develop
6 years ago
Qiao Longfei
381f383989
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-communicator
6 years ago
wopeizl
c1e18b13aa
Merge pull request #15635 from wopeizl/fixbuildissue
...
fix the build issue on gpu mode for win
6 years ago
dzhwinter
5d30b55de1
rerun ci. test=develop
6 years ago
dzhwinter
4ef34916a4
enhanced print message. test=develop
6 years ago
peizhilin
238ef94702
fix the build issue on gpu mode for win
...
test=develop
6 years ago
dzhwinter
ce0394bcd0
merge develop branch. test=develop
6 years ago
Xin Pan
74bc55c2a6
Merge pull request #14975 from dzhwinter/ir_inplace_pass
...
Ir inplace pass
6 years ago
dzhwinter
cca71532eb
add skip send.recv test=develop
6 years ago
dzhwinter
9f001c6525
skip dist. test=develop
6 years ago
Yan Chunwei
dc5e25fc7f
remove dot marked node ( #15606 )
6 years ago
dzhwinter
2561a6fc59
follow comment. test=develop
6 years ago
dzhwinter
2a5ecb68b0
follow comment. test=develop
6 years ago
dzhwinter
9f693fcac4
rerun ci. test=develop
6 years ago
dzhwinter
e537634d16
delete graph print pass. test=develop
6 years ago
dzhwinter
4f01de6378
Merge remote-tracking branch 'origin/develop' into feature/ir_inplace_pass
6 years ago
liuwei1031
6e84eb131f
expose peak gpu memory API to python test=develop ( #15529 )
...
* expose peak gpu memory API to python test=develop
* add unittest for peak gpu memory monitoring test=develop
* add pybind change test=develop
* add mutex to gpu mem usage monitor test=develop
* update benchmark flag definition file test=develop
* tweak unittest for memory monitoring test=develop
6 years ago
dzhwinter
5cab99a686
fuck windows. rerun windows ci. test=develop
6 years ago
dzhwinter
9c9ad7d40b
Merge remote-tracking branch 'origin/develop' into feature/ir_inplace_pass
...
test=develop
6 years ago
dzhwinter
0a63234c85
follow comments. test=develop
6 years ago
Yan Chunwei
897789b16e
fix save_inferece_model bug ( #15365 )
6 years ago
dzhwinter
9e87fbebb7
rerun windows ci. test=develop
6 years ago
dzhwinter
6f9904e99a
rerun windows ci. test=develop
6 years ago
dzhwinter
a52be7c081
refine build strategy. test=develop
6 years ago
dzhwinter
32a2014939
refine build strategy. test=develop
6 years ago
Krzysztof Binias
b1bdcd4de8
Make separate folders for mkldnn codes
...
test=develop
6 years ago
dzhwinter
06f2448848
Merge remote-tracking branch 'origin/develop' into feature/ir_inplace_pass
6 years ago
dzhwinter
8156fedf56
merge develop branch. test=develop
6 years ago
Qiao Longfei
d6c0dcaa16
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor
...
test=develop
6 years ago
Jiabin Yang
fd286f3596
Merge pull request #15534 from JiabinYang/fix/multi_output_support_imperative
...
test=develop, fix/multi_output_support_imperative
6 years ago
Qiao Longfei
c7e3868007
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-communicator
6 years ago
Qiao Longfei
02dab46ab8
add some debug info
6 years ago
dzhwinter
ee3aae56cd
merge develop branch. test=develop
6 years ago
dzhwinter
d6d3e6afe2
add more skip strategy
6 years ago
Yan Chunwei
b62b756b28
add version support ( #15469 )
6 years ago
tensor-tang
3c224e7e79
Merge pull request #15537 from baojun-nervana/rm_ngraph_operator
...
rm ngraph_operator.cc test=develop
6 years ago
Zhaolong Xing
97b76c94c4
Merge pull request #15242 from NHZlX/trt_int8_ultimate_version
...
add trt int8 support
6 years ago
Jiabin Yang
10bc9ffc2d
Merge pull request #15518 from JiabinYang/fix/refine_error_message
...
test=develop, refine_error_message for data type
6 years ago
Qiao Longfei
be738a646e
add some debug infor
6 years ago
Qiao Longfei
62549e0714
add GenParentScopeTreeDebugInfo
6 years ago
dzhwinter
2739096eec
compatibable with python side mem_opt
6 years ago
Qiao Longfei
a66115bed5
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor
...
test=develop
6 years ago
Qiao Longfei
fab8457e6b
code optimize
6 years ago
gongweibao
d303270a0e
revert test=develop ( #15535 )
6 years ago
baojun-nervana
8e9308a51a
mv ngraph_bridge to ngraph directory test=develop
6 years ago
baojun-nervana
da3f9cc512
rm ngraph_operator.cc test=develop
6 years ago
Qiao Longfei
ada43e89c3
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-async-ssa-graph-executor
...
test=develop
6 years ago
JiabinYang
5639f49b16
test=develop, fix/multi_output_support_imperative
6 years ago
gongweibao
d54494ba87
cleanup test=develop ( #15347 )
6 years ago
JiabinYang
c52f57de5b
test=develop, refine_error_message for data type
6 years ago
baojun
efce25673c
Adding ngraph_engine_op ( #14948 )
...
* enable ngraph_engine_op
test=develop
* merge develop test=develop
* avoid const_cast test=develop
* rm ngraph_operator test=develop
* Added TODO to move EnableNgraph test=develop
* Add TODO to remove const_cast test=develop
6 years ago
Yiqun Liu
3008fa1261
Add the CUDA kernel for beam_search op ( #15020 )
...
* Refine the beam_search op and test.
* A basic CUDA implementation of beam_search for small batch_size.
* Implement CUDA kernel for beam_search_op.
* Use multiple CUDA threads in the same block to select the top beam.
* Update the python api of beam_search op.
* Enable extend function in CPU kernel of beam_search op.
* Unify the CUDA codes.
test=develop
* Unify the CPU kernel of beam_search op.
* Ensure the seletced items of beam_search_op's CPU kernel sorted by scores.
* Update the description of beam_search in API.spec.
* Enable the use of CUDA kernel in beam_search op.
* Exclude the beam_search's CUDA unittest when there is no CUDA gpu, and delete some debuging statements.
test=develop
* Follow comments.
test=develop
* Call the CPU kernel for beam_search op when batch_size > 4.
test=develop
* Remove the except of is_empty op in PrepareData.
test=develop
6 years ago
nhzlx
0779e35544
fix two bug:
...
1. graph and program_desc alignment
2. trt stream
test=develop
6 years ago
Qiao Longfei
ca5d96bb3d
complete send lod tensor
6 years ago
Zeng Jinle
dec89bd7ed
Merge pull request #15460 from sneaxiy/try_to_turn_on_remove_unnecessary_lock
...
Turn on remove_unnecessary_lock by default
6 years ago
Xin Pan
58cb18d9d9
Merge pull request #15322 from velconia/imperative_resnet
...
Imperative Resnet
6 years ago
Qiao Longfei
be72940b76
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add-communicator
6 years ago
sneaxiy
ef788603d4
merge develop
...
test=develop
6 years ago
sneaxiy
d8568acd19
turn on remove_unnecessary_lock
...
test=develop
6 years ago
sneaxiy
eac5a0aa0c
Merge develop
...
test=develop
6 years ago
WangZhen
3ce6172052
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into graph_quantization
6 years ago
dzhwinter
8f3b252392
squash commits. test=develop
6 years ago
Yan Chunwei
885c4e57ab
fea/infer memory optim2 ( #14953 )
6 years ago
minqiyang
8ce198b2e1
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into imperative_resnet
...
test=develop
6 years ago
Dun
9f8f0fc2d3
Memory optimization of depthwise conv op and group norm op ( #15313 )
...
* mem opt
* test=develop
* test=develop
* test=develop
* test=develop
* test=develop
* test=develop
* test=develop
* refine code test=develop
* refine code test=develop
* refine code test=develop
* refine code test=develop
* refine with cub test=develop
* fix mkldnn test && remove comments && test=develop
* polish code && test=develop
* add only_forward test && test=develop
6 years ago
WangZhen
e2ff300b02
add UT for quantization.
6 years ago
WangZhen
451896fce4
init quantization.
6 years ago
Qiao Longfei
f3210b60ba
fix copy_memory and share_memory
6 years ago
Qiao Longfei
9958775b31
add NewTmpScope to scope
6 years ago
Qiao Longfei
7021979bc2
init communicator
6 years ago
Qiao Longfei
69484f71e0
remote communicator
6 years ago
Qiao Longfei
88d71fa2f9
support num_iteration_per_run
6 years ago
gongweibao
7cd4dd7ce4
Hide varhandle members. ( #15382 )
6 years ago
Qiao Longfei
ea66979684
can run
6 years ago
Qiao Longfei
afda840126
init communicator
6 years ago
Qiao Longfei
92a6c7a049
init async ssa executor
6 years ago
tensor-tang
3759c1db8c
Merge pull request #14805 from mozga-intel/mozga-intel/element_wise_operator_ngraph
...
Enable element_wise_add operator for a ngraph engine
6 years ago
mozga-intel
cba729404d
Enable softmax operator for a ngraph engine
...
test=develop
6 years ago
tensor-tang
a7fc3d42a0
Merge pull request #15304 from tensor-tang/fuse/second_order_mul_sub
...
Fuse/second order mul sub and fuse repeated fc relu
6 years ago
乔龙飞 Qiao Longfei
b14d4cdd75
Merge pull request #14890 from jacquesqiao/multithread-sparse-adam
...
adam support multithread
6 years ago
Qiao Longfei
9b4fe283e1
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into multithread-sparse-adam
...
test=develop
6 years ago
peizhilin
5e450833bd
test=develop
6 years ago
peizhilin
eea75a1d93
fix issue when type is invalid
...
test=develop
6 years ago
peizhilin
9adb158e5b
Merge remote-tracking branch 'upstream/develop' into debug/support
6 years ago
tensor-tang
84b0ecdcce
Merge remote-tracking branch 'ups/develop' into fuse/second_order_mul_sub
...
test=develop
6 years ago
chengduo
46d01d798e
Revert "Revert "Remove workspace_handle in conv_cudnn ( #15186 )"" ( #15290 )
...
test=develop
This reverts commit 358e657f68
.
6 years ago
tensor-tang
d618e48309
fix fuse square mat order and refine test
...
test=develop
6 years ago
tensor-tang
a5d2a6d1ad
add fuse pass of sequared mat sub fusion
6 years ago
tensor-tang
ca6fdc6e33
refine and fix test
...
test=develop
6 years ago
tensor-tang
a89296ac1f
add repeated fc relu pass
6 years ago
Xin Pan
50b4ac08b0
fix
...
test=develop
6 years ago
Xin Pan
a1bfb35dd6
try fix py2
...
test=develop
6 years ago
Xin Pan
6a18c0f9ff
Merge pull request #15278 from chengduoZH/revert_remove_workspace_handle_in_conv2d_cudnn
...
Revert "Remove workspace_handle in conv_cudnn (#15186 )"
6 years ago
Zhaolong Xing
98e85f3735
add_transpose_flatten_concat_fuse ( #15121 )
6 years ago
chengduozh
358e657f68
Revert "Remove workspace_handle in conv_cudnn ( #15186 )"
...
test=develop
This reverts commit 064512aa47
.
6 years ago
tensor-tang
fc9fbab6a0
Merge pull request #15271 from tensor-tang/fix/typo
...
fix typo and refine
6 years ago
chengduo
064512aa47
Remove workspace_handle in conv_cudnn ( #15186 )
...
* remove workspace_handle in conv2d_cudnn
test=develop
* remove workspace_handle
test=develop
* fix bug
test=develop
* make test_conv2d_op SERIAL
test=develop
* save memory in conv_cudnn
test=develop
* enhance thread safety
test=develop
* enhance temporary allocator
test=develop
* Add excess fraction
test=develop
* follow comments
test=develop
* fix bug and code refine
test=develop
* fix memory size check
test=develop
* rename reuse_tmp_allocation_excess_fraction
test=develop
6 years ago
tensor-tang
c3a9f3c4b2
fix typo and refine
...
test=develop
6 years ago
tensor-tang
ab9c4b2a9f
refine seqpool concat pass and remove unused nodes
...
test=develop
6 years ago
tensor-tang
ce909664d8
Merge remote-tracking branch 'ups/develop' into refine/seqpool/feed
6 years ago
flame
fb63cd89d4
Add python ir graph API ( #14917 )
6 years ago
tensor-tang
a0a27bd240
add seqpool concat fuse pass tester
...
test=develop
6 years ago
sneaxiy
594dc4d8f0
partial gc 1st version
...
test=develop
6 years ago
tensor-tang
8e086a8521
follow comment and fix typo
...
test=develop
6 years ago
tensor-tang
48410b9bfe
Merge pull request #15237 from tensor-tang/fuse/seqpool_concat_2
...
Fuse/seqpool concat 2
6 years ago
peizhilin
c1235c935f
add the enable_debug flag
...
test=develop
6 years ago