dzhwinter
2739096eec
compatibable with python side mem_opt
6 years ago
gongweibao
d303270a0e
revert test=develop ( #15535 )
6 years ago
baojun-nervana
8e9308a51a
mv ngraph_bridge to ngraph directory test=develop
6 years ago
baojun-nervana
da3f9cc512
rm ngraph_operator.cc test=develop
6 years ago
JiabinYang
5639f49b16
test=develop, fix/multi_output_support_imperative
6 years ago
gongweibao
d54494ba87
cleanup test=develop ( #15347 )
6 years ago
JiabinYang
c52f57de5b
test=develop, refine_error_message for data type
6 years ago
baojun
efce25673c
Adding ngraph_engine_op ( #14948 )
...
* enable ngraph_engine_op
test=develop
* merge develop test=develop
* avoid const_cast test=develop
* rm ngraph_operator test=develop
* Added TODO to move EnableNgraph test=develop
* Add TODO to remove const_cast test=develop
6 years ago
Yiqun Liu
3008fa1261
Add the CUDA kernel for beam_search op ( #15020 )
...
* Refine the beam_search op and test.
* A basic CUDA implementation of beam_search for small batch_size.
* Implement CUDA kernel for beam_search_op.
* Use multiple CUDA threads in the same block to select the top beam.
* Update the python api of beam_search op.
* Enable extend function in CPU kernel of beam_search op.
* Unify the CUDA codes.
test=develop
* Unify the CPU kernel of beam_search op.
* Ensure the seletced items of beam_search_op's CPU kernel sorted by scores.
* Update the description of beam_search in API.spec.
* Enable the use of CUDA kernel in beam_search op.
* Exclude the beam_search's CUDA unittest when there is no CUDA gpu, and delete some debuging statements.
test=develop
* Follow comments.
test=develop
* Call the CPU kernel for beam_search op when batch_size > 4.
test=develop
* Remove the except of is_empty op in PrepareData.
test=develop
6 years ago
nhzlx
0779e35544
fix two bug:
...
1. graph and program_desc alignment
2. trt stream
test=develop
6 years ago
Zeng Jinle
dec89bd7ed
Merge pull request #15460 from sneaxiy/try_to_turn_on_remove_unnecessary_lock
...
Turn on remove_unnecessary_lock by default
6 years ago
Xin Pan
58cb18d9d9
Merge pull request #15322 from velconia/imperative_resnet
...
Imperative Resnet
6 years ago
sneaxiy
ef788603d4
merge develop
...
test=develop
6 years ago
sneaxiy
d8568acd19
turn on remove_unnecessary_lock
...
test=develop
6 years ago
sneaxiy
eac5a0aa0c
Merge develop
...
test=develop
6 years ago
WangZhen
3ce6172052
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into graph_quantization
6 years ago
dzhwinter
8f3b252392
squash commits. test=develop
6 years ago
Yan Chunwei
885c4e57ab
fea/infer memory optim2 ( #14953 )
6 years ago
minqiyang
8ce198b2e1
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into imperative_resnet
...
test=develop
6 years ago
Dun
9f8f0fc2d3
Memory optimization of depthwise conv op and group norm op ( #15313 )
...
* mem opt
* test=develop
* test=develop
* test=develop
* test=develop
* test=develop
* test=develop
* test=develop
* refine code test=develop
* refine code test=develop
* refine code test=develop
* refine code test=develop
* refine with cub test=develop
* fix mkldnn test && remove comments && test=develop
* polish code && test=develop
* add only_forward test && test=develop
6 years ago
WangZhen
e2ff300b02
add UT for quantization.
6 years ago
WangZhen
451896fce4
init quantization.
6 years ago
gongweibao
7cd4dd7ce4
Hide varhandle members. ( #15382 )
6 years ago
tensor-tang
3759c1db8c
Merge pull request #14805 from mozga-intel/mozga-intel/element_wise_operator_ngraph
...
Enable element_wise_add operator for a ngraph engine
6 years ago
mozga-intel
cba729404d
Enable softmax operator for a ngraph engine
...
test=develop
6 years ago
tensor-tang
a7fc3d42a0
Merge pull request #15304 from tensor-tang/fuse/second_order_mul_sub
...
Fuse/second order mul sub and fuse repeated fc relu
6 years ago
乔龙飞 Qiao Longfei
b14d4cdd75
Merge pull request #14890 from jacquesqiao/multithread-sparse-adam
...
adam support multithread
6 years ago
Qiao Longfei
9b4fe283e1
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into multithread-sparse-adam
...
test=develop
6 years ago
peizhilin
5e450833bd
test=develop
6 years ago
peizhilin
eea75a1d93
fix issue when type is invalid
...
test=develop
6 years ago
peizhilin
9adb158e5b
Merge remote-tracking branch 'upstream/develop' into debug/support
6 years ago
tensor-tang
84b0ecdcce
Merge remote-tracking branch 'ups/develop' into fuse/second_order_mul_sub
...
test=develop
6 years ago
chengduo
46d01d798e
Revert "Revert "Remove workspace_handle in conv_cudnn ( #15186 )"" ( #15290 )
...
test=develop
This reverts commit 358e657f68
.
6 years ago
tensor-tang
d618e48309
fix fuse square mat order and refine test
...
test=develop
6 years ago
tensor-tang
a5d2a6d1ad
add fuse pass of sequared mat sub fusion
6 years ago
tensor-tang
ca6fdc6e33
refine and fix test
...
test=develop
6 years ago
tensor-tang
a89296ac1f
add repeated fc relu pass
6 years ago
Xin Pan
50b4ac08b0
fix
...
test=develop
6 years ago
Xin Pan
a1bfb35dd6
try fix py2
...
test=develop
6 years ago
Xin Pan
6a18c0f9ff
Merge pull request #15278 from chengduoZH/revert_remove_workspace_handle_in_conv2d_cudnn
...
Revert "Remove workspace_handle in conv_cudnn (#15186 )"
6 years ago
Zhaolong Xing
98e85f3735
add_transpose_flatten_concat_fuse ( #15121 )
6 years ago
chengduozh
358e657f68
Revert "Remove workspace_handle in conv_cudnn ( #15186 )"
...
test=develop
This reverts commit 064512aa47
.
6 years ago
tensor-tang
fc9fbab6a0
Merge pull request #15271 from tensor-tang/fix/typo
...
fix typo and refine
6 years ago
chengduo
064512aa47
Remove workspace_handle in conv_cudnn ( #15186 )
...
* remove workspace_handle in conv2d_cudnn
test=develop
* remove workspace_handle
test=develop
* fix bug
test=develop
* make test_conv2d_op SERIAL
test=develop
* save memory in conv_cudnn
test=develop
* enhance thread safety
test=develop
* enhance temporary allocator
test=develop
* Add excess fraction
test=develop
* follow comments
test=develop
* fix bug and code refine
test=develop
* fix memory size check
test=develop
* rename reuse_tmp_allocation_excess_fraction
test=develop
6 years ago
tensor-tang
c3a9f3c4b2
fix typo and refine
...
test=develop
6 years ago
tensor-tang
ab9c4b2a9f
refine seqpool concat pass and remove unused nodes
...
test=develop
6 years ago
tensor-tang
ce909664d8
Merge remote-tracking branch 'ups/develop' into refine/seqpool/feed
6 years ago
flame
fb63cd89d4
Add python ir graph API ( #14917 )
6 years ago
tensor-tang
a0a27bd240
add seqpool concat fuse pass tester
...
test=develop
6 years ago
sneaxiy
594dc4d8f0
partial gc 1st version
...
test=develop
6 years ago
tensor-tang
8e086a8521
follow comment and fix typo
...
test=develop
6 years ago
tensor-tang
48410b9bfe
Merge pull request #15237 from tensor-tang/fuse/seqpool_concat_2
...
Fuse/seqpool concat 2
6 years ago
peizhilin
c1235c935f
add the enable_debug flag
...
test=develop
6 years ago
Xin Pan
7b73fc9e1a
Merge pull request #15089 from panyx0718/api
...
try unify Executor and ParallelExecutor
6 years ago
tensor-tang
f8c305b243
Merge remote-tracking branch 'ups/develop' into fuse/seqpool_concat_2
...
test=develop
6 years ago
Zeng Jinle
e29f10d315
Merge pull request #15207 from sneaxiy/remove_op_handle_lock_and_fix_var
...
Remove op handle lock and fix var
6 years ago
Zeng Jinle
7b638f2781
Merge pull request #15218 from sneaxiy/fix_same_name_func
...
Fix same name func framework::ToTypeIndex
6 years ago
mozga-intel
a42f8f4f6f
Enable element_wise_add operator for a ngraph
...
test=develop
6 years ago
tensor-tang
72d2a1801e
add seqpool concat fuse pass
...
test=develop
6 years ago
sneaxiy
bc205ef374
fix same name func
...
test=develop
6 years ago
xuezhong
c0bc818688
Merge pull request #15188 from velconia/add_pyramid_dnn_support
...
Add no lock optimization pass
6 years ago
sneaxiy
4a443ffc98
merge develop
...
test=develop
6 years ago
sneaxiy
7c7342bf12
fix scope.var()
...
test=develop
6 years ago
Tao Luo
4d9aa1745a
Merge pull request #14806 from mozga-intel/mozga-intel/scale_operator_ngraph
...
Enable scale operator for a ngraph engine
6 years ago
peizhilin
a6f5ceee74
add the python callstack for debug support test=develop
6 years ago
minqiyang
b76695418a
Polish log
...
test=develop
6 years ago
minqiyang
1bfbc0d963
Polish code
...
test=develop
6 years ago
minqiyang
7f45b9511a
Polish code
6 years ago
minqiyang
68a07328fa
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_pyramid_dnn_support
...
test=develop
6 years ago
Qiao Longfei
44b300556d
change min_row_size_to_use_multithread to parameter of adam
...
test=develop
6 years ago
Qiao Longfei
87b4eb1da4
change min_param_size_to_use_multithread to min_row_size_to_use_multithread
6 years ago
minqiyang
4bfa110fd8
Add no lock optimize pass
...
test=develop
6 years ago
chengduo
eabb2105fa
Refactor MultiDevSSAGraphBuilder ( #15090 )
...
* Refactor ParallelExecutor
test=develop
* extract Reduce and AllReduce mode from MultiDevSSAGraphBuilder
test=develop
* Refactor MultiDevSSAGraphBuilder
test=developt
* Remove enable_data_balance
test=develop
* code refine
test=develop
* remove data balance
test=develop
* refine ScaleLossGradOp
test=develop
* remove uncessary file
test=develop
* code refine
test=develop
* modify function name
test=develop
* follow comments
test=develop
* add is_distribution field
test=develop
* set is_distribution
test=develop
* fix DistSSAGraphBuilder
test=develop
6 years ago
Yan Chunwei
875a07c32d
refactor inference analysis api ( #14634 )
6 years ago
mozga-intel
e77956c920
Enable mean operator for a ngraph
...
test=develop
6 years ago
mozga-intel
dd768714ab
Enable scale operator for a ngraph
...
test=develop
6 years ago
Qiao Longfei
17b1b660fc
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into multithread-sparse-adam
...
test=develop
6 years ago
baojun-nervana
f0cde74564
Update ngraph with elt-wise relu test=develop
6 years ago
Xin Pan
8ae9094e07
polish and resolve conflicts
...
test=develop
6 years ago
Xin Pan
5e928e579a
try unify Executor and ParallelExecutor
...
test=develop
6 years ago
Yan Xu
a1e60ab19b
Merge pull request #14791 from Yancey1989/parallel_graph_mode
...
[Feature] Add ParallelGraph executor mode in parallelexecutor to improve performance
6 years ago
Yancey1989
4ad9de74dd
disable sync nccl by default test=develop
6 years ago
Yancey1989
db603398b7
disable parallel graph executor by default
6 years ago
Xin Pan
087af6a686
Merge pull request #15131 from panyx0718/clean
...
hide temp tensor allocation
6 years ago
Yancey1989
e65436103f
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into parallel_graph_mode
...
test=develop
6 years ago
Yancey1989
94c80347b6
update by comment
6 years ago
Qiyang Min
23761beaef
Merge pull request #14971 from velconia/imperative_mnist
...
Imperative Optimizer
6 years ago
Wu Yi
227e0c4518
fix nccl2 mode startup test=develop ( #15132 )
6 years ago
Xin Pan
9186451f60
hide GetTensor
...
test=develop
6 years ago
Yancey1989
35cda13e9f
fix unittest test=develop
6 years ago
minqiyang
2547f9d1b8
Polish code
...
test=develop
6 years ago
minqiyang
09e2e66236
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into imperative_mnist
6 years ago
Yancey1989
0a885ac12a
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into parallel_graph_mode
...
test=develop
6 years ago
Yancey1989
ca8c77d966
selecte execution according to strategy test=develop
6 years ago
minqiyang
858e903231
Add unittest for operator
...
test=develop
6 years ago
wopeizl
7ab501264d
Merge pull request #15069 from wopeizl/windows/dsosupport
...
add cuda dso support for windows
6 years ago
minqiyang
6a5f604607
Support stop_gradients var in imperative backward
...
test=develop
6 years ago
guru4elephant
ff739449ab
Merge pull request #15018 from guru4elephant/add_timer
...
Add debug thread function for async executor
6 years ago
Qiyang Min
e29cbfe4f7
Merge pull request #14829 from velconia/accelerate_ddpg
...
Accelerate little models
6 years ago
Tao Luo
9c2cbfb89e
Merge pull request #15093 from baojun-nervana/intel/cmake
...
Upgrade ngraph & clean up cmake
6 years ago
Zeng Jinle
25b49a0896
Merge pull request #14933 from sneaxiy/rewrite_ddim
...
Rewrite ddim
6 years ago
Wu Yi
a8bc05b5ff
Refactor distributed RPC ( #15075 )
...
* wip
* wip
* refactor no.1 dir structure test=develop
* fix linking test=develop
* fix includes test=develop
* fix build test=develop
* fix build test=develop
6 years ago
baojun-nervana
555fbc10d8
upgrade ngraph to v0.10.1 test=develop
6 years ago
baojun-nervana
c714c36482
simplify logic test=develop
6 years ago
Xin Pan
3e8408429d
Merge pull request #15053 from panyx0718/imperative_hold
...
refactor to avoid scope.
6 years ago
sneaxiy
73896eeb94
merge develop
...
test=develop
6 years ago
Wu Yi
e26cced7cc
refine batch merge pass ( #14777 )
...
* refine batch merge pass
* refine batch merge pass test=develop
6 years ago
Yancey1989
4743c9cd5d
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into parallel_graph_mode
6 years ago
sneaxiy
9a3a246cb5
fix py35 compile error
...
test=develop
6 years ago
Zhaolong Xing
4048cfa9da
Merge pull request #15048 from NHZlX/add_affine_channel_fuse
...
Add conv+ affine channel fuse pass
6 years ago
minqiyang
ef7d563db9
Add changes back
...
test=develop
6 years ago
minqiyang
a318a490ab
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into accelerate_ddpg
...
test=develop
6 years ago
Zeng Jinle
c0bcff00dc
Merge pull request #14962 from sneaxiy/rewrite_variable_type
...
Rewrite variable type
6 years ago
chengduo
fe8495a758
[WIP] Refine MultiDevSSAGraph ( #15040 )
...
* refine parallel_exe
test=develop
* rename shared_var_device
* code refine
* add test_weight_decay
* remove Sort
test=develop
* Add SortForReduce
test=develop
* code refine
test=develop
* follow comment
test=develop
6 years ago
dongdaxiang
82335cd88c
Merge branch 'add_timer' of https://github.com/guru4elephant/Paddle into add_timer
...
test=develop
6 years ago
wopeizl
719ebe3786
Merge pull request #15070 from wopeizl/windows/testcasefix
...
fix test issues on windows
6 years ago
Xin Pan
b91a7a9d30
clear operator changes
...
test=develop
6 years ago
Xin Pan
f52b514dcd
call kernel
6 years ago
Xin Pan
4e80e04f23
fix
...
test=develop
6 years ago
Xin Pan
61491ce250
clean
...
test=develop
6 years ago
Xin Pan
ce7e503cbe
refactor to avoid scope.
...
test=develop
6 years ago
Qiyang Min
0238a3bb4f
Merge pull request #14972 from velconia/accelerate_lstm
...
Accelerate PADDLE_ENFORCE
6 years ago
Houjiang Chen
242d3c71a6
Merge pull request #15031 from hjchen2/develop
...
Fix conv_elementwise_add2_act pass
6 years ago
Xin Pan
71a4a8e981
Merge pull request #15071 from wopeizl/revert/15035
...
Revert "cherry-pick the #12759"
6 years ago
Qiao Longfei
3b294e2e2e
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into multithread-sparse-adam
6 years ago
sneaxiy
c4ce2e7b21
merge develop, solve conflict
...
test=develop
6 years ago
minqiyang
8ed0233924
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into accelerate_ddpg
...
test=develop
6 years ago
Zeng Jinle
9c6a0203e2
Merge pull request #15073 from sneaxiy/add_scope_pool
...
Add scope_pool
6 years ago
sneaxiy
b56aca82e9
merge develop
...
test=develop
6 years ago
sneaxiy
ee83ce75bf
try to fix py35 compile error
...
test=develop
6 years ago
sneaxiy
3e917a934a
add scope_pool
...
add module cleanup
test=develop
6 years ago
Yancey1989
1a4f79a7de
fix unittest test=develop
6 years ago
Yancey1989
86bb583881
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into parallel_graph_mode
6 years ago
Yancey1989
495e73d766
enable gc
6 years ago
Yancey1989
28cdfbc2b0
delete comment code
6 years ago
Yancey1989
845bfd5807
cleanup code
6 years ago
peizhilin
2388d0e7d6
Revert "cherry-pick the #12759"
...
test=develop
This reverts commit 7f6d8acecb
.
6 years ago
peizhilin
01c00b07dd
fix test issues on windows
...
test=develop
6 years ago
peizhilin
1e7f83e60a
add cuda dso support for windows
...
test=develop
6 years ago
tangwei12
dc8eca826e
code style fix, test=develop ( #15045 )
...
* code style fix, test=develop
6 years ago
Yancey1989
41a64f6a2a
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into parallel_graph_mode
6 years ago
nhzlx
a6aa8ea771
faster rcnn input is presistable. (fix it in paddle-trt)
...
test=develop
6 years ago
hjchen2
956cf92145
Fix conv_elementwise_add2_act pass
...
test=develop
6 years ago
Tao Luo
69659f4ae2
Merge pull request #15037 from jianhang-liu/fix/abnormal_stack_op_time
...
Fix/abnormal stack op time
6 years ago
sneaxiy
179acc60b3
fix conflict with develop
...
test=develop
6 years ago
wopeizl
09bd8fa67a
Merge pull request #15035 from wopeizl/debug/improvement1
...
cherry-pick the #12759
6 years ago
sneaxiy
dde3afe7b7
Merge develop
...
test=develop
6 years ago
dongdaxiang
2df1d80767
Merge branch 'add_timer' of https://github.com/guru4elephant/Paddle into add_timer
...
test=develop
6 years ago
Wu Yi
856f0da0fe
Fp16 training ( #14992 )
...
* wip
* wip
* wip
* wip for test
* add fp16 tests test=develop
* fix cpu build test=develop
* fix test=develop
* fix py3 tests test=develop
* fix lr_scheduler dtype test=develop
* fix test=dvelop
* test fix ci compile test=develop
* fix build and merge test=develop
* fallback momentumop change to general test=develop
* make fp16 lr schedule simple test=develop
* fix ut test=develop
* fix tests test=develop
* remove fp16 learning rate cast test=develop
6 years ago
Brian Liu
e821b12f57
Fix issue which cause abnormal CPU usage in stack op
...
Stack OP has much higher CPU cost than expected in release mode.
Caused by DebugStringEx() in base class OperatorWithKernel. Actually
this issue occur for each OP which hasn't implement it's own
GetExpectedKernelType().
test=develop
6 years ago
chengduo
b9fb03cf54
Move GetTensor to tensor_util ( #15011 )
...
* refine tensor
test=develop
* refine tensor
test=develop
* fix device_context log
test=develop
6 years ago
nhzlx
73b47df1f4
Merge branch 'develop' of https://github.com/paddlepaddle/paddle into add_affine_channel_fuse
...
test=develop
6 years ago
nhzlx
ce3782c193
add affine_channel fuse.
...
fix conv+elemenwise fuse bug.
6 years ago
peizhilin
7f6d8acecb
cherry-pick the #12759
...
test=develop
6 years ago
sneaxiy
3a2afbf02e
polish code
...
test=develop
6 years ago
tensor-tang
05d1121b22
Merge pull request #14802 from mozga-intel/mozga-intel/fill_constant_operator_ngraph
...
Enable fill_constant operator for a ngraph engine
6 years ago
tensor-tang
9d4f1d468a
Merge pull request #14804 from mozga-intel/mozga-intel/top_k_operator_ngraph
...
Enable top_k operator for a ngraph engine
6 years ago
sneaxiy
68d91cd594
add copy ctor
...
test=develop
6 years ago
dongdaxiang
3b3cb4ea55
Merge branch 'add_timer' of https://github.com/guru4elephant/Paddle into add_timer
6 years ago
sneaxiy
e02f67eff7
rewrite unsafe_cast
...
test=develop
6 years ago
minqiyang
68b86d6665
Change default value to align with the original react
...
test=develop
6 years ago
dongdaxiang
2dee8f6cd5
add TrainFilesWithTimer in async_executor
6 years ago
dongdaxiang
d434fcbaa6
add TrainFilesWithTimer in async_executor
6 years ago
minqiyang
250e893745
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into accelerate_ddpg
...
test=develop
6 years ago
Xin Pan
103f08f50e
Merge pull request #14910 from panyx0718/clean3
...
further updates
6 years ago
Zeng Jinle
0021b05b19
Merge pull request #14993 from sneaxiy/fix_check_lod
...
Fix CheckLoD bug
6 years ago
chengduo
79bd6dfa18
[Feature] Add Temporary Allocator ( #14875 )
...
* Add Temporal Allocator
* add Temporay Allocator to DeviceContext
test=develop
* code refine
test=develop
* fix mean_iou
test=develop
* Add DeviceTemporaryAllocator
test=develop
* fix conv_op bug
test=develop
* small fix
test=develop
* code refine
test=develop
* log refine
test=develop
* fix unit test
test=develop
* move double check
* refine concat_and_split
test=develop
* add limit_of_temporary_allocation
test=develop
* fix name
test=develop
6 years ago
sneaxiy
a30c5373eb
use std::is_sorted
...
fix comment
test=develop
6 years ago
minqiyang
8149a07a41
Fix wait stream two times bug
...
test=develop
6 years ago
sneaxiy
b8051e7927
merge develop
...
test=develop
6 years ago
Tao Luo
df1e4e2f10
fix check_lod
...
test=develop
6 years ago
minqiyang
0a4b6fc056
Remove unnessesary code
...
test=develop
6 years ago
minqiyang
53619a79b4
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into accelerate_lstm
6 years ago
minqiyang
6fabbd8fb8
Polish code and remove spin lock
...
test=develop
6 years ago
Zeng Jinle
95cbe07c40
Merge pull request #14836 from sneaxiy/feature/py_func
...
Featue/py_func op
6 years ago
mozga-intel
7048caf9a0
Enable top_k operator for a ngraph
...
test=develop
6 years ago
mozga-intel
ecfa68ecaa
Enable fill_constant operator for a ngraph
...
test=develop
6 years ago
sneaxiy
600f6d8272
polish code
...
test=develop
6 years ago
sneaxiy
7f6e513b1f
fix mac ci bug
...
make forward declaration
test=develop
6 years ago
sneaxiy
c1f7e54f62
merge develop
...
test=develop
6 years ago
typhoonzero
da87f7a698
Revert "[Feature] Fp16 training for resnet50 ( #14850 )"
...
This reverts commit 3d750f9c5a
.
6 years ago
sneaxiy
89b9d86d9d
fix windows compile bug
...
test=develop
6 years ago
Qiao Longfei
d76bda50c4
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into multithread-sparse-adam
...
test=develop
6 years ago
sneaxiy
490eb9061f
polish infer shape of py_func op
...
test=develop
6 years ago
Xin Pan
969ad966ba
all converted
...
test=develop
6 years ago
Xin Pan
a872eb90c2
Merge pull request #14959 from panyx0718/clean2
...
Further op RunImpl refactor
6 years ago
sneaxiy
13429c3e9f
clean code, remove void registration
...
test why MAC CI fail again
test=develop
6 years ago
chengduo
550e7e410b
Code Clean parallel_executor.py ( #14849 )
...
* refine parallel_executor
* remove uncessary code
test=develop
6 years ago
Wu Yi
3d750f9c5a
[Feature] Fp16 training for resnet50 ( #14850 )
...
* wip
* wip
* wip
* wip for test
* add fp16 tests test=develop
* fix cpu build test=develop
* fix test=develop
* fix py3 tests test=develop
* fix lr_scheduler dtype test=develop
* fix test=dvelop
* test fix ci compile test=develop
* fix build and merge test=develop
* fallback momentumop change to general test=develop
6 years ago
minqiyang
679d1a9e0b
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into accelerate_lstm
6 years ago
sneaxiy
83ac85158a
polish code
...
test=develop
6 years ago
sneaxiy
045dc12766
merge develop
...
test=develop
6 years ago
sneaxiy
ce4a26ddad
clean code
...
try to fix mac compile bug?
test=develop
7 years ago
Tomasz Patejko
e9eee0de6a
MKLDNN residual connection fuse: fixing accuracy problem ( #14874 )
...
* MKLDNN residual connection fuse: conv op reused
test=develop
* MKLDNN residual connection fuse: added prints for checking fuse
test=develop
* MKLDNN residual connection fuse: add more prints
test=develop
* MKLDNN residual connection fuse: add hash function. test=develop
* MKLDNN residual connection fuse: add hash to elementwise_add
test=develop
* MKLDNN residual connection fuse: add more hashes. test=develop
* MKLDNN residual connection fuse: added hashes to relu
test=develop
* MKLDNN residual connection fuse: do not fuse when fuse_relu is on
* MKLDNN residual connection fuse: check if fuse_relu attribute is set
test=develop
* MKLDNN residual connection fuse: comment out some printouts
* MKLDNN residual connection fuse: remove unused functions in the pass code
* MKLDNN residual connection fuse: delete commented hashes and printouts
* MKLDNN residual connection fuse: remove unnecessary includes. test=develop
7 years ago
sneaxiy
53f6c6991a
polish code
...
test=develop
7 years ago
sneaxiy
74a8e6b032
merge develop
...
fix conflict
test=develop
7 years ago
Xin Pan
1fe3ac352a
move more and fix while
...
test=develop
7 years ago
sneaxiy
ae6f46a1a9
rewrite variable type
...
test=develop
7 years ago
Xin Pan
9ef8a76873
convert more
...
test=develop
7 years ago
Xin Pan
876993887b
convert more interface to avoid scope
...
test=develop
7 years ago
Xin Pan
8c19f0bfe3
fix
...
test=develop
7 years ago
mozga-intel
9035bb81fe
Enable mul operator for a ngraph engine ( #14801 )
...
* Enable mul operator for a ngraph
test=develop
* Enable activation ops test
test=develop
* Remove unused line
test=develop
7 years ago
Xin Pan
4dd61e7260
convert GetInputVarPtrs and GetOutputVarPtrs
...
test=develop
7 years ago
Xin Pan
52d3903a12
fix
...
test=develop
7 years ago
Xin Pan
0e0983cc1d
convert more infer shape
7 years ago
Xin Pan
62eb43ba98
convert more
...
test=develop
7 years ago
Xin Pan
dfcf746ea1
Merge pull request #14904 from panyx0718/clean2
...
refactor RunImpl
7 years ago
Qiao Longfei
3f3a84b6dc
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into multithread-sparse-adam
...
test=develop
7 years ago
sneaxiy
a500dfa579
rewrite ddim
...
test=develop
7 years ago
Zeng Jinle
16c244bc3f
Merge pull request #14928 from sneaxiy/fix_gc
...
Fix gc bug
7 years ago
Xin Pan
f897bd16c0
clean
...
test=develop
7 years ago
Xin Pan
70981f5d79
clean
...
test=develop
7 years ago
colourful-tree
44ad2f4479
Merge pull request #14873 from colourful-tree/develop
...
add pslib(pserver) to paddle, an industrial scale high performance parameter server library
7 years ago
minqiyang
69642000dc
Hide KeyHasher
...
test=develop
7 years ago
Zhaolong Xing
a9fb34fad8
Merge pull request #14903 from NHZlX/add_conv_elementwise_pass
...
Add conv + elementwiseAdd pass
7 years ago
dzhwinter
7cd24b1318
add ir memory optimize. ( #14530 )
...
* follow comments. test=develop
* Fix typo
* fix compile error. test=develop
* merge develop branch. test=develop
* Remove set_equal
* Polish code
* Delete unused functions
test=develop
* polish code. test=develop
* follow comment
* polish code.
* fix windows compile error. test=develop
* fix op handle.
* rerun ci. test=develop
* rerun ci. test=develop
* rerun macci. test=develop
* polish code. test=develop
* rewrite sort code. test=develop
* remove unused code. test=develop
* fix tests. test=develop
* fix conflict. test=develop
* follow comment. test=develop
* merge develop branch. test=develop
* fix tests. test=develop
* remove ToTypeIndex. test=develop
* rerun ci. test=develop
7 years ago
Xin Pan
fb8ae30331
fix
...
test=develop
7 years ago
guru4elephant
a79a3ea2f0
Merge branch 'develop' into develop
7 years ago
wopeizl
0f085f0a5a
Merge pull request #14892 from wopeizl/windows/port3
...
fix script issue
7 years ago
Yancey1989
06936a2ff5
fix 1gpu test=develop
7 years ago
sneaxiy
c631412eab
fix gc bug
...
test=develop
7 years ago
Xin Pan
eaf8ba35b5
change input
...
test=develop
7 years ago
Xin Pan
840e6729e2
inject context
...
test=develop
7 years ago
Xin Pan
bbff0df320
try cache variables
...
test=develop
7 years ago
Xin Pan
52bc4ee75a
delay infer scope
...
test=develop
7 years ago
Yancey1989
d3a4da5cf6
fix comment test=develop
7 years ago
Yancey1989
49870f507d
delete unused code test=develop
7 years ago
Qiao Longfei
3bd54ed769
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into multithread-sparse-adam
7 years ago
minqiyang
27a0d6c2dc
Polish code
...
test=develop
7 years ago
minqiyang
aa41ee75a1
Accelerate PADDLE_ENFORCE
7 years ago
nhzlx
fcc93d96d5
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into add_conv_elementwise_pass
...
fix conflicts
test=develop
7 years ago
Yancey1989
a7d6b1f921
code cleanup test=develop
7 years ago
minqiyang
728e7e88fb
Use xxHash as scope's hash algorithm
...
test=develop
7 years ago
Yancey1989
a760a550b0
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into parallel_graph_mode
7 years ago
Yancey1989
fd144954ed
redefine api test=develop
7 years ago
minqiyang
81651fca45
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into accelerate_ddpg
...
test=develop
7 years ago
Yu Yang
bacf1d2399
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/tensor_type
7 years ago
Yu Yang
e439257ef7
Fix include style
...
test=develop
7 years ago
nhzlx
c0c9fcd9c7
add source file
...
test=develop
7 years ago
dongdaxiang
4c0a769d1d
avoid clock time in WIN32 mode
...
test=develop
7 years ago
dongdaxiang
66522046ad
remove clock time in WIN32 mode
...
test=develop
7 years ago
dongdaxiang
f2b92d77b5
remove clock time in WIN32 mode
7 years ago
nhzlx
4e4a777243
add conv+elementwiseadd pass
...
test=develop
7 years ago
gongweibao
0b1c7d838c
Add brpc serialization support. ( #11430 )
7 years ago
Yan Chunwei
a985949be9
Fea/fuse conv elementwise add fuse ( #14669 )
7 years ago
Yancey1989
4a4ccac1d0
update by comment test=develop
7 years ago
Yu Yang
04a570b463
Fix ut
...
test=develop
7 years ago
heqiaozhi
09d669ba40
fix static_cast to const_cast
7 years ago
peizhilin
23dec78772
fix script issue
...
test=develop
7 years ago
heqiaozhi
bd1c1724aa
add ps_instance doc
7 years ago