Paddle

Commit Graph

Author	SHA1	Message	Date
Leo Chen	0f1fde5102	fix the modification of set_expected_place (#31177 ) * revert the modification of set_expected_place * set device before op run * add ut	4 years ago
Qi Li	1d996637e6	[ROCM] update fluid imperative for rocm (part1), test=develop (#31017 ) * [ROCM] update fluid imperative for rocm (part1), test=develop * [ROCM] update reducer.cc after merge, test=develop * update reducer cmake after merge, test=develop	4 years ago
ShenLiang	9401173e3a	Remove scale loss before reduce in dygraph (#30807 )	4 years ago
ShenLiang	dae3e1f337	Solve inconsistent order in each card in dynamic graph (#30931 )	4 years ago
liuyuhui	87197f8c2e	[kunlun]fix sync in multi kunlun xpu dygraph training. (#30943 )	4 years ago
WangXi	6e3856d3fb	fix xpu dygraph place (#30868 )	4 years ago
wanghuancoder	35c5b23f68	use iwyu clean include second time, test=develop (#30829 ) * use iwyu clean include second time, test=develop	4 years ago
WangXi	b1026f64af	【kunlun】dygraph supports multi xpu card training (#30671 )	4 years ago
ShenLiang	3858f458ea	rm Singleton of reducer (#30775 )	4 years ago
wanghuancoder	d1b25ed9d7	add some RecordEvent, for dygraph timeline (#30299 ) * add some RecordEvent, for dygraph timeline, test=develop * change GpuMemcpySync to memory::Copy, test=develop * fix compile problem, test=develop * fix compile problem, test=develop * fix, test=develop * fix, test=develop	4 years ago
WangXi	572c466d19	[Prepare for MultiProcess xpu] unified gen nccl id, refine imperative reducer (#30455 )	4 years ago
Zhou Wei	fb20ec9a4e	fix bug of multicard grad ncclAllReduce (#30553 )	4 years ago
pangyoki	00554b3f6b	fix error message of Inplace strategy (#30520 )	4 years ago
Leo Chen	7043b8cfc6	support layer_norm fp16 in dygraph amp (#30430 ) * support layer_norm fp16 in dygraph amp * add ut * refine code	4 years ago
pangyoki	13d757362c	Add Inplace strategy (Output reuse Input Varbase) in dygraph (#30103 ) * add view strategy on squeeze,unsqueeze,reshape,flatten * add squeeze unittest * add unittests * use View strategy as name rather than Reuse Allacation * fix view api doc * fix format * use core.ops when input of reshape2 is Tensor * fix test_cross_entropy_loss error because of reshape2 * fix test_cross_entropy_loss error because of reshape2 * add inplace strategy * add elementwise_add sub * let backward op not use inplace * grad op do not use inplace * fix memory increase error and add leaf error message * delete selected_rows * change op_function * little change * solve HandleViewBetweenInputAndOutput * add unittest and leaf error message * merge view error * optimize op_function_generator format and support sum inplace op * fix format of basic_engine * fix format for framework * little change of variable wrapper * add reshape, squeeze, unsqueeze, scatter api * add relu elu tanh softmax inplace api * fix test_squeeze_op unittest * fix test_relu_op unittest * fix comment problems * delete sample code of inplace api * add reference of grad_pending_nodes in basic_engine * fix unittest name * add inplace apis into wlist * fix error message * add PADDLE_ENFORCE for set grad op twice * fix head file error	4 years ago
ShenLiang	a60f17b89d	Support unused parameters in dynamic graph distributed (#30224 )	4 years ago
石晓伟	8ce2482b80	fix header file paths of gflags, commit 1, test=develop (#30271 )	4 years ago
Leo Chen	8696335f86	Fix dtype of ungenerated grad var (#28511 ) * fix dtype of ungenerated grad var * update ut * refine code * set default dtype * fix could_use_cudnn bug * remove debug code * re-implement * fix bug	4 years ago
Leo Chen	1f97d61c68	Add callback after TensorCopy (#30123 ) * change to tensor copy sync * change to tensor copy sync * make copy_to safe when use TensorCopy * refine code * add ut * add cudapinned garbagecollector * add testcase: cpu place -> cuda pinned place	4 years ago
Chen Weihang	d0fb06b27f	[Complex] Simplify prepared op impl to improve performance (#30153 ) * simplify prepared op impl to improve performance * fix kunlun compile error * continue fix kunlun compile error * only transform diff place when dtype diff * fix failed unittests * remove useless file * polish impl by review comment	4 years ago
hong	297fff1a79	support dygraph in xpu place (#30051 ) * support dygraph in xpu place; test=develop * fix cpu/gpu compile error; test=develop * fix compile error; test=develop * fix xpu compile error; testd=develop	4 years ago
Chen Weihang	a1d9a14e89	support grad accumulated across batch (#29942 )	5 years ago
Chen Weihang	a6072055be	[Complex] Handle complex to real after type promotion (#29855 ) * try to add fwd op input dtypes * refactor base impl * return tmp_ins after dygraph prepare data * fix typo found in debug * polish comment & add complex net test * revert detail change * fix unittest failed * add complex kernel condition control * fix xpu test failed & polish comment * polish details by review comments	5 years ago
Chen Weihang	1a304e6c06	[Complex] Add support for complex grad accumulated (#29889 ) * add support for complex grad accumulated * add unittest for coverage * update test dtype * remove useless blank line	5 years ago
ShenLiang	f65f1caad3	opt sparse allreduce using ncclgather (#29819 )	5 years ago
ShenLiang	01e2874a0e	Support multi-stream communication for dynamic graph distributed (#29525 ) * fix fleet for multi-stream * fix memcpy for ncclid * use sync to solve move operation	5 years ago
Zhou Wei	e74e1a226c	support deepcopy for Layer/Tensor/Paramerbase (#29387 ) * support deepcopy for Layer/Tensor/Paramerbase * fix some code	5 years ago
ShenLiang	2ef9e0e23c	Rebuild group automatically in dynamic graph distributed (#29255 ) * add tensor_indices in AssignGroupBySize * add rebuild group in reducer	5 years ago
Zhou Wei	24ba9ed436	fix that parameters'grad has grad var (#29408 )	5 years ago
Leo Chen	b58cfff89d	use has_grad instead of train_mode (#29309 ) * use has_grad instead of train_mode * add vlog for debug * fix ut * fix ut	5 years ago
ShenLiang	696dc4bb13	fix the warning of reducer (#29323 )	5 years ago
Zhou Wei	c0a991c874	accumulate gradient for leaf tensor with previous graph and expose leaf tensor concept (#28429 ) * The leaf tensor concept is exposed and the gradient accumulation of leaf tensor * The leaf tensor concept is exposed and the gradient accumulation of leaf tensor * fix coverage * fix api doc * fix CI unittest * fix CI unittest * fix unitest * empty tensor does’t need inner_var_ * fix some error message	5 years ago
liym27	865a45984f	Check whether there is any inplace operation affecting gradient calculation. (#27901 ) * Add a class TensorInplaceVersion to count the inplace version and put it in framework::Tensor instead of Allocation or Variable. * Add a new attribute `_inplace_version` for VarBase. * Raise exception if an inplace operation can result in incorrect gradient computation. * Add a new interface _bump_inplace_version() for VarBase to bump the version whenever the Tensor is modified through an inplace operation. * For api assign, call _bump_inplace_version() when it's an inplace operation inn dynamic mode. * Use original var_wrapper if the inplace_version is not changed. * Replace SnapshotVarWrapperList with SnapshotVarWrapper to optimize performane.	5 years ago
ShenLiang	e2d01eb650	Support dynamic graph distributed (#28997 ) * add reducer * refine envent for memorycopy * add concat&split for allreduce * apply concat & split for fuse tensor * fix nccl dep * fix the untest, compile problem and ddp initialize problem * fix untest for mac & add some comments & solve the repeated param in sublayers * fix untest for windows & fix document	5 years ago
Leo Chen	770395cb93	Split train_mode and has_grad for tracer (#29064 ) * split train_mode and has_grad * fix format * fix ci problems * fix sample code	5 years ago
Chen Weihang	7eeb99fe02	Add basic hook classes for dygraph & implement reduce hook (#28584 ) * add base hook classes and reduce hook impl * fix constructor typo * polish comment format * refactor baisc hook class design * polish design details	5 years ago
danleifeng	a24d186814	fix nccl init failed in parallel dygraph mode (#28497 )	5 years ago
Chen Weihang	155b4f9b6c	Remove selected rows all reduce over height check (#28460 ) * remove slelected rows all reduce over height check * polish unittest	5 years ago
Chen Weihang	c42e656179	Add retry for dygraph parallel socket bind (#28404 ) * add retry for dygraph parallel socket bind * change to loop always * fix writing error	5 years ago
Leo Chen	44a476c2ab	support cuda pinned place (#28416 )	5 years ago
lidanqing	4ea2330759	use FLAGS_use_mkldnn to prevent unnecessary attrs copy (#28146 )	5 years ago
danleifeng	f29fb396df	dygraph nccl init support host domain name (#28107 ) * nccl init support hostname and ip; test=develop	5 years ago
Leo Chen	049696bf67	Refine the format of printing tensor (#27673 ) * add sumary feature * refine printting tensor * add sci_mode * add sample code * fix indent error * fix _format_item * polish code * support item indent * add ut * set place for ut * fix py2 issue * fix ut	5 years ago
arlesniak	0ecf441af1	Add support for mkldnn ops types selection with FLAGS in dygraph (#27482 ) * Add support for mkldnn ops types selection with FLAGS in dygraph * use regex to match DNNL verbose * python3 encoding fix	5 years ago
Leo Chen	a5b3263782	Refine error msg in paddle/fluid/imperative (#27521 ) * refine err msg * follow comments	5 years ago
wanghuancoder	df43905f12	use iwyu clean include (#27267 ) * use iwyu clean include, test=develop, test=win * compilation error, test=develop * fix compilation error2, test=develop * fix compilation error3, test=develop * fix compilation error4, test=develop * fix compilation error5, test=develop * fix compilation error6, test=develop * fix compilation error7, test=develop * fix compilation error8, test=develop * fix compilation error8, test=develop * fix compilation error10, test=develop * fix compilation error11, test=develop	5 years ago
arlesniak	885c61f086	Add use of global flag 'use_mkldnn' to layer_helper (#26497 ) * get use of global 'use_mkldnn' in layer_helper * update for CI * update for CI, relu test * update for CI, relu test added, make FLAGS_use_mkldnn a public flag * added more strict tests, fixes after review * fixes after review * fixes after review, CI stuff	5 years ago
Zhen Wang	f32ae272ec	Remove `sorted_sum_gradient_` form BasicEngine and PartialGradTask. (#26766 ) Use `Tensor` instead of `Variable` in the doc of paddle.grad.	5 years ago
Zhen Wang	f9066e6a6f	Update the demo code and the doc of varbase.backward. (#26506 ) * update the demo code and the doc of varbase.backward. * update the doc of the fake interface `paddle.fluid.Variable`. * remove BackwardStrategy.	5 years ago
QingshuChen	138ecf24aa	support Baidu Kunlun AI Accelerator (#25959 ) * support Baidu AI Accelerator * test=kunlun * minor * test=kunlun * support xpu op in separate file * test=kunlun * update XPU error message and remove duplicated code * test=kunlun * minor * test=kunlun * minor * test=kunlun	5 years ago

1 2 3 4 5 ...

269 Commits (e8cdb49aa9c29390d036d0a9984b4b458a506908)