yuyang18
d1203e3822
Add types
7 years ago
yuyang18
4649f662e7
Follow comments & polish doc
7 years ago
yuyang18
ab210925b8
Add more docs
7 years ago
yuyang18
f97c5d4c47
Trainer documentation
7 years ago
yuyang18
958ab99ef8
Polish Non-Layer API
7 years ago
tangwei12
bf2c53ae0a
Merge branch 'develop' of github.com:PaddlePaddle/Paddle into new_api_about_cpkt
7 years ago
Jeff Wang
637827a5bc
Use for_test=True in the Fluid Trainer to clone the test program ( #11323 )
...
* Use for_test=True in the Fluid Trainer to clone the test program
* fix typo
* Should do the samething to the inferencer
7 years ago
tangwei12
9e026a93cf
remove chief
7 years ago
tangwei12
7fbddaa64a
bug fix
7 years ago
tangwei12
cb7c1245b3
code optimized
7 years ago
tangwei12
2f44585e83
code optimized
7 years ago
tangwei12
53409a29d8
code optimized
7 years ago
tangwei12
6db240d78b
update trainer about epoch_id and step id
7 years ago
tangwei12
9735f25011
optimized
7 years ago
Siddharth Goyal
a4237171a5
Modify optimizer in new API to support more usecases ( #11168 )
...
* Modify optimizer in new API to support more usecase
* Modify CMake to include only modified examples
7 years ago
tangwei12
08e5f0ae48
rename need_load_checkpoint to get_latest_checkpoint_serial
7 years ago
tangwei12
7973d9b4b5
bug fix
7 years ago
tangwei12
46f2688f30
bug fix
7 years ago
tangwei12
bca4da4225
cancle only chief delete files
7 years ago
tangwei12
e44c278e60
bug fix about clean
7 years ago
tangwei12
b44ede8033
bug fix
7 years ago
tangwei12
d712af25dc
add distribute config
7 years ago
tangwei12
0deb6f90ba
annotation optimized and code style optimized
7 years ago
tangwei12
0211c5df0a
bug fix
7 years ago
tangwei12
9086043090
bug fix and optimize
7 years ago
tangwei12
486e1e337d
bug fix and optimize
7 years ago
tangwei12
ad9dfeb018
bug fix and optimize
7 years ago
tangwei12
5eea5db95f
optimized checkpoint and save_model
7 years ago
tangwei12
514b2427ed
add save/load persist_vars_without_grad
7 years ago
tangwei12
dca0b6d9cc
restore param_path
7 years ago
tangwei12
b044724db7
update fluid Train API param_path to checkpoint_config
7 years ago
Qiao Longfei
eb7d87545e
add trainer.stop and fix a bug for train_by_parallel_executor ( #10762 )
7 years ago
qiaolongfei
e8d24aa144
Inferencer support parallel_executor
7 years ago
yuyang18
f04754886b
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/trainer_by_pe
7 years ago
typhoonzero
a41a94f2ee
support nccl2 dist train in trainer
7 years ago
yuyang18
791af3a088
Merge branch 'develop' of https://github.com/PaddlePaddle/Paddle into feature/trainer_by_pe
7 years ago
Qiao Longfei
1c4bb5c83d
user need to set feed order for Trainer.train and Trainer.test ( #10679 )
7 years ago
daminglu
74ca73b80d
Update trainer api ( #10674 )
7 years ago
yuyang18
c4ad0dd084
Add fetch metrics
7 years ago
yuyang18
2a0205a5d9
Draft for train by parallel executor
7 years ago
Qiao Longfei
2a971f3084
Add inferencer infer ( #10445 )
...
* add Inference.infer
* optimize code
* update no_test_word2vec_new_api.py
* update trainer
* split check_and_get_place
* use inference_program to save inference model in Trainer
* update demo
* update save_inference_model
* clean code
7 years ago
fengjiayi
ba57348f8f
trainer.test() ( #10453 )
...
* a draft of trainer.test()
* polish trainer.test()
* polish trainer.test()
* update code format
* update
* polish code
* polish code
* polish code
* Make trainer.test follow the rule of returning [loss, metric, metric, ..]
7 years ago
Jeff Wang
889c919048
Use _prog_and_scope_guard to switch the scope ( #10421 )
7 years ago
Yancey
5b06944857
fix trainer import error on ce ( #10448 )
...
* fix trainer import error on ce
* fix setup.py.in
7 years ago
Yancey
bb3247e339
fix traner.py import error ( #10442 )
7 years ago
Jeff Wang
bd66eed50a
Trainer save load params ( #10386 )
...
* Load/save the params from the params_path
* Switch to use load_persistables and save_persistables
* Instaed of setup the executor to run program and scope. Pass the program to the load_persistables
7 years ago
Helin Wang
8ee23da846
Fluid new API: dist train without modifying code
...
Works with 1 trainer 1 pserver. 2 trainer 1 pserver will stuck at the
end of first step, still investigating.
The user only need to set envrionment variables to enable distributed
training.
run pserver:
PADDLE_TRAINING_ROLE=PSERVER PADDLE_PSERVER_IPS=127.0.0.1 PADDLE_TRAINERS=2 PADDLE_CURRENT_IP=127.0.0.1 python no_test_word2vec_new_api.py
run trainer:
PADDLE_TRAINING_ROLE=TRAINER PADDLE_PSERVER_IPS=127.0.0.1 PADDLE_TRAINERS=2 PADDLE_TRAINER_ID=0 python no_test_word2vec_new_api.py
7 years ago
Helin Wang
a66052c6ff
improve trainer API
...
- The trainer and inferencer will load params from disk if param_path
argument is not None in their constructor.
- Remove params.py, we will expose core.Scope to the user if needed
(e.g., for GAN). Currently we will not expose it, unless we clearly
know doing so can support GAN.
- Add `save_params` to Trainer (a TODO item).
- rename "network" to "program"
7 years ago
Yu Yang
1bb579a3f5
A naive trainer implementation
7 years ago
Helin Wang
b5dd215d46
improve comments
7 years ago