|
|
|
@ -5,17 +5,17 @@
|
|
|
|
|
The *session* object encapsulates the environment in which the
|
|
|
|
|
computation graph is executed.
|
|
|
|
|
|
|
|
|
|
We will have *local* session and *remote* session, they offer the
|
|
|
|
|
We will have the *local* session and *remote* session, they offer the
|
|
|
|
|
same [interface](#interface). The local session encapsulates the local
|
|
|
|
|
runtime environment and the remote session encapsulates the cluster
|
|
|
|
|
runtime envrionment.
|
|
|
|
|
runtime environment.
|
|
|
|
|
|
|
|
|
|
The local runtime envrionment contains:
|
|
|
|
|
The local runtime environment contains:
|
|
|
|
|
|
|
|
|
|
1. computation devices (i.e., CPU, GPU) handles, and
|
|
|
|
|
1. the [scope](../scope.md) which holds all variables.
|
|
|
|
|
|
|
|
|
|
The remote runtime envrionment contains:
|
|
|
|
|
The remote runtime environment contains:
|
|
|
|
|
|
|
|
|
|
1. computation devices (i.e., CPU and GPU on node 0, 1) in a cluster,
|
|
|
|
|
and
|
|
|
|
@ -29,12 +29,12 @@ remote computation resource in a cluster from his local computer.
|
|
|
|
|
|
|
|
|
|
## Background
|
|
|
|
|
|
|
|
|
|
The current design has an implicit global session on which
|
|
|
|
|
The current design has an implicit global session in which
|
|
|
|
|
`paddle.eval()` is executed. The pain point is:
|
|
|
|
|
|
|
|
|
|
Since the user is not able to explicitly switch between runtime
|
|
|
|
|
environments such as the scope and the device contexts, the user
|
|
|
|
|
cannot run a topology in two independent environments.
|
|
|
|
|
environments, the user cannot run a topology in two independent
|
|
|
|
|
environments.
|
|
|
|
|
|
|
|
|
|
For example, in reinforcement learning, the user may want to have a
|
|
|
|
|
stale model for inference and a fresh model for training, and only
|
|
|
|
@ -49,12 +49,12 @@ We need the session object to address above issues.
|
|
|
|
|
## Session
|
|
|
|
|
|
|
|
|
|
A session is an object that owns the runtime environment. All
|
|
|
|
|
computations are executed through `session.eval`.
|
|
|
|
|
computations are executed through `session.eval()`.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
### Interface
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
```python
|
|
|
|
|
eval(
|
|
|
|
|
targets,
|
|
|
|
|
feed_dict=None,
|
|
|
|
@ -64,37 +64,57 @@ eval(
|
|
|
|
|
Evaluates the target Operations or Variables in `targets`.
|
|
|
|
|
|
|
|
|
|
- *targets*: the evaluation targets. Can be a single Operation or
|
|
|
|
|
Variable, or a list with the Operations or Variables as elements.
|
|
|
|
|
Variable, or a list with the Operations or Variables as
|
|
|
|
|
elements. The value returned by `eval()` has the same shape as the
|
|
|
|
|
`target` argument.
|
|
|
|
|
|
|
|
|
|
The PaddlePaddle program is represented by
|
|
|
|
|
the [ProgramDesc](../design/program.md), `eval()` will infer the
|
|
|
|
|
ProgramDesc from the given targets and run the PaddlePaddle
|
|
|
|
|
program. Please
|
|
|
|
|
see
|
|
|
|
|
[this graph](./distributed_architecture.md#local-training-architecture) for
|
|
|
|
|
the detailed illustration for the local session
|
|
|
|
|
and
|
|
|
|
|
[this graph](./distributed_architecture.md#distributed-training-architecture) for
|
|
|
|
|
the detailed illustration for the remote session.
|
|
|
|
|
|
|
|
|
|
- *feed_dict*: a dictionary that contains the tensors which override
|
|
|
|
|
the edges of the computation graph.
|
|
|
|
|
|
|
|
|
|
The value returned by `eval()` has the same shape as the `target`
|
|
|
|
|
argument.
|
|
|
|
|
feed_dict not only can provide the input data, it can override any
|
|
|
|
|
OP's input as well:
|
|
|
|
|
|
|
|
|
|
The computation graph is implicitly inferred from the targets.
|
|
|
|
|
```python
|
|
|
|
|
a = pd.constant(1.0, name="a")
|
|
|
|
|
b = pd.constant(2.0)
|
|
|
|
|
c = pd.mul(a,b)
|
|
|
|
|
sess.eval(targets=c, feed_dict={"a":3.0}) # returns 6.0
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
- *feed_dict*: a dictionary that contains the tensors which overrides
|
|
|
|
|
the edges of the computation graph.
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
```python
|
|
|
|
|
close()
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Closes the session. Calling this method releases the scope.
|
|
|
|
|
Closes the session and releases the scope that the session owns.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
### Create a Local Session
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
```python
|
|
|
|
|
session(
|
|
|
|
|
gpu_ids=None
|
|
|
|
|
devices=None
|
|
|
|
|
)
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Creates a new session. One session owns one scope, so creating
|
|
|
|
|
multiple sessions will create different scopes.
|
|
|
|
|
|
|
|
|
|
- *gpu_ids*: a single `int` or a list of `int` of the GPU IDs to be
|
|
|
|
|
used as the computation devices. If not specified, all avaiable GPUs
|
|
|
|
|
will be used.
|
|
|
|
|
- *devices*: a single `string` or a list of `string` of device names,
|
|
|
|
|
the corresponding devices will be the computation devices for
|
|
|
|
|
`eval()`. If not specified, all available devices (e.g., all GPUs)
|
|
|
|
|
will be used. The user doesn't need to specify the CPU device since
|
|
|
|
|
it will be always used.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
#### Example
|
|
|
|
@ -103,14 +123,14 @@ multiple sessions will create different scopes.
|
|
|
|
|
a = paddle.constant(1.0)
|
|
|
|
|
b = paddle.constant(2.0)
|
|
|
|
|
c = a + b
|
|
|
|
|
sess = paddle.session(gpu_ids=[0,1])
|
|
|
|
|
sess = paddle.session(devices=["gpu:0", "gpu:1", "fpga:0"])
|
|
|
|
|
sess.eval(c)
|
|
|
|
|
sess.close()
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
### Create a Remote Session
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
```python
|
|
|
|
|
create_cloud_job(
|
|
|
|
|
name,
|
|
|
|
|
num_trainer,
|
|
|
|
@ -125,7 +145,7 @@ create_cloud_job(
|
|
|
|
|
|
|
|
|
|
Creates a Paddle Cloud job. Fails if the job name exists.
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
```python
|
|
|
|
|
get_cloud_job(
|
|
|
|
|
name
|
|
|
|
|
)
|
|
|
|
@ -133,7 +153,7 @@ get_cloud_job(
|
|
|
|
|
|
|
|
|
|
Gets a Paddle Cloud job.
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
```python
|
|
|
|
|
remote_session(
|
|
|
|
|
job
|
|
|
|
|
)
|
|
|
|
|