Skip to content

Commit f0e797e

Browse files
committed
Doc fix and enhancement for lstm_unit python wrapper.
1 parent 39502e6 commit f0e797e

File tree

1 file changed

+66
-60
lines changed
  • python/paddle/v2/fluid/layers

1 file changed

+66
-60
lines changed

python/paddle/v2/fluid/layers/nn.py

Lines changed: 66 additions & 60 deletions
Original file line numberDiff line numberDiff line change
@@ -151,7 +151,7 @@ def embedding(input, size, is_sparse=False, param_attr=None, dtype='float32'):
151151
152152
Args:
153153
input(Variable): Input to the function
154-
size(tuple|list|None): Shape of the look up table parameter
154+
size(tuple|list|None): Shape of the look up table parameter
155155
is_sparse(bool): Boolean flag that specifying whether the input is sparse
156156
param_attr(ParamAttr): Parameters for this layer
157157
dtype(np.dtype|core.DataType|str): The type of data : float32, float_16, int etc
@@ -366,9 +366,9 @@ def cross_entropy(input, label, **kwargs):
366366
367367
1) One-hot cross-entropy:
368368
`soft_label = False`, `Label[i, 0]` indicates the class index for sample i:
369-
369+
370370
.. math::
371-
371+
372372
Y[i] = -\log(X[i, Label[i]])
373373
374374
2) Soft-label cross-entropy:
@@ -386,15 +386,15 @@ def cross_entropy(input, label, **kwargs):
386386
As a special case of 2), when each row of 'label' has only one
387387
non-zero element which is equal to 1, soft-label cross-entropy degenerates
388388
to a one-hot cross-entropy with one-hot label representation.
389-
389+
390390
Args:
391-
input (Variable|list): a 2-D tensor with shape [N x D], where N is the
392-
batch size and D is the number of classes. This input is a probability
391+
input (Variable|list): a 2-D tensor with shape [N x D], where N is the
392+
batch size and D is the number of classes. This input is a probability
393393
computed by the previous operator, which is almost always the result
394394
of a softmax operator.
395-
label (Variable|list): the ground truth which is a 2-D tensor. When
396-
`soft_label` is set to `False`, `label` is a tensor<int64> with shape
397-
[N x 1]. When `soft_label` is set to `True`, `label` is a
395+
label (Variable|list): the ground truth which is a 2-D tensor. When
396+
`soft_label` is set to `False`, `label` is a tensor<int64> with shape
397+
[N x 1]. When `soft_label` is set to `True`, `label` is a
398398
tensor<float/double> with shape [N x D].
399399
soft_label (bool, via `**kwargs`): a flag indicating whether to interpretate
400400
the given labels as soft labels, default `False`.
@@ -403,7 +403,7 @@ def cross_entropy(input, label, **kwargs):
403403
A 2-D tensor with shape [N x 1], the cross entropy loss.
404404
405405
Raises:
406-
`ValueError`: 1) the 1st dimension of `input` and `label` are not equal; 2) when \
406+
`ValueError`: 1) the 1st dimension of `input` and `label` are not equal; 2) when \
407407
`soft_label == True`, and the 2nd dimension of `input` and `label` are not \
408408
equal; 3) when `soft_label == False`, and the 2nd dimension of `label` is not 1.
409409
@@ -727,9 +727,9 @@ def _get_default_param_initializer():
727727

728728
def sequence_pool(input, pool_type, **kwargs):
729729
"""
730-
This function add the operator for sequence pooling.
731-
It pools features of all time-steps of each instance, and is applied
732-
on top of the input using pool_type mentioned in the parameters.
730+
This function add the operator for sequence pooling.
731+
It pools features of all time-steps of each instance, and is applied
732+
on top of the input using pool_type mentioned in the parameters.
733733
734734
It supports four pool_type:
735735
@@ -758,7 +758,7 @@ def sequence_pool(input, pool_type, **kwargs):
758758
759759
Args:
760760
input(variable): The input variable which is a LoDTensor.
761-
pool_type (string): The pooling type of sequence_pool.
761+
pool_type (string): The pooling type of sequence_pool.
762762
It supports average, sum, sqrt and max.
763763
764764
Returns:
@@ -768,7 +768,7 @@ def sequence_pool(input, pool_type, **kwargs):
768768
769769
.. code-block:: python
770770
771-
x = fluid.layers.data(name='x', shape=[7, 1],
771+
x = fluid.layers.data(name='x', shape=[7, 1],
772772
dtype='float32', lod_level=1)
773773
avg_x = fluid.layers.sequence_pool(input=x, pool_type='average')
774774
sum_x = fluid.layers.sequence_pool(input=x, pool_type='sum')
@@ -816,7 +816,7 @@ def sequence_first_step(input, **kwargs):
816816
817817
.. code-block:: python
818818
819-
x = fluid.layers.data(name='x', shape=[7, 1],
819+
x = fluid.layers.data(name='x', shape=[7, 1],
820820
dtype='float32', lod_level=1)
821821
x_first_step = fluid.layers.sequence_first_step(input=x)
822822
"""
@@ -849,7 +849,7 @@ def sequence_last_step(input, **kwargs):
849849
850850
.. code-block:: python
851851
852-
x = fluid.layers.data(name='x', shape=[7, 1],
852+
x = fluid.layers.data(name='x', shape=[7, 1],
853853
dtype='float32', lod_level=1)
854854
x_last_step = fluid.layers.sequence_last_step(input=x)
855855
"""
@@ -1168,25 +1168,26 @@ def lstm_unit(x_t,
11681168
11691169
.. math::
11701170
1171-
i_t & = \sigma(W_{x_i}x_{t} + W_{h_i}h_{t-1} + W_{c_i}c_{t-1} + b_i)
1171+
i_t & = \sigma(W_{x_i}x_{t} + W_{h_i}h_{t-1} + b_i)
11721172
1173-
f_t & = \sigma(W_{x_f}x_{t} + W_{h_f}h_{t-1} + W_{c_f}c_{t-1} + b_f)
1173+
f_t & = \sigma(W_{x_f}x_{t} + W_{h_f}h_{t-1} + b_f)
11741174
1175-
c_t & = f_tc_{t-1} + i_t tanh (W_{x_c}x_t+W_{h_c}h_{t-1} + b_c)
1175+
c_t & = f_tc_{t-1} + i_t tanh (W_{x_c}x_t + W_{h_c}h_{t-1} + b_c)
11761176
1177-
o_t & = \sigma(W_{x_o}x_{t} + W_{h_o}h_{t-1} + W_{c_o}c_t + b_o)
1177+
o_t & = \sigma(W_{x_o}x_{t} + W_{h_o}h_{t-1} + b_o)
11781178
11791179
h_t & = o_t tanh(c_t)
11801180
1181-
The inputs of lstm unit includes :math:`x_t`, :math:`h_{t-1}` and
1182-
:math:`c_{t-1}`. The implementation separates the linear transformation
1183-
and non-linear transformation apart. Here, we take :math:`i_t` as an
1184-
example. The linear transformation is applied by calling a `fc` layer and
1185-
the equation is:
1181+
The inputs of lstm unit include :math:`x_t`, :math:`h_{t-1}` and
1182+
:math:`c_{t-1}`. The 2nd dimensions of :math:`h_{t-1}` and :math:`c_{t-1}`
1183+
should be same. The implementation separates the linear transformation and
1184+
non-linear transformation apart. Here, we take :math:`i_t` as an example.
1185+
The linear transformation is applied by calling a `fc` layer and the
1186+
equation is:
11861187
11871188
.. math::
11881189
1189-
L_{i_t} = W_{x_i}x_{t} + W_{h_i}h_{t-1} + W_{c_i}c_{t-1} + b_i
1190+
L_{i_t} = W_{x_i}x_{t} + W_{h_i}h_{t-1} + b_i
11901191
11911192
The non-linear transformation is applied by calling `lstm_unit_op` and the
11921193
equation is:
@@ -1213,14 +1214,15 @@ def lstm_unit(x_t,
12131214
Raises:
12141215
ValueError: The ranks of **x_t**, **hidden_t_prev** and **cell_t_prev**\
12151216
not be 2 or the 1st dimensions of **x_t**, **hidden_t_prev** \
1216-
and **cell_t_prev** not be the same.
1217+
and **cell_t_prev** not be the same or the 2nd dimensions of \
1218+
**hidden_t_prev** and **cell_t_prev** not be the same.
12171219
12181220
Examples:
12191221
12201222
.. code-block:: python
12211223
12221224
x_t = fluid.layers.fc(input=x_t_data, size=10)
1223-
prev_hidden = fluid.layers.fc(input=prev_hidden_data, size=20)
1225+
prev_hidden = fluid.layers.fc(input=prev_hidden_data, size=30)
12241226
prev_cell = fluid.layers.fc(input=prev_cell_data, size=30)
12251227
hidden_value, cell_value = fluid.layers.lstm_unit(x_t=x_t,
12261228
hidden_t_prev=prev_hidden,
@@ -1239,7 +1241,11 @@ def lstm_unit(x_t,
12391241

12401242
if x_t.shape[0] != hidden_t_prev.shape[0] or x_t.shape[
12411243
0] != cell_t_prev.shape[0]:
1242-
raise ValueError("The 1s dimension of x_t, hidden_t_prev and "
1244+
raise ValueError("The 1s dimensions of x_t, hidden_t_prev and "
1245+
"cell_t_prev must be the same.")
1246+
1247+
if hidden_t_prev.shape[1] != cell_t_prev.shape[1]:
1248+
raise ValueError("The 2nd dimensions of hidden_t_prev and "
12431249
"cell_t_prev must be the same.")
12441250

12451251
if bias_attr is None:
@@ -1268,17 +1274,17 @@ def lstm_unit(x_t,
12681274

12691275
def reduce_sum(input, dim=None, keep_dim=False):
12701276
"""
1271-
Computes the sum of tensor elements over the given dimension.
1277+
Computes the sum of tensor elements over the given dimension.
12721278
12731279
Args:
12741280
input (Variable): The input variable which is a Tensor or LoDTensor.
1275-
dim (int|None): The dimension along which the sum is performed. If
1276-
:attr:`None`, sum all elements of :attr:`input` and return a
1277-
Tensor variable with a single element, otherwise must be in the
1278-
range :math:`[-rank(input), rank(input))`. If :math:`dim < 0`,
1281+
dim (int|None): The dimension along which the sum is performed. If
1282+
:attr:`None`, sum all elements of :attr:`input` and return a
1283+
Tensor variable with a single element, otherwise must be in the
1284+
range :math:`[-rank(input), rank(input))`. If :math:`dim < 0`,
12791285
the dimension to reduce is :math:`rank + dim`.
1280-
keep_dim (bool): Whether to reserve the reduced dimension in the
1281-
output Tensor. The result tensor will have one fewer dimension
1286+
keep_dim (bool): Whether to reserve the reduced dimension in the
1287+
output Tensor. The result tensor will have one fewer dimension
12821288
than the :attr:`input` unless :attr:`keep_dim` is true.
12831289
12841290
Returns:
@@ -1312,17 +1318,17 @@ def reduce_sum(input, dim=None, keep_dim=False):
13121318

13131319
def reduce_mean(input, dim=None, keep_dim=False):
13141320
"""
1315-
Computes the mean of tensor elements over the given dimension.
1321+
Computes the mean of tensor elements over the given dimension.
13161322
13171323
Args:
13181324
input (Variable): The input variable which is a Tensor or LoDTensor.
1319-
dim (int|None): The dimension along which the mean is computed. If
1320-
:attr:`None`, compute the mean over all elements of :attr:`input`
1321-
and return a Tensor variable with a single element, otherwise
1322-
must be in the range :math:`[-rank(input), rank(input))`. If
1325+
dim (int|None): The dimension along which the mean is computed. If
1326+
:attr:`None`, compute the mean over all elements of :attr:`input`
1327+
and return a Tensor variable with a single element, otherwise
1328+
must be in the range :math:`[-rank(input), rank(input))`. If
13231329
:math:`dim < 0`, the dimension to reduce is :math:`rank + dim`.
1324-
keep_dim (bool): Whether to reserve the reduced dimension in the
1325-
output Tensor. The result tensor will have one fewer dimension
1330+
keep_dim (bool): Whether to reserve the reduced dimension in the
1331+
output Tensor. The result tensor will have one fewer dimension
13261332
than the :attr:`input` unless :attr:`keep_dim` is true.
13271333
13281334
Returns:
@@ -1356,22 +1362,22 @@ def reduce_mean(input, dim=None, keep_dim=False):
13561362

13571363
def reduce_max(input, dim=None, keep_dim=False):
13581364
"""
1359-
Computes the maximum of tensor elements over the given dimension.
1365+
Computes the maximum of tensor elements over the given dimension.
13601366
13611367
Args:
13621368
input (Variable): The input variable which is a Tensor or LoDTensor.
1363-
dim (int|None): The dimension along which the maximum is computed.
1364-
If :attr:`None`, compute the maximum over all elements of
1365-
:attr:`input` and return a Tensor variable with a single element,
1366-
otherwise must be in the range :math:`[-rank(input), rank(input))`.
1369+
dim (int|None): The dimension along which the maximum is computed.
1370+
If :attr:`None`, compute the maximum over all elements of
1371+
:attr:`input` and return a Tensor variable with a single element,
1372+
otherwise must be in the range :math:`[-rank(input), rank(input))`.
13671373
If :math:`dim < 0`, the dimension to reduce is :math:`rank + dim`.
1368-
keep_dim (bool): Whether to reserve the reduced dimension in the
1369-
output Tensor. The result tensor will have one fewer dimension
1374+
keep_dim (bool): Whether to reserve the reduced dimension in the
1375+
output Tensor. The result tensor will have one fewer dimension
13701376
than the :attr:`input` unless :attr:`keep_dim` is true.
13711377
13721378
Returns:
13731379
Variable: The reduced Tensor variable.
1374-
1380+
13751381
Examples:
13761382
.. code-block:: python
13771383
@@ -1400,22 +1406,22 @@ def reduce_max(input, dim=None, keep_dim=False):
14001406

14011407
def reduce_min(input, dim=None, keep_dim=False):
14021408
"""
1403-
Computes the minimum of tensor elements over the given dimension.
1409+
Computes the minimum of tensor elements over the given dimension.
14041410
14051411
Args:
14061412
input (Variable): The input variable which is a Tensor or LoDTensor.
1407-
dim (int|None): The dimension along which the minimum is computed.
1408-
If :attr:`None`, compute the minimum over all elements of
1409-
:attr:`input` and return a Tensor variable with a single element,
1410-
otherwise must be in the range :math:`[-rank(input), rank(input))`.
1413+
dim (int|None): The dimension along which the minimum is computed.
1414+
If :attr:`None`, compute the minimum over all elements of
1415+
:attr:`input` and return a Tensor variable with a single element,
1416+
otherwise must be in the range :math:`[-rank(input), rank(input))`.
14111417
If :math:`dim < 0`, the dimension to reduce is :math:`rank + dim`.
1412-
keep_dim (bool): Whether to reserve the reduced dimension in the
1413-
output Tensor. The result tensor will have one fewer dimension
1418+
keep_dim (bool): Whether to reserve the reduced dimension in the
1419+
output Tensor. The result tensor will have one fewer dimension
14141420
than the :attr:`input` unless :attr:`keep_dim` is true.
14151421
14161422
Returns:
14171423
Variable: The reduced Tensor variable.
1418-
1424+
14191425
Examples:
14201426
.. code-block:: python
14211427

0 commit comments

Comments
 (0)