Skip to content

07/Label semantic roles #5798

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Nov 22, 2017

Conversation

jacquesqiao
Copy link
Member

@jacquesqiao jacquesqiao commented Nov 21, 2017

project: #5813
fix: #5691
Tasks

  • load lod data
  • load pretrained embedding table
  • training

Copy link
Contributor

@lcy-seso lcy-seso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

从网络结构中,除了一些(我)难以理解的超参数设置,我也看不出来为什么CRF计算会溢出。是否可能先简化LSTM的层数,固定初始化,从数值计算一点点来检查。
CRF 输入和转移矩阵计算EXP之后的数值是不是有可能溢出?

size=[mark_dict_len, mark_dim],
data_type='float32',
is_sparse=IS_SPARSE,
param_attr=std_0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个 embedding 参数是需要学习的,对吗,这里的 std_0 会将参数初始化成全零吗?但book里面似乎也是这样设置,这个好奇怪。

Copy link
Contributor

@qingqing01 qingqing01 Nov 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

book里的embedding层提供了初始化模型,不用学习~

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

嗯,这个正在加,load之前训练好的embedding

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

size=hidden_dim,
candidate_activation='relu',
gate_activation='sigmoid',
cell_activation='sigmoid',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个激活的组合好奇怪,除去 gate 固定用 sigmoid 之外。

image

上面红色圈出来的是cell_activation,蓝色圈出来的是candidate_activation,我理解的对吗?

  • sigmoid激活域最小,relu 激活输出无上限,连乘太多次之后容易激活多大(CRF 后面会对relu的激活值再求 exponential),一般都习惯性从 tanh 作为默认激活开始训练。
  • 不过 book 原来的配置就是这样的设置,先以(数值)稳定地训练为目标吧。

mark_dim = 5
hidden_dim = 512
depth = 8
default_std = 1 / math.sqrt(hidden_dim) / 3.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个初始化方式也好迷,感觉像是随便拍的?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

先去掉这些奇怪的初始化方式了,默认使用XavierInitializer

layers.fc(input=input_tmp[1],
size=label_dict_len,
bias_attr=std_default,
param_attr=lstm_para_attr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个参数会被初始化成全零吗,只剩下 bias?看着好奇怪/危险。

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

改成默认的初始化方式了

name='label', shape=[1], data_type='int32', main_program=program)
hidden = layers.fc(input=images,
size=128,
act='relu',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里不应该使用任何非线性激活。

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

candidate_activation='relu',
gate_activation='sigmoid',
cell_activation='sigmoid',
is_reverse=((i % 2) == 1),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change this argument to
is_reversed
or
reversed
better?
@qingqing01

@reyoung reyoung mentioned this pull request Nov 22, 2017
20 tasks
Copy link
Contributor

@lcy-seso lcy-seso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@jacquesqiao jacquesqiao merged commit 53bd51e into PaddlePaddle:develop Nov 22, 2017
@reyoung reyoung added this to the Release 0.11.0 milestone Nov 23, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

book chapter7/label_semantic_roles
5 participants