-
Notifications
You must be signed in to change notification settings - Fork 5.7k
07/Label semantic roles #5798
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
07/Label semantic roles #5798
Conversation
… label_semantic_roles
… label_semantic_roles
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
从网络结构中,除了一些(我)难以理解的超参数设置,我也看不出来为什么CRF计算会溢出。是否可能先简化LSTM的层数,固定初始化,从数值计算一点点来检查。
CRF 输入和转移矩阵计算EXP之后的数值是不是有可能溢出?
size=[mark_dict_len, mark_dim], | ||
data_type='float32', | ||
is_sparse=IS_SPARSE, | ||
param_attr=std_0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个 embedding 参数是需要学习的,对吗,这里的 std_0 会将参数初始化成全零吗?但book里面似乎也是这样设置,这个好奇怪。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
book里的embedding层提供了初始化模型,不用学习~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
嗯,这个正在加,load之前训练好的embedding
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
size=hidden_dim, | ||
candidate_activation='relu', | ||
gate_activation='sigmoid', | ||
cell_activation='sigmoid', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mark_dim = 5 | ||
hidden_dim = 512 | ||
depth = 8 | ||
default_std = 1 / math.sqrt(hidden_dim) / 3.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个初始化方式也好迷,感觉像是随便拍的?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
先去掉这些奇怪的初始化方式了,默认使用XavierInitializer
layers.fc(input=input_tmp[1], | ||
size=label_dict_len, | ||
bias_attr=std_default, | ||
param_attr=lstm_para_attr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个参数会被初始化成全零吗,只剩下 bias?看着好奇怪/危险。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
改成默认的初始化方式了
name='label', shape=[1], data_type='int32', main_program=program) | ||
hidden = layers.fc(input=images, | ||
size=128, | ||
act='relu', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里不应该使用任何非线性激活。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
candidate_activation='relu', | ||
gate_activation='sigmoid', | ||
cell_activation='sigmoid', | ||
is_reverse=((i % 2) == 1), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change this argument to
is_reversed
or
reversed
better?
@qingqing01
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
project: #5813
fix: #5691
Tasks