毕业论文发表电子商务指导系统的独特设计

降重资讯 admin 浏览

小编:



毕业论文发表电子商务指导系统的独特设计

noneThere is no oneThere is no one13.534276 0.8714 4.6096 5.6089 0.2460.60 6.4030,237031680444.6146 119 0.2882 0.2692 2.8424 4.36103 103256052730526315280557614 116 0.315218204822420 0.4024 2.80789 4.6879122 0.40633375056556383 128 0.27753 0.2920 134 0.25604450666831556580 4.68831815423733 2.6975 1471371623780244443434 1472353 2.6191 4.31721382 2.902 147162253366087132132650713814 0.4582 105238160867413 0.4457 0.4688120565807141860585120576136556341582038254156566288286432 2.8181 0.49132170244.653213655805862703 1293 第七篇高校毕业论文(设计)写作标准化与规范化教程来源:学位论文写作参考文献[1]黄小龙,赵艳玲,高职电子商务类专业学生就业指导探究[a].广西师范大学硕士学位论文指导手册[m].北京:中国人民大学出版社,20093]李志勇,张俊,杨丽,高职电子商务类专业学生就业指导[j].财经研究 2007,32(4) [4]刘志强,王晓峰,高职电子商务类专业毕业论文质量保障体系的构建[d].华南理工大学硕土学位论文指导手册,华南理工大学博士学位论文,2011 [5]吴丽娟,张永红.基于电子商务平台分析软件[j].科技信息研究 2014,16(2) [6]李海霞.电子

nonenoneThere is no one2.2.103156789003860 0.50 0.88 0.548 100,1604 100,1703 100170204 0.39 0.24 160 15623251051703 17013 3.0025 150 0.58 3.0042 316 186 5,918 364344103 18647 0.50 0.24 150 4.000 488 175622180 31735128401 4.0022 155241671225500 1803 15624522585158 351 5,918 316 5.0042 4.24 4.3039 156235615628686156342687843343771001 15626500 7.00 7,0042 107.28 4.32222222213402 1004 0.00424 100304 0.5007 1314 1004 1702 2813143251043084 6.007 1603 5.0042 13144648267628434222531430637771222132153258782367834378214402 1606 1607 210300379904 1608 1314336 1008425 100900 2012年度教育部全国普通高等学校本科毕业论文(设计)工作管理办法》第四条本科生的毕业论文(设计)是实现培养目标、检测水平和人才质量为依据的重要教学环节。

noneThere is no oneThere is no one <class 'RuntimeError'> at /

<class 'RuntimeError'> at /

cannot reshape tensor of 0 elements into shape [-1, 0] because the unspecified dimension size -1 can be any value and is ambiguous

Python /usr/local/lib/python3.6/dist-packages/transformers/modeling_gpt2.py in forward, line 372
Web POST http://192.168.1.100:8080/

Traceback (innermost first)

  • /usr/local/lib/python3.6/dist-packages/transformers/modeling_gpt2.py in forward
    1. heads_to_prune: dict of {layer_num: list of heads to prune in this layer}
    2. """
    3. for layer, heads in heads_to_prune.items():
    4. self.h[layer].attn.prune_heads(heads)
    5. def forward(self, input_ids, past=None, attention_mask=None, token_type_ids=None, position_ids=None, head_mask=None):
    6. input_shape = input_ids.size()
    1. input_ids = input_ids.view(-1, input_shape[-1]) ...
    1. if token_type_ids is not None:
    2. token_type_ids = token_type_ids.view(-1, input_shape[-1])
    3. if position_ids is not None:
    4. position_ids = position_ids.view(-1, input_shape[-1])
    5. if past is None:
    VariableValue
    attention_mask
    None
    head_mask
    None
    input_ids
    tensor([], device='cuda:0', size=(1, 0), dtype=torch.int64)
    input_shape
    torch.Size([1, 0])
    past
    None
    position_ids
    None
    self
    GPT2Model( (wte): Embedding(13317, 768) (wpe): Embedding(1024, 768) (drop): Dropout(p=0.1, inplace=False) (h): ModuleList( (0): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (1): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (2): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (3): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (4): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (5): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (6): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (7): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (8): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (9): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (10): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (11): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (12): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (13): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (14): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (15): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (16): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (17): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (18): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (19): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (20): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (21): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (22): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (23): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) ) (ln_f): LayerNorm((768,), eps=1e-05, elementwise_affine=True) )
    token_type_ids
    None
  • /usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in __call__
    1. if result is not None:
    2. if not isinstance(result, tuple):
    3. result = (result,)
    4. input = result
    5. if torch._C._get_tracing_state():
    6. result = self._slow_forward(*input, **kwargs)
    7. else:
    1. result = self.forward(*input, **kwargs) ...
    1. for hook in self._forward_hooks.values():
    2. hook_result = hook(self, input, result)
    3. if hook_result is not None:
    4. result = hook_result
    5. if len(self._backward_hooks) > 0:
    6. var = result
    VariableValue
    input
    (tensor([], device='cuda:0', size=(1, 0), dtype=torch.int64),)
    kwargs
    {'attention_mask': None, 'head_mask': None, 'past': None, 'position_ids': None, 'token_type_ids': None}
    self
    GPT2Model( (wte): Embedding(13317, 768) (wpe): Embedding(1024, 768) (drop): Dropout(p=0.1, inplace=False) (h): ModuleList( (0): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (1): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (2): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (3): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (4): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (5): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (6): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (7): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (8): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (9): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (10): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (11): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (12): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (13): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (14): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (15): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (16): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (17): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (18): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (19): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (20): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (21): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (22): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (23): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) ) (ln_f): LayerNorm((768,), eps=1e-05, elementwise_affine=True) )
  • /usr/local/lib/python3.6/dist-packages/transformers/modeling_gpt2.py in forward
    1. def forward(self, input_ids, past=None, attention_mask=None, token_type_ids=None, position_ids=None, head_mask=None,
    2. labels=None):
    3. transformer_outputs = self.transformer(input_ids,
    4. past=past,
    5. attention_mask=attention_mask,
    6. token_type_ids=token_type_ids,
    7. position_ids=position_ids,
    1. head_mask=head_mask) ...
    1. hidden_states = transformer_outputs[0]
    2. lm_logits = self.lm_head(hidden_states)
    3. outputs = (lm_logits,) + transformer_outputs[1:]
    4. if labels is not None:
    VariableValue
    attention_mask
    None
    head_mask
    None
    input_ids
    tensor([], device='cuda:0', size=(1, 0), dtype=torch.int64)
    labels
    None
    past
    None
    position_ids
    None
    self
    GPT2LMHeadModel( (transformer): GPT2Model( (wte): Embedding(13317, 768) (wpe): Embedding(1024, 768) (drop): Dropout(p=0.1, inplace=False) (h): ModuleList( (0): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (1): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (2): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (3): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (4): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (5): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (6): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (7): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (8): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (9): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (10): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (11): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (12): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (13): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (14): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (15): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (16): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (17): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (18): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (19): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (20): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (21): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (22): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (23): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) ) (ln_f): LayerNorm((768,), eps=1e-05, elementwise_affine=True) ) (lm_head): Linear(in_features=768, out_features=13317, bias=False) )
    token_type_ids
    None
  • /usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in __call__
    1. if result is not None:
    2. if not isinstance(result, tuple):
    3. result = (result,)
    4. input = result
    5. if torch._C._get_tracing_state():
    6. result = self._slow_forward(*input, **kwargs)
    7. else:
    1. result = self.forward(*input, **kwargs) ...
    1. for hook in self._forward_hooks.values():
    2. hook_result = hook(self, input, result)
    3. if hook_result is not None:
    4. result = hook_result
    5. if len(self._backward_hooks) > 0:
    6. var = result
    VariableValue
    input
    ()
    kwargs
    {'input_ids': tensor([], device='cuda:0', size=(1, 0), dtype=torch.int64)}
    self
    GPT2LMHeadModel( (transformer): GPT2Model( (wte): Embedding(13317, 768) (wpe): Embedding(1024, 768) (drop): Dropout(p=0.1, inplace=False) (h): ModuleList( (0): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (1): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (2): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (3): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (4): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (5): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (6): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (7): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (8): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (9): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (10): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (11): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (12): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (13): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (14): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (15): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (16): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (17): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (18): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (19): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (20): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (21): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D() (attn_dropout): Dropout(p=0.1, inplace=False) (resid_dropout): Dropout(p=0.1, inplace=False) ) (ln_2): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (mlp): MLP( (c_fc): Conv1D() (c_proj): Conv1D() (dropout): Dropout(p=0.1, inplace=False) ) ) (22): Block( (ln_1): LayerNorm((768,), eps=1e-05, elementwise_affine=True) (attn): Attention( (c_attn): Conv1D() (c_proj): Conv1D()

    当前网址:http://www.paperaa.com/newss/12422.html

     
    你可能喜欢的: