# 사용자 정의 레이어 및 유틸리티 [[custom-layers-and-utilities]]

이 페이지는 라이브러리에서 사용되는 사용자 정의 레이어와 모델링을 위한 유틸리티 함수들을 나열합니다.

이 함수들 대부분은 라이브러리 내의 모델 코드를 연구할 때만 유용합니다.

## PyTorch 사용자 정의 모듈 [[transformers.Conv1D]][[transformers.Conv1D]]

#### transformers.Conv1D[[transformers.Conv1D]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/pytorch_utils.py#L98)

1D-convolutional layer as defined by Radford et al. for OpenAI GPT (and also used in GPT-2).

Basically works like a linear layer but the weights are transposed.

**Parameters:**

nf (`int`) : The number of output features.

nx (`int`) : The number of input features.

## PyTorch 헬퍼(helper) 함수 [[transformers.apply_chunking_to_forward]][[transformers.apply_chunking_to_forward]]

#### transformers.apply_chunking_to_forward[[transformers.apply_chunking_to_forward]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/pytorch_utils.py#L182)

This function chunks the `input_tensors` into smaller input tensor parts of size `chunk_size` over the dimension
`chunk_dim`. It then applies a layer `forward_fn` to each chunk independently to save memory.

If the `forward_fn` is independent across the `chunk_dim` this function will yield the same result as directly
applying `forward_fn` to `input_tensors`.

Examples:

```python
# rename the usual forward() fn to forward_chunk()
def forward_chunk(self, hidden_states):
    hidden_states = self.decoder(hidden_states)
    return hidden_states

# implement a chunked forward function
def forward(self, hidden_states):
    return apply_chunking_to_forward(self.forward_chunk, self.chunk_size_lm_head, self.seq_len_dim, hidden_states)
```

**Parameters:**

forward_fn (`Callable[..., torch.Tensor]`) : The forward function of the model.

chunk_size (`int`) : The chunk size of a chunked tensor: `num_chunks = len(input_tensors[0]) / chunk_size`.

chunk_dim (`int`) : The dimension over which the `input_tensors` should be chunked.

input_tensors (`tuple[torch.Tensor]`) : The input tensors of `forward_fn` which will be chunked

**Returns:**

``torch.Tensor``

A tensor with the same shape as the `forward_fn` would have given if applied`.

#### transformers.pytorch_utils.find_pruneable_heads_and_indices[[transformers.pytorch_utils.find_pruneable_heads_and_indices]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/pytorch_utils.py#L260)

Finds the heads and their indices taking `already_pruned_heads` into account.

**Parameters:**

heads (`list[int]`) : List of the indices of heads to prune.

n_heads (`int`) : The number of heads in the model.

head_size (`int`) : The size of each head.

already_pruned_heads (`Set[int]`) : A set of already pruned heads.

**Returns:**

``tuple[Set[int], torch.LongTensor]``

A tuple with the indices of heads to prune taking `already_pruned_heads`
into account and the indices of rows/columns to keep in the layer weight.

#### transformers.prune_layer[[transformers.prune_layer]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/pytorch_utils.py#L160)

Prune a Conv1D or linear layer to keep only entries in index.

Used to remove heads.

**Parameters:**

layer (`Union[torch.nn.Linear, Conv1D]`) : The layer to prune.

index (`torch.LongTensor`) : The indices to keep in the layer.

dim (`int`, *optional*) : The dimension on which to keep the indices.

**Returns:**

``torch.nn.Linear` or [Conv1D](/docs/transformers/v4.57.1/ko/internal/modeling_utils#transformers.Conv1D)`

The pruned layer as a new layer with `requires_grad=True`.

#### transformers.pytorch_utils.prune_conv1d_layer[[transformers.pytorch_utils.prune_conv1d_layer]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/pytorch_utils.py#L127)

Prune a Conv1D layer to keep only entries in index. A Conv1D work as a Linear layer (see e.g. BERT) but the weights
are transposed.

Used to remove heads.

**Parameters:**

layer ([Conv1D](/docs/transformers/v4.57.1/ko/internal/modeling_utils#transformers.Conv1D)) : The layer to prune.

index (`torch.LongTensor`) : The indices to keep in the layer.

dim (`int`, *optional*, defaults to 1) : The dimension on which to keep the indices.

**Returns:**

`[Conv1D](/docs/transformers/v4.57.1/ko/internal/modeling_utils#transformers.Conv1D)`

The pruned layer as a new layer with `requires_grad=True`.

#### transformers.pytorch_utils.prune_linear_layer[[transformers.pytorch_utils.prune_linear_layer]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/pytorch_utils.py#L64)

Prune a linear layer to keep only entries in index.

Used to remove heads.

**Parameters:**

layer (`torch.nn.Linear`) : The layer to prune.

index (`torch.LongTensor`) : The indices to keep in the layer.

dim (`int`, *optional*, defaults to 0) : The dimension on which to keep the indices.

**Returns:**

``torch.nn.Linear``

The pruned layer as a new layer with `requires_grad=True`.

## TensorFlow 사용자 정의 레이어 [[transformers.modeling_tf_utils.TFConv1D]][[transformers.modeling_tf_utils.TFConv1D]]

#### transformers.modeling_tf_utils.TFConv1D[[transformers.modeling_tf_utils.TFConv1D]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/modeling_tf_utils.py#L3247)

1D-convolutional layer as defined by Radford et al. for OpenAI GPT (and also used in GPT-2).

Basically works like a linear layer but the weights are transposed.

**Parameters:**

nf (`int`) : The number of output features.

nx (`int`) : The number of input features.

initializer_range (`float`, *optional*, defaults to 0.02) : The standard deviation to use to initialize the weights.

kwargs (`dict[str, Any]`, *optional*) : Additional keyword arguments passed along to the `__init__` of `keras.layers.Layer`.

#### transformers.TFSequenceSummary[[transformers.TFSequenceSummary]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/modeling_tf_utils.py#L3394)

Compute a single vector summary of a sequence hidden states.

**Parameters:**

config ([PretrainedConfig](/docs/transformers/v4.57.1/ko/main_classes/configuration#transformers.PretrainedConfig)) : The config used by the model. Relevant arguments in the config class of the model are (refer to the actual config class of your model for the default values it uses): 

summary_type (`str`) : The method to use to make this summary. Accepted values are:  - `"last"` -- Take the last token hidden state (like XLNet) - `"first"` -- Take the first token hidden state (like Bert) - `"mean"` -- Take the mean of all tokens hidden states - `"cls_index"` -- Supply a Tensor of classification token position (GPT/GPT-2) - `"attn"` -- Not implemented now, use multi-head attention 

summary_use_proj (`bool`) : Add a projection after the vector extraction.

summary_proj_to_labels (`bool`) : If `True`, the projection outputs to `config.num_labels` classes (otherwise to `config.hidden_size`).

summary_activation (`Optional[str]`) : Set to `"tanh"` to add a tanh activation to the output, another string or `None` will add no activation.

summary_first_dropout (`float`) : Optional dropout probability before the projection and activation.

summary_last_dropout (`float`)-- Optional dropout probability after the projection and activation. 

initializer_range (`float`, *optional*, defaults to 0.02) : The standard deviation to use to initialize the weights.

kwargs (`dict[str, Any]`, *optional*) : Additional keyword arguments passed along to the `__init__` of `keras.layers.Layer`.

## TensorFlow 손실 함수 [[transformers.modeling_tf_utils.TFCausalLanguageModelingLoss]][[transformers.modeling_tf_utils.TFCausalLanguageModelingLoss]]

#### transformers.modeling_tf_utils.TFCausalLanguageModelingLoss[[transformers.modeling_tf_utils.TFCausalLanguageModelingLoss]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/modeling_tf_utils.py#L213)

Loss function suitable for causal language modeling (CLM), that is, the task of guessing the next token.

Any label of -100 will be ignored (along with the corresponding logits) in the loss computation.

#### transformers.modeling_tf_utils.TFMaskedLanguageModelingLoss[[transformers.modeling_tf_utils.TFMaskedLanguageModelingLoss]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/modeling_tf_utils.py#L324)

Loss function suitable for masked language modeling (MLM), that is, the task of guessing the masked tokens.

Any label of -100 will be ignored (along with the corresponding logits) in the loss computation.

#### transformers.modeling_tf_utils.TFMultipleChoiceLoss[[transformers.modeling_tf_utils.TFMultipleChoiceLoss]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/modeling_tf_utils.py#L316)

Loss function suitable for multiple choice tasks.

#### transformers.modeling_tf_utils.TFQuestionAnsweringLoss[[transformers.modeling_tf_utils.TFQuestionAnsweringLoss]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/modeling_tf_utils.py#L242)

Loss function suitable for question answering.

#### transformers.modeling_tf_utils.TFSequenceClassificationLoss[[transformers.modeling_tf_utils.TFSequenceClassificationLoss]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/modeling_tf_utils.py#L297)

Loss function suitable for sequence classification.

#### transformers.modeling_tf_utils.TFTokenClassificationLoss[[transformers.modeling_tf_utils.TFTokenClassificationLoss]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/modeling_tf_utils.py#L255)

Loss function suitable for token classification.

Any label of -100 will be ignored (along with the corresponding logits) in the loss computation.

## TensorFlow 도우미 함수 [[transformers.modeling_tf_utils.get_initializer]][[transformers.modeling_tf_utils.get_initializer]]

#### transformers.modeling_tf_utils.get_initializer[[transformers.modeling_tf_utils.get_initializer]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/modeling_tf_utils.py#L3519)

Creates a `keras.initializers.TruncatedNormal` with the given range.

**Parameters:**

initializer_range (*float*, defaults to 0.02) : Standard deviation of the initializer range.

**Returns:**

``keras.initializers.TruncatedNormal``

The truncated normal initializer.

#### transformers.modeling_tf_utils.keras_serializable[[transformers.modeling_tf_utils.keras_serializable]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/modeling_tf_utils.py#L148)

Decorate a Keras Layer class to support Keras serialization.

This is done by:

1. Adding a `transformers_config` dict to the Keras config dictionary in `get_config` (called by Keras at
   serialization time.
2. Wrapping `__init__` to accept that `transformers_config` dict (passed by Keras at deserialization time) and
   convert it to a config object for the actual layer initializer.
3. Registering the class as a custom object in Keras (if the Tensorflow version supports this), so that it does not
   need to be supplied in `custom_objects` in the call to `keras.models.load_model`.

**Parameters:**

cls (a `keras.layers.Layers subclass`) : Typically a `TF.MainLayer` class in this project, in general must accept a `config` argument to its initializer.

**Returns:**

The same class object, with modifications for Keras deserialization.

#### transformers.shape_list[[transformers.shape_list]]

[Source](https://github.com/huggingface/transformers/blob/v4.57.1/src/transformers/tf_utils.py#L28)

Deal with dynamic shape in tensorflow cleanly.

**Parameters:**

tensor (`tf.Tensor` or `np.ndarray`) : The tensor we want the shape of.

**Returns:**

``list[int]``

The shape of the tensor as a list.

