LTP 文档¶
Submodules¶
ltp.interface module¶
- ltp.interface.LTP(pretrained_model_name_or_path='LTP/small', force_download=False, proxies=None, use_auth_token=None, cache_dir=None, local_files_only=False, **model_kwargs)[source]¶
- Instantiate a pretrained LTP model from a pre-trained model
configuration from huggingface-hub. The model is set in evaluation mode by default using model.eval() (Dropout modules are deactivated). To train the model, you should first set it back in training mode with model.train().
- Parameters:
- pretrained_model_name_or_path (str or os.PathLike):
- Can be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids are [LTP/tiny, LTP/small, LTP/base, LTP/base1, LTP/base2, LTP/legacy ], the legacy model only support cws, pos and ner, but more fast.
You can add revision by appending @ at the end of model_id simply like this: dbmdz/bert-base-german-cased@main Revision is the specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
A path to a directory containing model weights saved using [~transformers.PreTrainedModel.save_pretrained], e.g., ./my_model_directory/.
None if you are both providing the configuration and state dictionary (resp. with keyword arguments config and state_dict).
- force_download (bool, optional, defaults to False):
Whether to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
- proxies (Dict[str, str], optional):
A dictionary of proxy servers to use by protocol or endpoint, e.g., {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.
- use_auth_token (str or bool, optional):
The token to use as HTTP bearer authorization for remote files. If True, will use the token generated when running transformers-cli login (stored in ~/.huggingface).
- cache_dir (Union[str, os.PathLike], optional):
Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
- local_files_only(bool, optional, defaults to False):
Whether to only look at local files (i.e., do not try to download the model).
- model_kwargs (Dict, optional):
model_kwargs will be passed to the model during initialization
<Tip>
Passing use_auth_token=True is required when you want to use a private model.
</Tip>
ltp.legacy module¶
ltp.nerual module¶
- class ltp.nerual.LTP(config=None, tokenizer=None)[source]¶
Bases:
BaseModule,ModelHubMixin- model¶
- cws_vocab¶
- pos_vocab¶
- ner_vocab¶
- srl_vocab¶
- dep_vocab¶
- sdp_vocab¶
- load_state_dict(state_dict, strict=True)[source]¶
Copy parameters and buffers from
state_dictinto this module and its descendants.If
strictisTrue, then the keys ofstate_dictmust exactly match the keys returned by this module’sstate_dict()function.Warning
If
assignisTruethe optimizer must be created after the call toload_state_dictunlessget_swap_module_params_on_conversion()isTrue.- Parameters
state_dict (dict) – a dict containing parameters and persistent buffers.
strict (bool, optional) – whether to strictly enforce that the keys in
state_dictmatch the keys returned by this module’sstate_dict()function. Default:Trueassign (bool, optional) – When set to
False, the properties of the tensors in the current module are preserved whereas setting it toTruepreserves properties of the Tensors in the state dict. The only exception is therequires_gradfield ofDefault: ``False`
- Returns
- missing_keys is a list of str containing any keys that are expected
by this module but missing from the provided
state_dict.
- unexpected_keys is a list of str containing the keys that are not
expected by this module but present in the provided
state_dict.
- Return type
NamedTuplewithmissing_keysandunexpected_keysfields
Note
If a parameter or buffer is registered as
Noneand its corresponding key exists instate_dict,load_state_dict()will raise aRuntimeError.
- property version¶
Module contents¶
- ltp.LTP(pretrained_model_name_or_path='LTP/small', force_download=False, proxies=None, use_auth_token=None, cache_dir=None, local_files_only=False, **model_kwargs)[source]¶
- Instantiate a pretrained LTP model from a pre-trained model
configuration from huggingface-hub. The model is set in evaluation mode by default using model.eval() (Dropout modules are deactivated). To train the model, you should first set it back in training mode with model.train().
- Parameters:
- pretrained_model_name_or_path (str or os.PathLike):
- Can be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids are [LTP/tiny, LTP/small, LTP/base, LTP/base1, LTP/base2, LTP/legacy ], the legacy model only support cws, pos and ner, but more fast.
You can add revision by appending @ at the end of model_id simply like this: dbmdz/bert-base-german-cased@main Revision is the specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
A path to a directory containing model weights saved using [~transformers.PreTrainedModel.save_pretrained], e.g., ./my_model_directory/.
None if you are both providing the configuration and state dictionary (resp. with keyword arguments config and state_dict).
- force_download (bool, optional, defaults to False):
Whether to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
- proxies (Dict[str, str], optional):
A dictionary of proxy servers to use by protocol or endpoint, e.g., {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.
- use_auth_token (str or bool, optional):
The token to use as HTTP bearer authorization for remote files. If True, will use the token generated when running transformers-cli login (stored in ~/.huggingface).
- cache_dir (Union[str, os.PathLike], optional):
Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
- local_files_only(bool, optional, defaults to False):
Whether to only look at local files (i.e., do not try to download the model).
- model_kwargs (Dict, optional):
model_kwargs will be passed to the model during initialization
<Tip>
Passing use_auth_token=True is required when you want to use a private model.
</Tip>
- class ltp.StnSplit(self)¶
Bases:
object- batch_split(batch_text, threads=8)¶
batch split to sentences
- bracket_as_entity¶
Get the value of the bracket_as_entity option.
- en_quote_as_entity¶
Get the value of the en_quote_as_entity option.
- split(text)¶
split to sentences
- use_en¶
Get the value of the use_en option.
- use_zh¶
Get the value of the use_zh option.
- zh_quote_as_entity¶
Get the value of the zh_quote_as_entity option.