LTP 文档¶
Submodules¶
ltp.interface module¶
- ltp.interface.LTP(pretrained_model_name_or_path='LTP/small', force_download=False, proxies=None, use_auth_token=None, cache_dir=None, local_files_only=False, **model_kwargs)[source]¶
- Instantiate a pretrained LTP model from a pre-trained model
configuration from huggingface-hub. The model is set in evaluation mode by default using model.eval() (Dropout modules are deactivated). To train the model, you should first set it back in training mode with model.train().
- Parameters:
- pretrained_model_name_or_path (str or os.PathLike):
- Can be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids are [LTP/tiny, LTP/small, LTP/base, LTP/base1, LTP/base2, LTP/legacy ], the legacy model only support cws, pos and ner, but more fast.
You can add revision by appending @ at the end of model_id simply like this: dbmdz/bert-base-german-cased@main Revision is the specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
A path to a directory containing model weights saved using [~transformers.PreTrainedModel.save_pretrained], e.g., ./my_model_directory/.
None if you are both providing the configuration and state dictionary (resp. with keyword arguments config and state_dict).
- force_download (bool, optional, defaults to False):
Whether to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
- proxies (Dict[str, str], optional):
A dictionary of proxy servers to use by protocol or endpoint, e.g., {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.
- use_auth_token (str or bool, optional):
The token to use as HTTP bearer authorization for remote files. If True, will use the token generated when running transformers-cli login (stored in ~/.huggingface).
- cache_dir (Union[str, os.PathLike], optional):
Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
- local_files_only(bool, optional, defaults to False):
Whether to only look at local files (i.e., do not try to download the model).
- model_kwargs (Dict, optional):
model_kwargs will be passed to the model during initialization
<Tip>
Passing use_auth_token=True is required when you want to use a private model.
</Tip>
ltp.legacy module¶
ltp.nerual module¶
- class ltp.nerual.LTP(config=None, tokenizer=None)[source]¶
Bases:
BaseModule
,ModelHubMixin
- model¶
- cws_vocab¶
- pos_vocab¶
- ner_vocab¶
- srl_vocab¶
- dep_vocab¶
- sdp_vocab¶
- load_state_dict(state_dict, strict=True)[source]¶
Copy parameters and buffers from
state_dict
into this module and its descendants.If
strict
isTrue
, then the keys ofstate_dict
must exactly match the keys returned by this module’sstate_dict()
function.Warning
If
assign
isTrue
the optimizer must be created after the call toload_state_dict
unlessget_swap_module_params_on_conversion()
isTrue
.- Parameters
state_dict (dict) – a dict containing parameters and persistent buffers.
strict (bool, optional) – whether to strictly enforce that the keys in
state_dict
match the keys returned by this module’sstate_dict()
function. Default:True
assign (bool, optional) – When
False
, the properties of the tensors in the current module are preserved while whenTrue
, the properties of the Tensors in the state dict are preserved. The only exception is therequires_grad
field ofDefault: ``False`
- Returns
- missing_keys is a list of str containing any keys that are expected
by this module but missing from the provided
state_dict
.
- unexpected_keys is a list of str containing the keys that are not
expected by this module but present in the provided
state_dict
.
- Return type
NamedTuple
withmissing_keys
andunexpected_keys
fields
Note
If a parameter or buffer is registered as
None
and its corresponding key exists instate_dict
,load_state_dict()
will raise aRuntimeError
.
- property version¶
Module contents¶
- ltp.LTP(pretrained_model_name_or_path='LTP/small', force_download=False, proxies=None, use_auth_token=None, cache_dir=None, local_files_only=False, **model_kwargs)[source]¶
- Instantiate a pretrained LTP model from a pre-trained model
configuration from huggingface-hub. The model is set in evaluation mode by default using model.eval() (Dropout modules are deactivated). To train the model, you should first set it back in training mode with model.train().
- Parameters:
- pretrained_model_name_or_path (str or os.PathLike):
- Can be either:
A string, the model id of a pretrained model hosted inside a model repo on huggingface.co. Valid model ids are [LTP/tiny, LTP/small, LTP/base, LTP/base1, LTP/base2, LTP/legacy ], the legacy model only support cws, pos and ner, but more fast.
You can add revision by appending @ at the end of model_id simply like this: dbmdz/bert-base-german-cased@main Revision is the specific model version to use. It can be a branch name, a tag name, or a commit id, since we use a git-based system for storing models and other artifacts on huggingface.co, so revision can be any identifier allowed by git.
A path to a directory containing model weights saved using [~transformers.PreTrainedModel.save_pretrained], e.g., ./my_model_directory/.
None if you are both providing the configuration and state dictionary (resp. with keyword arguments config and state_dict).
- force_download (bool, optional, defaults to False):
Whether to force the (re-)download of the model weights and configuration files, overriding the cached versions if they exist.
- proxies (Dict[str, str], optional):
A dictionary of proxy servers to use by protocol or endpoint, e.g., {‘http’: ‘foo.bar:3128’, ‘http://hostname’: ‘foo.bar:4012’}. The proxies are used on each request.
- use_auth_token (str or bool, optional):
The token to use as HTTP bearer authorization for remote files. If True, will use the token generated when running transformers-cli login (stored in ~/.huggingface).
- cache_dir (Union[str, os.PathLike], optional):
Path to a directory in which a downloaded pretrained model configuration should be cached if the standard cache should not be used.
- local_files_only(bool, optional, defaults to False):
Whether to only look at local files (i.e., do not try to download the model).
- model_kwargs (Dict, optional):
model_kwargs will be passed to the model during initialization
<Tip>
Passing use_auth_token=True is required when you want to use a private model.
</Tip>
- class ltp.StnSplit(self)¶
Bases:
object
- batch_split(batch_text, threads=8)¶
batch split to sentences
- bracket_as_entity¶
Get the value of the bracket_as_entity option.
- en_quote_as_entity¶
Get the value of the en_quote_as_entity option.
- split(text)¶
split to sentences
- use_en¶
Get the value of the use_en option.
- use_zh¶
Get the value of the use_zh option.
- zh_quote_as_entity¶
Get the value of the zh_quote_as_entity option.