The Document Model¶
Sub-module for handling document-level stuff
- class corenlp_xml.document.Document(xml_string)[source]¶
This class abstracts a Stanford CoreNLP Document
- coreferences[source]¶
Returns a list of Coreference classes
Getter: Returns a list of coreferences Type: list of corenlp_xml.coreference.Coreference
- get_sentence_by_id(id)[source]¶
Gets sentence by ID
Parameters: id (int) – the ID of the sentence, as defined in the XML Returns: a sentence Return type: corenlp_xml.document.Sentence
- class corenlp_xml.document.Sentence(element)[source]¶
This abstracts a sentence
- basic_dependencies[source]¶
Accesses basic dependencies from the XML output
Getter: Returns the dependency graph for basic dependencies Type: corenlp_xml.dependencies.DependencyGraph
- collapsed_ccprocessed_dependencies[source]¶
Accesses collapsed, CC-processed dependencies
Getter: Returns the dependency graph for collapsed and cc processed dependencies Type: corenlp_xml.dependencies.DependencyGraph
- collapsed_dependencies[source]¶
Accessess collapsed dependencies for this sentence
Getter: Returns the dependency graph for collapsed dependencies Type: corenlp_xml.dependencies.DependencyGraph
- get_token_by_id(id)[source]¶
Accesses token by the XML ID
Parameters: id (int) – The XML ID of the token Returns: The token Return type: corenlp_xml.document.Token
- parse[source]¶
Accesses the parse tree based on the S-expression parse string in the XML
Getter: Returns the NLTK parse tree Type: nltk.Tree
- parse_string[source]¶
Accesses the S-Expression parse string stored on the XML document
Getter: Returns the parse string Type: str
- phrase_strings(phrase_type)[source]¶
Returns strings corresponding all phrases matching a given phrase type
Parameters: phrase_type (str) – POS such as “NP”, “VP”, “det”, etc. Returns: a list of strings representing those phrases
- semantic_head[source]¶
Returns the semantic head of the sentence – AKA the dependent of the root node of the dependency parse
Returns: the mention related to the semantic head Return type: corenlp_xml.coreference.Mention
- sentiment[source]¶
The sentiment of this sentence
Getter: Returns the sentiment value of this sentence Type: int
- class corenlp_xml.document.Token(element)[source]¶
Wraps the token XML element
- character_offset_begin[source]¶
Lazy-loads character offset begin node
Getter: Returns the integer value of the beginning offset Type: int
- character_offset_end[source]¶
Lazy-loads character offset end node
Getter: Returns the integer value of the ending offset Type: int
- lemma[source]¶
Lazy-loads the lemma for this word
Getter: Returns the plain string value of the word lemma Type: str
- ner[source]¶
Lazy-loads the NER for this word
Getter: Returns the plain string value of the NER tag for the word Type: str
- pos[source]¶
Lazy-loads the part of speech tag for this word
Getter: Returns the plain string value of the POS tag for the word Type: str