IterModel¶
-
class
matchup.models.model.
IterModel
¶ Bases:
matchup.models.model.Model
Describe one variation of Model classes : IterModel classes have some features for help his works Pointers and occurrences are implemented here.
Methods Summary
doc_repr
(doc)Process doc generating it representation initialize
(query, vocabulary)Initialize query-based initialize_occurrences
(query, vocabulary)Create another data structure _term_occurrences that represents the vocabulary with just query initialize_pointers
()Initialize pointers to model algorithm iter
()Define one iteration of this iter model algorithm next_doc
()Return the lowest doc pointer by pointers process_vocabulary_query_based
(query, vocabulary)Generate document scores based in query query_repr
(query, idf, tf)Construct query representation run
(query, vocabulary)Define the principal method of IR models. stop
()Given a dictionary with all pointers by keyword and another dictionary with all occurrences by keyword, Methods Documentation
-
doc_repr
(doc: str) → DefaultDict[str, float]¶ - Process doc generating it representation
Parameters: doc – Str represents the lowest document Returns: doc repr. Dictionary with all term scores by doc. That is the document vector.
-
initialize
(query: List[matchup.presentation.text.Term], vocabulary: matchup.structure.vocabulary.Vocabulary)¶ Initialize query-based
-
initialize_occurrences
(query: List[matchup.presentation.text.Term], vocabulary: matchup.structure.vocabulary.Vocabulary) → None¶ - Create another data structure _term_occurrences that represents the vocabulary with just query keywords.
Parameters: - query – original query
- vocabulary – original vocabulary
Returns: None
-
initialize_pointers
() → None¶ - Initialize pointers to model algorithm
Returns: None
-
iter
() → Tuple[str, DefaultDict[str, float]]¶ - Define one iteration of this iter model algorithm
Returns: doc, doc_repr (keyword -> score)
-
next_doc
() → str¶ - Return the lowest doc pointer by pointers
Returns: lowest document
-
process_vocabulary_query_based
(query: List[matchup.presentation.text.Term], vocabulary: matchup.structure.vocabulary.Vocabulary) → DefaultDict[str, List[matchup.structure.occurrence.Occurrence]]¶ - Generate document scores based in query
Parameters: - query – query representation
- vocabulary – vocabulary structure
Returns: List of occurrences
-
classmethod
query_repr
(query: List[matchup.presentation.text.Term], idf, tf) → DefaultDict[str, float]¶ - Construct query representation
Parameters: - query – list of all terms
- idf – structure IDF
- tf – structure TF
Returns: query representation
-
run
(query: List[matchup.presentation.text.Term], vocabulary: matchup.structure.vocabulary.Vocabulary) → List[matchup.structure.solution.Result]¶ - Define the principal method of IR models.
Parameters: - query – List of all entry terms
- vocabulary – Vocabulary pre-processed
Returns:
-
stop
() → bool¶ - Given a dictionary with all pointers by keyword and another dictionary with all occurrences by keyword, this function calculate if the algorithm is over
Returns: boolean flag indicates if algorithm stops
-