IterModel

class matchup.models.model.IterModel

Bases: matchup.models.model.Model

Describe one variation of Model classes : IterModel classes have some features for help his works Pointers and occurrences are implemented here.

Methods Summary

doc_repr(doc) Process doc generating it representation
initialize(query, vocabulary) Initialize query-based
initialize_occurrences(query, vocabulary) Create another data structure _term_occurrences that represents the vocabulary with just query
initialize_pointers() Initialize pointers to model algorithm
iter() Define one iteration of this iter model algorithm
next_doc() Return the lowest doc pointer by pointers
process_vocabulary_query_based(query, vocabulary) Generate document scores based in query
query_repr(query, idf, tf) Construct query representation
run(query, vocabulary) Define the principal method of IR models.
stop() Given a dictionary with all pointers by keyword and another dictionary with all occurrences by keyword,

Methods Documentation

doc_repr(doc: str) → DefaultDict[str, float]
Process doc generating it representation
Parameters:doc – Str represents the lowest document
Returns:doc repr. Dictionary with all term scores by doc. That is the document vector.
initialize(query: List[matchup.presentation.text.Term], vocabulary: matchup.structure.vocabulary.Vocabulary)

Initialize query-based

initialize_occurrences(query: List[matchup.presentation.text.Term], vocabulary: matchup.structure.vocabulary.Vocabulary) → None
Create another data structure _term_occurrences that represents the vocabulary with just query keywords.
Parameters:
  • query – original query
  • vocabulary – original vocabulary
Returns:

None

initialize_pointers() → None
Initialize pointers to model algorithm
Returns:None
iter() → Tuple[str, DefaultDict[str, float]]
Define one iteration of this iter model algorithm
Returns:doc, doc_repr (keyword -> score)
next_doc() → str
Return the lowest doc pointer by pointers
Returns:lowest document
process_vocabulary_query_based(query: List[matchup.presentation.text.Term], vocabulary: matchup.structure.vocabulary.Vocabulary) → DefaultDict[str, List[matchup.structure.occurrence.Occurrence]]
Generate document scores based in query
Parameters:
  • query – query representation
  • vocabulary – vocabulary structure
Returns:

List of occurrences

classmethod query_repr(query: List[matchup.presentation.text.Term], idf, tf) → DefaultDict[str, float]
Construct query representation
Parameters:
  • query – list of all terms
  • idf – structure IDF
  • tf – structure TF
Returns:

query representation

run(query: List[matchup.presentation.text.Term], vocabulary: matchup.structure.vocabulary.Vocabulary) → List[matchup.structure.solution.Result]
Define the principal method of IR models.
Parameters:
  • query – List of all entry terms
  • vocabulary – Vocabulary pre-processed
Returns:

stop() → bool
Given a dictionary with all pointers by keyword and another dictionary with all occurrences by keyword, this function calculate if the algorithm is over
Returns:boolean flag indicates if algorithm stops