IterModel¶

class matchup.models.model.IterModel¶

Bases: matchup.models.model.Model

Describe one variation of Model classes : IterModel classes have some features for help his works Pointers and occurrences are implemented here.

Methods Summary

`doc_repr`(doc)	Process doc generating it representation
`initialize`(query, vocabulary)	Initialize query-based
`initialize_occurrences`(query, vocabulary)	Create another data structure _term_occurrences that represents the vocabulary with just query
`initialize_pointers`()	Initialize pointers to model algorithm
`iter`()	Define one iteration of this iter model algorithm
`next_doc`()	Return the lowest doc pointer by pointers
`process_vocabulary_query_based`(query, vocabulary)	Generate document scores based in query
`query_repr`(query, idf, tf)	Construct query representation
`run`(query, vocabulary)	Define the principal method of IR models.
`stop`()	Given a dictionary with all pointers by keyword and another dictionary with all occurrences by keyword,

Methods Documentation

doc_repr(doc: str) → DefaultDict[str, float]¶

Process doc generating it representation

Parameters:	doc – Str represents the lowest document
Returns:	doc repr. Dictionary with all term scores by doc. That is the document vector.

initialize(query: List[matchup.presentation.text.Term], vocabulary: matchup.structure.vocabulary.Vocabulary)¶: Initialize query-based

initialize_occurrences(query: List[matchup.presentation.text.Term], vocabulary: matchup.structure.vocabulary.Vocabulary) → None¶

Create another data structure _term_occurrences that represents the vocabulary with just query keywords.

Parameters:	query – original query vocabulary – original vocabulary
Returns:	None

initialize_pointers() → None¶

Initialize pointers to model algorithm

Returns:	None

iter() → Tuple[str, DefaultDict[str, float]]¶

Define one iteration of this iter model algorithm

Returns:	doc, doc_repr (keyword -> score)

next_doc() → str¶

Return the lowest doc pointer by pointers

Returns:	lowest document

process_vocabulary_query_based(query: List[matchup.presentation.text.Term], vocabulary: matchup.structure.vocabulary.Vocabulary) → DefaultDict[str, List[matchup.structure.occurrence.Occurrence]]¶

Generate document scores based in query

Parameters:	query – query representation vocabulary – vocabulary structure
Returns:	List of occurrences

classmethod query_repr(query: List[matchup.presentation.text.Term], idf, tf) → DefaultDict[str, float]¶

Construct query representation

Parameters:	query – list of all terms idf – structure IDF tf – structure TF
Returns:	query representation

run(query: List[matchup.presentation.text.Term], vocabulary: matchup.structure.vocabulary.Vocabulary) → List[matchup.structure.solution.Result]¶

Define the principal method of IR models.

Parameters:	query – List of all entry terms vocabulary – Vocabulary pre-processed
Returns:

stop() → bool¶

Given a dictionary with all pointers by keyword and another dictionary with all occurrences by keyword, this function calculate if the algorithm is over

Returns:	boolean flag indicates if algorithm stops

IterModel¶

MatchUp Information Retrieval Library

Navigation

Related Topics