Python Sqlite - Select Only Spans of Rows with Multiple Words -
i have book in sqlite table has 1 sentence per row. there on 30k rows/sentences , format of table cannot changed (it break many other things).
i have several different spans of ids more or less divide book paragraphs. in tuples in list, i.e. [(0,2), (3,6), (7,10) ...] or [(0,3), (4,9), (10,13) ...], etc.
i need able return spans contain 2 words or more. is, find "water" , "earth" within same span/paragraph.
i looked @ making views using each group of spans using group_concant combine sentences, find no way since views cannot appended.
making 1000s of select calls 'select * in book id between ? , ? and...' not seem efficient.
is there way return spans have hits single statement or maybe way use temporary table combine them?
if they're sequential (or can force ordering id correlate spans), can use python grouping applying arbitrary key each group, , using part of groupby. eg:
from itertools import repeat, izip, chain, groupby operator import itemgetter testdata = [str(i) in range(10)] spans = [(0,2), (3,6), (7,10)] groups = chain.from_iterable(repeat(idx, e - s + 1) idx, (s, e) in enumerate(spans)) k, g in groupby(izip(testdata, groups), itemgetter(1)): words = set(chain.from_iterable(el[0].split() el in g)) if words.issuperset(['3', '6']): print words
you need modify how splits words , chooses matches, remains 1 possible option.
if you're doing often, may wish consider creating table containing single column representing paragraph (instead of sentences), , apply full text index on column make future queries lot easier. utilise above code assist in building table.
Comments
Post a Comment