[Glass] How to find a string in a large number of strings ...

Martin McClure via Glass glass at lists.gemtalksystems.com
Fri Mar 18 15:31:39 PDT 2016


On 03/18/2016 02:34 PM, itlists at schrievkrom.de via Glass wrote:
> I want to find a string in a very large number of strings (3 millions
> and increasing).
> 
> Should one use a simple Set ?
> 
>  -> means, that lots of memory is used and perhaps lots of RAM is needed
>     in the GEM ... (total memory at least 40 Mbyte of data). Swapping ?
> 
> Should I use an UnorderedCollection of Strings with Index (Equality) ?
> 
>  -> how do I set an index on a set with Strings ?
> 
>  -> perhaps like:
>        aSetOfStrings createEqualityIndexOn: '' withLastElementClass:
> String ?
> 
> Another problem is, that these strings-set change two times the day ...
> mostly adding strings.
> 
> 
> Any hint ?

What is the key by which you look up the string?

If it is the entire string, (the string is 'foobar' and I know I want
'foobar') use a Set, this will be very efficient (but if you know the
entire string, why do you need to look it up at all?)

If it is a prefix of the string (the string is 'foobar' but I only know
I want the string that starts 'foo') use an index.

If you want to look up from some substring (the string is 'foobar' and I
know only that I want a string that contains 'oba') you might want to
build a more complex structure.

Regards,

-Martin


More information about the Glass mailing list