[Glass] Instance variable handling and performance questions
FrankB via Glass
glass at lists.gemtalksystems.com
Sun Feb 8 02:29:29 PST 2015
Hello,I am still in the process of reading into and trying to understand
GemStone with the final goal of becoming able to decide if we shall use
GemStone for a new product family currently under development targeting a
mass-market as well as requiring server installations with a potentially
very high number of users and traffic. This means setting the course for our
technical environment for the next 10+ years.I currently have two questions
and I apologize upfront for that it needs quite some longer text to explain
the details and also that "I did not have the time to keep it short" (which
in German is a famous quote from the great writer and poet Johann Wolfgang
von Goethe):1) My first major concern arises from our special way of using
application data instance variables for searching and selecting objects in
GemStone:In my comprehensive Smalltalk application classes (VW5) ALL user
data related instance variables:- are kept by data model instances in
attribute dictionaries - are typically accessed by just sending the instVar
name via doesNotUnderstand (in 98% there are no getter/setter methods and
never ever any direct use oft instVars inside methods)- are created from and
defined by version dependent and partially user-defined definition data
making them totally dynamic at run-time, because all application data
instance variables are created from this definition data and held in
dictionaries.Converting this form of dynamically defined instance variables
to 'classical' hard-coded instance variables would not only be a major step
back, in my view, causing many severe and inacceptable disadvantages, but it
would just be practically impossible, because it would require an almost
total rewrite of my entire software and its underlying complex framework. No
chance in this life!Therefore, my question is whether or not such use of
instance variables in dictionaries is supported by GemStone? In other words,
would or could GemStone access, search, filter etc. these instVars via
doesNotUnderstand just as all of our code does or requires GemStone
hard-coded instVars?In case the answer is NO, you can also forget about the
next question, because this would mean the end for all of my considerations
on using GemStone.2) Performance of loading complex objects through GemStone
versus from MariaDBMy concerns regarding this subject relate to two very
different projects and products, which have different and in some respects
even oppositional requirements. I therefore divide this into two sections:a)
Loading very large numbers of middle complex objects under extreme traffic
on web servers for generating data for client-side UI and html generationI
am currently preparing an already existing but previously only desktop
application in VW5 to act as web server that must support these conditions,
which are best described using a well-known example:- serve a potentially
extremely high number of simultaneous users very similar in type and numbers
to those accessing LinkedIn- delivering to them data objects somewhat
similar to the person profile structures of LinkedIn but much more complex,
because they support multiple content languages, have mandator related
parts, user-defied data, and object history, for every master data object-
in the beginning support for a few 10 million such data objects is required
(we have 5 million already) with the potential need for a couple of 100
million (in the end and hopefully)- support for a large number of searching
and filtering criteria, which offer substantially more choices than the
relatively simple so-called advanced search features of LinkedIn.- of
course, answers are expected by users at Google like performanceI should
mention that besides this project I have another two different application
scenarios to follow with similar requirements and all three are truly new
and filling huge market niches. Of course, I am trying to develop the still
missing software parts so that all can be re-used later as much as possible.
All three projects are supporting the Freemium concept and will be
implemented and marketed without any involvement of venture capital sharks,
legally organized criminal Mafia, generally called "banks", or any other
capital parasites who all together have disastrous control over our
economies, societies, and politics - but I will never allow them to gain any
over me and my ideas!!! /(I am NOT a socialist, BTW)/As for the storing
technology I strongly favour the combination of *MariaDB *for mere content
storage and *Sphinx *for queries, for searching, filtering, sorting,
grouping etc. I do not see any viable alternative to Sphinx, which seems to
be the best if not the only player in this top league of highest performance
web sites. For the near future the application logic will remain in
Smalltalk but I keep the option in mind of later moving parts or even all of
it to node.js and server-side JavaScript provided that Smalltalk cannot cope
well enough with the performance requirements. Experience will show!Now the
decisive question:There are two ways of loading the master data objects
resulting from a Sphinx query and needed to generate the data for answering
the requests by the browser clients:- have Smalltalk collect the data object
fragments from relational MariaDB tables, with their ids are already known;
one master object and its depending sub-classes must be collected from about
5 to 8 tables with mostly one access per table, in 2 to 3 cases with
typically 2 and up to 10 dependent instances per sub-class / table- or use
GemStone to load these middle-complex objects by their id with no further
querying needed.Would you expect substantial performance advantages deriving
from GemStone in such cases?Regarding this I should mention that experience
shows that most of the CPU time is consumed not by the database accesses but
rather by converting the fields from these ugly rectangle records into nice
round objects and their dictionary based instance variables (you certainly
noticed my analogy to this famous old Byte [?] article on "How to squeeze
nice round objects into an ugly square database"; I have been in OOD since
1986).b) Loading rather few extremely complex objectsThe other potential
use-case for GemStone is my old but still currently not yet marketed
database publishing software. It was developed more than ten years ago with
an effort of around 7 men years also in VW5 and it was used so far only by
this one world-wide known very large Dutch electrical company where it
generates a couple of 100.000 product catalogue pages with very complex
layout for print, PDF and html. Data came from my product management
software storing over 1 million items. This software severely suffered from
great performance problems when loading the 500 to more than 2.000 little
components that one page was made up of. This resulted in loading times
between 1 and up to 5 minutes per page from MySQL even in single user mode.
This was the major reason why this software was never marketed beyond this
one large customer. That is a pity not only because of the investment but
primarily, because there still is no similar solution available on the
market to the best of my knowledge. And the software also generated html,
too, which makes it suitable for many more purposes today.Of course, both
DBMS software and the available hardware (SSD or RAM disks) are multiple
times faster today than back from 2002 to 2007. Despite this I would still
expect that GemStone could improve loading times of these very complex
objects substantially compared to MariaDB. Any general comments?Now comes
the great BUT:Instead of targeting the professional publishing market, I
would rather prefer to first cover a different and newly developing mass
market via a browser based solution (I am having a couple of unique ideas
and new features in mind), primarily because such an application perfectly
fits into and substantially up-values two of my above mentioned server
projects.Therefore, I will have to expect and prepare for up to a couple of
thousand simultaneous users accessing their own private or group-wide
private data (no shared data beyond group borders, but a few simultaneous
users per group) stored on web servers. Here the main two questions:I know
that this is a very difficult guess but what would you expect that one
GemStone server could support in terms of simultaneous users and requests
(mostly pages loaded).And my last question:Would you consider a GemStone
only server installation suitable? Or would you recommend to offload the
non-DB related work to clients like Pharo ('work' here is essentially
extracting data chunks to be sent to the fat clients, which generate UI and
hmtl for themselves)?I would very much appreciate a comment.Thank you very
much for taking the time to read my long novel.Frank
--
View this message in context: http://forum.world.st/Instance-variable-handling-and-performance-questions-tp4804450.html
Sent from the GLASS mailing list archive at Nabble.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gemtalksystems.com/mailman/private/glass/attachments/20150208/f04131e8/attachment.html>
More information about the Glass
mailing list