[Glass] upgrade speed sanity check

Mon Sep 23 18:21:36 PDT 2019

Thanks Dale

After the load script finished it showed 

...finished baseline
	Migrated 0 instances for 125 classes.

So it doesn't look like there was a massive number of migrations.  The SPC
cache was only 1GB, so maybe that likely had something to do with the time
taken.  I'll bump it to 2GB and the # of threads to 2 which I think is the
limit of hte license.  

Paul

GLASS mailing list wrote
> On 9/23/19 3:04 PM, Paul DeBruicker via Glass wrote:
>> Thanks Dale,
>>
>> I did use the upgradeStone script.  That script finished relatively
>> quickly
>> (~10-15 mins IIRC).   Reloading my project code is what was taking a
>> really
>> long time.
> Ahhh, okay .. then I suspect that you are getting hit by a bunch of 
> migrations while loading your application code ...
>> Sorry I neglected to mention that in my prior email.   It
>> finished about an hour ago and seems to have worked OK.
> 
> I saw the copy of the script that you used to upgrade your application 
> code and you _are_ using GsDeployer>>deploy:, so that is a good thing 
> ... presumably the migration time would have been worse if you hadn't 
> used the GsDeployer>>deploy:
> 
> When the deploy block completes it should have written to the log the 
> number of classes and instances that were migrated .... and presumably 
> this was fairly high, given how long it took ... if you look at 
> MCPlatformSupport class>>listInstances:, you'll see that a single thread 
> is being used to do the migration ... if you have  a multi-core machine 
> you could bump up this number to match the number of cores you have and 
> do the list instances step in parallel ....
> 
>>
>> I did use a fresh install of GsDevKit_home for the upgrade but the 8GB
>> stone
>> has come from 2.4.4.1 thus far so could be wonky in interesting ways.
> 
> Ahhh, this will be a pretty old version fo GLASS and friends, so I can 
> imagine that you probably ended up with a lot of classes with shape 
> changes and they are scanned for instances whether or not there are any 
> ... listInstances is a read only operation so the fact that you ended up 
> with a large volume of bytes written to disk, indicates that you ended 
> up with a number of instance migrations ... I'm curious if you'd done an 
> MFC before doing the upgrade ... if not, you might have migrated a fair 
> amount of dead objects ...
> 
>>
>> In the upgradeLog directory topaz.out was the last edited file and its
>> last
>> line is "GsDevKit Upgrade completed (GLASS/GLASS1/GsDevKit/tODE). No
>> errors
>> detected".
>>
>>
>> Looking in the logs for the stone the only thing that stands out is that
>> the
>> reclaim gc gem log for the session where I upgraded the project code is
>> 241,400 lines long, ~21MB .  The last 10 of which are:
>>
>> --- 09/23/2019 13:17:55.601 PDT reclaim abort: 239093 viewCr 754 numDp 36
>> deadNotRecl 0
>> --- 09/23/2019 13:19:56.128 PDT reclaim abort: 239094 viewCr 754 numDp 36
>> deadNotRecl 0
>> --- 09/23/2019 13:21:56.650 PDT reclaim abort: 239095 viewCr 754 numDp 36
>> deadNotRecl 0
>> --- 09/23/2019 13:23:57.170 PDT reclaim abort: 239096 viewCr 754 numDp 36
>> deadNotRecl 0
>> --- 09/23/2019 13:25:57.688 PDT reclaim abort: 239097 viewCr 754 numDp 36
>> deadNotRecl 0
>> --- 09/23/2019 13:27:58.232 PDT reclaim abort: 239098 viewCr 754 numDp 36
>> deadNotRecl 0
>> --- 09/23/2019 13:29:58.788 PDT reclaim abort: 239099 viewCr 754 numDp 36
>> deadNotRecl 0
>> --- 09/23/2019 13:31:59.339 PDT reclaim abort: 239100 viewCr 754 numDp 36
>> deadNotRecl 0
>> --- 09/23/2019 13:33:59.888 PDT reclaim abort: 239101 viewCr 754 numDp 36
>> deadNotRecl 0
>> --- 09/23/2019 13:36:00.435 PDT reclaim abort: 239102 viewCr 754 numDp 36
>> deadNotRecl 0
>>
>>
>> The reclaimgcgem.log for the session the other day  where I ran the
>> upgradeStone command is 112,397 lines long (~10MB).
>>
>> --- 09/20/2019 16:58:03.711 PDT reclaim abort: 112234 viewCr 12443 numDp
>> 31
>> deadNotRecl 0
>> --- 09/20/2019 17:02:47.350 PDT reclaim abort: 112235 viewCr 12443 numDp
>> 31
>> deadNotRecl 0
>> --- 09/20/2019 17:04:47.412 PDT reclaim abort: 112236 viewCr 12443 numDp
>> 31
>> deadNotRecl 0
>> --- 09/20/2019 17:06:47.743 PDT reclaim abort: 112237 viewCr 12443 numDp
>> 31
>> deadNotRecl 0
>> --- 09/20/2019 17:08:48.056 PDT reclaim abort: 112238 viewCr 12443 numDp
>> 31
>> deadNotRecl 0
>> --- 09/20/2019 17:10:48.351 PDT reclaim abort: 112239 viewCr 12443 numDp
>> 31
>> deadNotRecl 0
>> --- 09/20/2019 17:12:48.808 PDT reclaim abort: 112240 viewCr 12443 numDp
>> 31
>> deadNotRecl 0
>> --- 09/20/2019 17:14:49.282 PDT reclaim abort: 112241 viewCr 12443 numDp
>> 31
>> deadNotRecl 0
>> --- 09/20/2019 17:15:33.616 PDT Logging out
>> --- 09/20/2019 17:15:37.437 PDT Exiting
>>
>>
>> I'm happy to send you whatever else you want that might help.
> 
> I assume that since it did finish and appears to be in good shape, I'm 
> not sure there is too much left to be done ...
> 
> In general if you have a tight upgrade window for a production stone, it 
> is prudent to run the upgrade on a backup of production on another 
> machine to work out these kinds of issues ahead of time ... running a 
> statmon during these test runs would also supply information that would 
> tell you whether or not you are disk i/o bound or not ... bumping up the 
> number of threads used for list instances is also a good idea if that's 
> where you're spending time ...
> 
> Dale
> 
>>
>> Paul
>>
>>
>>
>> GLASS mailing list wrote
>>> On 9/23/19 12:36 PM, PAUL DEBRUICKER via Glass wrote:
>>>> Hi -
>>>>
>>>> I'm trying to upgrade a 8 GB and 30GB stone from 3.3.1 to 3.5 on a mac
>>>> first (while troubleshooting my upgrade process) then possibly on
>>>> ubuntu
>>>> 18.04 if need be.  Both machines have NVMe SSD's with plenty of room
>>>> and
>>>> with plenty of RAM and CPU.
>>>>
>>>> Right now the extent and tranlogs are both on the same device, in the
>>>> default GsDevKit_home structure.  On the mac I could move the tranlogs
>>>> to
>>>> an SDHC hard, and on the Ubuntu machine I could move them to a SATA
>>>> drive.
>>>>
>>>> While upgrading the 8GB stone I'm seeing speeds inline with the drives
>>>> published specs of around 250-400MB/s i/o speeds on the Mac.  And it
>>>> has
>>>> read 1.35TB of data and written 105GB of data in about 40 minutes.
>>>>
>>>> Right now I assume things are fine and going smoothly but because it
>>>> might be taking longer than it should I wanted to check:
>>>>
>>>> 1. Should it take more than 40 minutes to upgrade a 8GB stone?
>>>> 2. Should the 30GB stone (that has different objects/code/etc) take ~4x
>>>> as long?
>>>> 3. Is there any way to tell ahead of time how long an upgrade will
>>>> take?
>>> No ... which should answer questions 1 and 2 ... however, it does seem
>>> that it is taking a long time ...
>>>
>>> What scripts are you using to do the upgrade? I'll assume that you are
>>> using $GS_HOME/bin/upgradeStone to do the upgrade ... if you are running
>>> different scripts that would be useful to know ...
>>>
>>> Assuming upgradeStone, there should be an upgradeLog directory in the
>>> directory of the stone that you are upgrading to ... in that directory
>>> there should be a bunch of log files in that directory and an `ls -l`
>>> will give us a clue of where you are in the upgrade process.
>>>
>>> Here's an example directory from and upgrade of a 500M 3.4.3 extent to
>>> 3.5.0:
>>>
>>>      rogue:upgradeLog>ls -altr
>>>      total 11456
>>>      drwxr-xr-x 11 dhenrich smalltalk     4096 Sep 23 13:23 ..
>>>      -rw-r--r--  1 dhenrich smalltalk        0 Sep 23 13:23
>>> topazerrors.log
>>>      -rw-r--r--  1 dhenrich smalltalk 11286737 Sep 23 13:24
>>> upgradeImage.out
>>>      -rw-r--r--  1 dhenrich smalltalk       33 Sep 23 13:24
>>> conversion.conf
>>>      -rw-r--r--  1 dhenrich smalltalk     1254 Sep 23 13:24
>>> installGsDevKit_upgrade.topaz
>>>      -rw-r--r--  1 dhenrich smalltalk      253 Sep 23 13:24
>>> createGsDevKit_upgrade.topaz
>>>      -rw-r--r--  1 dhenrich smalltalk     3696 Sep 23 13:24
>>> createGsDevKit_upgrade.out
>>>      -rw-r--r--  1 dhenrich smalltalk     5527 Sep 23 13:24
>>> installGsDevKit_upgrade_topaz.log
>>>      -rw-r--r--  1 dhenrich smalltalk   112384 Sep 23 13:24
>>> installGsDevKit_upgrade.out
>>>      -rw-r--r--  1 dhenrich smalltalk      533 Sep 23 13:24
>>> prepareGsDevKitImage_pragma_user.topaz
>>>      -rw-r--r--  1 dhenrich smalltalk     6942 Sep 23 13:24
>>> prepareGsDevKitImage_pragma_user.log
>>>      -rw-r--r--  1 dhenrich smalltalk      462 Sep 23 13:24
>>> prepareImage_pragma_systemuser.topaz
>>>      -rw-r--r--  1 dhenrich smalltalk     6874 Sep 23 13:24
>>> prepareImage_pragma_systemuser_topaz.log
>>>      -rw-r--r--  1 dhenrich smalltalk      551 Sep 23 13:24
>>> prepareGsDevKitImage.topaz
>>>      -rw-r--r--  1 dhenrich smalltalk   118768 Sep 23 13:25
>>> prepareGsDevKitImage.log
>>>      -rw-r--r--  1 dhenrich smalltalk      525 Sep 23 13:25
>>> cleanupGsDevKitImage.topaz
>>>      drwxr-xr-x  2 dhenrich smalltalk     4096 Sep 23 13:25 .
>>>      -rw-r--r--  1 dhenrich smalltalk   118379 Sep 23 13:25
>>> upgradeSeasideImage.out
>>>      -rw-r--r--  1 dhenrich smalltalk     6436 Sep 23 13:25
>>> cleanupGsDevKitImage.log
>>>      -rw-r--r--  1 dhenrich smalltalk    13452 Sep 23 13:25 topaz.out
>>>
>>> sending me the newest log file and the`ls -l` list should be enough
>>> information for me to figure out where you are in the upgrade process
>>> and we can go from there ...
>>>
>>>> 4. Are there any optimizations or a pre-upgrade checklist I should look
>>>> or go through at that isn't covered in the install guide?
>>> no .... I'm guessing that you are getting hit by "excessive migrations",
>>> and that you _might_ be using an older version of GsDevKit_home (the SHA
>>> of $GS_HOME commit should answer that question ... I can also tell by
>>> the list of files in the upgradeLog directory)... the "modern" upgrade
>>> from 3.3.1 to 3.5.0 should be avoiding most if not all of the migrations
>>> that would cause long upgrade times, but that will actually depend upon
>>> whether or not you are running with a "modern" version of GLASS/TODE in
>>> your 3.3.1 stone or not (the SHAs of the various projects you have
>>> loaded in your image will help figure that out) ... let's start by
>>> looking at the log files and go from there ...
>>>> 5. Would moving the tranlogs to a separate disk prior to upgrading help
>>>> substantially?
>>> Maybe, but first we want to determine what's happening ... if the
>>> migrations cannot be avoided, it might make sense to arrange for the
>>> repository scans to use multi-threaded list instances, before resorting
>>> to movint tranlogs around, but I first want to see if you are getting
>>> unnecessary upgrades.
>>>
>>>
>>> Dale
>>>
>>>
>>> _______________________________________________
>>> Glass mailing list
>>> Glass at .gemtalksystems
>>> https://lists.gemtalksystems.com/mailman/listinfo/glass
>>
>>
>>
>>
>> --
>> Sent from: http://forum.world.st/GLASS-f1460844.html
>> _______________________________________________
>> Glass mailing list
>> 

> Glass at .gemtalksystems

>> https://lists.gemtalksystems.com/mailman/listinfo/glass
> _______________________________________________
> Glass mailing list

> Glass at .gemtalksystems

> https://lists.gemtalksystems.com/mailman/listinfo/glass

--
Sent from: http://forum.world.st/GLASS-f1460844.html