[Glass] extent0.dbf grows

Fri Aug 7 05:24:00 PDT 2015

On Thu, Aug 6, 2015 at 8:29 PM, Dale Henrichs <
dale.henrichs at gemtalksystems.com> wrote:

>
>
> On 08/05/2015 02:33 PM, Mariano Martinez Peck wrote:
>
>>
>>
>> Dale, it is not clear to me what the checkpoint interval is. I understood
>> it was at abort/commit/logout..so how is this internal related? Is there a
>> way I can check the stone configuration parameter of this (the 5 minutes)?
>>
> A checkpoint is a gemstone system operation where all data pages in the
> SPC that were written before a given commit are flushed to disk. After a
> checkpoint, gemstone guarantees that all data for a particular commit has
> been written to the extents ... Until a commit is checkpointed, the systems
> relies on the records in the tranlog for recovery ... the checkpoint is not
> necessarily the primary point at which pages are freed up in an extent but
> it does represent a book keeping boundary that triggers other processing
> that does lead to freeing up extent pages ....
>
> When you recover from a crash, the system looks at the commit that was
> last checkpointed and then scans the tranlogs looking for that commit and
> then reads the tranlogs from that point forwards to complete recovery ....
>

Got it! Now I see why once I restarted from a crash it took 5 minutes to
start :)

>
> STN_CHECKPOINT_INTERVAL in the system.conf file is used to customize the
> checkpoint interval ... Given the above the checkpoint interval controls
> how much data would have to be recovered from tranlogs in the event of a
> system crash ... with the default checkpoint interval of 5 minutes, that
> means that 5 minutes worth of tranlog data will have to be recovered on the
> other end of the spectrum, with a checkpoint interval of 5 minutes, that
> means that every 5 minutes the SPC will be scanned for pages that have not
> been written to disk ... tuning the checkpoint interval involves finding a
> balance between scan cpu consumption, disk i/o, and recovery time, SPC size
> and commit rates.
>
> If the checkpoint interval is too short you may consume a lot of cpu time
> doing checkpoints without actually writing any dirty pages. If the
> checkpoint interval is too long, it may take a long time to replay tranlogs
> during crash recovery ... There's a third inflection point where the
> checkpoint interval is shorter than the time it takes to write all the
> dirty pages accumulated and you end up in perpetual checkpoint mode - in
> this case you just have to live with the exposure to longer recovery times
> (or take steps to improve disk i/o or ....)
>
> At the end of the day, tuning the checkpoint interval only comes into play
> at higher commit rates ...

Great. Thanks for the explanation. The more I read and understand GemStone,
the more I am impressed by such a complex yet reliable software!

>
>
>> So for the "system data" to turn into free space, the other gems need to
>> abort/commit/logout AND only after 5 minutes that turning into free space
>> will happen?   I ask because I am scheduling some batch jobs running at
>> night, and as soon as they all finish, I run GC...and sometimes it seems I
>> do not really GC what I should have...
>>
>> The actual process for freeing up a page is a pretty complicated process.
> Dirty pages cannot be reclaimed, so I mentioned the checkpoint interval in
> relation to free pages because the checkpoint kicks off processing (in a
> manner of speaking) that leads to the possible creation of free pages ...
> different types of data are stoned on pages and the rules for reclaiming a
> page depends upon what type of data is on the page ...
>
> Data pages cannot be reclaimed until there are no active views that could
> possible reference an object on that page (the object table maps oops to
> pages) so a single live object on a page can keep an entire data page from
> being reclaimed ... So in your case when you run the gc, you are not
> guaranteed that all of the pages housing the objects will be reclaimed. In
> the worse case, each dead object may be on a separate  page and you end up
> with no additional free pages ... the reclaim gems do look around for
> "scavengable pages" and will do a certain amount of automatic data
> consolidation and you can get information about "data fragmentation" by
> using Repository>>pagesWithPercentFree: or
> Repository>>fastPagesWithPercentFree:, but as I think I've mentioned before
> the system is very dynamic and over time a system that is running at
> constant rates will achieve an equilibrium in terms of free pages but the
> actual number of free pages will fluctuate within a  range that is dictated
> by quite a few different factors.
>
>
Thanks for the explanation Dale.

> It's good to keep an eye on things to recognize when unreasonable growth
> is occurring, but I don't think that it is reasonable to expect that the
> system always stay within some strict size limits ...
>
> Of course the challenge is to differentiate between unreasonable growth
> and growth due to real data accumulation, so it's worth diving deep on
> these different subjects so that you can learn about the normal rhythms of
> your own system ...
>
>
Indeed.

> Dale
>
>
>

-- 
Mariano
http://marianopeck.wordpress.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.gemtalksystems.com/mailman/private/glass/attachments/20150807/61a28706/attachment-0001.html>