Transient leaks

ConvertFromOldNGs · Postby **ConvertFromOldNGs** » Fri Aug 07, 2009 10:55 am

by Robert >> Thu, 23 Jun 2005 22:13:25 GMT

The subject of transient leaks has prompted numerous discussions since Jade's launch (e.g. "Transient cleanup and DictIterator" in JADE Tech General).

Memory leaks are an impediment to developing high availability systems in Jade, as eventually any misbehaving code requires that the node is restarted.

Avoiding leaks currently relies on good practice, and developers' diligence - so this leaves plenty of room for oversights!

Are there ways to apply Jade's philosophy - to make programming easier - to this problem?

ConvertFromOldNGs · Postby **ConvertFromOldNGs** » Fri Aug 07, 2009 10:55 am

by allistar >> Fri, 24 Jun 2005 5:57:01 GMT

I have used a number of stategies to combat transient leaks, and theyall come down to good coding practices. These include:

1) Also delete a transient object in the epilog of the method that created it.

2) Always delete child controls in the "delete" method of the parent control (for composite controls created at runtime).

Number 1) is important, and if adhered to will all but remove transient leaks. There are occasions when this is not ossible, but in a well designed system these exceptions should be rare.

There are some cases where JADE itself breaks rule number, such as the Collection.createIterator method. My rule is to delete all iterators sraight after the "endWhile;" *as well as* in the epilog. If an exception happens during iteration and you don't delete it in the epilog, then it won't get deleted (until the process shuts down).

I hope this helps.
Allistar.
--
------------------------------------------------------------------
Allistar Melville
Software Developer, Analyst allistar@silvermoon.co.nz
Auckland, NEW ZEALAND

Silvermoon Software
Specialising in JADE development and consulting
Visit us at: http://www.silvermoon.co.nz
*NEW* Simple web access to Jade at: www.silvermoon.co.nz/jhp.html ------------------------------------------------------------------

ConvertFromOldNGs · Postby **ConvertFromOldNGs** » Fri Aug 07, 2009 10:56 am

by Brendan >> Fri, 24 Jun 2005 8:33:53 GMT

In addition to the good advice from Allistar, there is the problem of finding where a leaky transient was created or, more precisely, should have been deleted.

There are two things which can help here. The first is the CardsSchema method app.cnCheckForTransients which searches for undeleted transient objects on shutdown and logs how many there were for each class. This tells you the scale of any transient leak problem you may have (if your users haven't alread

). The option is enabled via the CardSchema ini file option CheckTransientsOnShutDown=true. Note, however, that this was written long before PeerSchemas and Packages were introduced so there may be some other places to look as well.

If you have identified that there is a problem, it can be quite tricky in a large app to find where the transients should have been deleted. An approach I have used successfully on a number of projects in the past is two pronged. One is to to put some diagnostic code in the constructor and destructor methods of all transients in, say, a BaseTransient abstract superclass and one or two other places. The other is to reimplement createIterator on Collection (on Btree and List classes) so that you can get diagnostics when an iterator is created. (I couldn't fifure out a way to automatically get diagnostics when an iterator was deleted).

The diagnostic information I write out to a file is in the format
<oid> C <call stack>
for the create and
<oid> D
for the delete.

For iterators, the <oid> C <call stack> is written to a separate file.

Then the diagnostic file is sorted on the first two fields <oid> and [CD] (SortActor helps here) and a JadeScript goes through the file throwing away matching, consecutive <oid> C and <oid> D pairs. What you are left with is a list of all undeleted transients and the call stack indicating where they were created. This is usually enough to fix the problem.

You can filter which transients get diagnostics by switches and reimplementing the methods logging the dianostics so that the performance is acceptable when running in test mode.

For iterators, the oid of all undeleted iterators is logged and then you can easily find where it was created and what the call stack is.

Cheers, Brendan

ConvertFromOldNGs · Postby **ConvertFromOldNGs** » Fri Aug 07, 2009 10:56 am

by Robert >> Mon, 27 Jun 2005 0:29:55 GMT

I hope the majority of Jade users follow Allistar's coding practices already ... but they (we) still make mistakes. As leakage invariably occurs in exception situations, problems end up in production systems ... so kiss goodbye to high availability until the problem is identified - this in itself can take a while, especially if diagnostic code such as Brendan suggests must be deployed first.

Thanks for the "tips & techniques" material guys, but I posted to the New Feature discussion to expore whether there are longer term opportunities for Jade to "make programming easier", i.e. does the responsibilty have to lie with the programmer, or can this be taken off their hands - especially those who lack Allistar or Brendan's experience.

e.g.
Should objects created as local variables with no other references be automatically deleted?
Auto-garbage collect is mentioned on various submissions (and as someone suggested, consequently there may be no need for a delete statement). Is this proven technique appropriate to Jade? What are pros, cons and scope? Would this rely on a periodic cleanup process, or would objects be deleted immediately they become unreferenced?
Presuming any AGC facility must be optional to preserve existing functionality (e.g. if anyone accesses transients by oid), what granularity is desirable or necessary, e.g. allow cleanup to be enabled or disabled by process or class?

ConvertFromOldNGs · Postby **ConvertFromOldNGs** » Fri Aug 07, 2009 10:56 am

by allistar >> Mon, 27 Jun 2005 2:14:36 GMT

To correctly implement garbage collection Jade would have to hold a reference count on objects. I.e. how many objects are referencing me. I'd imagine that this in itself would be a large undertaking. Then there's the situation where, for whatever reason, references to objects are not necessarily referenced by anything but should still not be garbage collected. There are ways to get to an object without needing to be references from anything, such as: Class.firstInstance, and String.asOid(). It would also have to make sure an object is not garbage collected if there are any notifications registered or locks held on that object.

I've always been of the mind that managing memory is the developers responsibility - if memory is allocated by the developer, then they should deallocate it. Knowing where objects are created and deleted gives a better understanding of how any application works without resorting to "I'll trust the system do it for me".

Just my 2c.

Allistar.
--
------------------------------------------------------------------
Allistar Melville
Software Developer, Analyst allistar@silvermoon.co.nz
Auckland, NEW ZEALAND

Silvermoon Software
Specialising in JADE development and consulting
Visit us at: http://www.silvermoon.co.nz
*NEW* Simple web access to Jade at: www.silvermoon.co.nz/jhp.html ------------------------------------------------------------------

ConvertFromOldNGs · Postby **ConvertFromOldNGs** » Fri Aug 07, 2009 10:56 am

by Robert >> Mon, 27 Jun 2005 3:09:09 GMT

Class.firstInstance, and String.asOid

Surely no one should be using these to access transients in mainstream production code ... though as I said, auto cleanup would have to be optional so as not to break any such existing code.

Jade would have to hold a reference count on objects

Would this be stored or derived, and what is the overhead? Perhaps an object's reference count could be maintained at the same time as it's Global Collection (Jade 6.2 roadmap) membership?

managing memory is the developers responsibility

Did I mention I spent far too much time over the last few months fixing transient leaks for which other developers *did not* take responibility? My client raises their collective eyebrows at this techno-jargon-incidental-complexity-cr@p, and mumble something about oracle and microsoft solutions (as business people, they're not going to ask if the alternative has AGC).

ConvertFromOldNGs · Postby **ConvertFromOldNGs** » Fri Aug 07, 2009 10:56 am

by allistar >> Mon, 27 Jun 2005 3:23:08 GMT

The woes of not having sufficiently trained developers, or having a system with no peer-review process. (Not that having these things is a guarantee, but it minimses downstream problems). There are a lot of things that "irresponsible" developers can do wrong over and above transient leaking. Improper or no locking strategy. Excessive notifications. Improper or no use of collection keys. No inverses. All of these things can breeze past standard QA and not be an issue until 6 months in a production environment.

I understand what you are saying about garbage collection, and if done well it could surely help. The root cause of the problem is not the tool though, it's the user of the tool. I'd say fix the root cause, which comes down to a mixture of education and/or peer-review.

Regards,
Allistar.

--
------------------------------------------------------------------
Allistar Melville
Software Developer, Analyst allistar@silvermoon.co.nz
Auckland, NEW ZEALAND

Silvermoon Software
Specialising in JADE development and consulting
Visit us at: http://www.silvermoon.co.nz
*NEW* Simple web access to Jade at: www.silvermoon.co.nz/jhp.html ------------------------------------------------------------------

ConvertFromOldNGs · Postby **ConvertFromOldNGs** » Fri Aug 07, 2009 10:56 am

by Robert >> Tue, 28 Jun 2005 22:09:30 GMT

I've been doing this long enough to follow best practice, and Allistar's rules are very simple to follow.

However my concerns are based on observation of *real* deployed systems; my last 2 projects (mods to existing systems) had performance and availability problems due to transient leakage (server-based web and fat clients) that took a fair bit of effort to resolve - time I could better spend addressing business related issues.

My comment re exception conditions was in the context of abnormal processing paths, not Jade exceptions (UAT often struggles just to address main path processes). It doesn't take many leaks per day to have cache overflowing within a month. So the offending processes can be bounced - if the process is servicing users, then availability is affected. Users also suffer the performance degradation leading up to the outage.

I hoped to promote some discussion between those I have heard talking passionately about AGC around the "water-cooler" - so I am somewhat disappointed by the lack of contributions ... also no real input again from the plant. So contrary to input on prior postings, it seems fair to conclude that AGC is no longer wanted or needed.

ConvertFromOldNGs · Postby **ConvertFromOldNGs** » Fri Aug 07, 2009 10:56 am

by allistar >> Tue, 28 Jun 2005 22:50:24 GMT

I appreciate the hassle and expense tracking down transient leaks, especially in code developed over years by developers that have come and gone. Although AGC would most likely remove the issue, I'd much rather other languiage features be added, such as operator overloading. (I'd expect the effort involved in coding operator overloading would be much less that that of an AGC system).

Jade (the company) have been releasing some great features of late, a lot of which are aimed at maintaining uptime, concurrent systems and other enterprise features. These are all good ideas, and well received. The language itself hasn't evolved much at all over the last few years, with the exception of interfaces (to be seen in Jade 6.1). It would be good to see other language features which make development easier/faster/more flexible. I see operator overloading on the top of the list.

Sorry to hijack your thread, maybe we should have a hands up of new features we would like?

Allistar.
--
------------------------------------------------------------------
Allistar Melville
Software Developer, Analyst allistar@silvermoon.co.nz
Auckland, NEW ZEALAND

Silvermoon Software
Specialising in JADE development and consulting
Visit us at: http://www.silvermoon.co.nz
*NEW* Simple web access to Jade at: www.silvermoon.co.nz/jhp.html ------------------------------------------------------------------

ConvertFromOldNGs · Postby **ConvertFromOldNGs** » Fri Aug 07, 2009 10:56 am

by John Munro >> Wed, 29 Jun 2005 9:09:02 GMT

I agree with the best practices, though it's difficult with old systems created/modified by many different developers of varying skill and discipline over a long period of time, and that it can be very hard to track down transient leaks.

I can see the arguments for and against AGC (there was a time when I couldn't see any arguments against)... At some point Jade must have decided not to implement it, I'd be interested to hear their rationale

As far as a feature wishlist, mine would go something like this:

Threading (already on roadmap)
Interfaces (already on roadmap)
Static methods (already on roadmap)
Web Services without IIS (already on roadmap)
Load Balancing (already on roadmap)
Enumerations
Overloading
Method pre/postconditions

The IPC, versioning, reorg and cache improvements on the roadmap look good

John Munro

FileVision UK Ltd.
Atlantic House
Imperial Way
Reading
RG2 0TD

Telephone: +44 (0) 118 903 6066
Fax: +44 (0) 118 903 6100
Email: john.munro@filevision.com
Web: http://www.filevision.com

The contents of this communication are confidential and are only intended to be read by the addressee. We apologize if you receive this communication in error and ask that you contact FileVision UK Ltd. immediately to arrange for its return. The use of any information contained in this communication by an unauthorized person is strictly prohibited. FileVision UK Ltd. cannot accept responsibility for the accuracy or completeness of this communication as it is being transmitted over a public network. If you suspect this message may have been intercepted or amended, please inform FileVision UK Ltd.

JADE Feature Discussion wrote:

Forums

Transient leaks

Transient leaks

Re: Transient leaks

Re: Transient leaks

Re: Transient leaks

Re: Transient leaks

Re: Transient leaks

Re: Transient leaks

Re: Transient leaks

Re: Transient leaks

Re: Transient leaks

Who is online