Philosophy on Map Files

Discussions about design and architecture principles, including native JADE systems and JADE interoperating with other technologies
ConvertFromOldNGs
Posts: 5321
Joined: Wed Aug 05, 2009 5:19 pm

Philosophy on Map Files

Postby ConvertFromOldNGs » Fri Aug 07, 2009 11:25 am

by Craig Shearer >> Wed, 20 Jun 2001 21:27:01 GMT

Hi All

I'm interested in getting consensus on how people use map files, and in particular, how people make decisions about what classes go in what map files, how many are needed, performance implications etc.

Map files are used to store class instances, and are maintained separately from class definitions (in the Class Maps Browser). Each class is assigned a map file, and instances of that class are physically stored in that map file (with the exception of exclusive collection classes - instances of which are stored in the same map file as their parent class).

In previous versions, JADE had a limit of 2GB on map file size, which made it possible to realistically exceed this limit with classes which had a large numbers of potential instances, or whose objects were large, or both. However, in the current version of JADE, the limit is 2^64 -1 bytes, effectively making map file size a non-issue (at least in my lifetime), provided you have a physical disk large enough to contain the map file.

So, how do you determine what map files are required? My approach is to clump classes together into related areas and define a map file for each area. Database reorganisations take place on a "per map file" basis so it's a good idea to put large-map-file-producing classes in their own map file to minimize reorg time. But apart from this, is there any performance penalty to having only a handful of map files, or conversely, one map file per class?

I've also been told that it's a good idea to have your Global and Application subclasses in a separate map file from all other classes. The reason is that you can then delete all your persistent objects by using the dbutil utility to delete the map files - something which is infinitely quicker than writing a script to get JADE to delete instances on a per object, or per class basis. If you don't have Global and Application subclasses in their own mapfile, then JADE has to handle recreating them for you when you start up the development environment (for each schema).

Comments?

Craig Shearer
--
Chief Architect
Helix Software Limited
http://www.helix.co.nz
craig.shearer@helix.co.nz
Phone: 025 936 334

ConvertFromOldNGs
Posts: 5321
Joined: Wed Aug 05, 2009 5:19 pm

Re: Philosophy on Map Files

Postby ConvertFromOldNGs » Fri Aug 07, 2009 11:25 am

by Stephen Persson >> Wed, 20 Jun 2001 23:11:44 GMT

Hi,
I personally get a bit anxious having one large map file, or every any one map file getting too big. I don't know if its justified but the thought of having all my eggs in one huge basket doesn't sit well with me - mainly for reasons like you said, with re-orgs, corrupt data/instances etc... Even though I know Jade is pretty good at roll forwards, recoveries and the like, I still like to do all I can to minimise the chances and spread things out a bit.

With all of our apps I would typically use a number of map files for each of the large classes (explained below), and a number of common map files for classes that I know will only ever have a small number of instances - still grouping it to common areas ( eg financial classes, dictionaries etc..). This grouping would typically match my class stucture.

With the classes I know are going to get huge, I create an abstract parent class with all the methods, attributes etc.. on it. Eg Widget. Beneath this class I create a persistent class suffixed by a number, Widget1. I then create a corresponding mapfile with the same suffix Eg MWid1, Whenever I want to create an instance of my Widget class I call a global method mCreateClass(classNamePrefix, mapFileNamePrefix) which will check which map file is the smallest and create the instance in that persistent class. Eg, if I called mCreateClass("Widget", "MWid") it will check all the MWid"x" mapfiles for the smallest one, say MWid3, and create the instance in the Widget3 class.
Doing it this way means I don't have to worry about classes/mapfiles getting too big. If the map files start getting too big I can just create some more persistent classes, Widget4, Widget5 etc.. and the corresponding mapfiles MWid4, MWid5 and there are now two more map files to help spread out the load, no re-orgs or code rework etc.., nice and easy.

I know that performance wise it must slow down the creation of instances a bit since it has to check sizes of the map files, but I'm well prepared to live with a few extra hundredths of a second to keep my data nicely spread, and calm my anxious mind...

--
Stephen Persson
Kinetix Group Ltd
Email: stephenp@kinetix.co.nz
WebSite: www.kinetix.co.nz

ConvertFromOldNGs
Posts: 5321
Joined: Wed Aug 05, 2009 5:19 pm

Re: Philosophy on Map Files

Postby ConvertFromOldNGs » Fri Aug 07, 2009 11:25 am

by Craig Shearer >> Thu, 21 Jun 2001 0:18:47 GMT

Hi Stephen

Thanks for your views... I'm not so sure about the integrity perspective - I think that if one of your map files becomes corrupt you're pretty much stuffed completely.

As for your algorithm for allocating instances to map files, I have seen something similar before, but using a random number. As the number is random (though, not truely :-) the instances will tend to spread evenly across mapfiles over time anyway, and I'd say it would be faster to do this than have to check which map file is smallest (presumeably, you'd need to look at all the map files for the superclass's subclasses - if that makes sense!)

Having said that, I would have thought that such a process would now be obselete under the new mapfile size restrictions - though if you really want to keep your mapfiles small, then you'd still have to do this. I'm just wondering whether this is a laudable aim anyway.

Craig.

ConvertFromOldNGs
Posts: 5321
Joined: Wed Aug 05, 2009 5:19 pm

Re: Philosophy on Map Files

Postby ConvertFromOldNGs » Fri Aug 07, 2009 11:25 am

by Carl Ranson >> Thu, 21 Jun 2001 2:54:31 GMT
Hi Stephen

Thanks for your views... I'm not so sure about the integrity perspective - I think that if one of your map files becomes corrupt you're pretty much stuffed completely.

I note that the jade db util has an method for recovering data from damanged files via some sort of brute force technique (I think its called a "rebuild" from memory). So I expect one could get *some* of the data back in all but the worst cases.
As for your algorithm for allocating instances to map files, I have seen something similar before, but using a random number. As the number is random (though, not truely :-) the instances will tend to spread evenly across mapfiles over time anyway, and I'd say it would be faster to do this than have to check which map file is smallest (presumeably, you'd need to look at all the map files for the superclass's subclasses - if that makes sense!)

Why would it need to be random? a sequential number mod <num map files> would do it too.
Although neither of these techniques would even out the storage when the number of map files is increased as the method Stephen describes does. (I know...im nitpicking :)
Having said that, I would have thought that such a process would now be obselete under the new mapfile size restrictions - though if you really want to keep your mapfiles small, then you'd still have to do this. I'm just wondering whether this is a laudable aim anyway.

Craig.

I don't see a lot of reason to do it. I suppose you could store each file on a different physical disk, therby reducing the mean seek time.

Piece of mind is as good a reason as any, but I would think the solution is raid or mirroring rather than having smaller files.

*** Anyone from the plant care to comment if jade is more efficent with references between map files ?

Cheers,
CR

ConvertFromOldNGs
Posts: 5321
Joined: Wed Aug 05, 2009 5:19 pm

Re: Philosophy on Map Files

Postby ConvertFromOldNGs » Fri Aug 07, 2009 11:25 am

by Craig Shearer >> Thu, 21 Jun 2001 4:12:54 GMT

Well, a random number is convenient in that you can generate it immediately, whereas if you are using a sequential number, you'd need to have it stored somewhere persistently, and serialise access to it.

Craig.

ConvertFromOldNGs
Posts: 5321
Joined: Wed Aug 05, 2009 5:19 pm

Re: Philosophy on Map Files

Postby ConvertFromOldNGs » Fri Aug 07, 2009 11:25 am

by Stephen Persson >> Thu, 21 Jun 2001 22:28:56 GMT
Thanks for your views... I'm not so sure about the integrity perspective - I think that if one of your map files becomes corrupt you're pretty much stuffed completely.

I have never had a map file get permanently corrupted, but I would have thought having seperate map file I could use the dbutil and remove the one corrupt file, leaving the rest intact. Therefore if I had one of my, say, correspondence map files (containing all email send by system) gets corrupt, I wouldn't be mortaly wounded if I had to delete that one file, and lose my email log.
If it doesn't work this way - why does the dbutil give the option to select individual files for deletion, if it does leave the rest of the system corrupt?

ConvertFromOldNGs
Posts: 5321
Joined: Wed Aug 05, 2009 5:19 pm

Re: Philosophy on Map Files

Postby ConvertFromOldNGs » Fri Aug 07, 2009 11:25 am

by Craig Shearer >> Fri, 22 Jun 2001 1:43:16 GMT

Hi Stephen

I agree with your point that it is better to lose just some data than all. Agreed that you can delete just one file, but what happens if you have references to objects in that file - for example, your email objects file - do you have other objects that reference email objects? What would happen if you deleted this file?

Interesting that you have never had a map file corrupted - neither have I. Has anybody? It'd be a shame for you to waste time designing/coding for stuff that is extremely unlikely to happen (and if it did happen, you could restore from backup anyway).

Craig.

ConvertFromOldNGs
Posts: 5321
Joined: Wed Aug 05, 2009 5:19 pm

Re: Philosophy on Map Files

Postby ConvertFromOldNGs » Fri Aug 07, 2009 11:25 am

by Carl Ranson >> Sat, 23 Jun 2001 2:50:11 GMT

One other possibility did strike me after thinking about this stuff for a while. If you have a business that has seasonal it may be beneficial to store each years data in a different map file by appending the year to the base class name....Invoice1999, Invoice2000 etc.

The map files could then be managed separately. You could have historical data on a compressed drive, this years data on an uncompressed drive for instance, or only do regular backups of this years files.

I'm still not 100% convinced it's worthwhile, but it does present some possibilities.

CR

ConvertFromOldNGs
Posts: 5321
Joined: Wed Aug 05, 2009 5:19 pm

Re: Philosophy on Map Files

Postby ConvertFromOldNGs » Fri Aug 07, 2009 11:25 am

by John Porter >> Mon, 25 Jun 2001 6:59:47 GMT

A couple of things that haven't been mentioned yet:

1. Keep collections with the data they reference. If you keep them in separate map files and those two map files end up at opposite ends of the physical disk, performance would suffer due to excessive head movement.

2. Logistically there can be problems if an individual map file gets too large. One client had a 1.6gb map file and even though there was 9gb of free disk, DiskKeeper Server 5.0 (the retail version - not the freebie) wouldn't defrag it. This can be handled by other means, but it sure is a lot easier if you can leave the auto-defrag taking care of it all. Also, WinZip won't zip up any file over 4gb, nor will it create a zip file over 4gb. For one project we had to zip the db (36gb) up into 4 zip files to avoid the latter limit. Another system I have seen has the former problem - they can't zip up the database at all, which makes it a bitch to transfer across a network.

Cheers,
John P

ConvertFromOldNGs
Posts: 5321
Joined: Wed Aug 05, 2009 5:19 pm

Re: Philosophy on Map Files

Postby ConvertFromOldNGs » Fri Aug 07, 2009 11:25 am

by Craig Shearer >> Tue, 26 Jun 2001 2:53:34 GMT

Hi John

Good stuff, but in relation to point #1, are you aware that when you have an exclusive collection (the most common type) that the collection is stored in the map file of the parent object, not the map file that the collection class has defined?

For example, if I have a ProductDict of Product objects stored in the product map file, and have a Customer class stored in the customer map file, then have an allMyProducts exclusive collection on Customer, then the ProductDict instance for each Customer will reside in the customer map file.

Craig.


Return to “Design and Architecture”

Who is online

Users browsing this forum: No registered users and 25 guests