Improving Search Time

For questions and postings not covered by the other forums
M45HY
Posts: 63
Joined: Wed Jul 11, 2012 7:32 am
Location: Mansfield, Nottinghamshire, UK

Improving Search Time

Postby M45HY » Fri Apr 26, 2013 3:05 am

Hi guys,

I had a quick question and would be grateful if someone could provide feedback.

I currently have a set up that has roughly 2000+ data records within it. When I wish to access the records within the dataset via the Web, I have to use an iterator, which tends to take a couple of minutes. I was wondering whether someone knows how I can improve the search time?

Example:
If you can imagine that you have a dataset called 'Customer', which has 200,000+ records in it (some of which are duplicated for a specific reason). Now when you try to find the most recent record of a customer whom has multiple records (the record with the highest customer reference number), you will have to loop through records and evaluate which one is the most recent record by evaluating the data.

Is there a possible way that the time it takes to loop through (not evaluate/change) the data can be minimal as possible? I’ve also tried looking at the link: https://forums.jadeworld.com/viewtopic. ... +key#p6956 but I feel as if though this would take a long time to run if applied.

Thanks
Omash
"If you can't explain it simply, you don't understand it well enough." - Albert Einstein

murray
Posts: 144
Joined: Fri Aug 14, 2009 6:58 pm
Location: New Plymouth, New Zealand

Re: Improving Search Time

Postby murray » Fri Apr 26, 2013 10:30 am

Have you considered adding a collection with appropriate keys for the query?
If you already have a collection for the data set, then it is just a matter of adding another inverse.
Using your example, a dictionary with keys [customerNumber, referenceNumber] may be suitable.

Otherwise, look for ways that you can partition and segment the data set when searching, to limit the number of objects to be fetched.
Murray (N.Z.)

JohnP
Posts: 73
Joined: Mon Sep 28, 2009 8:41 am
Location: Christchurch

Re: Improving Search Time

Postby JohnP » Fri Apr 26, 2013 11:13 am

If you add a key of the timestamp, descending, the query should be practically instantaneous. For example, keys of [customerNumber, lastUpdateTimeStamp descending].

M45HY
Posts: 63
Joined: Wed Jul 11, 2012 7:32 am
Location: Mansfield, Nottinghamshire, UK

Re: Improving Search Time

Postby M45HY » Fri Apr 26, 2013 9:10 pm

Hi guys,

I firstly want to thank you both for replying so quickly - it's most appreciated :)

@Murray: Being new to JADE, I just wanted to query that would the term 'collection' be referring to i.e. a transient array, in which keys can be set in order to store/search or have I got the wrong end of the stick? Because if so, we do have one that's been created and I wasn't entirely sure on the term 'inverse'.

@JohnP: Thanks mate that's a great idea but in order to implement that, I would have to reorg a key dataset that's used throughout our system and would take quite a long time in doing so and I'm not so sure whether creating a new dataset would be something that would be considered either (as the situation is many-to-many) .

Just to provide a bit more information on the situation, if you can image an employee that has five records - each with a unique number (i.e. Customer ID). If we imagine that the fifth record is the most recent for the customer and that there is a another number (Reference ID) which is the same for all of the records and is formed by having the oldest record's Reference ID (the very first record for a Customer would consist of the Customer ID and Reference ID being the same).

An example can be seen attached.

Now for this small example, the searching time would be minimal, but if you imagine 2000+ records of where this can possibly consist at a large scale, then that's where the search time starts to increase.
Attachments
Example.jpg
Example.jpg (30.35 KiB) Viewed 6819 times
"If you can't explain it simply, you don't understand it well enough." - Albert Einstein

murray
Posts: 144
Joined: Fri Aug 14, 2009 6:58 pm
Location: New Plymouth, New Zealand

Re: Improving Search Time

Postby murray » Fri Apr 26, 2013 10:56 pm

Hi Omash,

I was referring to adding a persistent collection, with appropriate dictionary keys.
This would require defining a new dictionary class and performing a reorg.
In my experience, 200,000 is not a large data set for re-orging, but maybe worse for you.

I think you need to find out a bit more about Jade's collections and dictionaries.
There's a lot you can do, with automatic updating of multiple indexes possible.
The logical organisation of your data should match the task you need to perform.

In this case you really need to "divide and conquer" to read as few objects as possible.
You definitely do not want to have to read all objects every time.
JohnP's suggestion is an good solution.

The white paper "Performance Design Tips" from the Jade website may be of some help.

Murray.
Murray (N.Z.)

JohnP
Posts: 73
Joined: Mon Sep 28, 2009 8:41 am
Location: Christchurch

Re: Improving Search Time

Postby JohnP » Mon Apr 29, 2013 11:48 am

Given that data structure, the keys for my suggestion would be [ReferenceId, CustomerId descending], and the first one encountered would be the most recent. This collection would be a subclass of MemberKeyDictionary. My experience is similar to Murray's, in that adding a collection like this for a class of 200,000 should be a quick reorg.

M45HY
Posts: 63
Joined: Wed Jul 11, 2012 7:32 am
Location: Mansfield, Nottinghamshire, UK

Re: Improving Search Time

Postby M45HY » Wed May 08, 2013 6:29 am

Hi Guys,

Thank you guys for your time and your suggestions. Now that I believe that I have understood the core of what your suggestions were, I think may go with what JohnP was suggesting. I never thought about descending the order and looping through like that.

Thank you very much guys!
"If you can't explain it simply, you don't understand it well enough." - Albert Einstein


Return to “General Discussion”

Who is online

Users browsing this forum: No registered users and 10 guests

cron