OnDemand Users Group

Support Forums => CMOD for Multiplatforms => Topic started by: wwwalton on April 25, 2013, 12:43:03 PM

Title: Retrieval i/o performance
Post by: wwwalton on April 25, 2013, 12:43:03 PM
So, we run CMOD MP 8.5.6 under AIX 7 with a local cache only storage and DB2 9.5.  We generically index certain PDFs.  Currently a single retrieval of a 100k PDF (compressed when stored) takes .15 seconds using the 66 record.  New PDFs being designed, the black and white version is about 1 Mb in size and takes about 1.0 seconds to retreive.  The color version is much larger, around 4 Mb, and retrieval time jumps to 25 seconds.  Since these are used by web (I used the gui client to test), retrieval times are critical.  Any ideas as to how this time could be reduced for the color version?
Thanks,
-Walt
Title: Re: Retrieval i/o performance
Post by: Paul on May 03, 2013, 04:28:31 PM
I have found some issues with retrieval and the java console version.  Particularly java 1.6.0_14 when using ODWEK 8.4.1 and 8.5.0.6.
Title: Re: Retrieval i/o performance
Post by: Trambak on May 20, 2013, 04:02:50 AM
How do you load these pdf documents - generic indexer or pdf indexer? If there is a resource portion that gets pulled while pulling these documents using ODWEK, think about caching those resource files externally.
Title: Re: Retrieval i/o performance
Post by: wwwalton on May 20, 2013, 11:54:53 AM
These are all loaded generic so no opportunity that I know of to deal with resources separately.  I used the the Windows installed client which I do not believe uses java, but is a natively compiled application.
Thanks,
-ww
Title: Re: Retrieval i/o performance
Post by: Justin Derrick on May 20, 2013, 12:34:37 PM
Do you have compression set to 'disable'?  Also, try increasing your object size.  At 4MB each, you're only getting 2.5 PDFs per CMOD object.  Up to 100MB should be fine.

Also, try running a retrieval with 'arsdoc get' on the server.  That will eliminate the possibility of any client PC / network issues.

-JD.
Title: Re: Retrieval i/o performance
Post by: wwwalton on May 21, 2013, 12:38:09 PM
Thanks for your response Derek, just a couple of questions.  Since I was using the times off the '66' record, the client shouldn't matter right?

Also, the doc on compression says:
Disable
OnDemand does not compress the input data. Choose this option when the input data is already compressed, such as a compressed TIFF. The documents are uncompressed by the appropriate viewer on the client, for example, Acrobat Reader.
None
OnDemand does not compress the input data when loading it into the system. When the user selects a document for viewing, OnDemand compresses the document before transmitting it over the network and will uncompress the document on the client.
So, if I read this correctly, I still pay a penalty storing uncompressed as time will be taken to compress/decompress anyway?
Thanks again for your input.
-ww
Title: Re: Retrieval i/o performance
Post by: wwwalton on May 21, 2013, 12:39:12 PM
Ooops, meant Justin.   :-[
Title: Re: Retrieval i/o performance
Post by: Paul on May 21, 2013, 03:54:13 PM
Shouldn't you use OD77 since these are documents and not TIFFs?
Title: Re: Retrieval i/o performance
Post by: Justin Derrick on May 21, 2013, 04:08:26 PM
Yup, WW is definitely right -- set compression to 'none', and not 'disable'.  (My bad, sorry!) 

Paul:  The overwhelming majority of PDFs use compression by default, and trying to compress them is a waste of time -- the savings would be negligible, and some files actually grow in size due to the overhead of the compression method.
Title: Re: Retrieval i/o performance
Post by: Paul on May 21, 2013, 04:24:07 PM
Thanks, Justin!