Issue indexing PDF file AFTER PCL2PDF

Previous topic - Next topic

jeffs42885

I believe this may have been touched on before but I am just wondering if anyone has seen this before.

We are currently in the process of implementing PCL2PDF and I am trying to index a stacked PDF file. I converted the file to PDF, and I opened it with the OnDemand indexer. I tried triggering off of Page 1 to break the document, and it is not working. It is throwing an 88 record and saying that it cannot find the trigger on page 2 (INDEXSTARTBY=2 due to a cover sheet..) I also tried using Re: Dear as a unique trigger to no luck.

Here is my index information:

COORDINATES=IN
TRIGGER1=UL(0.79,3.22),LR(1.31,3.96),*,'Re: Dear'
FIELD1=UL(4.40,10.27),LR(6.04,10.96),0,(TRIGGER=1,BASE=0)
FIELD2=UL(6.77,10.27),LR(7.27,11.00),0,(TRIGGER=1,BASE=0)
FIELD3=UL(7.39,10.25),LR(8.22,11.00),0,(TRIGGER=1,BASE=0)
INDEX1='rdate',FIELD2,(TYPE=GROUP)
INDEX2='ssn',FIELD1,(TYPE=GROUP)
INDEX3='contract',FIELD3,(TYPE=GROUP)
INDEXSTARTBY=2

Here is what I am seeing in the system log

ARS4902 Number of input pages = 68
ARS4914 Trigger(s) not found by page 2
ARS4922 ARSPDOCI completed code 1
arsload: 09/26/13 08:19:44 Indexing failed
arsload: Processing failed for file

There should be 10 or so rows loaded into the database, everything looks good when I look at this file with the PDF indexer.

Alessandro Perucchi

Hello jeffs42885,

Well, one possible error is that you need to define the square around your trigger larger than needed.

For example if you have a trigger that is the letter A, then create a square around the letter A with something like a margin of 5mm around the letter A, if your square touche the letter, then the indexer might have trouble finding the trigger!! Silly... but sad truth.

One other way would be to use the command "arspdump" in order to dump your PDF and have the coordinate that you need to use with the PDF indexer.

Hope that helps a little bit.

Sincerely yours,
Alessandro
Alessandro Perucchi

#Install #Migrations #Conversion #Educate #Repair #Upgrade #Migrate #Enhance #Optimize #AIX #Linux #Multiplatforms #DB2 #Windows #Oracle #TSM #Tivoli #Performance #Audits #Customizing #Availability #HA #DR #JavaApi #ContentNavigator #ICN #WEBi #ODWEK #Services #PDF #AFP #XML

jeffs42885

Thanks Alessandro.

I tried that and it did not work unfortunately. It is still saying that It is unable to find the trigger on page 2.


pankaj.puranik

Jeff

Run the arspdump as Alessandro suggested.
Then in the output of arspdump, search for the trigger string "'Re: Dear'"
Compare the UL and LR values in this file with
TRIGGER1=UL(0.79,3.22),LR(1.31,3.96),*,'Re: Dear'

jeffs42885

I tried doing the arspdump method and that did not work.

Is it possible to index a PDF off of white space?