Statement Indexing Troubleshooting

Previous topic - Next topic

tjspencer2

We are outsourcing the createment of statements and we're using PDF files and PDF Indexer.

My vendor provided me a file of PDF statements that they said had 3039 statements.

When I load the statements into CMOD via PDF Indexer, I only get 3038??

We are using "Page 1 of" as our trigger to uniquely identify the first page of a statement.

When I open their file in Adobe X and do an advanced search I find 3039 instances of the phrase "Page 1 of"

But PDF Indexer only loads 3038 statements??

I've looked at this phrase across all 3039 statements but can only get 3038 to load.

Has anybody ever encountered this?  How could I troubleshoot?  I'm baffled!! :(

jeffs42885

I've seen this before and it was a headache to troubleshoot.

You mentioned that you've looked at this phase across all 3039 statements, but just out of curiousity..tucked Away in this document, could there be a statement with two pages? In the case I saw, it was the to: address had extra lines..

Examples

This would load one page/statement working as expected

Jeff S
1234 Main St
Anytown CA, 90210

Something like this would cause the statement to pour over into the next page:

Jeff S
CMOD Person
1234 Main St
Anytown CA, 92010

Jeff S
1234 Main St
Suite 234
Anytown CA, 92010

tjspencer2

I think there's somethign to the first statement rolling over onto the first page of the duplicate statement and indexer not interpreting the first page of the 2nd statement as a new statement.

Just so I'm straight on uniqueness, there's nothing ensuring uniqueness of statements right?

The statement that isn't loading is completely identical to the one that precedes it.

Justin Derrick

Depending on your version of CMOD, uniqueness may be enforced automatically.
Call:  +1-866-533-7742  or  eMail:  jd@justinderrick.com
IBM CMOD Wiki:  https://CMOD.wiki/
FREE IBM CMOD Webinars:  https://CMOD.Training/
IBM CMOD Professional Services: https://CMOD.cloud

Interests: #AIX #Linux #Multiplatforms #DB2 #TSM #SP #Performance #Security #Audits #Customizing #Availability #HA #DR

tjspencer2

So uniqueness isn't our issue - as there are some accounts for which we create two statements for in CMOD.

What's happening is that there are a couple of accounts for which we generate the same statement multiple times and create in CMOD.

For some instances of these, the "Page 1 of" trigger isn't being interpreted as the beginning of a new statement but instead as the continuation of an existing statement.

In our 167,000 statements this happens 5 times and the result is for these 5 statements, they're combined with the statement ahead of them :(

Is there a way to analyze the PDF file to see control characters that may not be visible in the PDF file when viewing it?  Is it even possible for a control character to be in a PDF file and that not be visible?

pankaj.puranik

Is it possible for you to share the indexing script/parameter information that you used?