Monday, December 3, 2012

Error Checking MaxL From a Windows Batch Script


Hey Everyone,
I thought I would share a simple way to error check MaxL output using a windows batch.  I know there are many different methods out there, but this is the one that I like to use.

Please note:  I have removed all dynamic references from the code samples below and hard coded sample paths, server, and credentials to make them as generic and as easy to adapt as possible.  Also, I am making the assumption that Essmsh is in the server path so it can be called directly.  In versions 11.1.2.1 and up you will need to call the startmaxl.cmd instead of essmsh as Oracle has phased out essmsh.

Step 1: 

Ensure the MaxL Script is set to spool out to a log file by adding the following code to the beginning of you MaxL.  It is also a good idea to set the time-stamp functionality to "on".
/***************** Record session to a log file *********************/
spool on to C:\LogFiles\log_file_name.log;

/***************** Enable MaxL Timestamp ****************************/
set timestamp on;
Also make sure to include the "spool off" command at the end of your MaxL script to ensure that the log file gets closed by the operating system.  (see code below)
spool off;
exit;

Step 2:

Setup the batch script to call the MaxL and perform the error checking on the MaxL log file.
REM  ----------------------------------------------------------------------------
REM  ---------------- Execute MaxL script to extract level 0 data ---------------
REM  ----------------------------------------------------------------------------
call essmsh C:\MaxL\BackUp_level0.mxl [Here you would have dynamic references including server name, credentials, log file name, log file path, etc.]

findstr /m "ERROR" C:\LogFiles\log_file_name.log
if %errorlevel%==0 ( blat -t me@email.com -priority 1 -s "AM Backup: Level0 backup failed" -body "Level0 backup failed. Aborting Process." -attacht C:\LogFiles\log_file_name.log
exit )

Example of MaxL Log File Error:

Below is an example of a portion of a MaxL log file where we were encountering mysterious network errors.
MAXL> login Username Password on ServerName;

 OK/INFO - 1051034 - Logging in user [Username].
 OK/INFO - 1051035 - Last login on Thursday, June 21, 2012 10:08:07 AM.
 OK/INFO - 1241001 - Logged in to Essbase.

      essmsh timestamp: Thr Jun 21 10:08:07 2012

MAXL> import database APP.DB data from server data_file "DataFile.txt" using server rules_file 'SSD_RULE' on error write to "C:\ERR\dataload.err";

   ERROR - 1042013 - Network error [10054]: Cannot Receive Data.
   ERROR - 1042012 - Network error [10054]: Cannot Send Data.
   ERROR - 1241101 - Unexpected Essbase error 1042013.

So what is this batch script actually doing?

The first step in the batch is to call the MaxL that we setup in Step 1.  This will generate the log file specified in the MaxL script every time it runs (C:\LogFiles\log_file_name.log).  This log file will reside on the server that actually kicks off this script.  If you run it multiple times, it will overwrite the one that existed before.  Note that after this essmsh command, you would normally pass server names, credentials, paths, etc. to the MaxL script for further processing (which we are not going to discuss right now).

After the MaxL script has finished up the next thing the batch will do is look for the word "ERROR" anywhere within the log file.  The findstr function is case sensitive, so beware.

This next step is now testing to see if the line before found the word in the log file.  Now before you look at this next line and say "Hey Scott, your code is incorrect!  You are saying that if there is no error then exit!"  This appears so, but it is not the case.  The line "If %errorlevel%==0" is actually testing the "findstr" line.  If the findstr function does not find the word "ERROR" in our log file, it will throw an error itself.  This is a "good error"!  So if the errorlevel for the findstr function is 0, then we have a problem...

The nested steps within the if statement are pretty simple as well.  The first step is to send an email to me notifying me of the problem, and then the final step closes the batch script so that it does not continue.  More information on the Blat command can be found here.

You can also insert other lines of code within the IF statement.  I usually put in a call to a batch that will backup and archive the log files with an error time-stamp on them or create a flag file notifying other disconnected processes of the error.

That's about it.  As always if you have any questions or find any errors in my posts, please comment so I can respond and correct.

7 comments:

  1. Hey Everyone,
    I am getting a bunch of questions regarding the error message that I displayed in the example.

    ERROR - 1042013 - Network error [10054]: Cannot Receive Data.
    ERROR - 1042012 - Network error [10054]: Cannot Send Data.
    ERROR - 1241101 - Unexpected Essbase error 1042013.

    This was copied from an issue I experienced a while back with one of my ASO outlines corrupting. What was happening was that I was compacting the outline daily using the ESSCMDQ utility. Running this daily was actually causing the outlines to corrupt upon next data load.

    I will post about the compaction utility soon and the issues that I encountered running it daily.

    ReplyDelete
  2. Hey Jake,
    I still have an Oracle SR open and they are trying to replicate the issue. Are you experiencing this same issue? Thanks.

    ReplyDelete
  3. Hi Scott,

    I am having the exact same 3 errors while performing a load into an ASO cube. The load worked perfectly in the past.

    Did you get to the bottom of your issue?

    Kind regards,

    Russell

    ReplyDelete
  4. Hey Russell,
    I have a workaround in place right now in which I run the ASO Compaction much less frequently, but this is still an issue. Oracle cannot seem to replicate it either.

    One of my Blog readers suggested a fix in which I have not tried yet.

    Here is the link to the post, check out the comments section.
    http://scottsoracleblog.blogspot.com/2013/02/aso-outline-compaction-using-esscmdq.html

    Thanks,
    --Scott

    ReplyDelete
  5. Hi Scott,

    Thanks for the reply. The interesting thing is I don't believe the client is compacting the outline...

    I will look at that though. to be honest, I sense we are going to have to re-build the ASO cube. Oracle don't seem to be getting anywhere with the SR...

    Thanks,

    Russell

    ReplyDelete
  6. Russell,
    You should try and defragment your database to see if that fixes the issue (suggested by another reader). I am wondering if there is some sort of corruption stored in the OTL file that is causing your issue. Thanks,
    --Scott

    ReplyDelete