Sunday, March 13, 2011

eInvoice Simple Testing

A simple testing on a quad-cores CPU with 8GB memory running Oracle Enterprise Linux. Storage is a SATA disk.

1. The size of 15M XML eInvoice
[oracle@gis data]$ du -h /home/oracle/data
71G     /home/oracle/data

2. Content of sqlloader control file: filelist.ctl
load data
infile 'filelist.dat'
APPEND
into table einvoice
xmltype(XMLDATA)
(
filename filler char(120),
XMLDATA lobfile(filename) terminated by eof
)

3. Content of filelist.dat (part of, actually is file listing) 
test-000000000-0986.xml
test-000000986-0814.xml
test-000001800-0811.xml
test-000002611-1464.xml
....
Total: 14959 files (15M of invoice number)

4. sqlloader Command
[oracle@gis data]$ sqlldr userid=fsms/password@orcl control=filelist.ctl direct=y
SQL*Loader: Release 11.2.0.1.0 - Production on Fri Mar 11 20:54:13 2011
Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.
Load completed - logical record count 14959.

4.1 To find how long does sqlloader take, check the log file
this is a single process sqlloader
Content of log file: filelist.log
SQL*Loader: Release 11.2.0.1.0 - Production on Fri Mar 11 20:54:13 2011


Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.


Control File:   filelist.ctl
Data File:      filelist.dat
  Bad File:     filelist.bad
  Discard File:  none specified


 (Allow all discards)


Number to load: ALL
Number to skip: 0
Errors allowed: 50
Continuation:    none specified
Path used:      Direct


Table EINVOICE, loaded from every logical record.
Insert option in effect for this table: APPEND


   Column Name                  Position   Len  Term Encl Datatype
------------------------------ ---------- ----- ---- ---- ---------------------
FILENAME                            FIRST   120           CHARACTER
  (FILLER FIELD)
XMLDATA                           DERIVED     *  EOF      CHARACTER
    Dynamic LOBFILE.  Filename in field FILENAME


The following index(es) on table EINVOICE were processed:
index FSMS.SYS_C0014048 loaded successfully with 14959 keys

Table EINVOICE:
  14959 Rows successfully loaded.
  0 Rows not loaded due to data errors.
  0 Rows not loaded because all WHEN clauses were failed.
  0 Rows not loaded because all fields were null.

Bind array size not used in direct path.
Column array  rows :       1
Stream buffer bytes:  256000
Read   buffer bytes: 1048576

Total logical records skipped:          0
Total logical records read:         14959
Total logical records rejected:         0
Total logical records discarded:        0
Total stream buffers loaded by SQL*Loader main thread:    88408
Total stream buffers loaded by SQL*Loader load thread:        0

Run began on Fri Mar 11 20:54:13 2011
Run ended on Fri Mar 11 23:04:57 2011

Elapsed time was:     02:10:42.70
CPU time was:         02:03:32.10

5. Count The Total Imported Invoice Number
SQL> l
  1  select count(*)
  2  from einvoice,
  3  xmltable('/A0401/Invoice' passing object_value
  4  COLUMNS
  5* invoice_no varchar2(10) path 'Main/InvoiceNumber')
SQL> /

  COUNT(*)
----------
  15000000

6. A Testing Invoice Query Without XMLType Index Created
23:50:00 SQL> l
  1  SELECT
  2  XMLQuery('/A0401/Invoice/Main' passing object_value returning content)
  3  from einvoice p
  4  where
  5* XMLExists('/A0401/Invoice/Main[InvoiceNumber="AA66279380"]' PASSING OBJECT_VALUE)
23:50:07 SQL> /

XMLQUERY('/A0401/INVOICE/MAIN'PASSINGOBJECT_VALUERETURNINGCONTENT)
--------------------------------------------------------------------------------
<Main>
  <InvoiceNumber>AA71600519</InvoiceNumber>
  <InvoiceDate>
    <Year>100


00:15:02 SQL>

Takes 25mins

6. Create a XML Index to improve performance



09:49:49 SQL> CREATE INDEX invoice_ix
10:06:57   2  ON einvoice(object_value)
10:06:57   3  INDEXTYPE IS XDB.XMLINDEX
10:06:57   4  PARAMETERS('PATH TABLE einvoice_path_table
10:06:57   5  PATHS (INCLUDE (/A0401/Invoice/Main/InvoiceNumber /A0401/Invoice/Main/Seller/Identifier /A0401/Invoice/Main/Buyer/Identifier))
10:06:57   6  PIKEY INDEX einvoice_pikey_ix
10:06:57   7  VALUE INDEX einvoice_value_ix')
10:06:58   8  /

Index created.

11:31:30 SQL>

It takes around 01:30:00 

7. Run Testing Invoice Query With XML Index Created

11:39:19 SQL> l
  1  SELECT
  2  XMLQuery('/A0401/Invoice/Main' passing object_value returning content)
  3  from einvoice p
  4  where
  5* XMLExists('/A0401/Invoice/Main[InvoiceNumber="AA66279380"]' PASSING OBJECT_VALUE)
11:39:21 SQL> /

XMLQUERY('/A0401/INVOICE/MAIN'PASSINGOBJECT_VALUERETURNINGCONTENT)
--------------------------------------------------------------------------------
<Main>
  <InvoiceNumber>AA71600519</InvoiceNumber>
  <InvoiceDate>
    <Year>100

11:40:22 SQL>

Takes 1 (00:01:09) minutes. (Compares with previous SQL query without index created, it taked 25 minutes)


7. I am interesting in load data with index already created.

12:08:13 SQL> truncate table einvoice;


Table truncated.


12:08:26 SQL>
Note: Cannot use direct path load cause index is already created.
[oracle@gis data]$ sqlldr userid=fsms/fsms@orcl control=filelist.ctl direct=n

SQL*Loader: Release 11.2.0.1.0 - Production on Mon Mar 14 12:09:25 2011

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

p.s. The load performance is bad compared to previous on which has no index created. It might be improved by pre-allocated table and index space by setting "miniextend"

No comments: