1. The size of 15M XML eInvoice
[oracle@gis data]$ du -h /home/oracle/data
71G /home/oracle/data
2. Content of sqlloader control file: filelist.ctl
load data
infile 'filelist.dat'
APPEND
into table einvoice
xmltype(XMLDATA)
(
filename filler char(120),
XMLDATA lobfile(filename) terminated by eof
)
3. Content of filelist.dat (part of, actually is file listing)
test-000000000-0986.xml
test-000000986-0814.xml
test-000001800-0811.xml
test-000002611-1464.xml
....
Total: 14959 files (15M of invoice number)
4. sqlloader Command
[oracle@gis data]$ sqlldr userid=fsms/password@orcl control=filelist.ctl direct=y
SQL*Loader: Release 11.2.0.1.0 - Production on Fri Mar 11 20:54:13 2011
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
Load completed - logical record count 14959.
4.1 To find how long does sqlloader take, check the log file
this is a single process sqlloader
Content of log file: filelist.log
SQL*Loader: Release 11.2.0.1.0 - Production on Fri Mar 11 20:54:13 2011
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
Control File: filelist.ctl
Data File: filelist.dat
Bad File: filelist.bad
Discard File: none specified
(Allow all discards)
Number to load: ALL
Number to skip: 0
Errors allowed: 50
Continuation: none specified
Path used: Direct
Table EINVOICE, loaded from every logical record.
Insert option in effect for this table: APPEND
Column Name Position Len Term Encl Datatype
------------------------------ ---------- ----- ---- ---- ---------------------
FILENAME FIRST 120 CHARACTER
(FILLER FIELD)
XMLDATA DERIVED * EOF CHARACTER
Dynamic LOBFILE. Filename in field FILENAME
The following index(es) on table EINVOICE were processed:
SQL*Loader: Release 11.2.0.1.0 - Production on Fri Mar 11 20:54:13 2011
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
Load completed - logical record count 14959.
4.1 To find how long does sqlloader take, check the log file
this is a single process sqlloader
Content of log file: filelist.log
SQL*Loader: Release 11.2.0.1.0 - Production on Fri Mar 11 20:54:13 2011
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
Control File: filelist.ctl
Data File: filelist.dat
Bad File: filelist.bad
Discard File: none specified
(Allow all discards)
Number to load: ALL
Number to skip: 0
Errors allowed: 50
Continuation: none specified
Path used: Direct
Table EINVOICE, loaded from every logical record.
Insert option in effect for this table: APPEND
Column Name Position Len Term Encl Datatype
------------------------------ ---------- ----- ---- ---- ---------------------
FILENAME FIRST 120 CHARACTER
(FILLER FIELD)
XMLDATA DERIVED * EOF CHARACTER
Dynamic LOBFILE. Filename in field FILENAME
The following index(es) on table EINVOICE were processed:
index FSMS.SYS_C0014048 loaded successfully with 14959 keys
Table EINVOICE:
14959 Rows successfully loaded.
0 Rows not loaded due to data errors.
0 Rows not loaded because all WHEN clauses were failed.
0 Rows not loaded because all fields were null.
Bind array size not used in direct path.
Column array rows : 1
Stream buffer bytes: 256000
Read buffer bytes: 1048576
Total logical records skipped: 0
Total logical records read: 14959
Total logical records rejected: 0
Total logical records discarded: 0
Total stream buffers loaded by SQL*Loader main thread: 88408
Total stream buffers loaded by SQL*Loader load thread: 0
Run began on Fri Mar 11 20:54:13 2011
Run ended on Fri Mar 11 23:04:57 2011
Elapsed time was: 02:10:42.70
CPU time was: 02:03:32.10
5. Count The Total Imported Invoice Number
SQL> l
1 select count(*)
2 from einvoice,
3 xmltable('/A0401/Invoice' passing object_value
4 COLUMNS
5* invoice_no varchar2(10) path 'Main/InvoiceNumber')
SQL> /
COUNT(*)
----------
15000000
6. A Testing Invoice Query Without XMLType Index Created
23:50:00 SQL> l
1 SELECT
2 XMLQuery('/A0401/Invoice/Main' passing object_value returning content)
3 from einvoice p
4 where
5* XMLExists('/A0401/Invoice/Main[InvoiceNumber="AA66279380"]' PASSING OBJECT_VALUE)
23:50:07 SQL> /
XMLQUERY('/A0401/INVOICE/MAIN'PASSINGOBJECT_VALUERETURNINGCONTENT)
--------------------------------------------------------------------------------
<Main>
<InvoiceNumber>AA71600519</InvoiceNumber>
<InvoiceDate>
<Year>100
00:15:02 SQL>
XMLQUERY('/A0401/INVOICE/MAIN'PASSINGOBJECT_VALUERETURNINGCONTENT)
--------------------------------------------------------------------------------
<Main>
<InvoiceNumber>AA71600519</InvoiceNumber>
<InvoiceDate>
<Year>100
00:15:02 SQL>
Takes 25mins
6. Create a XML Index to improve performance
09:49:49 SQL> CREATE INDEX invoice_ix
10:06:57 2 ON einvoice(object_value)
10:06:57 3 INDEXTYPE IS XDB.XMLINDEX
10:06:57 4 PARAMETERS('PATH TABLE einvoice_path_table
10:06:57 5 PATHS (INCLUDE (/A0401/Invoice/Main/InvoiceNumber /A0401/Invoice/Main/Seller/Identifier /A0401/Invoice/Main/Buyer/Identifier))
10:06:57 6 PIKEY INDEX einvoice_pikey_ix
10:06:57 7 VALUE INDEX einvoice_value_ix')
10:06:58 8 /
Index created.
11:31:30 SQL>
It takes around 01:30:00
7. Run Testing Invoice Query With XML Index Created
11:39:19 SQL> l
1 SELECT
2 XMLQuery('/A0401/Invoice/Main' passing object_value returning content)
3 from einvoice p
4 where
5* XMLExists('/A0401/Invoice/Main[InvoiceNumber="AA66279380"]' PASSING OBJECT_VALUE)
11:39:21 SQL> /
XMLQUERY('/A0401/INVOICE/MAIN'PASSINGOBJECT_VALUERETURNINGCONTENT)
--------------------------------------------------------------------------------
<Main>
<InvoiceNumber>AA71600519</InvoiceNumber>
<InvoiceDate>
<Year>100
11:40:22 SQL>
Takes 1 (00:01:09) minutes. (Compares with previous SQL query without index created, it taked 25 minutes)
7. I am interesting in load data with index already created.
12:08:13 SQL> truncate table einvoice;
Table truncated.
12:08:26 SQL>
7. I am interesting in load data with index already created.
12:08:13 SQL> truncate table einvoice;
Table truncated.
12:08:26 SQL>
Note: Cannot use direct path load cause index is already created.
[oracle@gis data]$ sqlldr userid=fsms/fsms@orcl control=filelist.ctl direct=n
SQL*Loader: Release 11.2.0.1.0 - Production on Mon Mar 14 12:09:25 2011
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
p.s. The load performance is bad compared to previous on which has no index created. It might be improved by pre-allocated table and index space by setting "miniextend"
No comments:
Post a Comment