This chapter describes the CTX_QUERY PL/SQL package you can use for generating query feedback, counting hits, and creating stored query expressions.
Note:
You can use this package only when your index type isCONTEXT. This package does not support the CTXCAT index type.The CTX_QUERY package includes the following procedures and functions:
| Name | Description |
|---|---|
| BROWSE_WORDS | Returns the words around a seed word in the index. |
| COUNT_HITS | Returns the number hits to a query. |
| EXPLAIN | Generates query expression parse and expansion information. |
| HFEEDBACK | Generates hierarchical query feedback information (broader term, narrower term, and related term). |
| REMOVE_SQE | Removes a specified stored query expression from the SQL tables. |
| RESULT_SET | Executes a query and generates a result set. |
| STORE_SQE | Executes a query and stores the results in stored query expression tables. |
This procedure enables you to browse words in an Oracle Text index. Specify a seed word and BROWSE_WORDS returns the words around it in the index, and an approximate count of the number of documents that contain each word.
This feature is useful for refining queries. You can identify the following words:
Unselective words (words that have low document count)
Misspelled words in the document set
Syntax 1: To Store Results in Table
ctx_query.browse_words(
index_name IN VARCHAR2, seed IN VARCHAR2, restab IN VARCHAR2, browse_id IN NUMBER DEFAULT 0, numwords IN NUMBER DEFAULT 10, direction IN VARCHAR2 DEFAULT BROWSE_AROUND, part_name IN VARCHAR2 DEFAULT NULL
);
Syntax 2: To Store Results in Memory
ctx_query.browse_words(
index_name IN VARCHAR2, seed IN VARCHAR2, resarr IN OUT BROWSE_TAB, numwords IN NUMBER DEFAULT 10, direction IN VARCHAR2 DEFAULT BROWSE_AROUND, part_name IN VARCHAR2 DEFAULT NULL
);
Specify the name of the index. You can specify schema.name. Must be a local index.
Specify the seed word. This word is lexed before browse expansion. The word need not exist in the token table. seed must be a single word. Using multiple words as the seed will result in an error.
Specify the name of the result table. You can enter restab as schema.name. The table must exist before you call this procedure, and you must have INSERT permissions on the table. This table must have the following schema.
| Column | Datatype |
|---|---|
| browse_id | number |
| word | varchar2(64) |
| doc_count | number |
Existing rows in restab are not deleted before BROWSE_WORDS is called.
Specify the name of the result array. resarr is of type ctx_query.browse_tab.
type browse_rec is record ( word varchar2(64), doc_count number ); type browse_tab is table of browse_rec index by binary_integer;
Specify a numeric identifier between 0 and 232. The rows produced for this browse have a value of in the browse_id column in restab. When you do not specify browse_id, the default is 0.
Specify the number of words returned.
Specify the direction for the browse. You can specify one of:
| value | behavior |
|---|---|
BEFORE |
Browse seed word and words alphabetically before the seed. |
AROUND |
Browse seed word and words alphabetically before and after the seed. |
AFTER |
Browse seed word and words alphabetically after the seed. |
Symbols CTX_QUERY.BROWSE_BEFORE, CTX_QUERY.BROWSE_AROUND, and CTX_QUERY.BROWSE_AFTER are defined for these literal values as well.
Specify the name of the index partition to browse.
Browsing Words with Result Table
begin
ctx_query.browse_words('myindex','dog','myres',numwords=>5,direction=>'AROUND');
end;
select word, doc_count from myres order by word;
WORD DOC_COUNT
-------- ----------
CZAR 15
DARLING 5
DOC 73
DUNK 100
EAR 3
Browsing Words with Result Array
set serveroutput on;
declare
resarr ctx_query.browse_tab;
begin
ctx_query.browse_words('myindex','dog',resarr,5,CTX_QUERY.BROWSE_AROUND);
for i in 1..resarr.count loop
dbms_output.put_line(resarr(i).word || ':' || resarr(i).doc_count);
end loop;
end;
Returns the number of hits for the specified query. You can call COUNT_HITS in exact or estimate mode. Exact mode returns the exact number of hits for the query. Estimate mode returns an upper-bound estimate but runs faster than exact mode.
exec CTX_QUERY.COUNT_HITS(
index_name IN VARCHAR2,
text_query IN VARCHAR2,
exact IN BOOLEAN DEFAULT TRUE,
part_name IN VARCHAR2 DEFAULT NULL
) RETURN NUMBER;
exec CTX_QUERY.COUNT_HITS_CLOB_QUERY(
index_name IN VARCHAR2,
text_query IN CLOB,
exact IN BOOLEAN DEFAULT TRUE,
part_name IN VARCHAR2 DEFAULT NULL
) RETURN NUMBER;
Specify the index name.
Specify the query.
Specify TRUE for an exact count. Specify FALSE for an upper-bound estimate.
Specifying FALSE returns a less accurate number but runs faster. Specifying FALSE might return a number which is too high if rows have been updated or deleted since the last FULL index optimize. Optimizing in full mode removes these false hits, and then EXACT set to FALSE will return the same number as EXACT set to TRUE.
Specify the name of the index partition to query.
Use CTX_QUERY.EXPLAIN to generate explain plan information for a query expression. The EXPLAIN plan provides a graphical representation of the parse tree for a Text query expression. This information is stored in a result table.
This procedure does not execute the query. Instead, this procedure can tell you how a query is expanded and parsed before you enter the query. This is especially useful for stem, wildcard, thesaurus, fuzzy, soundex, or about queries. Parse trees also show the following information:
ABOUT query normalization
Query expression optimization
Stop-word transformations
Breakdown of composite-word tokens
Knowing how Oracle Text evaluates a query is useful for refining and debugging queries. You can also design your application so that it uses the explain plan information to help users write better queries.
exec CTX_QUERY.EXPLAIN(
index_name IN VARCHAR2,
text_query IN VARCHAR2,
explain_table IN VARCHAR2,
sharelevel IN NUMBER DEFAULT 0,
explain_id IN VARCHAR2 DEFAULT NULL,
part_name IN VARCHAR2 DEFAULT NULL
);
exec CTX_QUERY.EXPLAIN_CLOB_QUERY(
index_name IN VARCHAR2,
text_query IN CLOB,
explain_table IN VARCHAR2,
sharelevel IN NUMBER DEFAULT 0,
explain_id IN VARCHAR2 DEFAULT NULL,
part_name IN VARCHAR2 DEFAULT NULL
);
Specify the name of the index to be queried.
Specify the query expression to be used as criteria for selecting rows.
When you include a wildcard, fuzzy, or soundex operator in text_query, this procedure looks at the index tables to determine the expansion.
Wildcard, fuzzy (?), and soundex (!) expression feedback does not account for lazy deletes as in regular queries.
Specify the name of the table used to store representation of the parse tree for text_query. You must have at least INSERT and DELETE privileges on the table used to store the results from EXPLAIN.
See Also:
"EXPLAIN Table" in Appendix A, "Oracle Text Result Tables" for more information about the structure of the explain table.Specify whether explain_table is shared by multiple EXPLAIN calls. Specify 0 for exclusive use and 1 for shared use. Default is 0 (single-use).
When you specify 0, the system automatically truncates the result table before the next call to EXPLAIN.
When you specify 1 for shared use, this procedure does not truncate the result table. Only results with the same explain_id are updated. When no results with the same explain_id exist, new results are added to the EXPLAIN table.
Specify a name that identifies the explain results returned by an EXPLAIN procedure when more than one EXPLAIN call uses the same shared EXPLAIN table. Default is NULL.
Specify the name of the index partition to query.
To create an explain table called test_explain for example, use the following SQL statement:
create table test_explain(
explain_id varchar2(30),
id number,
parent_id number,
operation varchar2(30),
options varchar2(30),
object_name varchar2(64),
position number,
cardinality number);
To obtain the expansion of a query expression such as comp% OR ?smith, use CTX_QUERY.EXPLAIN as follows:
ctx_query.explain(
index_name => 'newindex',
text_query => 'comp% OR ?smith',
explain_table => 'test_explain',
sharelevel => 0,
explain_id => 'Test');
Retrieving Data from Explain Table
To read the explain table, you can select the columns as follows:
select explain_id, id, parent_id, operation, options, object_name, position from test_explain order by id;
The output is ordered by ID to simulate a hierarchical query:
EXPLAIN_ID ID PARENT_ID OPERATION OPTIONS OBJECT_NAME POSITION ----------- ---- --------- ------------ ------- ----------- -------- Test 1 0 OR NULL NULL 1 Test 2 1 EQUIVALENCE NULL COMP% 1 Test 3 2 WORD NULL COMPTROLLER 1 Test 4 2 WORD NULL COMPUTER 2 Test 5 1 EQUIVALENCE (?) SMITH 2 Test 6 5 WORD NULL SMITH 1 Test 7 5 WORD NULL SMYTHE 2
CTX_QUERY.EXPLAIN does not support the use of query templates.
You cannot use CTX_QUERY.EXPLAIN with remote queries.
If the query utilizes themes (for example, with an ABOUT query), then a knowledge base must be installed. Such a knowledge base may or may not have been installed with Oracle Text. For more information on knowledge bases, see Oracle Text Application Developer's Guide.
In English or French, this procedure generates hierarchical query feedback information (broader term, narrower term, and related term) for the specified query.
Broader term, narrower term, and related term information is obtained from the knowledge base. However, only knowledge base terms that are also in the index are returned as query feedback information. This increases the chances that terms returned from HFEEDBACK produce hits over the currently indexed document set.
Hierarchical query feedback information is useful for suggesting other query terms to the user.
Note:
CTX_QUERY.HFEEDBACK requires an installed knowledge base. A knowledge base may or may not have been installed with Oracle Text. For more information on knowledge bases, see Oracle Text Application Developer's Guide.
CTX_QUERY.HFEEDBACK is only supported in English and French.
exec CTX_QUERY.HFEEDBACK(
index_name IN VARCHAR2,
text_query IN VARCHAR2,
feedback_table IN VARCHAR2,
sharelevel IN NUMBER DEFAULT 0,
feedback_id IN VARCHAR2 DEFAULT NULL,
part_name IN VARCHAR2 DEFAULT NULL
);
exec CTX_QUERY.HFEEDBACK_CLOB_QUERY(
index_name IN VARCHAR2,
text_query IN CLOB,
feedback_table IN VARCHAR2,
sharelevel IN NUMBER DEFAULT 0,
feedback_id IN VARCHAR2 DEFAULT NULL,
part_name IN VARCHAR2 DEFAULT NULL
);
Specify the name of the index for the text column to be queried.
Specify the query expression to be used as criteria for selecting rows.
Specify the name of the table used to store the feedback terms.
See Also:
"HFEEDBACK Table" in Appendix A, "Oracle Text Result Tables" for more information about the structure of the explain table.Specify whether feedback_table is shared by multiple HFEEDBACK calls. Specify 0 for exclusive use and 1 for shared use. Default is 0 (single-use).
When you specify 0, the system automatically truncates the feedback table before the next call to HFEEDBACK.
When you specify 1 for shared use, this procedure does not truncate the feedback table. Only results with the same feedback_id are updated. When no results with the same feedback_id exist, new results are added to the feedback table.
Specify a value that identifies the feedback results returned by a call to HFEEDBACK when more than one HFEEDBACK call uses the same shared feedback table. Default is NULL.
Specify the name of the index partition to query.
Create a result table to use with CTX_QUERY.HFEEDBACK as follows:
CREATE TABLE restab (
feedback_id VARCHAR2(30),
id NUMBER,
parent_id NUMBER,
operation VARCHAR2(30),
options VARCHAR2(30),
object_name VARCHAR2(80),
position NUMBER,
bt_feedback ctxsys.ctx_feedback_type,
rt_feedback ctxsys.ctx_feedback_type,
nt_feedback ctxsys.ctx_feedback_type
) NESTED TABLE bt_feedback STORE AS res_bt
NESTED TABLE rt_feedback STORE AS res_rt
NESTED TABLE nt_feedback STORE AS res_nt;
CTX_FEEDBACK_TYPE is a system-defined type in the CTXSYS schema.
See Also:
"HFEEDBACK Table" in Appendix A, "Oracle Text Result Tables" for more information about the structure of theHFEEDBACK table.The following code calls the HFEEDBACK procedure with the query computer industry.
BEGIN
ctx_query.hfeedback (index_name => 'my_index',
text_query => 'computer industry',
feedback_table => 'restab',
sharelevel => 0,
feedback_id => 'query10'
);
END;
The following code extracts the feedback data from the result table. It extracts broader term, narrower term, and related term feedback separately from the nested tables.
DECLARE
i NUMBER;
BEGIN
FOR frec IN (
SELECT object_name, bt_feedback, rt_feedback, nt_feedback
FROM restab
WHERE feedback_id = 'query10' AND object_name IS NOT NULL
) LOOP
dbms_output.put_line('Broader term feedback for ' || frec.object_name ||
':');
i := frec.bt_feedback.FIRST;
WHILE i IS NOT NULL LOOP
dbms_output.put_line(frec.bt_feedback(i).text);
i := frec.bt_feedback.NEXT(i);
END LOOP;
dbms_output.put_line('Related term feedback for ' || frec.object_name ||
':');
i := frec.rt_feedback.FIRST;
WHILE i IS NOT NULL LOOP
dbms_output.put_line(frec.rt_feedback(i).text);
i := frec.rt_feedback.NEXT(i);
END LOOP;
dbms_output.put_line('Narrower term feedback for ' || frec.object_name ||
':');
i := frec.nt_feedback.FIRST;
WHILE i IS NOT NULL LOOP
dbms_output.put_line(frec.nt_feedback(i).text);
i := frec.nt_feedback.NEXT(i);
END LOOP;
END LOOP;
END;
The following output is for the preceding example, which queries on computer industry:
Broader term feedback for computer industry: hard sciences Related term feedback for computer industry: computer networking electronics knowledge library science mathematics optical technology robotics satellite technology semiconductors and superconductors symbolic logic telecommunications industry Narrower term feedback for computer industry: ABEND - abnormal end of task AT&T Starlans ATI Technologies, Incorporated ActivCard Actrade International Ltd. Alta Technology Amiga Format Amiga Library Services Amiga Shopper Amstrat Action Apple Computer, Incorporated ..
Note:
TheHFEEDBACK information you obtain depends on the contents of your index and knowledge base and as such might differ from the sample shown.The CTX_QUERY.REMOVE_SQE procedure removes the specified stored query expression.
CTX_QUERY.REMOVE_SQE(query_name IN VARCHAR2);
Specify the name of the stored query expression to be removed.
This procedure executes an XML query and generates a result set in XML. The result set interface can return data views that are difficult to express in SQL, such as top N by category queries.
See Also:
Oracle Text Application Developer's Guide for details on how to use the result set interfaceCTX_QUERY.RESULT_SET ( index_name IN VARCHAR2, query IN VARCHAR2, result_set_descriptor IN CLOB, result_set IN OUT NOCOPY CLOB, part_name IN VARCHAR2 DEFAULT NULL);
Specify the index against which to execute the query.
Specify the query string.
Specify the result set descriptor in XML. It describes what the result set should contain. See "The Input Result Set Descriptor" for more details.
Specify the output result set. If this variable is NULL on input, a session-duration temporary lob will be allocated and returned to the user. The user is responsible for deallocating this temporary lob. See "The Output Result Set XML" for more details.
Specify the index partition name. If the index is global, part_name must be NULL. If the index is partitioned and part_name is not NULL, then the query will only be evaluated for the given partition. If the index is partitioned and part_name is NULL, then the query will be evaluated for all partitions.
The Input Result Set Descriptor
The result set descriptor is an XML message which describes what to calculate for the result set. The elements present in the result set descriptor and the order in which they occur serve as a simple template, specifying what to include in the output result set. That is, there should be the list of hit rowids, then a count, then a token count, and so on. The attributes of the elements specify the parameters and options to the specific operations, such as number of hits in the list of rowids, estimate versus exact count, and so on.
The result set descriptor itself is XML conforming to the following DTD:
<!ELEMENT ctx_result_set_descriptor (hitlist?, group*, count?)> <!ELEMENT hitlist (rowid?, score?, sdata*)> <!ELEMENT group(count?)> <!ELEMENT count EMPTY> <!ELEMENT rowid EMPTY> <!ELEMENT score EMPTY> <!ELEMENT sdata EMPTY> <!ATTLIST group sdata CDATA #REQUIRED> <!ATTLIST hitlist start_num_hit integer #REQUIRED> <!ATTLIST hitlist end_num_hit integer #REQUIRED> <!ATTLIST hitlist order PCDATA #IMPLIED> <!ATTLIST count exact (TRUE|FALSE) "FALSE"> <!ATTLIST sdata name CDATA #REQUIRED>
The following is a description of the possible XML elements for the result set descriptor:
ctx_result_set_descriptor
This is the root element for the result set descriptor. The parent element is none, as are the available attributes.
The possible child elements are:
Zero or more hitlist elements.
Zero or more group elements.
At most one count element.
group
The group element causes the generated result set to include a group breakdown. In other words, a breakdown of the results by SDATA section values. The parent element is ctx_result_set_descriptor, and the available attributes are:
sdata
Specifies the name of the SDATA section to use for grouping. It is required.
Possible child elements of group are:
At most one count element.
hitlist
The hitlist element controls inclusion of a list of hit documents. The parent element is ctx_result_set_descriptor, and the available attributes are:
start_hit_num
This specifies the starting document hit to be included in the generated result set. This can be set to any positive integer less than or equal to 2048. For example, if start_hit_num is 21, then the result set will include document hits starting from the 21st document hit. This element is required.
end_hit_num
This specifies the last document hit to be included in the generated result set. This can be set to any positive integer less than or equal to 2048. For example, if end_hit_num is 40, then the result set will include document hits up to the 40th document hit. This element is required.
order
This is an optional attribute that specifies the order for the documents in the generated result set. The value is a list similar to a SQL ORDER BY statement, except that, instead of column names, they can either be SCORE or SDATA section names. In the following example, MYDATE and MYPRICE are the SDATA section names:
(order = "SCORE DESC, MYDATE, MYPRICE DESC")
The possible child elements are:
At most one rowid element.
At most one score element.
At most one sdata element.
count
This element causes the generated result set to include a count of the number of hit documents. The parent elements are:
ctx_result_set_descriptor
group
The available attributes are:
exact
This is to estimate mode. Set to true or false. It is required, and the default is false.
The possible child elements are none.
rowid
This child element causes the generated result set to include rowid information for each hit. The parent element is hitlist. There are no attributes and no possible child elements.
score
This child element causes the generated result set to include score information for each hit.
The parent element is hitlist.
There are no available attributes, and no possible child elements.
sdata
This child element causes the generated result set to include sdata values for each hit.
The parent element is hitlist.
The available attribute is name. This specifies the name of the sdata section. It is required.
There are no child elements.
The output result set XML is XML conforming to the following DTD:
<!ELEMENT ctx_result_set (hitlist?, groups*, count?)> <!ELEMENT hitlist (hit*)> <!ELEMENT hit(rowid?, score?, sdata*)> <!ELEMENT groups (group*)> <!ELEMENT group (count?)> <!ELEMENT count CDATA> <!ELEMENT rowid CDATA> <!ELEMENT score CDATA> <!ELEMENT sdata CDATA> <!ATTLIST groups sdata CDATA #REQUIRED> <!ATTLIST group value CDATA #REQUIRED> <!ATTLIST sdata name CDATA #REQUIRED>
The following is a description of the list of possible XML elements for the output result set:
ctx_result_set
This is the root element for the generated result set. There are no attributes. The parent is none. The possible child elements are:
At most one hitlist element.
Zero or more groups elements.
groups
This delimits the start of a group breakdown section. The parent element is ctx_result_set. The available attributes are:
sdata
This is the name of the sdata section used for grouping.
The possible child elements are:
Zero or more group elements.
group
This delimits the start of a GROUP BY value. The parent element is the groups element. The available attributes are:
value
This is the value of the sdata section.
The possible child elements are at most one count element.
hitlist
This delimits the start of hitlist information. The parent element is ctx_result_set, while the children are zero or more hit elements. There are no attributes.
hit
This delimits the start of the information for a particular document within a hitlist. The parent element is hitlist, and there are no available attributes. The possible child elements are:
Zero or one rowid elements.
Zero or one score element.
Zero or one sdata element.
rowid
This is the rowid of the document, so the content is the rowid of the document. The parent element is the hit element. There are no child elements, and no available attributes.
score
This is the score of the document. The parent element is the hit element. The content is the numeric score. There are no available attributes, and no possible child elements.
sdata
This is the SDATA value or values for the document. The parent element is the hit element, and the available attribute is name, which is the name of the sdata section. There are no possible child elements available. The content is the SDATA section value, which, for DATE values, is in the format "YYYY-MM-DD HH24:MI:SS", depending upon the actual values being stored.
count
This is the document hit count. The parent element is the ctx_result_set element or the group element. It contains the numeric hit count, and has no attributes, and no possible child elements.
This call to ctx_query.result_set() with the specified XML result_set_descriptor will generate the following information in the form of XML:
top 20 hits displaying, score, rowid, author SDATA section value, and pubDate SDATA section value, order by pubDate SDATA section value DESC and score DESC
total doc hit count for the text query
counts group by pubDate SDATA section values
counts group by author SDATA section values
declare
rs clob;
begin
dbms_lob.createtemporary(rs, true, dbms_lob.session);
ctx_query.result_set('docidx', 'oracle', '
<ctx_result_set_descriptor>
<count/>
<hitlist start_hit_num="1" end_hit_num="5" order="pubDate desc, score desc">
<score/>
<rowid/>
<sdata name="author"/>
<sdata name="pubDate"/>
</hitlist>
<group sdata="pubDate">
<count/>
</group>
<group sdata="author">
<count/>
</group>
</ctx_result_set_descriptor>
', rs);
dbms_lob.freetemporary(rs);
exception
when others then
dbms_lob.freetemporary(rs);
raise;
end;
/
The XML output store in the result set output clob will resemble the following:
<ctx_result_set>
<hitlist>
<hit>
<score>3</score><rowid>AAAPoEAABAAAMWsAAC</rowid>
<sdata name="AUTHOR">John</sdata>
<sdata name="PUBDATE">2001-01-03 00:00:00</sdata>
</hit>
<hit>
<score>3</score><rowid>AAAPoEAABAAAMWsAAG</rowid>
<sdata name="AUTHOR">John</sdata>
<sdata name="PUBDATE">2001-01-03 00:00:00</sdata>
</hit>
<hit>
<score>3</score><rowid>AAAPoEAABAAAMWsAAK</rowid>
<sdata name="AUTHOR">John</sdata>
<sdata name="PUBDATE">2001-01-03 00:00:00</sdata>
</hit>
<hit>
<score>3</score><rowid>AAAPoEAABAAAMWsAAO</rowid>
<sdata name="AUTHOR">John</sdata>
<sdata name="PUBDATE">2001-01-03 00:00:00</sdata>
</hit>
<hit>
<score>3</score><rowid>AAAPoEAABAAAMWsAAS</rowid>
<sdata name="AUTHOR">John</sdata>
<sdata name="PUBDATE">2001-01-03 00:00:00</sdata>
</hit>
</hitlist>
<count>100</count>
<groups sdata="PUBDATE">
<group value="2001-01-01 00:00:00"><count>25</count></group>
<group value="2001-01-02 00:00:00"><count>50</count></group>
<group value="2001-01-03 00:00:00"><count>25</count></group>
</groups>
<groups sdata="AUTHOR">
<group value="John"><count>50</count></group>
<group value="Mike"><count>25</count></group>
<group value="Steve"><count>25</count></group>
</groups>
</ctx_result_set>
This procedure creates a stored query expression. Only the query definition is stored.
Stored query expressions support all of the CONTAINS query operators. Stored query expressions also support all of the special characters and other components that can be used in a query expression, including other stored query expressions.
Users are allowed to create and remove stored query expressions owned by them. Users are allowed to use stored query expressions owned by anyone. The CTXSYS user can create or remove stored query expressions for any user.
CTX_QUERY.STORE_SQE(query_name IN VARCHAR2,
text_query IN VARCHAR2);
CTX_QUERY.STORE_SQE_CLOB_QUERY(query_name IN VARCHAR2,
text_query IN CLOB);
Specify the name of the stored query expression to be created.
Specify the query expression to be associated with query_name.