DAI
From SRB
| Revision as of 23:46, 17 February 2006 132.249.102.242 (Talk | contribs) ← Previous diff |
Revision as of 00:05, 22 February 2006 132.249.102.242 (Talk | contribs) Next diff → |
||
| Line 1: | Line 1: | ||
| - | This document provides a brief note on usage of the | + | This document provides a brief note on usage of the database access interface (DAI). The DAI provides access to tabular data stored in databases using the SRB. This document details how to use the SRB API to ingest into and retrieve from a database tables. The DAI includes mechanisms for tailoring the input and output streams. This is performed by associating a 'template' file with the input which can be used to interpret the data in the input file and convert them into SQL statements for inserting into the database. On the way out, the 'template' file can be used to construct forms and marked-up documents using the tabular data produced by SQL statements. These conversion operations occur on the fly and inside the SRB and is conveniently transparent to the user. We have implemented a template language called T-language which is used for building these templates. We discuss briefly how to use the T-language templates. A separate primer on the T-language is also available in the TLANG.primer file. |
| - | database access interface (DAI). The DAI provides | + | |
| - | access to tabular data stored in databases using the | + | |
| - | SRB. This document details how to use the SRB API to | + | |
| - | ingest into and retrieve from a database tables. | + | |
| - | The DAI includes mechanisms for tailoring the input and output streams. | + | |
| - | This is performed by associating a 'template' file with the input | + | |
| - | which can be used to interpret the data in the input file and convert them | + | |
| - | into SQL statements for inserting into the database. On the way out, | + | |
| - | the 'template' file can be used to construct forms and marked-up documents | + | |
| - | using the tabular data produced by SQL statements. These conversion operations | + | |
| - | occur on the fly and inside the SRB and is conveniently transparent to the | + | |
| - | user. We have implemented a template language called T-language which is | + | |
| - | used for building these templates. We discuss briefly how to use the | + | |
| - | T-language templates. A separate primer on the T-language is also | + | |
| - | available in the TLANG.primer file. | + | |
| We provide brief notes on various stages in using the DAI interface: | We provide brief notes on various stages in using the DAI interface: | ||
| - | 1. Registering a database object with MCAT: | + | ==1. Registering a database object with MCAT:== |
| ------------------------------------------ | ------------------------------------------ | ||
| - | (a) Register the location of the database using ingestLocation command | + | (a) Register the location of the database using ingestLocation command using the string |
| - | using the string | + | "<hostname>:<databasename>:<instancename>" as the netprefix parameter. |
| - | "<hostname>:<databasename>:<instancename>" | + | |
| - | as the netprefix parameter. | + | |
| - | <hostname> is the name of the host where the SRBserver brokering the | + | <hostname> is the name of the host where the SRBserver brokering the database is located. |
| - | database is located. | + | <databasename> is the name of the database. |
| - | <databasename> is the name of the database. | + | <instancename> is the name of the database instance or it can also be the name of the user under which the database is going to be accessed. |
| - | <instancename> is the name of the database instance or it can also be | + | |
| - | the name of the user under which the database is going to | + | |
| - | be accessed. | + | |
| - | Example: ingestLocation 'catharsis dlib1 DB' 'catharsis.sdsc.edu:dlib1:srb' 'catharsis sdsc' srb sdsc | + | Example: ingestLocation 'catharsis dlib1 DB' 'catharsis.sdsc.edu:dlib1:srb' 'catharsis sdsc' srb sdsc |
| - | Currently we provide only one user per resource (i.e., per location). | + | Currently we provide only one user per resource (i.e., per location). |
| - | We plan to provide a more flexible combination in a later release. | + | We plan to provide a more flexible combination in a later release. |
| - | (b) Register the resource using ingestResource command with | + | (b) Register the resource using ingestResource command with |
| - | '' as the default path name, | + | '' as the default path name, |
| - | '??? table database' as the resource type, where | + | '??? table database' as the resource type, where |
| - | ??? can be oracle, db2 or illustra. | + | ??? can be oracle, db2 or illustra. |
| - | + | ||
| - | Example: ingestResource 'ora-dlib1-sdsc' 'oracle table database' 'catharsis dlib1 DB' '' permanent 0 | + | Example: ingestResource 'ora-dlib1-sdsc' 'oracle table database' 'catharsis dlib1 DB' '' permanent 0 |
| (c) Register the database itself as a SRB dataset using the | (c) Register the database itself as a SRB dataset using the | ||
| - | Sregister command and use options | + | Sregister command and use options |
| - | -D 'database shadow object' /* for datatype */ | + | -D 'database shadow object' /* for datatype */ |
| - | and '' (empty string) for RegisteringObjectPath value. | + | and '' (empty string) for RegisteringObjectPath value. |
| + | |||
| + | Example: Sregister -D 'database shadow object' -R ora-dlib1-sdsc '' dlib1 | ||
| - | Example: Sregister -D 'database shadow object' -R ora-dlib1-sdsc '' dlib1 | ||
| - | |||
| - | The shadow object is a unique concept in SRB. Normally, when we | + | The shadow object is a unique concept in SRB. Normally, when we ingest or register an object in SRB, the object is considered to be an atomic object and is retrieved as such when accessed. When we register (shadow objects cannot be ingested) a shadow object in SRB, it implies that the object is not a 'real' object (with its own bits and bytes), but stands for (or is a pointer to) a set or structure of objects. Using the data type 'directory shadow object' one can register a directory in SRB without registering (or ingesting) its component files and sub-directories). In this sense the path name registered for that object is partial and gives a prefix. During actual access, the user provides the suffix portion of the path that is appended to the path. The shadow object in this sense is not a real object but stands for a structured set of objects whose contents/access paths are defined at access time. |
| - | ingest or register an object in SRB, the object is considered to | + | |
| - | be an atomic object and is retrieved as such when accessed. When | + | |
| - | we register (shadow objects cannot be ingested) a shadow object | + | |
| - | in SRB, it implies that the object is not a 'real' object (with | + | |
| - | its own bits and bytes), but stands for (or is a pointer to) a set | + | |
| - | or structure of objects. Using the data type 'directory shadow | + | |
| - | object' one can register a directory in SRB without registering | + | |
| - | (or ingesting) its component files and sub-directories). In this | + | |
| - | sense the path name registered for that object is partial and | + | |
| - | gives a prefix. During actual access, the user provides the suffix | + | |
| - | portion of the path that is appended to the path. The shadow object | + | |
| - | in this sense is not a real object but stands for a structured set of | + | |
| - | objects whose contents/access paths are defined at access time. | + | |
| - | The 'database shadow object' provides a data type for a database. With | + | The 'database shadow object' provides a data type for a database. With the path name being empty, it will allow any query to be sent to the database. One can provide for a more controlled querying by giving a partial path such as 'select ' instead of ''. Note that the default path registered as RegisteringObjectPath can contain variables in them and hence a very flexible but uniquely controllable querying is possible. |
| - | the path name being empty, it will allow any query to be sent to | + | |
| - | the database. One can provide for a more controlled querying | + | |
| - | by giving a partial path such as 'select ' instead of ''. | + | |
| - | Note that the default path registered as RegisteringObjectPath | + | |
| - | can contain variables in them and hence a very flexible but uniquely | + | |
| - | controllable querying is possible. | + | |
| - | Examples: Sregister -D 'database shadow object' -R ora-dlib1-sdsc 'select * from AA ' dlib1 | + | Examples: Sregister -D 'database shadow object' -R ora-dlib1-sdsc 'select * from AA ' dlib1 with the where condition optionally specified by the user. |
| - | with the where condition optionally specified by the user. | + | Sregister -D 'database shadow object' -R ora-dlib1-sdsc '<TEMPLATETYPE>HTMLREL</TEMPLATETYPE> select ' dlib1 which predefines a template to be used for accessing results from this database through this object. |
| - | Sregister -D 'database shadow object' -R ora-dlib1-sdsc '<TEMPLATETYPE>HTMLREL</TEMPLATETYPE> select ' dlib1 | + | |
| - | which predefines a template to be used for accessing results from this database through this object. | + | |
| (d) Enter the three lines (for Oracle) in data/TableConfig file: | (d) Enter the three lines (for Oracle) in data/TableConfig file: | ||
| - | OracleUser '<databasename>:<instancename>' '<username>' | + | OracleUser '<databasename>:<instancename>' '<username>' |
| - | OraclePasswd '<databasename>:<instancename>' '<password>' | + | OraclePasswd '<databasename>:<instancename>' '<password>' |
| - | OracleHome '<databasename>:<instancename>' '<home-dir-of-oracle>' | + | OracleHome '<databasename>:<instancename>' '<home-dir-of-oracle>' |
| - | <username> is the name of the user under which the database is going to | + | <username> is the name of the user under which the database is going to |
| - | be accessed. | + | be accessed. |
| - | <password> is the password to be used when connecting to the database. | + | <password> is the password to be used when connecting to the database. |
| - | <home-dir-of-oracle> is the path name of Oracle Homeas given in ORACLE_HOME | + | <home-dir-of-oracle> is the path name of Oracle Homeas given in ORACLE_HOME |
| - | Example: OracleUser 'dlib1:srb' 'sekar' | + | Example: OracleUser 'dlib1:srb' 'sekar' |
| - | OraclePasswd 'dlib1:srb' 'sekarpassword' | + | OraclePasswd 'dlib1:srb' 'sekarpassword' |
| - | OracleHome 'dlib1:srb' '/usr/local/apps/oracle/product/9.2.0' | + | OracleHome 'dlib1:srb' '/usr/local/apps/oracle/product/9.2.0' |
| - | In case of DB2, use the phrases Db2User and Db2Passwd and | + | In case of DB2, use the phrases Db2User and Db2Passwd and for Illustra use the phrases IllusUser and IllusPasswd. |
| - | for Illustra use the phrases IllusUser and IllusPasswd. | + | Sample TableConfig file in data directory gives the info for other database systems. |
| - | Sample TableConfig file in data directory gives the info for other | + | |
| - | database systems. | + | |
| - | + | ||
| - | 2. Retrieving data from databases: | + | |
| + | ==2. Retrieving data from databases:== | ||
| --------------------------------- | --------------------------------- | ||
| There are three ways of accessing data from a database through the DAI, | There are three ways of accessing data from a database through the DAI, | ||
| - | each of increasing complexity. | + | each of increasing complexity. |
| (a) Packed Output | (a) Packed Output | ||
| - | In the first and simplest method, one provides a SQL select query as value for | + | In the first and simplest method, one provides a SQL select query as value for the SHADOW variable. The output is in the form of a file with every value string in the row-order output is terminated with a ASCII NULL symbol. We call this mode of presentation as the PACKED mode, since there is no mark up involved except for the delimiting of values using the NULL symbol ('\0' of C). |
| - | the SHADOW variable. The output is in the form of a file with every | + | |
| - | value string in the row-order output is terminated with a ASCII NULL | + | |
| - | symbol. We call this mode of presentation as the PACKED mode, since | + | |
| - | there is no mark up involved except for the delimiting of values using | + | |
| - | the NULL symbol ('\0' of C). | + | |
| - | Examples: Scat 'dlib1&SHADOW= select * from AA' | + | Examples: Scat 'dlib1&SHADOW= select * from AA' |
| - | Sget '/home/sekar.sdsc/dlib1&SHADOW= select A,B from AA where B >= 2' outFile | + | Sget '/home/sekar.sdsc/dlib1&SHADOW= select A,B from AA where B >= 2' outFile |
| - | Scat 'dlib1&SHADOW= select distinct T.A, S.C from AA T, BB S where T.B = S.B order by 1 asc, 2 desc' | + | Scat 'dlib1&SHADOW= select distinct T.A, S.C from AA T, BB S where T.B = S.B order by 1 asc, 2 desc' |
| - | Note how, the value of the SHADOW variable is given using the ampersand | + | Note how, the value of the SHADOW variable is given using the ampersand as part of the object name in the Scommand. Also there should be no blanks between the object name and the '&' symbol unless the object name has been registered with blanks. In these samples we are assuming that the tables are pre-created and have a few values inserted appropriately before hand. |
| - | as part of the object name in the Scommand. Also there should be no | + | |
| - | blanks between the object name and the '&' symbol unless the | + | |
| - | object name has been registered with blanks. | + | |
| - | In these samples we are assuming that the tables are pre-created | + | |
| - | and have a few values inserted appropriately before hand. | + | |
| (b) Built-In Output Templates | (b) Built-In Output Templates | ||
| - | In the second method, the user can apply any of the few built-in templates | + | In the second method, the user can apply any of the few built-in templates that are provided in the DAI package. These templates give a a set of output independent of the attributes involved. |
| - | that are provided in the DAI package. These templates give a | + | The templates currently provided by DAI are: |
| - | a set of output independent of the attributes involved. | + | HTMLREL - result table encoded as an HTML tabular data. The attribute names are used for heading the column names. |
| - | The templates currently provided by DAI are: | + | HTMLTREE - result table is given in a directory-like tree form with one value per line of the table. |
| - | HTMLREL - result table encoded as an HTML tabular data. The | + | HTMLNEST - result table is given as a nested relational table. This requires that the columns be ordered. The difference between this and HTMLTREE is that in here the values in a subsequent column are nested into the same box of the HTML table if there column values are the same. |
| - | attribute names are used for heading the column names. | + | XMLREL - result table is marked up in XML tags. The tag structures used are very simple and intuitive. |
| - | HTMLTREE - result table is given in a directory-like tree form | + | XMLTREE - the result table is nested and tagged in XML. This captures the semi-structures found in the result table. This also requires that the columns be ordered. |
| - | with one value per line of the table. | + | |
| - | HTMLNEST - result table is given as a nested relational table. | + | |
| - | This requires that the columns be ordered. The difference | + | |
| - | between this and HTMLTREE is that in here the values in | + | |
| - | a subsequent column are nested into the same box of the | + | |
| - | HTML table if there column values are the same. | + | |
| - | XMLREL - result table is marked up in XML tags. The tag structures | + | |
| - | used are very simple and intuitive. | + | |
| - | XMLTREE - the result table is nested and tagged in XML. This | + | |
| - | captures the semi-structures found in the result table. | + | |
| - | This also requires that the columns be ordered. | + | |
| - | Examples: Scat 'dlib1&SHADOW= <TEMPLATETYPE>HTMLREL</TEMPLATETYPE> select * from AA' | + | Examples: Scat 'dlib1&SHADOW= <TEMPLATETYPE>HTMLREL</TEMPLATETYPE> select * from AA' |
| - | Sget '/home/sekar.sdsc/dlib1&SHADOW= <TEMPLATETYPE>HTMLNEST</TEMPLATETYPE> select A,B from AA where B >= 2 order by 1,2' outFile | + | Sget '/home/sekar.sdsc/dlib1&SHADOW= <TEMPLATETYPE>HTMLNEST</TEMPLATETYPE> select A,B from AA where B >= 2 order by 1,2' outFile |
| - | Scat 'dlib1&SHADOW= <TEMPLATETYPE>XMLTREE</TEMPLATETYPE> select distinct T.A, S.C from AA T, BB S where T.B = S.B order by 1 asc, 2 desc' | + | Scat 'dlib1&SHADOW= <TEMPLATETYPE>XMLTREE</TEMPLATETYPE> select distinct T.A, S.C from AA T, BB S where T.B = S.B order by 1 asc, 2 desc' |
| (b) User-defined Templates | (b) User-defined Templates | ||
| - | In this method, the user creates a template in a special language called | + | In this method, the user creates a template in a special language called the T-language (cf. TLANG.primer) and registers that template as a SRB object. The user can give read permissions for the template to other users and groups if desired. The template can then be used to format the output of the result table. The template can take parameters which can also be given by the user when performing the access. These templates can be viewed as style sheets for displaying the tabular data.Note that templates, more often than not, are tailored to specific queries or class of queries. |
| - | the T-language (cf. TLANG.primer) and registers that template | + | |
| - | as a SRB object. The user can give read permissions for the | + | |
| - | template to other users and groups if desired. The template | + | |
| - | can then be used to format the output of the result table. | + | |
| - | The template can take parameters which can also be given by | + | |
| - | the user when performing the access. These templates can be viewed | + | |
| - | as style sheets for displaying the tabular data.Note that templates, | + | |
| - | more often than not, are tailored to specific queries or class of | + | |
| - | queries. | + | |
| - | + | ||
| - | Examples: Scat 'dlib1&SHADOW= <TEMPLATETYPE>TEMPLATE /styles/userInfoForm1.html</TEMPLATETYPE> select rsrc_id,rsrc_name,rsrc_typ_name from mdas_cd_rsrc A,mdas_td_rsrc_typ B where A.rsrc_typ_id=B.rsrc_typ_id' | + | Examples: Scat 'dlib1&SHADOW= <TEMPLATETYPE>TEMPLATE /styles/userInfoForm1.html</TEMPLATETYPE> select rsrc_id,rsrc_name,rsrc_typ_name from mdas_cd_rsrc A,mdas_td_rsrc_typ B where A.rsrc_typ_id=B.rsrc_typ_id' |
| - | Sget '/home/sekar.sdsc/dlib1&SHADOW= <TEMPLATETYPE>TEMPLATE /styles/userInfoForm2</TEMPLATETYPE><TEMPLATEPARAM>alpha=2</TEMPLATEPARAM> select A,B from AA ' outFile | + | Sget '/home/sekar.sdsc/dlib1&SHADOW= <TEMPLATETYPE>TEMPLATE /styles/userInfoForm2</TEMPLATETYPE><TEMPLATEPARAM>alpha=2</TEMPLATEPARAM> select A,B from AA ' outFile |
| - | In the first example, the template userInfoForm1.html is being used for | + | In the first example, the template userInfoForm1.html is being used for |
| - | configuring the output of the query. The second example uses a | + | configuring the output of the query. The second example uses a |
| - | template which requires a parametric value for alpha at access time. | + | template which requires a parametric value for alpha at access time. |
| - | The user supplied value of 2 is used during this access. The value | + | The user supplied value of 2 is used during this access. The value |
| - | is used in computation when formatting the output. For more | + | is used in computation when formatting the output. For more |
| - | information on the template creation and execution, please see the | + | information on the template creation and execution, please see the |
| - | primer in file TLANG.primer. | + | primer in file TLANG.primer. |
| - | The section on registering a dataset provides an example of how | + | The section on registering a dataset provides an example of how |
| - | one can pre-associate a template to an object. | + | one can pre-associate a template to an object. |
| - | 3.Inserting data into databases: | + | ==3.Inserting data into databases:== |
| ------------------------------- | ------------------------------- | ||
| - | The Database Access Interface provides four ways of inserting data into | + | The Database Access Interface provides four ways of inserting data into |
| a database. One important point to note is that when one inserts data into | a database. One important point to note is that when one inserts data into | ||
| a database using the DAI, the database and the associated tables are expected | a database using the DAI, the database and the associated tables are expected | ||
| Line 191: | Line 117: | ||
| (a) The simple method : using SQL insert statements. | (a) The simple method : using SQL insert statements. | ||
| - | In this method we send insert statements through the write-buffers. | + | In this method we send insert statements through the write-buffers. These insert statements get executed and committed when the file is closed. For this type of ingestion |
| - | These insert statements get executed and committed when the | + | SHADOW value = insert [into <tab-name> ] and the data value sent through buffers is a set of insert statements separated by commas. |
| - | file is closed. For this type of ingestion | + | |
| - | SHADOW value = insert [into <tab-name> ] | + | |
| - | and the data value sent through buffers is a set of | + | |
| - | insert statements separated by commas. | + | |
| - | Examples: | + | Examples: |
| - | Sappend -i 'dlib1&SHADOW=insert into AA' | + | Sappend -i 'dlib1&SHADOW=insert into AA' |
| - | insert into AA values (30,40,30,'aa','bb'); | + | insert into AA values (30,40,30,'aa','bb'); |
| - | insert into AA values (40,40,30,'aa','bb') | + | insert into AA values (40,40,30,'aa','bb') |
| - | ^D | + | ^D |
| - | Sappend -i 'dlib1&SHADOW=insert' | + | Sappend -i 'dlib1&SHADOW=insert' |
| - | insert into AA values (30,40,30,'aa','bb'); | + | insert into AA values (30,40,30,'aa','bb'); |
| - | insert into AA values (30,20,30,'aa','bb'); | + | insert into AA values (30,20,30,'aa','bb'); |
| - | insert into AA values (40,40,30,'aa','bb') | + | insert into AA values (40,40,30,'aa','bb') |
| - | ^D | + | ^D |
| - | Sappend -i 'dlib1&SHADOW=insert' | + | Sappend -i 'dlib1&SHADOW=insert' |
| - | insert into AA values (30,40,30,'aa','bb'); | + | insert into AA values (30,40,30,'aa','bb'); |
| - | insert into BB values (40,'aa','bb') | + | insert into BB values (40,'aa','bb') |
| - | ^D | + | ^D |
| - | Sappend -i "dlib1&SHADOW=insert" | + | Sappend -i "dlib1&SHADOW=insert" |
| - | insert into AA select A,A,A,B,C from BB; | + | insert into AA select A,A,A,B,C from BB; |
| - | ^D | + | ^D |
| - | The -i option of Sappend diverts the terminal input into SRB | + | The -i option of Sappend diverts the terminal input into SRB object. Hence, the lines following that statement are the ones that are filled into a buffer and sent to SRB. As can be seen one can ingest multiple tables and execute complex insert statements using this method. |
| - | object. Hence, the lines following that statement are the ones | + | |
| - | that are filled into a buffer and sent to SRB. | + | |
| - | As can be seen one can ingest multiple tables and execute | + | |
| - | complex insert statements using this method. | + | |
| (b) Insertion using PACKED data and VALUES data | (b) Insertion using PACKED data and VALUES data | ||
| - | This method allows one to insert data into a single table where | + | This method allows one to insert data into a single table where the data is in PACKed form (i.e., each value is ASCII NULL separated). |
| - | the data is in PACKed form (i.e., each value is ASCII NULL separated). | + | For example, the output from a table using the packed format (cf. Section 2(a)) is in this format and can be inserted into another database table if necessary using this format. |
| - | For example, the output from a table using the packed format (cf. | + | |
| - | Section 2(a)) is in this format and can be inserted into another | + | |
| - | database table if necessary using this format. | + | |
| - | + | ||
| - | Examples: | + | |
| - | Sappend packFile 'dlib1&SHADOW=<TEMPLATETYPE>PACK</TEMPLATETYPE>insert into AA' | + | |
| - | Scat '/home/sekar.sdsc/dlib1&SHADOW= select distinct * from AA' | + | |
| - | | Sappend -i 'dlib1&SHADOW=<TEMPLATETYPE>PACK</TEMPLATETYPE>insert into AA' | + | |
| - | The second example shows how one can copy tabular data from one table | + | Examples: |
| - | to another (in this case itself) possibly on different databases | + | Sappend packFile 'dlib1&SHADOW=<TEMPLATETYPE>PACK</TEMPLATETYPE>insert into AA' |
| - | from different vendors. Also, The ingestion takes into account | + | Scat '/home/sekar.sdsc/dlib1&SHADOW= select distinct * from AA' |
| - | whether the value goes into a character column or a numeric column | + | | Sappend -i 'dlib1&SHADOW=<TEMPLATETYPE>PACK</TEMPLATETYPE>insert into AA' |
| - | and adds surrounding quotes if needed. The table name in the | + | |
| - | insert statement of SHADOW is used for this purpose. | + | |
| - | One can insert the PACKED values into multiple tables also. This is | + | The second example shows how one can copy tabular data from one table to another (in this case itself) possibly on different databases from different vendors. Also, The ingestion takes into account whether the value goes into a character column or a numeric column and adds surrounding quotes if needed. The table name in the insert statement of SHADOW is used for this purpose. |
| - | accomplished using the MULTIINSERT preamble being added to the | + | |
| - | SQL statements in the SHADOW value. | + | |
| - | Examples: | + | One can insert the PACKED values into multiple tables also. This is accomplished using the MULTIINSERT preamble being added to the SQL statements in the SHADOW value. |
| - | Sappend packFile "dlib1&SHADOW=<TEMPLATETYPE>PACK</TEMPLATETYPE>MULTIINSERT insert into CC1 values (***0:,***1:,***2:);insert into CC2 values (***0:,'***3:','***4:');" | + | |
| - | The DAI executes the set of SQL statements given in the MULTIINSERT | + | Examples: |
| - | part one after the other. The variables (predefined from ***0: onwards | + | Sappend packFile "dlib1&SHADOW=<TEMPLATETYPE>PACK</TEMPLATETYPE>MULTIINSERT insert into CC1 values (***0:,***1:,***2:);insert into CC2 values (***0:,'***3:','***4:');" |
| - | upto ***255: ) are bound with values from the input buffers. | + | |
| - | Note that if the field is a character field, the user has to | + | |
| - | specifically add the quotes unless the input buffer has quoted the | + | |
| - | appropriate field values. The technique can be used to | + | |
| - | copy data from one database to another and into an entirely | + | |
| - | different schema. | + | |
| - | + | ||
| - | The PACKED format takes in NULL separated data and is useful | + | |
| - | when the data is generated by a machine. In case, data is | + | |
| - | hand generated one can use the VALUES template. With this | + | |
| - | tag, each row of values to be inserted is separated from the next row | + | |
| - | of values using the ';' symbol. DAI takes these | + | |
| - | values and plugs them into insert statements and executes them. | + | |
| + | The DAI executes the set of SQL statements given in the MULTIINSERT part one after the other. The variables (predefined from ***0: onwards upto ***255: ) are bound with values from the input buffers. Note that if the field is a character field, the user has to specifically add the quotes unless the input buffer has quoted the appropriate field values. The technique can be used to copy data from one database to another and into an entirely different schema. | ||
| - | Examples: | + | The PACKED format takes in NULL separated data and is useful when the data is generated by a machine. In case, data is hand generated one can use the VALUES template. With this tag, each row of values to be inserted is separated from the next row of values using the ';' symbol. DAI takes these values and plugs them into insert statements and executes them. |
| - | Sappend -i 'dlib1&SHADOW=<TEMPLATETYPE>VALUES</TEMPLATETYPE> insert into AA' | + | |
| - | 300,400,300,'eaa','ebb'; | + | |
| - | 300,200,300,'eaa','ebb'; | + | |
| - | 400,400,300,'eaa','ebb'; | + | |
| - | ^D | + | |
| - | Note that each value can be in separate lines as long as each value | + | |
| - | is separated by a ',' and each row separated by a ';'. | + | Examples: |
| - | Multi table inserts are also possible (as in previous example) | + | Sappend -i 'dlib1&SHADOW=<TEMPLATETYPE>VALUES</TEMPLATETYPE> insert into AA' |
| - | with the VALUES type. | + | 300,400,300,'eaa','ebb'; |
| + | 300,200,300,'eaa','ebb'; | ||
| + | 400,400,300,'eaa','ebb'; | ||
| + | ^D | ||
| + | |||
| + | Note that each value can be in separate lines as long as each value is separated by a ',' and each row separated by a ';'. Multi table inserts are also possible (as in previous example) with the VALUES type. | ||
| (c) Insertion of Data in Built-In Template formats | (c) Insertion of Data in Built-In Template formats | ||
| - | This method allows the user to insert data that exist in any of the | + | This method allows the user to insert data that exist in any of the built-in formats supported by DAI. |
| - | built-in formats supported by DAI. | + | |
| + | Examples: | ||
| + | Sappend xmlrelfile 'dlib1&SHADOW= <TEMPLATETYPE>XMLREL</TEMPLATETYPE> insert into AA' | ||
| + | Scat '/home/sekar.sdsc/dlib1&SHADOW= <TEMPLATETYPE>XMLREL</TEMPLATETYPE>select distinct * from AA' | ||
| + | | Sappend -i 'dlib1&SHADOW=<TEMPLATETYPE>XMLREL</TEMPLATETYPE>insert into AA' | ||
| - | Examples: | + | The same techniques will work for other built-in templates also, Multi table inserts are also allowed in this method. |
| - | Sappend xmlrelfile 'dlib1&SHADOW= <TEMPLATETYPE>XMLREL</TEMPLATETYPE> insert into AA' | + | |
| - | Scat '/home/sekar.sdsc/dlib1&SHADOW= <TEMPLATETYPE>XMLREL</TEMPLATETYPE>select distinct * from AA' | + | |
| - | | Sappend -i 'dlib1&SHADOW=<TEMPLATETYPE>XMLREL</TEMPLATETYPE>insert into AA' | + | |
| - | The same techniques will work for other built-in templates also, | + | Example: |
| - | Multi table inserts are also allowed in this method. | + | Sappend htmlnestFile "dlib1&SHADOW=<TEMPLATETYPE>HTMLNEST</TEMPLATETYPE> MULTIINSERT insert into CC1 values (***0:,***1:,***2:);insert into CC2 values (***0:,'***3:','***4:');" |
| - | Example: | + | Note that the DAI automatically recovers all the values even when the data is in the nested relational formats. |
| - | Sappend htmlnestFile "dlib1&SHADOW=<TEMPLATETYPE>HTMLNEST</TEMPLATETYPE> MULTIINSERT insert into CC1 values (***0:,***1:,***2:);insert into CC2 values (***0:,'***3:','***4:');" | + | |
| - | + | ||
| - | Note that the DAI automatically recovers all the values even when the | + | |
| - | data is in the nested relational formats. | + | |
| (d) Insertion of Data in User-Defined Template formats | (d) Insertion of Data in User-Defined Template formats | ||
| - | This method allows the user to insert data that was created/output | + | This method allows the user to insert data that was created/output using a user-defined template. Note that for performing the ingestion, the template should have ingestion rules as part of the template. Additional information on these rules are given in the TLANG.primer file. |
| - | using a user-defined template. Note that for performing the | + | |
| - | ingestion, the template should have ingestion rules as part of the | + | |
| - | template. Additional information on these rules are given in | + | |
| - | the TLANG.primer file. | + | |
| - | Examples: | + | Examples: |
| - | Sappend templfile 'dlib1&SHADOW= <TEMPLATETYPE>TEMPLATE /styles/userInfoForm1.html</TEMPLATETYPE> <TEMPLATEPARAM> alpha = 24 , beta = 34 </TEMPLATEPARAM> insert into BB' | + | Sappend templfile 'dlib1&SHADOW= <TEMPLATETYPE>TEMPLATE /styles/userInfoForm1.html</TEMPLATETYPE> <TEMPLATEPARAM> alpha = 24 , beta = 34 </TEMPLATEPARAM> insert into BB' |
| - | Note that two parameters alpha and beta are being used by the template | + | Note that two parameters alpha and beta are being used by the template routine when calculating the data to be stored. Multiple table inserts are also possible with user-defined templates. |
| - | routine when calculating the data to be stored. Multiple table inserts | + | |
| - | are also possible with user-defined templates. | + | |
| - | Examples: | + | Examples: |
| - | Sappend templfile2 "dlib1&SHADOW=<TEMPLATETYPE>TEMPLATE /styles/userInfoForm1.html</TEMPLATETYPE><TEMPLATEPARAM>alpha = 2, beta = 34 </TEMPLATEPARAM> MULTIINSERT insert into B1 values (***RSRC_ID:,'***RSRC_NAME:');insert into B2 values (***RSRC_ID:,'***RSRC_TYP_NAME:')" | + | Sappend templfile2 "dlib1&SHADOW=<TEMPLATETYPE>TEMPLATE /styles/userInfoForm1.html</TEMPLATETYPE><TEMPLATEPARAM>alpha = 2, beta = 34 </TEMPLATEPARAM> MULTIINSERT insert into B1 values (***RSRC_ID:,'***RSRC_NAME:');insert into B2 values (***RSRC_ID:,'***RSRC_TYP_NAME:')" |
| - | Since in a template, the order in which the values are placed in the | + | Since in a template, the order in which the values are placed in the template may differ from the order of target fields in the tables, user-defined variable-values, such as ***RSRC_ID: are used. More details of how the user-defined variable-values are gathered from the data using the template can be found in the TLANG.primer file. |
| - | template may differ from the order of target fields in the tables, | + | |
| - | user-defined variable-values, such as ***RSRC_ID: are used. | + | |
| - | More details of how the user-defined variable-values are gathered | + | |
| - | from the data using the template can be found in the TLANG.primer file. | + | |
| - | 4. Performing other database functions: | + | ==4. Performing other database functions:== |
| -------------------------------------- | -------------------------------------- | ||
| - | The DAI provides a way for performing other database functions. | + | The DAI provides a way for performing other database functions. |
| But this requires the user to give the exact statements needed by the | But this requires the user to give the exact statements needed by the | ||
| native database being accessed. Note that in most cases if many | native database being accessed. Note that in most cases if many | ||
| Line 337: | Line 217: | ||
| has been tested, this may be a good vehicle to execute them. | has been tested, this may be a good vehicle to execute them. | ||
| - | Examples: | + | Examples: |
| - | Sappend -i "dlib1&SHADOW=" | + | Sappend -i "dlib1&SHADOW=" |
| - | create table D1 (AA char(25), BB int); | + | create table D1 (AA char(25), BB int); |
| - | insert into D1 values('aaa',25); | + | insert into D1 values('aaa',25); |
| - | ^D | + | ^D |
| - | Sappend execFile "dlib1&SHADOW=" | + | Sappend execFile "dlib1&SHADOW=" |
| - | Note that the SHADOW variable is given an empty string. Do not | + | Note that the SHADOW variable is given an empty string. Do not use "dlib1&SHADOW=''" as this will be taken as a null quoted string and will result in an error. Note also that even though we are creating a new table, we still use the Sappend command and not the Sput command. |
| - | use "dlib1&SHADOW=''" as this will be taken as a null quoted | + | |
| - | string and will result in an error. Note also that even though | + | |
| - | we are creating a new table, we still use the Sappend command | + | |
| - | and not the Sput command. | + | |
Revision as of 00:05, 22 February 2006
This document provides a brief note on usage of the database access interface (DAI). The DAI provides access to tabular data stored in databases using the SRB. This document details how to use the SRB API to ingest into and retrieve from a database tables. The DAI includes mechanisms for tailoring the input and output streams. This is performed by associating a 'template' file with the input which can be used to interpret the data in the input file and convert them into SQL statements for inserting into the database. On the way out, the 'template' file can be used to construct forms and marked-up documents using the tabular data produced by SQL statements. These conversion operations occur on the fly and inside the SRB and is conveniently transparent to the user. We have implemented a template language called T-language which is used for building these templates. We discuss briefly how to use the T-language templates. A separate primer on the T-language is also available in the TLANG.primer file.
We provide brief notes on various stages in using the DAI interface:
Contents |
1. Registering a database object with MCAT:
(a) Register the location of the database using ingestLocation command using the string "<hostname>:<databasename>:<instancename>" as the netprefix parameter.
<hostname> is the name of the host where the SRBserver brokering the database is located. <databasename> is the name of the database. <instancename> is the name of the database instance or it can also be the name of the user under which the database is going to be accessed.
Example: ingestLocation 'catharsis dlib1 DB' 'catharsis.sdsc.edu:dlib1:srb' 'catharsis sdsc' srb sdsc
Currently we provide only one user per resource (i.e., per location).
We plan to provide a more flexible combination in a later release.
(b) Register the resource using ingestResource command with as the default path name, '??? table database' as the resource type, where ??? can be oracle, db2 or illustra.
Example: ingestResource 'ora-dlib1-sdsc' 'oracle table database' 'catharsis dlib1 DB' permanent 0
(c) Register the database itself as a SRB dataset using the Sregister command and use options -D 'database shadow object' /* for datatype */ and (empty string) for RegisteringObjectPath value.
Example: Sregister -D 'database shadow object' -R ora-dlib1-sdsc dlib1
The shadow object is a unique concept in SRB. Normally, when we ingest or register an object in SRB, the object is considered to be an atomic object and is retrieved as such when accessed. When we register (shadow objects cannot be ingested) a shadow object in SRB, it implies that the object is not a 'real' object (with its own bits and bytes), but stands for (or is a pointer to) a set or structure of objects. Using the data type 'directory shadow object' one can register a directory in SRB without registering (or ingesting) its component files and sub-directories). In this sense the path name registered for that object is partial and gives a prefix. During actual access, the user provides the suffix portion of the path that is appended to the path. The shadow object in this sense is not a real object but stands for a structured set of objects whose contents/access paths are defined at access time.
The 'database shadow object' provides a data type for a database. With the path name being empty, it will allow any query to be sent to the database. One can provide for a more controlled querying by giving a partial path such as 'select ' instead of . Note that the default path registered as RegisteringObjectPath can contain variables in them and hence a very flexible but uniquely controllable querying is possible.
Examples: Sregister -D 'database shadow object' -R ora-dlib1-sdsc 'select * from AA ' dlib1 with the where condition optionally specified by the user.
Sregister -D 'database shadow object' -R ora-dlib1-sdsc '<TEMPLATETYPE>HTMLREL</TEMPLATETYPE> select ' dlib1 which predefines a template to be used for accessing results from this database through this object.
(d) Enter the three lines (for Oracle) in data/TableConfig file: OracleUser '<databasename>:<instancename>' '<username>' OraclePasswd '<databasename>:<instancename>' '<password>' OracleHome '<databasename>:<instancename>' '<home-dir-of-oracle>' <username> is the name of the user under which the database is going to be accessed. <password> is the password to be used when connecting to the database. <home-dir-of-oracle> is the path name of Oracle Homeas given in ORACLE_HOME Example: OracleUser 'dlib1:srb' 'sekar'
OraclePasswd 'dlib1:srb' 'sekarpassword' OracleHome 'dlib1:srb' '/usr/local/apps/oracle/product/9.2.0'
In case of DB2, use the phrases Db2User and Db2Passwd and for Illustra use the phrases IllusUser and IllusPasswd. Sample TableConfig file in data directory gives the info for other database systems.
2. Retrieving data from databases:
There are three ways of accessing data from a database through the DAI, each of increasing complexity. (a) Packed Output In the first and simplest method, one provides a SQL select query as value for the SHADOW variable. The output is in the form of a file with every value string in the row-order output is terminated with a ASCII NULL symbol. We call this mode of presentation as the PACKED mode, since there is no mark up involved except for the delimiting of values using the NULL symbol ('\0' of C).
Examples: Scat 'dlib1&SHADOW= select * from AA'
Sget '/home/sekar.sdsc/dlib1&SHADOW= select A,B from AA where B >= 2' outFile
Scat 'dlib1&SHADOW= select distinct T.A, S.C from AA T, BB S where T.B = S.B order by 1 asc, 2 desc'
Note how, the value of the SHADOW variable is given using the ampersand as part of the object name in the Scommand. Also there should be no blanks between the object name and the '&' symbol unless the object name has been registered with blanks. In these samples we are assuming that the tables are pre-created and have a few values inserted appropriately before hand.
(b) Built-In Output Templates In the second method, the user can apply any of the few built-in templates that are provided in the DAI package. These templates give a a set of output independent of the attributes involved. The templates currently provided by DAI are:
HTMLREL - result table encoded as an HTML tabular data. The attribute names are used for heading the column names. HTMLTREE - result table is given in a directory-like tree form with one value per line of the table. HTMLNEST - result table is given as a nested relational table. This requires that the columns be ordered. The difference between this and HTMLTREE is that in here the values in a subsequent column are nested into the same box of the HTML table if there column values are the same. XMLREL - result table is marked up in XML tags. The tag structures used are very simple and intuitive. XMLTREE - the result table is nested and tagged in XML. This captures the semi-structures found in the result table. This also requires that the columns be ordered.
Examples: Scat 'dlib1&SHADOW= <TEMPLATETYPE>HTMLREL</TEMPLATETYPE> select * from AA'
Sget '/home/sekar.sdsc/dlib1&SHADOW= <TEMPLATETYPE>HTMLNEST</TEMPLATETYPE> select A,B from AA where B >= 2 order by 1,2' outFile
Scat 'dlib1&SHADOW= <TEMPLATETYPE>XMLTREE</TEMPLATETYPE> select distinct T.A, S.C from AA T, BB S where T.B = S.B order by 1 asc, 2 desc'
(b) User-defined Templates In this method, the user creates a template in a special language called the T-language (cf. TLANG.primer) and registers that template as a SRB object. The user can give read permissions for the template to other users and groups if desired. The template can then be used to format the output of the result table. The template can take parameters which can also be given by the user when performing the access. These templates can be viewed as style sheets for displaying the tabular data.Note that templates, more often than not, are tailored to specific queries or class of queries.
Examples: Scat 'dlib1&SHADOW= <TEMPLATETYPE>TEMPLATE /styles/userInfoForm1.html</TEMPLATETYPE> select rsrc_id,rsrc_name,rsrc_typ_name from mdas_cd_rsrc A,mdas_td_rsrc_typ B where A.rsrc_typ_id=B.rsrc_typ_id'
Sget '/home/sekar.sdsc/dlib1&SHADOW= <TEMPLATETYPE>TEMPLATE /styles/userInfoForm2</TEMPLATETYPE><TEMPLATEPARAM>alpha=2</TEMPLATEPARAM> select A,B from AA ' outFile
In the first example, the template userInfoForm1.html is being used for configuring the output of the query. The second example uses a template which requires a parametric value for alpha at access time. The user supplied value of 2 is used during this access. The value is used in computation when formatting the output. For more information on the template creation and execution, please see the primer in file TLANG.primer.
The section on registering a dataset provides an example of how one can pre-associate a template to an object.
3.Inserting data into databases:
The Database Access Interface provides four ways of inserting data into a database. One important point to note is that when one inserts data into a database using the DAI, the database and the associated tables are expected to exist before hand and registered with SRB. Because of this one needs to open the srbObject (using srbObjOpen) instead of creating the object (using srbObjCreate). In the case of Scommands, one should use the Sappend utility instead of the Sput utility. In the current release Scp or Smv are not supported for this type of resource. But, using the DAI, one can copy a set of tables (possibly one) from one database into another set of tables (possibly arranged in a different schema) in another database of possibly different vendor. The Scat and Sappend are used inconnected by a pipe to achieve this.
(a) The simple method : using SQL insert statements.
In this method we send insert statements through the write-buffers. These insert statements get executed and committed when the file is closed. For this type of ingestion
SHADOW value = insert [into <tab-name> ] and the data value sent through buffers is a set of insert statements separated by commas.
Examples: Sappend -i 'dlib1&SHADOW=insert into AA' insert into AA values (30,40,30,'aa','bb'); insert into AA values (40,40,30,'aa','bb') ^D
Sappend -i 'dlib1&SHADOW=insert' insert into AA values (30,40,30,'aa','bb'); insert into AA values (30,20,30,'aa','bb'); insert into AA values (40,40,30,'aa','bb') ^D
Sappend -i 'dlib1&SHADOW=insert' insert into AA values (30,40,30,'aa','bb'); insert into BB values (40,'aa','bb') ^D
Sappend -i "dlib1&SHADOW=insert" insert into AA select A,A,A,B,C from BB; ^D
The -i option of Sappend diverts the terminal input into SRB object. Hence, the lines following that statement are the ones that are filled into a buffer and sent to SRB. As can be seen one can ingest multiple tables and execute complex insert statements using this method.
(b) Insertion using PACKED data and VALUES data This method allows one to insert data into a single table where the data is in PACKed form (i.e., each value is ASCII NULL separated). For example, the output from a table using the packed format (cf. Section 2(a)) is in this format and can be inserted into another database table if necessary using this format.
Examples: Sappend packFile 'dlib1&SHADOW=<TEMPLATETYPE>PACK</TEMPLATETYPE>insert into AA' Scat '/home/sekar.sdsc/dlib1&SHADOW= select distinct * from AA' | Sappend -i 'dlib1&SHADOW=<TEMPLATETYPE>PACK</TEMPLATETYPE>insert into AA'
The second example shows how one can copy tabular data from one table to another (in this case itself) possibly on different databases from different vendors. Also, The ingestion takes into account whether the value goes into a character column or a numeric column and adds surrounding quotes if needed. The table name in the insert statement of SHADOW is used for this purpose.
One can insert the PACKED values into multiple tables also. This is accomplished using the MULTIINSERT preamble being added to the SQL statements in the SHADOW value.
Examples: Sappend packFile "dlib1&SHADOW=<TEMPLATETYPE>PACK</TEMPLATETYPE>MULTIINSERT insert into CC1 values (***0:,***1:,***2:);insert into CC2 values (***0:,'***3:','***4:');"
The DAI executes the set of SQL statements given in the MULTIINSERT part one after the other. The variables (predefined from ***0: onwards upto ***255: ) are bound with values from the input buffers. Note that if the field is a character field, the user has to specifically add the quotes unless the input buffer has quoted the appropriate field values. The technique can be used to copy data from one database to another and into an entirely different schema.
The PACKED format takes in NULL separated data and is useful when the data is generated by a machine. In case, data is hand generated one can use the VALUES template. With this tag, each row of values to be inserted is separated from the next row of values using the ';' symbol. DAI takes these values and plugs them into insert statements and executes them.
Examples:
Sappend -i 'dlib1&SHADOW=<TEMPLATETYPE>VALUES</TEMPLATETYPE> insert into AA'
300,400,300,'eaa','ebb';
300,200,300,'eaa','ebb';
400,400,300,'eaa','ebb';
^D
Note that each value can be in separate lines as long as each value is separated by a ',' and each row separated by a ';'. Multi table inserts are also possible (as in previous example) with the VALUES type.
(c) Insertion of Data in Built-In Template formats
This method allows the user to insert data that exist in any of the built-in formats supported by DAI.
Examples: Sappend xmlrelfile 'dlib1&SHADOW= <TEMPLATETYPE>XMLREL</TEMPLATETYPE> insert into AA' Scat '/home/sekar.sdsc/dlib1&SHADOW= <TEMPLATETYPE>XMLREL</TEMPLATETYPE>select distinct * from AA' | Sappend -i 'dlib1&SHADOW=<TEMPLATETYPE>XMLREL</TEMPLATETYPE>insert into AA'
The same techniques will work for other built-in templates also, Multi table inserts are also allowed in this method.
Example: Sappend htmlnestFile "dlib1&SHADOW=<TEMPLATETYPE>HTMLNEST</TEMPLATETYPE> MULTIINSERT insert into CC1 values (***0:,***1:,***2:);insert into CC2 values (***0:,'***3:','***4:');"
Note that the DAI automatically recovers all the values even when the data is in the nested relational formats.
(d) Insertion of Data in User-Defined Template formats
This method allows the user to insert data that was created/output using a user-defined template. Note that for performing the ingestion, the template should have ingestion rules as part of the template. Additional information on these rules are given in the TLANG.primer file.
Examples: Sappend templfile 'dlib1&SHADOW= <TEMPLATETYPE>TEMPLATE /styles/userInfoForm1.html</TEMPLATETYPE> <TEMPLATEPARAM> alpha = 24 , beta = 34 </TEMPLATEPARAM> insert into BB'
Note that two parameters alpha and beta are being used by the template routine when calculating the data to be stored. Multiple table inserts are also possible with user-defined templates.
Examples: Sappend templfile2 "dlib1&SHADOW=<TEMPLATETYPE>TEMPLATE /styles/userInfoForm1.html</TEMPLATETYPE><TEMPLATEPARAM>alpha = 2, beta = 34 </TEMPLATEPARAM> MULTIINSERT insert into B1 values (***RSRC_ID:,'***RSRC_NAME:');insert into B2 values (***RSRC_ID:,'***RSRC_TYP_NAME:')"
Since in a template, the order in which the values are placed in the template may differ from the order of target fields in the tables, user-defined variable-values, such as ***RSRC_ID: are used. More details of how the user-defined variable-values are gathered from the data using the template can be found in the TLANG.primer file.
4. Performing other database functions:
The DAI provides a way for performing other database functions. But this requires the user to give the exact statements needed by the native database being accessed. Note that in most cases if many such statements are issued during a single access, all of them will be performed in order but in a bunch (i.e., even if typed in, they probably will be executed only when a ^-D is pressed). Hence one looses interactivity as seen from a SQL query client. Because of this we recommend that users do not use this for executing ad hoc data definition functions through this API. But if there is a script that has been tested, this may be a good vehicle to execute them.
Examples: Sappend -i "dlib1&SHADOW=" create table D1 (AA char(25), BB int); insert into D1 values('aaa',25); ^D
Sappend execFile "dlib1&SHADOW="
Note that the SHADOW variable is given an empty string. Do not use "dlib1&SHADOW=" as this will be taken as a null quoted string and will result in an error. Note also that even though we are creating a new table, we still use the Sappend command and not the Sput command.


