MCAT Attributes
From SRB
This document provides a list of MCAT attributes that are currently exposed to the user through the srbGetDataDirInfo Client API.
The srbGetDataDirInfo call provides a means to query the MCAT catalog. The call has the prototype:
extern int srbGetDataDirInfo(srbConn* conn, int catType, char qval[][MAX_TOKEN], int *selval, mdasC_sql_result_struct *myresult, int rowsWanted);
The MCAT uses the input qval[][] and selval[] array to compose and execute SQL queries and returns the query result in myresult. The selval[] array specifies a list of attrbutes to retrieve and qval[][] specifies a lists of predicates conditions to search, which are used conjunctively. Both selval[] and qval[][] must be arrays of size MAX_DCS_NUM and are indexed by values given in mdasC_db2_externs.h under the heading DCS-ATTRIBUTE-INDEX DEFINES. Please check the list there for the most recent set of attributes that are available for querying.
The selval[] values can range as follows with the following semantics
(follows the general relational database) functions:
1 : select value
2 : count()
3 : max()
4 : min()
5 : avg()
6 : sum()
7 : variance()
8 : stddev()
9 : count(distinct )
The qval[] provides conditions for retrieval. The values can be quoted strings or numbers as per SQL requirements. We provide the following types of comparisons for the qval values.
= : equal
<> : not equal
> : greater than
< : less than
>= : greater than or equal
<= : less than or equal
in : in a list [eg. in ('alpha','beta','gamma') ]
not in : not in a list
between : between two items [eg. between 24 and 32 ]
not between : not between two items
like : like use % and _ for wildcard string and
character respy. Can use ESCAPE as per SQL.
not like : not like an item
sounds like : if soundex is built into the database
Oracle supports it.
sounds not like : sounds not like an item.
For the selval[] array, setting an element of the array to 1 means that the value associated with this element is to be retrieved. e.g., selval[USER_NAME] = 1; means the "user_name" attribute is to be retrieved. The qval[][] array value includes comparison predicates to search. e.g., sprintf(qval[DATA_NAME]," = '%s'", "unixFileObj1"); means that the search condition includes the term (data_name = "unixFileObj1").
Once the query is successful, individual columns of the resulting structure can be retrieved using the getAttributeColumn function whose prototype is given below:
char *getAttributeColumn(mdasC_sql_result_struct *result, int attrIndex)
An example showing the usage of srbGetDataDirInfo and getAttributeColumn (the example is also in test/examples):
#include <stdio.h>
#include <sys/file.h>
#include <sys/stat.h>
#ifdef PORTNAME_solaris
#include <fcntl.h>
#endif
#include "srbClient.h"
#define SRB_HOST "torah.sdsc.edu"
#define MY_PASSWORD "CCCCC"
int main(int argc, char **argv)
{
int i, status;
srbConn *conn;
mdasC_sql_result_struct myresult;
char qval[MAX_DCS_NUM][MAX_TOKEN];
int selval[MAX_DCS_NUM];
char *pathName, *rsrcName;
int numOfRows = 20;
if (argc != 3 ) {
fprintf(stderr, "Usage: %s collectionName dataName \n",
argv[0]);
exit(1);
}
/* connect to a SRB server */
conn = clConnect (SRB_HOST, NULL, MY_PASSWORD);
if (clStatus(conn) != CLI_CONNECTION_OK) {
fprintf(stderr,"Connection to srbMaster failed.\n");
fprintf(stderr,"%s",clErrorMessage(conn));
srb_perror (2, clStatus(conn), "", SRB_RCMD_ACTION|SRB_LONG_MSG);
clFinish(conn);
exit(1);
}
/* initalize the input structures */
for (i = 0; i < MAX_DCS_NUM; i++) {
selval[i] = 0;
sprintf(qval[i],"");
}
/* set user requirements - in this example, the user is asking for
the resource name and path name for a given dataset. */
sprintf(qval[DATA_NAME]," = '%s'",argv[2]);
sprintf(qval[COLLECTION_NAME]," = '%s'",argv[1]);
selval[PATH_NAME] = 1;
selval[RSRC_NAME] = 1;
/* perform the MCAT query */
status = srbGetDataDirInfo(conn, MDAS_CATALOG,
qval, selval, &myresult, numOfRows);
while ( status == 0) {
/* retrieve the values */
pathName = (char *) getAttributeColumn((mdasC_sql_result_struct *) &myresult,
PATH_NAME);
rsrcName = (char *) getAttributeColumn((mdasC_sql_result_struct *) &myresult,
RSRC_NAME);
/* print the result */
for (i = 0; i < myresult.row_count; i++) {
fprintf(stdout, "%20.20s %s\n",rsrcName,pathName);
pathName += MAX_DATA_SIZE;
rsrcName += MAX_DATA_SIZE;
}
/* free SRB allocated query column structures */
free(pathName);
free(rsrcName);
/* if there are more rows retrieve them */
if (myresult.continuation_index >= 0) {
status = srbGetMoreRows(conn, MDAS_CATALOG,
myresult.continuation_index,
&myresult, numOfRows);
}
else {
break;
}
}
/* disconnect from SRB Server */
clFinish(conn);
exit(0);
}
We give below the list of attributes that can be queried or used as conditions in a query in srbGetDataDirInfo. This list is not exhaustive since the design is far more comprehensive; further, even in implementations some of the attributes are not exposed at the client level and a few are exposed only through specific routines. Also, there are a few sets of attributes such as the IV Core, etc that are not exposed through this interface at this time. Please check the mdasC_db2_externs.h for the set of attributes exposed to the clients in the current implementation of the SRB client library:
Caveat: A few attributes occur more than one time under different names depending upon the role they play (eg. DATA_OWNER_EMAIL and USER_EMAIL). These multiple attributes actually point to the same data internally and hence there are no problems of inconsistency involved in this virtual replication.
Dataset Information (core meta-information about datasets):
DATA_NAME /* data name */
DATA_REPL_ENUM /* replica copy number */
COLLECTION_NAME /* collection name in which the data resides*/
SIZE /* size of data */
DATA_TYP_NAME /* data type (mostly data formats)*/
DATA_CLASS_NAME /* classification name for data */
DATA_CLASS_TYPE /* classification type */
ACCESS_CONSTRAINT /* access restriction on data */
currently supported access constraints are:
'execute','read audit','read',
'annotate audit','annotate',
'write audit','write',
'create audit','create',
'all audit','all', /* all is like ownership and
allows to grant/revoke access
to other users */
'curate audit','curate' /* used at collection level
to have ownership on objects
in/and below the collection. This
access constraint is still under
design and may have its properties/usage
modified in future releases */
DATA_COMMENTS /* comments on data */
DATA_COMMENTS_TIMESTAMP /* time stamp for comments on data */
REPL_TIMESTAMP /* data modification time stamp */
PATH_NAME /* physical path name of data object */
DATA_CREATE_TIMESTAMP /* data creation time stamp */
DATA_IS_DELETED /* data liveness */
DATA_OWNER /* data creator name */
DATA_OWNER_DOMAIN /* domain of data creator */
DATA_OWNER_EMAIL /* email of data creator */
Collection Information (core meta-information about collections):
COLLECTION_NAME /* collection name in which the data resides*/ PARENT_COLLECTION_NAME /* name of parent collection (15) */ ACCESS_COLLECTION_NAME /* use this as collection name for checking access to collection */ COLLECTION_ACCESS_CONSTRAINT /* access restriction on collection*/ CONTAINER_FOR_COLLECTION /* default container for collection */
User Information (this contains core information about SRB-registered users):
USER_NAME /* user name */ DOMAIN_DESC /* user domain name */ USER_TYP_NAME /* user type */ USER_GROUP_NAME /* name of user group */ USER_ADDRESS /* user address */ USER_PHONE /* user phone number */ USER_EMAIL /* user email */ USER_DISTIN_NAME /* distinguished name of user (used by authentication systems ) */ USER_AUTH_SCHEME /* user authentication scheme associated with the user_distin_name */
Physical Resource Information (core meta-information about physical resources including SRB servers):
SERVER_LOCATION /* location of SRB server */ SERVER_NETPREFIX /* net address of SRB server */ PHY_RSRC_NAME /* physical resource name */ PHY_RSRC_TYP_NAME /* physical resource type */ RSRC_CLASS /* classification of physical resource (inherited by logical) */ MAX_OBJ_SIZE /* maximum size of data object allowed in physical resource (not enforced by MCAT) */ PHY_RSRC_DEFAULT_PATH /* default path in physical resource */ LOCATION_NAME /* registered name of location (of resource) in MCAT */ RSRC_ADDR_NETPREFIX /* net address of resource */ RESOURCE_MAX_LATENCY /* physical resource estimated latency (max) */ RESOURCE_MIN_LATENCY /* physical resource estimated latency (min) */ RESOURCE_BANDWIDTH /* physical resource estimated bandwidth */ RESOURCE_MAX_CONCURRENCY /* physical resource maximum concurrent requests */ RESOURCE_NUM_OF_HIERARCHIES /* number of hierarchies in the physical resource */ RESOURCE_NUM_OF_STRIPES /* number of striping of data in the physical resource */ RESOURCE_CAPACITY /* capacity of the physical resource */ RESOURCE_DESCRIPTION /* comments on the resource */
Logical Resource Information (core meta-information about logical resources):
RSRC_NAME /* name of logical resource - logical resources inherit many of the attributes from the associted physical resource*/ RSRC_REPL_ENUM /* index of physical rsrc in logical rsrc*/ PHY_RSRC_NAME /* associated physical resource name */ RSRC_ACCESS_LIST /* access list for resource */ RSRC_TYP_NAME /* logical resource type */ RSRC_DEFAULT_PATH /* default path in logical resource */
Container Information (core meta-information about containers):
CONTAINER_NAME /* name of container - container has all the properties of a dataset */ CONTAINER_REPL_ENUM /* container copy number */ CONTAINER_MAX_SIZE /* maximum size of container */ IS_DIRTY /* data has been changed in the container compared to other copies */ OFFSET /* position of data in container */ CONTAINER_SIZE /* current size of container */ CONTAINER_RSRC_NAME /* name of physical resource of container */ CONTAINER_LOG_RSRC_NAME /* logical resource associated with a container */ CONTAINER_RSRC_CLASS /* class of physical resource associated with a container */
Ticket-based Access Control Information (information about ticket-based access control for datasets, collections as well as recursively under a collection):
TICKET_D /* identifier for ticket given for data*/ TICKET_BEGIN_TIME_D /* data ticket validity start time */ TICKET_END_TIME_D /* data ticket validity end time */ TICKET_ACC_COUNT_D /* valid number of opens allowed on data ticket*/ TICKET_ACC_LIST_D /* valid access allowed on data ticket (currently readonly) */ TICKET_OWNER_D /* data ticket creator */ TICKET_OWNER_DOMAIN_D /* data ticket creator domain */ TICKET_USER_D /* allowed ticket user or user group */ TICKET_USER_DOMAIN_D /* data ticket user domain */ TICKET_C /* identifier for ticket given for collection and sub collections*/ TICKET_BEGIN_TIME_C /* collection ticket validity start time*/ TICKET_END_TIME_C /* collection ticket validity end time*/ TICKET_ACC_COUNT_C /* valid number of opens allowed on data in collection */ TICKET_ACC_LIST_C /* valid access allowed on data in collection (currently readonly) */ TICKET_OWNER_C /* collection ticket creator */ TICKET_OWNER_DOMAIN_C /* collection ticket creator domain */ TICKET_USER_C /* allowed collection ticket user */ TICKET_USER_DOMAIN_C /* collection ticket user domain */
Audit Information (audit information on users and on datasets:)
AUDIT_USER /* audit user name */ AUDIT_USER_DOMAIN /* audit user domain */ USER_AUDIT_TIME_STAMP /* audit on user time stamp */ USER_AUDIT_COMMENTS /* audit on user comments */ AUDIT_ACTION_DESC /* audited action on data */ AUDIT_TIMESTAMP /* audit time stamp for data */ AUDIT_COMMENTS /* audit comments for data */
Annotations Information (core meta-information on annotating datasets. see also ACCESS_CONSTRAINT attribute for access control ) :
DATA_ANNOTATION_USERNAME /* name of annotator */ DATA_ANNOTATION_USERDOMAIN /* domain of annotator */ DATA_ANNOTATION /* actual annotation string */ DATA_ANNOTATION_TIMESTAMP /* time of annotation */ DATA_ANNOTATION_POSITION /* user-defined location for the annotation */
Structured Metadata Information (user can store structured (treated as a blob) metadata information):
STRUCTURED_METADATA_TYPE /* type of user-inserted structured metadata */
STRUCTURED_METADATA_COMMENTS /* comments on the structured metadata */
STRUCTURED_METADATA_DATA_NAME /* data name of structured metadata
stored as another data object inside SRB */
STRUCTURED_METADATA_COLLNAME /* collection name of structured metadata
stored as another data object inside SRB */
INTERNAL_STRUCTURED_METADATA /* strcutured metadata stored as string in MCAT */
Index Information (user can index a dataset , datasets of given type or datasets in a collection. the index is treated as a SRB registered dataset. The user can download the index and search on it. The location can be collection-information (i.e., index is stored as several datasets inside a collection, or can be a URL!. Note that index is treated as a SRB registered dataset and hence inherits all meta information about datasets including structured metadata which can be used to store information about the index. see datacutter proxy for more information):
INDEX_NAME_FOR_DATASET /* data name of index on data */ IX_COLL_NAME_FOR_DATASET /* collection name of index on data */ IX_DATATYPE_FOR_DATASET /* index type*/ IX_LOCATION_FOR_DATASET /* path name of index*/ INDEX_NAME_FOR_DATATYPE /* data name of index on data type */ IX_COLLNAME_FOR_DATATYPE /* collection name of index on data type */ IX_DATATYPE_FOR_DATATYPE /* index type*/ IX_LOCATION_FOR_DATATYPE /* path name of index*/ INDEX_NAME_FOR_COLLECTION /* data name of index on collection */ IX_COLLNAME_FOR_COLLECTION /* collection name of index on collection */ IX_DATATYPE_FOR_COLLECTION /* index type */ IX_LOCATION_FOR_COLLECTION /* path name of index*/
Method Information (users can associate methods on dataset , datasets of given type or datasets in a collection. the method is treated as a SRB registered dataset and hence inherits all meta information about datasets including structured metadata which can be used to store information about the arguments and method return values. see datacutter proxy for more information):
METHOD_NAME_FOR_DATASET /* data name of method on data */ MTH_COLLNAME_FOR_DATASET /* collection name of method on data */ MTH_DATATYPE_FOR_DATASET /* method type */ METHOD_NAME_FOR_DATATYPE /* data name of method on data type */ MTH_COLLNAME_FOR_DATATYPE /* collection name of method on data type*/ MTH_DATATYPE_FOR_DATATYPE /* method type */ METHOD_NAME_FOR_COLLECTION /* data name of method on collection */ MTH_COLLNAME_FOR_COLLECTION /* collection name of method on collection */ MTH_DATATYPE_FOR_COLLECTION /* method type */
Pre-Allocated User-defined Metadata Indformation for Datasets (MCAT has pre-defined some attributes for users to store metadata about their datasets. These metadata can be used in whatever form the user desires including and not restricted to: user-mapped attributes, (variable,value) pairs to store arbitrary list of meta data, small-sized structured metadata, sorted list of values, etc... Note that the size of the strings are 350 characters.):
UDMS0 /* user-defined string metadata 0 for data */ UDMS1 /* user-defined string metadata 1 for data */ UDMS2 /* user-defined string metadata 2 for data */ UDMS3 /* user-defined string metadata 3 for data */ UDMS4 /* user-defined string metadata 4 for data */ UDMS5 /* user-defined string metadata 5 for data */ UDMS6 /* user-defined string metadata 6 for data */ UDMS7 /* user-defined string metadata 7 for data */ UDMS8 /* user-defined string metadata 8 for data */ UDMS9 /* user-defined string metadata 9 for data */ UDMI0 /* user-defined integer metadata 0 for data */ UDMI1 /* user-defined integer metadata 1 for data */
Pre-Allocated User-defined Metadata Indformation for Collections
UDSMD_COLL0 /* user-defined string metadata 0 for collection */ UDSMD_COLL1 /* user-defined string metadata 1 for collection */ UDSMD_COLL2 /* user-defined string metadata 2 for collection */ UDSMD_COLL3 /* user-defined string metadata 3 for collection */ UDSMD_COLL4 /* user-defined string metadata 4 for collection */ UDSMD_COLL5 /* user-defined string metadata 5 for collection */ UDSMD_COLL6 /* user-defined string metadata 6 for collection */ UDSMD_COLL7 /* user-defined string metadata 7 for collection */ UDSMD_COLL8 /* user-defined string metadata 8 for collection */ UDSMD_COLL9 /* user-defined string metadata 9 for collection */ UDIMD_COLL0 /* user-defined integer metadata 0 for collection */ UDIMD_COLL1 /* user-defined integer metadata 1 for collection */
Dublin Core Metadata for Datasets (for more information please check http://www.dublincore.org/ (this set of metadata, even though part of MCAT core, is normally turned off in order to speed up processing. patches ned to be applied if this option needs to be used).
DC_DATA_NAME /* Dublin Core Data Name same as DATA_NAME */ DC_COLLECTION /* DC: Collection NAme same as COLLECTION_NAME */ DC_CONTRIBUTOR_TYPE /* DC: Contributor Type: Eg. Author, Illustrator */ DC_SUBJECT_CLASS /* DC: Subject Classification */ DC_DESCRIPTION_TYPE /* DC: Type of Description */ DC_TYPE /* DC: Type of the Object */ DC_SOURCE_TYPE /* DC: Type of the Source */ DC_LANGUAGE /* DC: Language of the Object */ DC_RELATION_TYPE /* DC: Relation with another Object in (170,171) */ DC_COVERAGE_TYPE /* DC: Coverage Type */ DC_RIGHTS_TYPE /* DC: Rights Type */ DC_TITLE /* DC: Title of the Object */ DC_CONTRIBUTOR_NAME /* DC: Contributor Name. NOT same as (7) */ DC_CONTRIBUTOR_ADDR /* DC: Contributro Address */ DC_CONTRIBUTOR_EMAIL /* DC: Contributor Email */ DC_CONTRIBUTOR_PHONE /* DC: Contributor Phone */ DC_CONTRIBUTOR_WEB /* DC: Contributor Web Address */ DC_CONTRIBUTOR_CORPNAME /* DC: Contributor Affiliation */ DC_SUBJECT_NAME /* DC: Subject */ DC_DESCRIPTION /* DC: Description */ DC_PUBLISHER /* DC: Publisher Name */ DC_PUBLISHER_ADDR /* DC: Publisher Address */ DC_SOURCE /* DC: Source Name */ DC_RELATED_DATA_DESCR /* DC: Related Data Description */ DC_RELATED_DATA /* DC: Date Related to (152,153) */ DC_RELATED_COLL /* DC: */ DC_COVERAGE /* DC: Coverage Information */ DC_RIGHTS /* DC: Rights Information */


