query

NAME

query -- Perform an arbitrary query on a cdbs database table

USAGE

query table columns qual

DESCRIPTION

This task returns a text file containing the result of a database query. The user specifies the name of one or more Calibration Database (CDBS) tables, the columns to display from the table, and a boolean expression used to select information from the table. The variables in the boolean expression, or qualifier, are column names in the CDBS table. A row is retrieved fromthe database table whenever the column values in that row make the qualifier true. Operators have two forms, a Fortran and an SQL form. These two forms can be used interchangably. The following is a list of the boolean operators in both forms:

equal	.eq.	=	not equal	.ne.	!=
less than	.lt.	<	greater than	.gt.	>
less than or equal	.le.	<=	greater than or equal	.ge.	>=
and	.and.	and	or	.or.	or
negation	.not.	not	wildcard match		like

The wildcard match operator, like, has no corresponding Fortran form. It allows you to do pattern matching using the wildcard characters * and ?. The * character matches any number of characters and the ? character matches a single character. The like operator is asymmetric, the argument on the left must be the column name and the argument on the right must be a quoted string containing wildcard characters. Please note that you cannot use wildcard characters with the equality operator, .eq., they will be interpreted as ordinary characters.

The columns parameter contains a list of the data base table columns that are to be retrieved. The items in the list may contain the usual pattern matching metacharacters (i.e., the wildcard characters * and ?). If the list is blank (the default value), all columns will be retrieved. If the first character in the list is the negation character (~), then only those columns that are not in the list will be retrieved from the data base.

More than one data base table can be queried simultaneously by passing more than one name in the first parameter. When more than one data base table is being queried, all combinations of rows in the tables are tested and those combinations for which the qualifier is true will be returned. Column names in the qualifier must be preceded by the table alias followed by a period. The table alias is a single character whose value depends on the position of the table name in list. For example, the second database table in the list of table names would have the alias b. If the qualifier contained an expression involving the column version from the second table, it would be referred to as b.version. Column names referred to in the second parameter may optionally be preceded by the table alias followed by a period. If a column name is not preceded by a table alias, then that column will be retrieved from all tables in which it is present, or excluded if the negation character precedes the list of column names. If the second parameter is blank, then every column from every table will be retrieved.

PARAMETERS

+table [string] Name of the CDBS table from which data will be retrieved. If data are retrieved from more than one parameter table, then table names are separated by commas. Table aliases will be assigned to tables in the order in which they are in the list. For example, the first table has the alias 'a', the second has the alias 'b', and so on.
columns[string] A list of column names to be retrieved from the data base. The order of names in the list does not affect their order in the output file--the output file is always organized in the same order as the source data base. Column names in the list are separated by commas. If the list is blank, then all columns will be retrieved from all tables named by 'params'. This parameter can include IRAF pattern matching metacharacters (i.e., the wildcard characters * and ?), in which case all column names that match the pattern will be retrieved. If the column name list is preceded by a @, it is taken to be a file containing the column names. Columns in the file may be separated by commas or whitespace.
qual [string] A boolean expression used to select rows from the data base table. This expression can involve either Fortran relational operators, or their SQL equivalents. If data are retrieved from more than one data base table, the column names in the expression should be preceded by the table alias, followed by a period. If the qualifier is too long to fit (63 characters maximum), the qualifier can be placed in a file. In this case, the value of qual is the file name preceded by a @ character.
(dbase = "") [string] The name of the database containing the table. If left blank, the database name defaults to the value of the CDBS_DB environment variable, or if that is not found, to cdbs_ops.
(unique = no) [bool] If this parameter is set to yes, duplicate rows in the output will be deleted.
(title = yes) [bool] If this parameter is set to yes, the output table will have a title consisting of the table column names.
(ucase = no) [bool] If this parameter is set to yes, the output will be printed in upper case. If this parameter is no, output will be printed in lower case.
(text = yes) [bool] If this parameter is set to yes (the default for command line mode), the output table will be an ascii file. If it is no (the default for cgi script mode), the output table will be an html file.

DATA FILES

Query uses two html template files to produce its html output. The names of the files it uses are qrysearch.html and qryresult.html. These contain the html templates to build the search and result screens, respectively.

The text in the html templates is copied into getref's output except where variables appear in the template. These variables are replaced by text generated by getref. Variables in the template file start with the character string . In between the starting and ending string is the variable name. The variable thus looks like an html comment to an html browser, so the template can be checked for style and content without running the task.

The task supports four variables, output, toplink, txtlink, and taskurl. The variable names and meanings are the same as those used by getref, except that query does not support the sumlink variable. For further information on the meaning of these variables, read the getref documentation.

ALGORITHM

The main procedure for query is task, whcih reads any task arguments into the proper parameters, calls form_input, which reads any cgi input, calls query, which takes the task parameters and uses them to perform a database query, and finally calls either qryout_html or qryout_text to produce the task output.

The design of form_input is discussed in the documentation of getref.

The query function fills in a sql query template, executes the query, and saves the results in a load table. The major portion of the work is in assempling the strings to be inserted in the template. The first portion of the work is getting the list of column names, which is done by get_collist. Getting the list of column names is non-trivial because the user may use wild cards or the negation symbol. The column names returned by get_collist are qualified by the table alias and strip_collist removes them to form unqualified names. The where clause is converted from Fortran syntax to sql by ftn_to_sql. After all these pieces of the sql query are built, they are assembled into the query by build_qry.

The function get_collist loops over all tables that are being searched. The function read_dbcols returns all the database columns from a table and match_dbcols finds thos that match the list of column names the user specified. The function read_dbcols does a query against the system tables sysobjects and syscolumns to retrieve all the columns in the table. Columns that are not convertable to strings (most notably text columns) are not retrieved. The function match_dbcols loops over all column names (which may contain wild cards) in the user's column list and tries to match the pattern against the column names returned by get_collist by using the pattern matching function patmatch. The result is stored in an array of flags corresponding to the table columns. If the negate flag is set, the array is inverted. Finally, the aolumn names are copied from the list of table columns to the output list when the corresponding flag is true.

The function ftn_to_sql converts a query qualifier using Fortran syntax to an equivalent qualifier using sql syntax. The function reads a token at a time from the qualifier by calling qrylex_next. The type of the token is returned as a function value. When no more tokens remain, the function returns. Quoted strings are converted to upper case and if the string follows a like operator, wild card characters are converted to their sql equivalents. Numeric constants are copied as is. Identifiers (either sql keywords or table column names) are prefixed by the table alias a if they are unqualified table column names. Symbols and operators are converted from their Fortran to sql form. The converted (or unconverted) tokens are copied to the output buffer, each token separated from the previous by a blank.

The function qryout_html uses an html template and a load table to create an html file. It calls html_template, which prints the content of the template until it encounters a variable in the template. At that point, it returns the name of the variable. Qryout_html prints the text corresponding to the variable name. It recognizes four variables: output, which prints the load table as an html table, toplink prints out a link to the search form, taskurl prints out the bare url of the search form, and txtlink prints a link that performs the current query, but outputs the results as an ascii file. All these variables are fairly trivial, except for txtlink. It calls the function each_option to retrieve each of the parameters passed to the task. If the parameter is not in a list of omitted parameters, it is encoded by the function cgi_encode and appended to the url.

The function (qryout_text) produces ascii output from a load table. The chief complications are determining the width of each column in the output and where rows should be folded so that the width of the output does not exceed 80 characters. The function computes the length of the longest string in each column to determine the column width. It the sums the column widths and when the sum exceeds the maximum value, marks the previous column as the last in the line by setting its width to a negative number. Then when printing the table, the function checks to see if a column has a negative width and if so, follows it with a backslash and a newline.

INSTALLATION DEPENDENCIES

The names of the three html template files are defined in datafile.h. The filenames use the DATAFILE macro, defined in system.h to specify the directory name containing the template files. The url used to invoke the search screen is defined by the QUERYURL macro defined in datafile.h. The query.cgi shell script in the util directory, used to run query from a cgi script script, contains system dependent environment variables and must be edited and copied to the cgi-bin directory on the system.

This task uses four environment variables to control its access to the database. CDBS_SERVER sets the database server the task accesses. CDBS_DB sets the name of the cdbs database that is used by the task. DADS_DB sets the name of the dads database that is used by the task. And CDBS_USER sets the user name used to access the database. If these environment variables are not set, they default to ZEPPO, cdbs_ops, dads_ops, and a public cdbs account.

In addition to these environment variables that control cdbs access, the stdb library that this task is linked with requires the two environment variables SYBASE and ACAREA be set. These variables define the location of the Sybase and stdb executables and libraries.

EXAMPLES

To retrieve all reference files whose useafter date is later than January 1, 1998, run the following command:

query wfpc2_file file_name 'useafter_date .gt. "Jan 1 1998"'

To perform the same query, but also retrieve the pedigree from the row level table, run the following command:

query wfpc2_file,wfpc2_row .file_name,pedigree \ 'a.file_name .eq. b.file_name .and. a.useafter_date .gt. "Jan 1 1998"'

Back to index of cdbs tasks.