SRB 3.4.0

This document describes changes for SRB 3.4, released 
October 31, 2005.

Please note that the 3.4 servers can be used with 3.3 clients but the
3.3 server is not 100% compatible with 3.4 clients.  We have added
some new client APIs such as srbExecCommandC() and some new options
such as bulk registration into a compound resource.  But the 3.4
server is backward compatible with the 3.3 client protocol.

Major new features:

** Master/Slave MCAT **

A SRB federation (zone) can be configured to run with a single Master
MCAT plus zero or more Slave MCATs. The purpose of the Slave MCAT is
to improve responsiveness across a Wide-Area-Network. The Slave MCATs
are used for "read only" type queries. The following Scommands have
been converted to use the Slave Mcat by default:

Scat, SgetColl, Sget, SgetD, SgetR, SgetU, Sls, Slscont and Stoken.

The -X option or the environment variable "masterMcat" can be used to
force the query to be run on the Master MCAT instead. Please read the
readme file in readme.dir/README.MasterSlaveMcat for more information.


** A prototype integration of the HDF5 and SRB **

HDF5 is a general-purpose library and file format for storing
scientific data developed by NCSA. In the past year, NCSA and SDSC
collaborated to provide efficient access to objects in HDF5 files
stored in the SRB.  A prototype was developed, demonstrating the
feasibility of this approach, and showing that significant performance
gains can be achieved for clients that need to access only parts of a
file, such as individual objects, subsets of large arrays, or
metadata.

The integration was carried out by integrating the HDF5 library on the
server so that HDF5 functions can be run directly on where the data is
stored.  A set of HDF5 specific thin client APIs was provided for
accessing HDF5 data stored in SRB. A user guide is included in the
release, in readme.dir/HDF-SRB-UG.pdf.


** SRB GridStatus, a monitoring and alert system **

GridStatus can monitor one or more SRB grids, all you need to do is
add connection parameters for one SRB account per Grid and it will do
the rest. It will discover all of the hosts and resources that are
apart of the grid and begin monitoring them. It will send email alerts
if something goes down, or if a resource goes above 90% utilization,
and it uses an optional mySQL database to store all downtime
information.  Package: admin/GridStatus


** More extensive pre-release (QA) testing **

The 3.4 release has gone through much more vigorous Quality
Assurance-type testing than previous releases due in large part to a
number of extensions to our automatic testing environment and scripts.

We wish to thank Adil Hasan of the UK E-Science Data Management group
for providing a set of python Scommand test scripts (see our
contributed software page) which are now run as part of this.  

A large set of tests are run continually on four hosts under our
tinderbox system (available off the SRB home page), which now includes
a Solaris system, and a configuration of two cooperating hosts to
create and use network-based resources.  Also see bugzilla item 91
below.



Most of the other new features and bug fixes are bugzilla items
(listed at the beginning of each below).  Please check the srb
bugzilla system for more information.

New features and bug fixes:

44 - SPCommand / Proxy can not be used across firewall.  Client
initiated proxy command and API. To get around firewall issues on the
client side, a "-c" option has been added to the Spcommand command for
client initiated connection. The default uses server initiated
connection.  A new API - srbExecCommandC () has been added for issuing
client initiated proxy operation.

63 - to disconnect from database when a spawned srbServer is idle.

65 - handle port scans better.

72 - Rare and intermittent Sput -b data corruption on a Mac

73 - Rare and intermittent Sput -b file loss

75 - SgetR crashes with: glibc detected free(): invalid pointer:

78 - Have password to be prompted on Sinit instead of from MdasAuth.
If no .MdasAuth or .srbAuthFile is available, Sinit will prompt for
the user password and create a temporary scrambled password file (like
Sauth) which Sexit will deletes (if temporary).

80 - srbLog should not be purged but stored by date. The srbLog files
will now be stored in data/log/log.mm.dd.yy where: mm = month, dd =
day, yy = year.  The "logfileInt" parameter in bin/runsrb script can
be used to specify the interval in days for switching to a new
logfile. The default interval is 5 days.

81 - SRB web-page, broken link

85 - Schmod -i seems to require -r

86 - Schmod -i -r doesn't actually recurse

91 - Need a much more extensive set of test (QA) scripts.  Many
extensions were made to our automatic testing system, including the
integration of an additional set of tests provided by Adil Hasan of
the UK E-Science Data Management group (see our contributed software
page).

102 - add a function - Change resourceName

105 - problem with accentuated letter

105 - problem with accentuated letter.  Collection names and data
names with single-quote characters are now handled .  This required a
lot of debugging (more then 3 weeks) to get this working for all SRB
operations.  User-defined metadata with single-quotes in them are
double quoted before inserting.  Same thing with querying also.  This
work was also involved in solving bug 148.

106 - Sufmeta -R -c option returns COLECTION_NOT_IN_CAT

108 - proxy command arguments cannot contain blanks

111 - After a while, SRBServers fail trying to open mcatHost

112 - jargon file transfer problems

113 - install.pl should detect, explain, and quit on 64-bit hosts

115 - perl rindex problem on some hosts causing install.pl failure

116 - Build assumes GNU make is installed as "gmake".  The build
system will now check and use gmake if it is available, and 'make'
otherwise (assuming that 'make' is 'gmake').

121 - Java Admin tool should handle more SmodR functions.  A new set
of miscellaneous resource operations have been added.

125 - Doing Sget a file, if the local directory is full, need better
error message.

127 - Configuration check for Globus flavors is arbitrary

128 - Build failure, PPC MacOS X, when using GSI

129 - Physical Move fails for non-admin users

130 - Build failure, AIX (libtool problem?)

135 - In link commands, use "-L/dir", not "-L /dir"

136 - Sls -R "DATA_CHESUM='0'" does not work

137 - Array indexing error in clStub.c

138 - srbMaster crashes on startup

139 - jargon intermittent bulkload error on Linux

141 - auth_scheme should be written, esp if env var mdasEnvFile set

143 - GSI-enabled server get inconsistent user_id from Sput/Sget
      This could result in NO_ACCESS error.

145 - logEval.c does not build on latest OS X (8.0.0)

146 - MCAT not handling bulk load properly if a file already exists

147 - annotations of previous same-name collections reappear

148 - Apostrophes in metadata

149 - Add ability to modify resource/location netprefix

152 - Add a new input parameter- newPathName to srbObjCopy

153 - SgetD misleading answer

154/155- Sbload Command problems in SRB3.3.1.  Sbload seg fault.
Needed to initialize the mcatHost parameter before using it.

156 - Add federation(s) to attributes kept by the ZoneAuthority

158 - SgetR -l no longer works

159 - problem deleting DN strings for a given user

161 - SmodColl -d and then -c failure; perhaps OS X specific

162 - Srsync failed if user only have read permission

163 - mcatAdmin, refreshed windows will revert if closed and reopened

164 - add srbPort to install.pl

165 - Scommands fail to build on some Macs.  There was a fatal error
msg in commExtern.h.  Later, we found this happening in other systems
too (Linux) and this fix should correct that too.

168 - SERVER_DN is not read when mdasEnvFile is used

169 - Rare and intermittent Sphysmove problem on a Linux host

170 - mcatAdmin.jar hang

172 - Sphymove -P option does not seem to work, should it be removed?
      Problem was where the input path was not passed along.

173 - Sput -b with invalid directoryname segfaults

174 - Ssh dies with segmentation fault

176 - Apostrophe in filename breaks download on SRB 3.3.1


180 - Sauth scrambled password failures

184 - Scp -b -r does not work on OS X using CVS version

185 - Compilation failure under gcc 4 (Fedora Core 4)
  
186 - Very slow operations with Oracle.  Srsync, "Sget -b", Sphymove
and "Scp -b" operations using Oracle MCAT could become very slow, up
to 10 minutes, when using Oracle for the MCAT.  The problem was traced
to a bug in Oracle involving ESCAPE character that can cause the query
to be extremely slow.  This bug was fixed in Oracle 10.2.0.1 but we
now have a workaround for this problem in SRB by not using ESCAPE
character in these queries.  This problem does not exist for other
DBMSes.

187 - Sreplicate broken pipe to server.  Fix a couple obscure problems
that could cause server to server connections to fail.  One is
specific to certain versions of Linux.  The other occurred when
servers were restarted.  For the Linux solution, the code now uses
both gethostbyname_r (reentrant) and gethostbyname and checks for a
type of failure (depending on the Linux version) each can return.
 
188 - Srsync -a issues.  There were some cases where Srsync -a would
not update all replicas properly.

189 - lowLevelClose error from autotest.sh 9

na - Files sizes of containers and files in containers.  The maximum
size of a container has been increased from 2 Gbytes to 200 Gbytes.
The maximum size of a file that can be stored in a container has also
been increased from 2 Gbytes to 200 Gbytes.

na - Bulk Scp into container. "Scp -bc container" now works and can be
used to recursively copy a whole collection into a container.

na - Sbkupsrb - added -v option for verbose mode.

na - Sbkupsrb and Srsync - continue operation even though one or more
errors have occurred.

na - Sregister - added "-C" option for registering files into compound
resource.

na - Sbregister - Allow bulk registration into compound resource with
the "-f" option.

na - The handling of srbBulkUnload() has been re-done such that it will
download inContainer files too in addition to normal files so that a
separate download for inContainer files is no longer needed.

na - Fix a problem with "Sput -n" where the copy number is ignored when it
is used with the m/M option.

na - Fix a problem that Scp was not working properly when the source
files have replica.

na - Fix a problem that "Sget -n" option may download the wrong copy.

na - Fix a problem that Sget may incorrectly think the size downloaded
is wrong with an OBJ_ERR_COPY_LEN error if the source file contains
multiple copies of different sizes.

na - Resource access permission was not checked when the request came
from a foreign zone. The problem has been fixed.
 
na - Include the value of errno to the error msg in Sget to make it
easier to identify a cause of some problems (bug 125).  One example is
when the local file system is running out of space.

na - Spcommands will take an argument with "space" character in it.

na - For overwriting SRB files, change the size to zero before the
overwrite so that if the write fails in the middle, the registered
file size will be zero instead of some undetermined value.

na - Take out the size verification after file transfer for ADS type
files because it does not support the stat () call.

na - Fix a problem that Sput/Srsync print out bogus error of
COLLECTION_NOT_IN_CAT when the collection does not exist.

na - Fix a problem with Sphymove of a single file into container.