Linux certification coordination group meeting 31.10.02 ======================================================= Present: ALICE: Fons Rademakers ATLAS: Bruce Barnett, Guenter Duckeck CLUG: Helge Meinhard CMS: Eric Cano, Stephan Wynhoff, Lassi Tuura IT-API: Andreas Pfeiffer IT-FIO: Thorsten Kleinwort IT-catchall: Jan Iven OpenLAB: Sverre Jarp LHCb: Marco Cattaneo, Sebastian Ponce non-LHC exp.: Benigno Gobbo PS-CO: Nicolas de Metz-Noblat Excused: Alastair Bland (SL-CO) Jarek Polok (Desktop) The meeting started with a round-the-table list of the different compiler requirements: Nicolas: depend on Oracle Libraries: for Oracle 9 (preferred, but no production version yet) they need 2.95 or 2.96, for Oracle 8 they would need 2.91. Tight time constraints: need to deploy in February at latest. Andreas: will follow requirements from the experiments, but would like to move to 3.2. Previous problem with Objectivity have disappeared together with it. NAG seems to work fine with either 2.95 or 3.2. OpenInventor libraries (compiled with 3.1) seems to work with 3.2. Guenter: ATLAS will need 2.95, doesn't care for 2.96, is testing 3.2. Dependencies are being cleared up at the moment, but are expected to be close to LHCb. Fons: ALICE has successfully tested 2.95,2.96,3.2 (and Intel compiler), ROOT can be made available for all these compilers Stephan: CMS is investigating dependencies with users. ANAPHE, GEANT4, ROOT moved to optional as changing more frequently, hence decoupled from the Linux certification. Want 3.2, no interest in 2.96. Need to run in production in July 03, hence test environment required early 2003. CMS software tested to some extent under 3.2, no problem so far. It is not clear yet whether 2.95 will be required for production on 7.x; so far, 2.95 is only used under RH6. Sverre: Listening, no external constraints from OpenLab Thorsten: Listening Marco: Productions planned for December 02 and February 03, based on 2.95.2 with current versions of external libraries. Efforts going on in parallel on porting to gcc 3.2, Gaudi being worked on now, reconstruction software a little later. Expect full switch to 3.2 in June 03. Benigno: DELPHI tested gcc 3.2, running fine, but still checking consistency of physics results. OPAL has not migrated out of FATMEN and has the manpower to support FATMEN and remove tms for end 2003, still needs GPHIGS. COMPASS still depends on Objectivity, ie. 2.95.2 under RH6. Migration to ORACLE is expected to be over in March 03. Dependencies on DATE, ROOT, ANAPHE (CLHEP), CERNLIB (math functions), are all expected not to be a problem for RH 7.3 once Objectivity dependency is gone. Will test 3.2 against ORACLE. Bruce: ATLAS Online HLT depends on offline packages (eg. GAUDI), inhe- riting its dependencies. Backend software currently compiles on 2.95.2 and 2.96; next version (due in some weeks) will be only 2.95.2. Future versions will be ported to 3.2, but no date yet. Some dependencies on external packages, eg. Together. Working group in Atlas Online has been established, meeting next week Eric: CMS online (hardware drivers) are using 2.96 (kernel compiler), control framework will be ported to 3.2 Helge: Listening Jan: IT issue is how to provide the different compilers. RPM-based ASIS cannot provide two compiler versions at the time, requires some work. First step is to re-compile the alternative compilers on 7.3, and then to provide a working environment for both at a time. Mostly interested in 3.2.1 because of platforms other than Linux. ORACLE is certifying its libraries against RH 7.1 and Red Hat Advanced Server (AES) only; no formal support for anything else, including client side. Summary: ======== * 2.95.2 needs to be available on 7.3.1, including physics software libraries * 3.2(.1) needs to be available as soon as possible on 7.3.1; physics libraries must be made available until spring (one-to-one agreement with library providers, not part of the 7.3.1 certification) Timeline: --------- Still trying to certify by November 15th, natively-compiled libraries should be available by then and initial tests of physics code look ok. ASIS compiler installation/binutils fixing problems could cause a delay. Next meeting: ------------- November 14th, 10:00. Room to be announced. Agenda: (hopefully) to declare the 7.3.1 certification to be over. Further discussion: ------------------- * 8.0: CMS would like to see an early beta for 8.0, others either don't care for theplatoformor would like stability. Consistent with the CLUG recommendation, there will be no cluster running 8.0 in Spring 03. A "early adopters" test machine will be provided after 7.3 certification is over, mostly to see whether to expect porting problems between ASIS gcc-3.2.1 and RedHat gcc 3.2. CLUG-agreed timeline for 8.x (autumn 2003) is still valid, assuming that 3.2 works as expected on 7.3.1. * Intel compiler is interesting, but will not be part of the 7.3.1 certification. One-to-one agreements need to be reached between library providers and users. * non-system compiler needs to be in an non-standard path, to minimize user errors * multiple compiler version problem: moving from ASIS gcc-3.2 to RedHat "gcc3" is not easily possibly -- no more updates to gcc3 in 7.x. ASIS should be able to solve this problem (old version could do it) * 2.95.2 may need an older version of binutils (ld) on 7.3, to be verified with the natively compiled version. If so, this version should be in the same path as 2.95.2. May require fancy bootstrapping for gcc (unlikely). * 3.2 may need a newer version of binutils (debugging), similar workaround with extra binaries in path. * Stephan: where to store knowledge for such workarounds? Web page will lead to inconsistencies over time. Proposal: attach info to test case -- either workaround is obsolete, or broken test case will contain the knowledge how to fix. Such things should be sent to the certification mailing list as well. * JAVA native interface: creates dependency between the (unknown) compiler used to compile JAVA and user code. Probably a bad idea to use C++ in this code. [ Aside: check copyright notices in ASIS products such as Java or acroread ] * Compatibility: expect 7.2 code to run on 7.3, not vice versa. Cross-distribution compatibility is used in ATLAS DC, but is not be guaranteed (ATLAS runs checks, and mostly does not use C++) * ASIS: using the system RPM DB seems to make queries slower. For stability in case of network trouble, shared libraries on AFS should disappear, as well as the /etc/ls.so.conf entry to /usr/local/lib (Thorsten: PLUS/BATCH on 7.x tend to have everything on local disk, except TeX) * 7.3 interactive behaviour seems to be worse then 6.1 under heavy load. Difficult to assess, kernel is still changing and seemingly small modification may have large influence on responsiveness or throughput.