Blog

We plan to take our enterprise directory service down at 10pm, Saturday March 9 for a 30 minute maintenance period.  This will impact all authentications to lab systems including cloud based services like Google.  For many, this service is commonly known as "LDAP".  It will also impact access to the new One-time-Password (OTP) Service that is being gradually phased in by IT staff.

One of our goals is to re-implement an approach to Password Expiration - Lab employees who have not changed their password within the past several months will be put on a staggered schedule (with appropriate notifications) - during April and May.  Passwords will be set to expire on workdays when the IT Help Desk can assist if needed (Tuesdays through Thursdays) if notifications have not been acted upon.

This will not impact Mobile phone access to Google mail or calendar.

For the past couple of years, Manymoon - an early Google Marketplace tool - has been available through the "More" Menu when logged into any of the core Google Applications.  

 

 

 

 

 

 

 

 

 

 

We never found a great business need for this tool. Now that the company has been acquired by Salesforce and the product converted to "DO", we feel the time has come to remove it from our portfolio of applications.  Individual users can still use the free version by going directly to the companies website, according to Manymoon support staff, but as of Friday March 8, it will no longer be part of our environment.

IT is pleased to host a special 2 hour course created just for
Berkeley Lab scientists and program managers designed to help you get
the most out of your software development projects and avoid common
pitfalls.    The course will take place Monday March 4 from 2-4pm in
Perseverance Hall.

Whether you're in an early stage, or already responsible for a large
development effort within your area, this course will share concepts,
tips, and horror stories that will hopefully help you to give your
projects the best chance of success.  The focus of the course is on
managing software development projects for science, not actual coding
or development.

The course will be taught by the founder of the nonprofit Software
Carpentry Foundation, which works to help scientists grow their
computational faculties.  Greg Wilson, the founder and teacher, has
decades of experience working to help scientists and coding teams be
successful.

The course was designed with input from LBNL IT and NERSC and informed
by Greg's many years of experience.  There is no charge for this
course, which is provided as part of IT's commitment to help LBNL
researchers compute and collaborate more effectively.

Please register for the course here: http://go.lbl.gov/sc-register

Details:
When:  Monday, March 4, 2013
Where: 54-130 (PERS HALL)
Time:  2:00-4:00 PM

 Maintenance on UPS may impact Collaboration and File/Print Services

The IT Infrastructure Dept has scheduled an maintenance window to upgrade a UPS.  The goal is to replace an aging unit with a new one.  This work may impact some IT Services.  The UPS system supports a Network File System (NetApp) that is used by a variety of collaboration and File/Print Services.  

Services that rely on this file system may not be available on Wednesday, February 13 from 6:30 PM to 7:30 PM. 

The File system has dual power supplies and only one side will be affected at a time.  There is some chance (the probability is very small but not zero) that the unit would loose power and have to be restarted.  If this happens, the following would be down for a short time: 

  • Print servers that support Windows XP clients
  • Network File Services for users of our Institutional Active Directory based service. 
  • The Plesk web hosting environment
  • The proxy site for accessing scientific journals at UCB
  • Primavera - used by the labs Project Management Office
  • The RMS system

This outage will impact Webspace and  the Commons wiki (and the web pages hosted on it, including IT, HR, and Facilities) - both of which will be taken down as a precautionary step.

 

 

 

Maintenance this Weekend Affects eBuy, TREX, PCard, and Other Financial Services

Starting at 2 p.m Feb. 8 and continuing through the weekend, the Lab’s financial systems will be taken offline for maintenance. To prevent an online session from being interrupted, users will need to log out of the system prior to 2 pm.

During this outage, users will be unable to access eBuy, TREX, purchase requisitions, or approve online financial transactions. It is expected that these systems will be back up by Monday morning.

For more information, contact the IT help desk (x4357).

Berkeley Lab scientists led the development of an algorithm and a computational pipeline, making extensive use of the Lawrencium Cluster, that analyzes large sets of tumor images. Their work will help scientists learn more about the genetic and molecular mechanisms that control tumor signatures. It will also shed light on whether tumor subtype can predict the effectiveness of therapies. The research was led by Hang Chang, Ju Han, Leandro Loss, and Bahram Parvin of the Life Sciences Division, as well as scientists from several other institutions. The scientists validated their pipeline by applying it to 377 whole-slide images from patients who have an aggressive brain cancer. More>

A tumor’s organizational complexity is revealed. The center image is a whole-slide image of a Glioblastoma Multiforme tumor. The arrows indicate enlarged, distinct regions. Berkeley Lab scientists have developed an automated way to analyze large sets of tumor images.

We have scheduled a regular maintenance for all the clusters in our supercluster infrastructure on Tuesday, Feb 12th from 9:00 am to 5:00 pm. We are in need of taking this downtime to address some stability issues with our job scheduler which has impacted our users.

User logins and access to the data on the cluster filesystems will be blocked. Job scheduler reservations will be put in place such that there will not be any jobs running in the cluster queues for the duration of the downtime. If you are submitting jobs to your clusters before the downtime make sure you request proper wallclock time such that your job will finish before 9:00 am on Feb 12th or else your job will stay queued until after the downtime and then starts to run. User access to the data transfer node will also be turned off.

All the clusters in our supercluster including Alsacc, ARES, Baldur, Cumulus, Explorer, Hbar, JBEI, JCAP, Lawrencium (Lr1, Lr2 & Lr3 nodes) with Co2seq, Matgen, Ganita, Nanotheory, Esd1, Esd2 condos, MHG, Musigny, Nano, Natgas, Voltaire, Vulcan & Yquem all are affected by this downtime. We request all our users to save their work and close their login shell before 9:00 am on Feb 12th.

We apologize for any inconvenience. After the downtime all cluster services will be restored as before. Email us at [email protected] if you have any questions or concerns.

As part of our ongoing quest to find the right suite of realtime collaboration tools for Berkeley Lab researchers, IT is testing Fuze Meeting with the Biosciences Directorate.  The test will actually be one of if not the largest realtime, interactive, video meetings ever attempted at LBNL (that we know of).

 

Participants need to download the fuze software in advance at https://www.fuzebox.com/products/download  and install it on their computer or tablet.

 

For additional attendee information, consult your email.

 

Software Carpentry Boot Camp

We are offering our fourth installment of the 2 day training session dedicated to teaching scientists how to be better computer users on May 9-10, 2013, 9:00AM-4:30PM, 54-130 (Pers Hall).   The class will cover everything from shell to beginning scientific programming.

To register for the boot camp go to: 

http://go.lbl.gov/sc-bootcamp-may9-10

For detailed information about the training sessions go to:

 http://software-carpentry.org/bootcamps/2013-05-lbl.html

There is no charge for these courses, which are supported as part of IT's commitment to help researchers compute and collaborate more effectively.  

 

Excel Classes

Excel for Scientists will be available again in May, and April will have courses in Advanced Functions and Using "What If"Analysis Tools and Macros.  Check the training page for additional details.

 

 

www outage Wed January 23

There will be a 30 minute outage of www.lbl.gov between 6pm and 8pm on Wed January 23.   A small number of other services are also impacted and customers of those services have been notified.  Other websites, network connectivity, email, and all other services are not impacted by this outage.

 

We are glad to announce the availability of third generation (Lr3nodes in the Lawrencium cluster which is the Lab's institutional scientific computational system available for LBNL PI use. We have recently added 108 new compute nodes each equipped with dual-socket eight-core Intel SandyBridge 2.6Ghz processors (16 cores/node) and 64GB of 1600Mhz memory. They  are connected with the latest high performance, low latency 56gb/s FDR infiniband interconnect, compared to the QDR 40gb/s & DDR 20gb/s interconnects in the earlier generation nodes, and are connected into the same user environment and storage as the Lr1 and Lr2 clusters.

Comparison of available Lawrencium Nodes
Third generation (lr3) nodes - 16 core, 64 GB, FDR 56gb/s infiniband
Second generation (lr2nodes - 12-core, 24GB, QDR 40gb/s infiniband
First generation (lr1) nodes - 8-core, 16GB, DDR 20gb/s infiniband

Any Lawrencium user can now access these new nodes by submitting jobs to the same routing queue as always ("lr_batch") but by specifying the  type of nodes on which you want to run in the "-l" line of your PBS job scripts. For example:

1) To run on the new, third generation (lr3nodes, please specify
#PBS -q lr_batch
#PBS -l nodes=X:ppn=Y:lr3

If you do not specify a type of node (either :lr1 or :lr2 or : lr3), the job will default to using the lr1 nodes.

Also we have a 60 second default walltime configured which means any job submitted to Lawrencium queues without the required walltime will run only for 60 seconds so please make sure you specify walltime in all your jobs.

We hope our users will make good use of these new enhanced resources and get more research done quickly.
Interested and new users can visit the HPC Services web site to learn more and request an account. Please email us at [email protected] if you have any questions. Enjoy.


IT is pleased to present a one day course on Agile Project Management for the Lab Community.   Information from the instructor is below.  

Please register for the course here.  There is no charge to attend.


For just any project these days, uncertainty is certain. The customer will change their mind, a ‘must have’ requirement will be discovered, deadlines will move, or an unexpected technological issue will need to be solved. The agile approach to defining and executing a project allows a team to easily adjust to these changing conditions in order to produce the best work product in the best time frame possible.

This workshop compares the agile iterative project approach to a more traditional gated or “waterfall” approach.    What are the benefits of using an agile approach, as well as the costs and risks?   We will explore the key roles, responsibilities, interactions, and processes that make a successful agile project happen.

Attendees will take part in discussions and demonstrations covering:

      • The what and the why of Agile  the myths and truths of Agile
      • Lean, Kanban, and Scrum  the most common agile frameworks
      • Scrum in a Nutshell – the key roles, activities and work products on a scrum team the most common form of agile development
      • Working with agile requirements, specifications, and documentation
      • Estimation  how much work do we have to do?
      • Creating and applying an agile 'Definition of Done'
      • Communication on an agile team  (even when everyones not in the same room)
      • The power of team self-organization and continuous improvement

By the end of the workshop, attendees will have experienced a broad introduction to the world of Agile – its benefits, tradeoffs, costs, and risks.  Concrete ideas will be shared that attendees can immediately apply individually or on team projects. 

 

Instructor Bio

Chris Sims is a Certified Scrum Trainer (CST), agile coach, and recovering C++ developer who helps software development teams improve their productivity and happiness. Chris is the founder of Agile Learning Labs, as well as the Bay Area Agile Managers’ Support group. He is co-author of The Elements of Scrum and has published over 50 articles on agile topics at InfoQ. Even more of his writing can be found on Agile Learning Labs’ blog. Before starting Agile Learning Labs, Chris made a living in roles such as ScrumMaster, Product Owner, Engineering Manager, Project Manager, Software Engineer, Musician, and Auto Mechanic.

 

Operations Customers:

As you may know, occupants of building 46 are being relocated due to the slide above McMillan road.  Workstation Support (MPSG) and the IT Helpdesk were both located in Building 46.     We wanted to alert you to possible interruptions to service as a result of this relocation.

First, the IT Helpdesk has completed its move to a new facility.  We do not anticipate any interruption of service for the helpdesk.

For workstation support, which is a larger and more complex move, we expect that responses to trouble tickets will be impacted over the next two weeks.   We will make every effort to respond to high priority tickets, but please be judicious in marking a ticket “high priority.”  Normal priority tickets may be delayed until the relocation is complete (approximately 2 weeks).   In addition, computer installs and other maintenance will be rescheduled until after the move.  If you have a scheduled install you will receive separate notification about rescheduling.

Thank you for your patience during this process.   If you have any questions or concerns, please contact us at [email protected]


Additional information about the 46 relocation and the hillside is available here:  http://today.lbl.gov/14973/


IT Maintenance 12-27-12

Overview

As part of electrical upgrades to support the next generation of high speed networking, there will be short disruptions to LBL's internet and local network connectivity on Dec 27th from 6:30AM to 1:30pm. During these outages, all services which rely on LBL networks will be unavailable including LBL websites, email, collaborative tools, business applications, and remote access.  The outages are anticipated at the beginning and end of the scheduled work.   

The outage will not cause a loss of email, which will be queued at the sending location and delivered, albeit delayed, when our email systems systems come back online.

Services Impacted

The maintenance directly impacts the following services:

  • All Connectivity between LBL and the internet
  • Some internal subnets (see more below)
  • All services which rely on network connectivity, including Google Apps (gmail/calendar/etc), network fileshares, cluster computing, remote access, etc.

 

Why is Gmail impacted?

There are two ways Gmail is impacted during this outage. The first is the indirect impact due to Gmail's dependency on LBL LDAP for authentication. Since Google's servers will not be able to communicate with LBL's servers, users will be unable to authenticate to Gmail. If you are already connected to Gmail during the outage, you will not be disconnected. However, you will not receive new email nor will new emails you create be delivered due to the second impact. Although Lab email is delivered to Gmail and you check it at Gmail, the email is  routed through (passes through) Lab email routing systems, for additional security filtering and list processing before being sent to Gmail.  In summary, you may be able to remain logged in and read your gmail during the outage, but new mail you receive and send will not be delivered to you until the outage is over. 

What will improve after the outage?

This electrical upgrade is necessary to to support the eventual upgrade of our connection to the internet (via ESnet) to 100G - enabling future high speed science data flows and enhanced worldwide collaboration.  The actual upgrade will occur in early 2013, this outage is only to install the electrical connections necessary for the upgrade.

How can I get more information?

If you have any questions, feedback, or just want more information, please contact the IT Help Desk at http://help.lbl.gov, or 510-486-4357.

Where can I get the gritty details?

On the IT Outage Page

 

Your Feedback

 

Berkeley Lab's HPC Services consultant Yong Qin won the FX10 Championship hosted by Kyushu University at SC12 last month.

The FX10 championship is a competition for performance efficiency on your own code on 12 compute nodes of Fujitsu PRIMEHPC FX10, a  commercial version of the K-computer (#3 of the TOP500 November 2012 list) equipped with the SPARC64(TM) IXfx processor and Tofu interconnect. Contestants submitted their codes to the Kyushu University staff and it was subsequently compiled and profiled to measure efficiency. The person with the highest efficiency wins.

According to Professor Keiichiro FUKAZAWA of Kyushu University, any code with an efficiency better than 10% is good. The application that Yong brought in was a code highly optimized for undulator radiation spectrum calculation that we collaborate with the Advanced Light Source (US) and Hiroshima Synchrotron Radiation Center (Japan).  The code achieved an astonishing 53% efficiency. The 2nd place winner was only able to reach a 20% efficiency.

Yong attributes his ability to win based on his efforts to greatly reduce the memory footprint and to optimize the code with advanced parallelization techniques. It also helped that he developed this code to run on the newly available 37TF 108-node Lawrencium LR3 cluster which is equipped with 16 Intel Sandybridge processor cores per node - the same number of cores as on the FX10 nodes.

At first glance, the contest organizers thought that Yong had written a benchmark type of code to use up the processors, but once Yong explained his methods, they declared him the overall winner. Next year, Yong hopes to do even better after he has had a chance to further optimize his code.