Viewable by the world
Skip to end of metadata
Go to start of metadata


Remote Attendees


Not attending


  • Post mortem on what worked well and what could be improved preparing the all hands
  • Review points from the all hands discussion on 10/17
  • Review action items for next steps in developing data science program

Discussion items

2minReview agendaMike Barton
5minIntroduction and motivation for getting involvedJonathon (Jon) Bertsch
  • Started out in science and then moved into programming.
  • Looking at data in IMG, then moved to gold because of budget constraints.
  • Doing data analysis and web work
  • Has background from IMG and gold about how data moves around in the JGI

Provide feed forward on the slides generated for the all hands.

  • What worked well in the slides?
  • What could be improved next time?
David E Gilbert
  • Kinds of presentations shown to the JGI, need to consider first why should anyone in the room care.
  • Subset of individuals who will have an interest in content, but expect many others to be not overtly involved and not necessarily to have an interest.

Specific points:

  • Explain what data science is, define what we mean from the beginning.
  • What are the data sources we are referring to.
  • Explain the motivation for putting out the survey in the first place
  • The data presented, begs further questions about the specific skills and applications are.
  • Even in the data science discussion, too much data presented
  • Can we instead pull out specific stories and vignettes that people can relate to.
  • Some nuggets and quotes, but too many words on a slide. People will start focus on reading slides instead. Again should focus on stories that people can relate to.


  • Focus on the key points
  • Drill down on to key points

Next steps

  • How can prime people for a presentation at the next all hands
  • Perhaps the inside JGI newsletter
  • Over the next few months, think about presenting the take home messages from the survey
  • Focus on methods for getting people involved.
  • Possibly drive people drive people to confluence


10minOpen discussion of preparing the slides for the all handsMike Barton


  • Shouldn't we get people involved already
  • Focus on the next steps, need to focus on the next steps
  • Could we target the low hanging fruit and work on those, such as creating documentation.


  • If there isn't a path forward, difficult to get people involved and there is no upfront communication
  • Two different and complementary efforts that we could use to drive
  • There are efforts being done, and we should organise around that


  • Notes pasted in the comments


  • Need to keep people motivated
  • Revise the slides and add a story about why the data science is important


  • Have a framework for the points that are most important for people
  • How to get access to basic data stores
  • Come with up a plan


  • Need one person in charge, and then they then decide on the directions that we want to move in
  • Then can come up with actionable initiatives that focus around these key points.
  • These initiatives could then recuit people to work on them.
  • The current group isn't currently large enough to implement all the initiatives ourselves, we'll need to recruit people to working on.


10minReview points raised in the all hands meeting 10/17Mike Barton


  • Focus on the bash shell and the visualisation tools
  • Focus on simple things
  • People spend time documenting things
  • Sent an email to Leila about how we can determine documentation


  • People will ask the same thing to the same person.
  • The same questions often get asked over and over again.
  • Importance of finding information was raised at the all hands.
  • Julie raised slack as useful


  • Find at the NERSC slack that the same questions often get asked over and over again


  • We don't have to figure out the best tools and methods now.
  • One tool won't fit everyone, in this meeting it's not necessary to determine what to do now.
  • Would say that the next steps should be action items
  • Spoke with other people and there are many solutions that we can work
10minPlan next steps for implementation data science at the JGIMike Barton

Bill and Simona

  • Will try to develop a next step on the educational part

Jon and Kecia

  • Take ownership for developing a data sources next steps


  • Meet regularly to develop a communications strategy for these intitatives


  • Already doing a lot of training.
  • Will get involved where possible
  • Try to lever the existing NERSC documentation as part of these initiatives.


  • Each person should outline the overall components for an initiatives
  • Then break them down into smaller components


  • Underpinning communications is the main point
  • Likes Tony points about moving forward with slack
  • Takes advantage of the environment within the intranet, prefers developing community within the intranet.
  • Drive people towards the intranet from there


5minReview action items and plan next meeting Jon will be out of the office the next couple of weeks

Action items

Notes from the all hands meeting held 2017/10/17

Bill's notes:

  • use of slack is proposed for communicating what is happening.

  • people want to share their scripts and not just news on who does what. Especially with purge policy it is common for scripts to get lost.

  • visualization tools used widely, but there is little communication about who does what.
  • wikis - jgi has 4 wikis but it is not certain that all are used as they are meant to be, since people dont know where they have to go for info. what we need is something like search 
  • everyone uses the bash shell widely.
  • ML is a controversial topic at JGI - statistical learning is popular, but ML tools are a black box. It is not clear why their predictions might fail.
  • Need for better documentation is agreed upon.
  • for documentation a good idea would be during the quarterly NERSC maintenance to have a meeting where people document their tasks and code (in a wiki).
  • Kaggle is a competition for training. But if everyone participates they will motivate one another.
  • Any training has to be relevant to work at JGI, or else people won't find time for it and will lose interest.

Tony's notes:

  • please don't use wikis for code examples or scripts, please use a git repository. Wikis are harder to maintain, you can't beat a 'git commit' once a script works as compared to going to browser, open page, log in, edit, copy/paste, save... Plus, good scripts will evolve, and it's important to have version control to know when it breaks for someone or when a new feature gets added.

  • regarding data competitions, kaggle-style or otherwise. I'd be careful about how these are introduced. If they're done on an individual basis, there's bound to be a lot of people who won't participate because they're shy. Make it group-oriented, either by JGI groups or by ad-hoc groups or something like that. I can help with the gamification if you like.
  • Bill's slides mentioned making data easier to find in JAMO. What's needed there is a basic ontology that the JGI can adopt and adhere to. I'd like to work on that when I get some spare time, it's key to a lot of things that need doing.
  • For group education, how about signing up for Coursera courses as a team and scheduling classes? We could have one day for watching the weeks videos (maybe over lunch), then a few days later a get-together to go over the homework. That gives people time to try them on their own and then learn as a crowd. Plus there's the course fora which are an invaluable source of help - there's no reason for us to do this alone!