Measuring the Success of a SAMSI Program – My Experience at the Beyond BIoinformatics Transition Workshop

The following was written by Katerina Kechris, Associate Professor and Graduate Program Director, University of Colorado – Denver. School of Public Health.

Katerina Kechris

Katerina Kechris

In mid-May 2015, working groups from the Beyond Bioinformatics Program gathered during the Bioinformatics Transition Workshop. This was a culmination of eight months of progress for over 10 working groups. The workshop topics were diverse and covered a variety of topics including epigenetics, microbial communities, evolutionary models, imaging genetics, next generation sequencing errors, high-dimensional discrete data, multiple hypothesis testing and data integration. The diversity of these topics reflects the current state of research in the biomedical sciences where technologies are advancing the study of biological mechanisms, structures, populations and disease. These technologies are generating high-dimensional and complex data structures providing intriguing opportunities for statisticians, mathematicians and computer scientists to develop new models, methods and algorithms to answer important biological questions.

Group photo outside

The Beyond Bioinformatics Transition Workshop attendees.

As a leader for one of the two Data Integration working groups, I was excited to hear about the activities from the other working groups during the workshop. I found their progress impressive, considering that many of the group members did not know each other until the Opening Workshop just eight months earlier. The transition workshop gave me the opportunity to reflect: How does one measure success of a program year and a working group? There are the usual metrics of publications, conference presentations and grant proposals that will be documented in great detail for reports. But at the workshop I could see more qualitative and interpersonal measures of successes. First, new collaborations were developed among researchers who would otherwise not have had the opportunity to meet and work together.

Personally, I enjoyed getting to know and working as a team with the other Data Integration working group leaders and members. Second, I was pleased to see great attendance and presentations at the workshop by students and post-docs. I know in several cases that the working group facilitated thesis and post-doctoral research projects for these junior investigators. Finally, I observed that there are ongoing plans to continue the working group efforts beyond the formal program year, which speaks to the positive aspects of the program. As for our working groups, it was such a pleasure to make new colleagues and see the evolution of how we approached the problem of data integration with very different perspectives and methods. I look forward to learning about the continuing progress of all groups.

Classroom shot of people listening to lecture

Listening to a working group make its report.

Learning about the Human Microbiome

The following was written by Nur Majida Shahir, graduate student, Bioinformatics and Computational Biology at the University of North Carolina at Chapel Hill.

Mur Majida sitting in the lecture room

Nur Majida Shahir at the Microbiome workshop at SAMSI.

This past month, I had the opportunity to attend a workshop at the Statistical and Applied Mathematical Sciences Institute (SAMSI) on the human microbiome. While I was only able to attend the first day, the information and insight I gained over the course of the day was amazingly useful.

Dr. Susan Holmes gave the first talk of the morning session on Multi-Table Data Analysis. While the talk itself was interesting, the thing that stood out most to me was being introduced to an R package that was created by Dr. Holmes’ group called phyloseq. Prior to this workshop, the only downstream analysis program I knew of was Explicet but after being exposed to the flexibility of phyloseq, I have a feeling that I may be using the latter more in my research.

two people sitting by the SAMSI sign

Bill Shannon (L), Washington U. at St. Louis, and Timothy Randolph, (R) Fred Hutchison Cancer Center

Dr. Vanni Bucci gave the second talk of the morning on predictive modeling of microbiome dynamics. In contrast to the previous talk, this was approaching the microbiome from an applied mathematics perspective with a focus on creating and using a minimal model to study microbiota dynamics in enteric infections. I found this talk particularly fascinating in part due to my background in mathematics as well as the fact that looking at the microbial community dynamics in the gut makes sense due to the transient nature of some of the flora seen in the gut.

people sitting around a table

Nur participates in her first breakout working group session.

After the morning sessions and lunch, I had my first experience with a working group breakout meeting at this workshop. On one hand, it was a good experience to hear what people were thinking with regards to various datasets and the analysis of said dataset. There were many concepts and approaches that were thrown around that I honestly hadn’t thought of. On the other hand, I found it disorienting because I had a very superficial idea of what they were discussing. It would have been more beneficial if I had access to the data or at least the papers to which they were referencing prior to the working group breakout meeting.

One of the things that I enjoyed about this workshop was the varied backgrounds of the presenters. While the majority of the presentations were focused on statistics approaches and problems regarding the analysis of the microbiome, others approached the microbiome from a much more theoretical perspective as seen in Dr. Giseon Heo’s talk, which if I recall correctly, approached it from the perspective of knot theory.

four people in front of two posters

People discussing their work at the poster session.

The poster session was held at the end of the day. Reflective of the talks, the poster session content was fairly diverse content-wise as well with posters ranging from the “standard data analysis + results + future directions” to more methodology oriented approaches regarding how to approach the data. I personally enjoy poster sessions because they allow me to approach the material at my own pace and to interact with the presenter in a more direct manner.

All in all, I left the workshop very content. I’ve attended a few conferences where halfway through I’m utterly exhausted and dreading the next 4 hours. At this workshop, I felt that my time there was both well spent and informative.

Leading a SAMSI Working Group: One Size Does Not Fit All

The following was written by Paul Brooks, Associate Professor, Department of Statistical Sciences and Operations Research, at Virginia Commonwealth University.

I came to SAMSI to get “real” statisticians to look at data that my microbiologist/clinical colleagues were generating to discover new patterns related to human health. Whenever I work with colleagues in the biomedical arena and they learn that I am a quantitative person, they automatically assume that I am a statistician and can help them with all of their statistical needs. They lump all of us applied math people together (I was happy to return the favor in my previous sentence). The experience depicted in this video rings all too true:

I watched this with some of my colleagues and we laughed until we cried because we have had nearly the same conversation several times over the years.

Last year, we were working on a grant proposal that asked for outreach to the community to make data available to analysts and train people on best practices. I had heard of SAMSI from a statistician in my department who had been urging us all to get involved. Then I discovered the \research program” concept that SAMSI has and further, that a theme for this year was Beyond Bioinformatics. The stars aligned. After speaking with Snehalata Huzurbazar, Deputy Director of SAMSI at the time, we agreed that I could lead a working group during the Beyond Bioinformatics program. Several other things fell into place, and I am now a visiting research fellow at SAMSI and leading two working groups.

Coming into the year, I was excited about the concept of a working group adding all kinds of new expertise to our ongoing projects. But I am a bit of an anxious person. Perhaps a nicer way to put it is, I am a planner. I like to have my stuff together well ahead of time. I met some SAMSI postdocs over the summer at a conference and asked them for advice on a successful working group. They said it takes a leader to assign tasks and hold people accountable for getting things done. Okay, I would need to learn how to do that.

When I arrived at SAMSI, I attended the opening workshop on ecological modeling and spied on them and how their working groups formed. Some of the working groups that formed there were huge! There were 35+ people in most groups, and many people were in multiple groups. How was I going to find enough ideas for that many people to work on? One group seemed to have a pretty good approach: they first brainstormed general topics of interest, then they collected whose to form article topics, then they began to work on what their paper titles would be, and even discussed target journals. That seemed pretty good for the first two days. I would try that. But of course it didn’t go quite as planned. Around 200 people attended the Beyond Bioinformatics opening workshop, and there were about 10 working groups proposed. The working groups formed after lunch on Thursday, but many of the people left right after lunch. The initial working group meetings were not as big as anticipated, and many of the working groups met in the same room.

working group meeting

MCDC group meeting during the Beyond Bioinformatics workshop.

My proposed working group, MCDC (Microbiome Community Dynamics and Complexity), was sort of split into two and one was merged with another working group and would be complementary to one or two others. Plans went out the window. One of the working groups started with about 5-6 people on site and about 25 people attending online via WebEx. For the first few meetings, we agreed on papers to read, and someone would lead a discussion of the analysis methods used. For a while, it seemed like only 2-3 of the 20-30 in attendance participated in the discussions. Some of these discussions were somewhat heated because of disagreements about appropriate modeling strategies. Perhaps people new to the held and graduate students/postdocs did not feel that they could weigh in on the issues yet. But they kept coming back.

The meetings reminded me of my early experiences with microbiome conferences. When I first started attending microbiome conferences, there was a group of postdocs who would microblog every slide from every presentation. They would describe what the speaker was presenting, then offer their opinion. They often disagreed with the presenter and with each other. Observing these discussions was incredibly helpful to me to understand a new field and to understand what the big questions were. Perhaps that’s the kind of service/entertainment we were providing with our working group meetings.

Fast forward to today. We still have 15-25 people at each meeting. Many more people are actively participating in the discussions than when we started. We have some subgroups who are working on different parts of a paper that we hope to write together. And we are laying the foundation for two additional papers to write together.

Each working group is a unique experience. Be flexible and know that different approaches to leading a working group can lead to fruitful collaboration.

It is Hard to Define What is Beyond Bioinformatics

The following blog entry was written by ClarLynda Williams-DeVane, Assistant Professor Bioinformatics/Biostatistics, Department of Biology and Director of Bioinformatics Genomics and Computational Chemistry Core (BGCCC) at the Biotechnology Biomedical Research Institute (BBRI), North Carolina Central University; Building Interdisciplinary Careers in Women’s Health (BIRCWH)  Duke University.


ClarLynda Williams-Devane

Dr. ClarLynda Williams-DeVane

Two weeks ago I participated in SAMSI’s Opening Workshop for the 2014-15 Program on Beyond Bioinformatics: Statistical and Mathematical Challenges. I was particularly interested in participating in this program because of the focus on data integration and large-scale data methodology. The focus of my research is in large-scale data integration for complex women’s diseases. As an assistant professor at a smaller university, it was an amazing opportunity to spend a week thinking about and discussing current and developing methodology in my research area. The discussion of exploratory data analysis (eda) methods in comparison or compliment to Bayesian model based methods was insightful and of great benefit as I have these discussions often with my K-award mentorship team. The thought leaders in these areas all made very well defined and supported arguments about which methodology was best given specific research questions.


Terry Speed talk at SAMSI

Dr. Terry Speed, UC-Berkeley, and Walter and Eliza Hall Institute of Medical Research

2014-09-09 12.19.12

Throughout the meeting, it was difficult for most speakers and attendees to define what it means to move beyond Bioinformatics. Many of the speakers and discussions following the speakers exemplified moving beyond bioinformatics while discussing how to move from exploratory data analysis methods to more model based analysis methods, which defines for me the need to move beyond bioinformatics. I appreciate the focus on mathematical and statistical approaches to problems. As a junior faculty member, the discussion about publishing in this area and developing clinically relevant methodologies was very helpful. At the end of the workshop as we broke into working groups, we continued our discussions of data integration. The working group process was a bit overwhelming attempting to find the appropriate fit. Through the various discussions on data integration, it was possible to find a working group that complimented my current research and to which I could be a major contributor. I am eagerly anticipating the next face-to-face meeting of my working group and seeing the outcome of the other working groups.

SAMSI-SAVI Workshop on Statistical Methods for Bioinformatics: December 2013

The following was written by Malay Bhattacharyya, Department of C.S.E., University of Kalyani, India

Malay Bhattacharyya

Malay Bhattacharyya

“Don’t give a talk, take a class.” This was the driving force, as I feel, behind the SAMSI Workshop on Statistical Methods for Bioinformatics we had at the IISc campus, India during Dec 12-14, 2013. It included people from diverse backgrounds encompassing biologists, statisticians, mathematicians, computer scientists, biostatisticians, biophysicians, biochemists, anthropologists, and lot more.

The Department of Mathematics at IISc was a perfect inspiring venue for this workshop, particularly for research discussions. It has chalks and boards kept everywhere, at every corner of the department! Even I saw a catering person to write random equations (although trivial) on a board avoiding others’ notice. Environment really motivates!

The very first talk of the workshop, by Varghese George of Georgia Regents University, made us feel like entering into the revitalized world of the epigenetics. The recent progresses and futuristic goals were very nicely introduced. Both Indranil Mukhopadhyay of ISI Kolkata and S. R. Deshmukh of University of Pune had comprehensive introductory talks afterwards about the basics of statistical tests, molecular biology, etc. and about expression profiling, respectively. Many of the speakers were also benefited from their efforts of making a strong foundation of the preliminary concepts. It was not required for them to start from the scratch.

The first day was so windy that I felt possibly the air is also keen to enter into the lecture hall, a perfect learning platform! What I liked most is not only the respective speakers were responding to the questions, rather everybody took the pain to discuss and settle the best answer for the tricky ones. The sessions were not too much attendee-heavy, so everybody had a fair chance of asking questions. Again, the ratio close to 2:3 between the invited speakers and the participants set up a real platform of face-to-face learning.

Naomi Altman of Pennsylvania State University gave an attractive talk on the recent obesity of the high throughput data. The reproducibility of research became a major issue of discussion lead by the talk of Prof. Altman. It is really a burning issue worldwide. There was a common agreement between all the invited speakers, who strongly encouraged keeping the relevant source codes also available alongside the publications.

How statistical analysis can help in some particular areas of plant genetics, especially in the gene duplication problems, was thoroughly described by the SAMSI Deputy Director Snehlata Huzurbazar. It was also a real pleasure to have Ashis Sen Gupta of ISI Kolkata on stage to talk about his work on understanding the circadian rhythms. It is a real challenge to build statistical models given the surprising fact that “the rhythm of the life is circular, although the life is itself a linear game.”

The next day we also had a great experience! Everybody was cracking jokes with the “Friday the 13th” issue that eventually marked out the second day of the workshop. But it was really an enjoyable day – not only for the banquet but also for the diversity and depth of the talks. It started with the talk by Nagasuma R. Chandra from the host institute who gave a realistic overview of how to proceed step-by-step towards the modelling of disease prediction. She has a strong belief that making such systems automated will indeed speed up the progresses in this direction.

Olga Vitek from Purdue University gave us an all-inclusive glance toward the immense scope lying in some promising areas of proteomics. Again, N. Srinivasan of IISc detailed on a fantastic account of his research on finding missing links between protein families using computational models.

T. S. Vasulu of ISI Kolkata and Paul Joyce (I love the way he explains complex things with funny examples) of University of Idaho gave nice introductions to statistical methods that can be applied to phylogenetic analysis and for the study of adaptive evolution, respectively. Switching between the laser pointer and the hard pointing stick was a real fun for Prof. Joyce.


Every talk was made somewhat flexible based on the demand of the attendees. I remember a talk to have been stretched by half an hour or so to satisfactorily answer every question raised by the listeners. Still the overall time frame for the entire day was well maintained. We also had a nice photo session on the terrace of the hosting department in a mood of get together.

The long walks (voluntarily avoiding cars) through the woods of the campus of IISc with many of the speakers, while returning after the workshop days, gave me pleasant chance of gaining additional experience through informal discussions. Research is no more an independent effort, rather a collaborative competition.

The final day of the workshop started with a nice talk by Susan Holmes of Stanford University, who highlighted diverse facets of the Human Microbiome Project. Her idea of more on leaps less on slides was great, using the chalks and boards every time. The concluding talk by Sanghamitra Bandyopadyay of ISI Kolkata, a very basic one bridging between statistics, computer science and biology, detailed on various robust computational models that can tackle multi-objective problems, often occurring in expression analysis and related areas.

The contributory talks by seven of us, mostly covering ongoing studies, were strongly benefitted from the expert speakers being also a part of the audience. I enjoyed my contributed talk and received a couple of valuable suggestions. The take home message was “do whatever you wish to do, but with a clear conception and full confidence.” We were unfortunate this time to miss the talk of K. Thangaraj of CCMB, on the very last day because of his absence due to some urgent involvements.

I hope that the slides of the talks will soon be available online. The logistics managed by Shruti and Sai were fabulous. The foods were so diverse that I experienced the taste of the entire India. The organizers took a lot of pain to arrange a great banquet dinner on the second day. The round table discussions in the banquet session were really effective.

I feel like SAMSI workshop proved to be a real SAMSI (such a mega statistical incident). Hats off SAMSI!