DPDA Workshop: Reinforcing the Importance of Statistics and Applied Mathematics in Distributed Computing

alexander-terenin_headshot

Contributed by: Alexander Terenin, Statistics and Applied Mathematics PhD student, University of California – Santa Cruz

I am a PhD student in Statistics and Applied Mathematics at the University of California – Santa Cruz (UCSC). My research focuses on Bayesian statistics – specifically, Markov Chain Monte Carlo methods at scale in parallel and distributed environments for big data applications. I had heard about the workshop from a fellow graduate student in my department, and attending was a very natural choice given my area of research.

The Workshop…
On September 20 – 23, I had the privilege of attending a 4-day Workshop on Distributed and Parallel Data Analysis (DPDA) hosted by Statistical and Applied Mathematical Sciences Institute (SAMSI) at North Carolina State University in Raleigh, N.C. I would like to take this time to reflect on my observations after attending the workshop in this piece.

Upon arrival, the workshop proceeded as workshops usually do: various speakers gave talks on different topics, intertwined with breaks that give participants the opportunity to take a moment to think about the talks, as well as time to talk to one another about ideas. I was intrigued to see that the DPDA workshop had no parallel sessions – a format I much prefer because it brings people together that may otherwise never end up in the same room.

img_1098

Participants at the 2016 DPDA Workshop network during one of the scheduled times of the series. Participants used these opportunities to network and collaborate on ideas.

Informative and Engaging Discussions…
A number of these talks and discussions stood out to me – I’ll highlight three of them, in order of occurrence.

Wotao Yin, a faculty from UCLA’s Mathematics Department, gave a talk on “Asynchronous Parallel Coordinate Update Algorithms.” In this talk, he described a particular class of parallel versions of optimization algorithms – asynchronous iterative algorithms.

To understand what these are, let’s first back up and speak for a moment on iterative algorithms: these are algorithms where some sequence of steps is repeated until convergence. To take the next step, we need to have completed the previous one – so how can iterative algorithms be parallelized? It turns out, one way to do so is to make them asynchronous. For example, a set of workers perform a set of iterative steps as fast as they can, talking to each other as much as possible, with no control over what order these steps occur in. So then the question is asked, can such processes converge? Sometimes this is possible. If the algorithm’s state space forms a box, and if individual steps shrink the box, then the algorithm will converge even if performed asynchronously. After recalling these results, Prof.Yin illustrated that certain coordinate ascent algorithms satisfy these conditions. This talk was very interesting for me to listen to as I have written a paper about the asynchronous variant of Gibbs Sampling, an algorithm for Bayesian computation, the analysis of which is complicated but involves the same conditions. Seeing the same ideas used in a different context was very interesting and got me to think about similarities and differences with my own work.

Eric Xing, a faculty from Carnegie Mellon’s Computer Science Department, gave a talk on “Strategies and Principles for Distributed Machine Learning.” His lecture focused on a description of a variety of computational software environments used in big data setting, and how different implementation choices can yield vastly different levels of performance. This topic was interesting, because it bridged the theory of statistical computation with software engineering considerations that end up having substantially more implications for performance than might be expected. For example, in a distributed setting, having a master node that manages and coordinates workers can yield different performance characteristics than a peer-to-peer model where all of the workers talk to each other – even if the exact same algorithm is used in both cases. Similar lines of thought have been highly relevant in my own work as well. Having written papers on performing Markov Chain Monte Carlo algorithms in two different parallel settings – compute clusters and graphics cards – I have learned that software engineering considerations are an inherent part of parallel computing and it is important to study them.

I also found the discussion panel toward the end of the workshop to be particularly memorable. My PhD Advisor at UCSC, David Draper, was on the panel, along with a number of distinguished faculty members from several universities – moderated by Sujit Ghosh, Deputy Director of SAMSI. Draper made the point that for the field of statistical computation to advance, “statisticians need to become better computer scientists, and computer scientists need to become better statisticians.” This point resonated with me because as a student in a graduate program in statistics, we are largely not taught anything about high performance computing, whether in traditional supercomputer or Silicon Valley style hardware environments. I however, have been fortunate that I have had the privilege of working in both settings through an academic collaboration with Shawfeng Dong, an astrophysicist at UCSC, and my time at eBay, Inc. – many statisticians have not had this comparable opportunity.

This makes statistical high performance computing a specialty area, which in my view causes two discipline-wide consequences: (1) it’s easy for non-specialists to write code and design algorithms that scale poorly, and (2) the typical software stack that statisticians are taught and use in practice is filled with out-of-date tools and programmatic concepts that make coding and debugging unnecessarily difficult.

It was very interesting to hear similar ideas brought up and discussed as part of the panel. The experience was vital because the panel emphasized the implications on statistical education, a topic I do not have many opinions about, because I am still a student. The discussion panel gave me the opportunity to think about our field as statisticians and applied mathematicians and where our discipline is headed.  This new information and insight is important for a young person, such as myself, because it tells me what to study and spend my time on throughout my graduate program.

Participants at the 2016 DPDA Workshop discuss various topics on distributed computing during the Workshop Reception and Poster session.

Participants at the 2016 DPDA Workshop discuss various topics on distributed computing during the Workshop Reception and Poster session.

“Statisticians need to become better computer scientists, and computer scientists need to become better statisticians.”

A Good Experience Overall…
Overall, I found the workshop highly memorable. The points highlighted merely scratched the surface of topics I wanted to discuss. An honorable mention was the lecture by Han Liu, a faculty at the Statistical Machine Learning Lab at Princeton University. Liu’s talk was called “Blessing of Massive Scale” and he demonstrated that some problems become much easier when they are big. Faming Liang, a faculty at the University of Florida’s Department of Biostatistics, spoke about “Bayesian Neural Networks for High Dimensional Variable Selection.” I found  Liang’s treatment of Bayesian asymptotics interesting.

Finally, Samuel Franklin’s, of 360i: Digital Marketing Agency, presented a talk called “HDPA Growth Constraints in Digital Marketing.” The subject was surprisingly interesting for a talk that involved no mathematics. He called upon all of us in the room, the next generation of statisticians, engineers and applied mathematicians to be champions for increased education on high performance computing foreshadowed some of what was later said in the panel.

Data Science at 360i, lectures on the importance high speed computing as a resource for digital marketing strategies.

Samuel Franklin, Vice President of Data Science at 360i, lectures on the importance of high speed computing as a resource for digital marketing strategies.

I was thankful that I had the opportunity to attend and listen to all of the wonderful perspectives that were offered on our field of study, as well as the opportunity to try North Carolina BBQ during one of the evenings. I would also like to thank SAMSI for compiling and sharing the approved lectures from this event online. For more information about the DPDA Workshop or simply to review what was presented, visit: www.samsi.info/dpda.

SAMSI Undergraduate Workshop inspires Student Growth

joanna-itzel_rev

 Contributed by: Joanna Itzel Navarro, Statistics Undergraduate, University of California – Los Angeles

From May 22-26, 2016, I had the privilege of participating in the SAMSI (Statistical and Applied Mathematical Sciences Institute) Interdisciplinary Workshop for Undergraduate Students.

In my quest for statistical research, I learned about SAMSI after coming across a paper on Markov chain Monte Carlo (MCMC) methods written by the Deputy Director of SAMSI, Sujit K. Ghosh.  A statistics alumnus from UCLA had previously mentioned SAMSI to me before, so when I came across Dr. Ghosh’s paper, I was compelled to find out more about this program he and Dr. Ghosh endorsed.  A few months later, I found myself at SAMSI learning about random walks and the Metropolis-Hastings algorithm from Dr. Ghosh himself.

The SAMSI Experience…
The day after arriving in North Carolina, the workshop commenced with a presentation by the Director of SAMSI, Dr. Richard Smith, on statistical reasoning in public and the complexity of small and large data sets. Throughout this first day of the workshop, we heard more data talks from different sources in order to investigate a variety of questions related to several exciting and emerging areas of research.  The research projects available to us ranged from the overall complex dynamic behavior of the brain and nervous system to measuring climate change through dolphin migration patterns. After the talks ended, the other students and I broke up into groups of 5-9 and were assigned to the research project we selected.  Before the first day was over, we got to know our group members and learned of all the different majors we were.  This miscellany of majors initially struck us as inexpedient, but throughout the week, we learned that bringing together minds from different backgrounds, qualifications, and experiences is key to effective problem-solving.

“When we found ourselves stumped, all it took was one group member to pose a provoking question or novel information to furnish the impetus that moved us forward.”

Reinforcing Effective Foundations in Statistics…
The following days entailed a wealth of R, MATLAB, presentations on giving effective presentations, and panels on graduate school programs and graduate school life. Additionally, we toured neighboring research institutions in North Carolina’s Research Triangle Park and reconnoitered the campus NC State University.

Research Group Projects…
While our morning and afternoon activities varied, our evenings remained dutifully allotted for our research projects and group work.  After an eventful day, we came back every evening to find ourselves huddled around desks and ripe for our research projects.

lecture-at-undegrad-workshop_rev

– Joanna Itzel Navarro presents findings on her Research Group’s Project at SAMSI Interdisciplinary Workshop for Undergraduate Students, May 22-26, 2016.                               (photo provided by Navarro)

My research group was under the guidance of Duke’s newest, congenial statistics postdoctoral fellow, Dr. Adam Jaeger, and our research examined how various environmental factors predict behaviors of bottlenose dolphins in the Northern North Carolina Estuarine System (NNCES) stock in Roanoke Sound, North Carolina.  Furthermore, our research sought to discover how water temperature relates to the presence of dolphins and whether a change in the frequency of dolphins could be indicative of climate change.

Learning Through Diverse Perspectives…
The amalgam of majors in our group was certainly a recipe for a wide range of questions and approaches, and we noticed this especially in the beginning.  This led us to adopt a multidisciplinary approach, and by the end of the program, we had molded ourselves into your quintessential, diverse research team. When we found ourselves stumped, all it took was one group member to pose a provoking question or novel information to furnish the impetus that moved us forward.  We were all challenged to work out our differences and use our diversions as opportunities; we learned to anticipate alternative viewpoints and to expect that reaching a consensus would take effort and strong reasoning.

The End…

itzel-listening-to-lecture_rev

– Joanna Itzel Navarro listens to one of many lectures presented at SAMSI Interdisciplinary Workshop for Undergraduate Students, May 22-26, 2016. (photo provided by Navarro)

On the last day of the workshop, every group presented their research findings.  The presentations were interactive and the questions were provoking.  After a series of group photos and goodbyes, we all parted our separate ways. This was not the end of things for us though. Currently, many of us remain connected.  Whether through our Facebook group we’re all part of or through email, we continuously share with each other and let each other know about other opportunities.

Participating in this interdisciplinary workshop has highlighted the role of mathematical sciences, particularly statistics, in solving a gamut of important problems.  Through the tours, presentations, group research, and interacting with erudite people from academia and industry, this workshop has imparted an educational experience that I cannot image receiving elsewhere. This was an indelible experience and a worthwhile way to spend my degrees of freedom.

2016-samsi-undergrad-workshop

– Group photo of students and mentors at the SAMSI Interdisciplinary Workshop for Undergraduate Students, May 22-26, 2016. (photo provided by Navarro)

SAMSI/Harvard Workshop on Environmental Health Data: A Lasting Impression – 9 Months in the Making

Contributed by Krista Coleman, MSM; Associate Director of Research Strategy and Development, Harvard T.H. Chan School of Public Health

I facilitated the ‘Introductions’ on the morning of Day 1 of the Statistical Methods and Analysis of Environmental Health Data in Mumbai, India, and I can’t express how satisfying it was to see nine months of planning come to life. Once everyone provided their brief introduction including their name, professional title, institution, and area of research interest, I recall saying, “Well, it sounds like we’ve gathered the right group of researchers together!” That statement held true throughout the week as I watched existing colleagues reconnect, new collaborations form, and treasured friendships develop – all because we came together around the topic of India’s pressing public health challenges related to indoor and outdoor air pollution.

Krista Coleman_SAMSI

Dr. Francesca Dominici, Professor of Biostatistics and Senior Associate Dean of Research at the Harvard T.H. Chan School of Public Health speaking at the workshop.

This workshop was the product of a identifying a unique opportunity, the pooling of ideas and resources, strategic planning and dedication from the organizers at the Harvard T.H. Chan School of Public Health, SAMSI, and ISI-Kolkata. An incredible amount of care an attention went into the identification and selection of workshop participants – each of us leveraged the networks of our colleagues in the U.S. and India to recommend participants that would get the most out of their investment in the week, while also contributing to the benefit of others. Once we had a tentative roster, we worked with precision to create a program and recruit speakers that would meet the needs of all of those in attendance and seed collaborations. We were able to leverage the Harvard T.H. Chan School of Public Health’s India Research Center in Mumbai and with an incredible amount of communication across time zones, plan and confirm all of the logistical arrangements for the workshop. Having never planned a workshop, let alone an international event, it was quite an experience to invest so much of myself in watching the seed of an idea be nurtured along the way and blossom into a wildly successful effort!

Touring Mumbai_SAMSI

Workshop attendees touring Mumbai.

In my role at the Harvard Chan School, I rarely get to so closely observe learning and research in action. It was such a gift to observe the lectures – watching researchers (spanning from students to professors) engage and learn from each other. I was amazed by how quickly the working groups began their collaborative efforts and was in awe of how much they were able to accomplish in just a few days – again, honoring the fact that we were all in the right place at the right time.

Dr. Prabhakaran Dorairaj_SAMSI

Dr. Prabhakaran Dorairaj of Public Health Foundation of India (PHFI) speaking at the workshop.

It’s my nature to set high expectations for projects I engage in, and having never done this before, I wanted it to be perfect. I can say with great confidence based on my own experience and from the feedback we received, that we exceeded our expectations in Mumbai. I’m deeply grateful for all of the contributions from the organizers, speakers, and participants. This wouldn’t have been a success without the engagement from all of those who attended. Thank you all for being part of such an incredibly rewarding experience!

Teamwork and Collegiality Key to Success of the SAMSI-SAVI Workshop

The following was written by Amrutasri Nori-Sarma, from Yale University.

“Coming together is a beginning; keeping together is progress; working together is success.” – Henry Ford

As a PhD candidate at Yale University’s School of Forestry and Environmental studies, I spend much of my year designing and implementing my research projects in some of the most sensitive communities in urban India. Through the course of my fieldwork and data collection, I have learned to rely on the expertise of local community members if I want to achieve my research goals. These relationships can take a significant amount of time to build and nurture to a fruitful collaboration stage, which is why I was pleasantly surprised by how quickly the teamwork and collegiality came together in the first week of June, at the SAMSI-SAVI workshop in Mumbai.

Against the backdrop of the sweltering Mumbai summer, the workshop on Statistical Methods and Analysis of Environmental Health Data was an oasis in more ways than one. Leading participants from Indian and U.S. institutions came together for this inaugural workshop at the brand new Harvard centre in Mumbai, to discuss the cutting edge methods in statistical analysis of environmental health data.

Balakrishnan

Dr. Kalpana Balakrishnan speaking at the workshop.

For me, the best part about the workshop was the balance between methods-based talks (from Prof. Francesca Dominici and Prof. Donna Spiegelman and my own adviser Prof. Michelle Bell, among others) and summaries of the ongoing work in India (from Prof. Kalpana Balakrishnan, senior scientists in a variety of departments at the Public Health Foundation of India, and Dr. Mohan Thanikachalam). The interspersed talks provided a well-rounded picture of the ongoing work in India, as well as the critical research gaps that remain to be filled. This environment was further enhanced by the working group discussions around specific data sets that have been collected by our colleagues in India, which they shared with the groups for discussion and analysis.

group photo at the SAMSI-SAVI workshop

The group at dinner.

Midway through the workshop, attendees were invited by Dr. Swati Piramal to join her for a conference dinner at the Piramal tower. The collaborative discussions continued in the ballroom over dinner and drinks, surrounded by Dr. Piramal’s beautiful art collection. I was able to use this dinner to catch up with my research collaborator Dr. Prakash Gupta, head of the Healis-Seksaria Institute, who is one of the pioneering health data scientists in India working with a cohort that he has been building for 20+ years. I’m excited about the possibility of other similar Indo-US collaborations, which might have their origins at this workshop…

I’ll be returning to India in September to continue my research, and will look forward to the opportunity to reconnect with other workshop participants during my trip!

What Does It Mean to be a Woman in Mathematics?

The following blog post was written by Jessica Matthews, Cooperative Institute for Climate and Satellites (CICS-NC).

room full of women watching presentation

Workshop for Women in Mathematics held April 6-8, 2016.

When offered the invitation to speak at SAMSI’s Opportunities Workshop for Women in Math Sciences, I gladly accepted. When it came time to actually prepare the presentation, I realized that I had never attended, let alone presented, at this type of workshop ever before. I am well versed in putting together a scientific presentation, but this was different. So I myself was faced with the opportunity to consider what it meant to be a woman in mathematics. I had the opening talk time slot, which inherently carries with it the pressure of setting the tone for the entire event. I chose to draw from my personal experiences and to discuss career possibilities beyond the classroom, skill sets I have found necessary (beyond math), and a few key challenges faced by women in our field. A spirited discussion regarding the pay gap and the importance of negotiation entailed. I enjoyed the free-flowing discussion, and felt like this open and welcoming atmosphere was present for the rest of our gathered time.

Throughout the two and half days of the workshop, we had the privilege of hearing from a number of women who have successful careers in academia, industry, and government. They shared their lessons learned, fielded questions, and led discussions about career opportunities and challenges experienced. I cannot possibly capture a comprehensive account of all the great talks and conversations that took place in this workshop, so I provide merely a few personal highlights.

two ladies talking in the hallway

Amanda Goldbeck (R) talking to a participant of the workshop.

Amanda Golbeck introduced the concept of viewing one’s career path as a jungle-gym rather than a ladder. We tend to have the ingrained view of the traditional (and linear) career path, while in reality, to maintain a healthy life–work balance, flexibility is required.  Another grain of wisdom she offered is that being a strong leader is important, but being a valuable team member is paramount. I think this is often forgotten in our power-hungry society, but the truth is that more can be accomplished via cooperation and we should value the cultivation of teamwork skills.

Panel at the women in math workshop

L-R: Ulrica Wilson, Lea Jenkins and Amanda Goldbeck.

Drawing on her experiences at a historically black university, Ulrica Wilson offered a great explanation as to why having workshops such as this one is not only relevant, but important for increasing and maintaining diversity. When we take the time to create this space, we are able to stop focusing on what makes us different and just focus on the math—which is really what we were all drawn to when we chose this pursuit in the first place!

Marie Davidian gave a fascinating overview of notable women in the mathematical sciences, both in the past and the present. I was captivated with the story of the trailblazer Gertrude Cox, founding head of the (then-named) Department of Experimental Statistics at NCSU in 1941. Her recommendation for the position came in the way of a footnote appended to a letter containing a list of recommended male peers: “Of course if you would consider a woman for this position, I would recommend Gertrude Cox of my staff.” This truly puts into perspective how far the community has come with regard to gender equality.

The workshop attendees were energetic and engaged, which made the panel-led discussions and breakout sessions (not to mention breaks) both stimulating and fun. The participants were largely graduate students and early career scientists, who had plenty of thoughtful questions for the expert representatives from academia, industry, and government. Even though I may have been cast as one of the experts, I found that I learned a lot and left the workshop with a to-do list of actions I am interested in taking. In particular: joining a mentor network, engaging more in professional society events, and advocating for family leave benefits.

I am glad to have had this opportunity to consider the challenges, and solutions to those challenges, faced by women and minorities in the mathematical sciences. I’d like to thank SAMSI for hosting this event and allowing us to gather and reflect on both the progress that has been made, and the issues that remain. It is only through this type of directed intention that we may continue to move towards equality.

Statistical Methods and Analysis of Environmental Health Data

The following was written by SukhDev Mishra,Ph.D., Division of Bio-Statistics, National Institute of Occupational Health, Indian Council of Medical Research, Ahmedabad(India)

group shot

Statistical Methods and Analysis of Environmental Health Data Workshop group.

I was fortunate to attend the SAMSI workshop on Statistical Methods and Analysis of Environmental Health Data last week in Mumbai. It focused on various topics related to the statistical analysis of environmental health data, some of which discussed latest methodological development in this field, particularly during the first day’s opening lecture from Professor Joel Schwartz.

Time series data has proven to be critical in the assessment of systematic impact of environmental factors on human health. Professor Francesca Dominici, a researcher with significant contributions in this area was a very dynamic and enthusiastic co-leader for this workshop. She discussed in length the statistical principles and assumptions of multi-site time series analysis along with careful interpretation of such data. Due to technological advances and regular measurement availability, time series data could be accessed and easily analyzed with the techniques elaborated by Professor Dominici, which will be integral to the success of my future studies.

Working Group 5 - Gene x Environment Interactions

Working Group 5 – Gene x Environment Interactions

The Gene x Environment Analysis & Epigenetics lecture taken by Professor Bhramar Mukherjee provided very useful information on interaction/additive and multiplicative models citing practical applications in area of environmental health that she developed. Her very creative way of teaching, blended with great sense of humor, kept us engaged so much so that we wouldn’t blink for a second.

Spatial statistics is a critical part for environmental health data, so it was helpful to have the basics covered by Dr. Safraj Shahul Hameed and Dr. Brian Reich well. Professor Donna Spiegelman presented a wonderful talk on measurement error starting from statistical notations to complete logit function (being a statistician ….I always love this part J ). She put great effort explaining Regression calibration method for MS/EVS and algorithms. Interesting talk!

Working groups were engaged in different exercises that included working on different problems/real data sets generated through various participants and coming up with new analysis and interpretation of data. I worked on Exposure Modelling of Ambient and Household Air Pollution for Acute and Chronic Health Effects. I enjoyed working with my fellow WG colleagues- Kalpana Balakrishnan, Santu Ghosh, Donna Spiegelman, Kevin Lane, Joel Schwartz, Sourangsu Chowdhury , and Poonam Rathi. Fine scientific arguments during the process of analysis were the crux of our exercise; thanks to Joel, Kalpana, Donna and Kevin especially.

This is no way a comprehensive description of this workshop, just my thoughts. I would also like to record here that I learned from each and every speaker and fellow participant. It was a gathering of great scientific minds and very inquisitive researchers. My understanding is that one of SAMSI’s objectives is to foster a culture of collaborative research among Indo-US researcher in area of public health; and I could see that coming true as we collectively discussed ideas on how to continue our work in mutual scientific engagement. I hope these efforts result in great scientific endeavors in coming time for environmental health priorities.

People drinking tea during a break

Enjoying afternoon tea.

One of the unique features of this workshop was meticulous planning by the team of organizers, be it scientific contents or overall execution by Professor Richard Smith, Professor Sujit Ghosh, Professor Francesca Dominici, and Ms. Krista Coleman whose scientific management and interaction with participants was very encouraging.

My working experience mainly includes working in pharmaceutical industry earlier, as biostatistician, and I consider myself a beginner in environmental health. This workshop has helped me to gain more scientific perspectives in this area by leaps and bounds.

This kind of knowledge sharing exercises may prove very helpful for researchers in the area of statistics and epidemiology to address India’s most pressing public health needs. Thank you SAMSI, Harvard, ISI-Kolkata and all of the other participating organizations for such a wonderful experience!

 

 

Postdoctoral Fellow Profile – Lucas Mentch

Mentch-photoweb

Lucas Mentch, SAMSI Postdoctoral Fellow.

Lucas Mentch was born in Indiana Pennsylvania, a town in Western Pennsylvania just east of Pittsburgh, and grew up in central Pennsylvania near Harrisburg. While he was attending high school, he took a statistics course and decided to pursue the subject at Bucknell University in Lewisburg, PA where he majored in mathematics. He asked his professors about how to pursue a career in statistics and many of them told him it was good to have a background in math first.

Lucas attended Cornell University for his graduate studies, where he obtained a Masters and PhD in statistics. He got interested in machine learning and has been looking at developing new statistical inference techniques in this context. He looks at big, messy datasets that are difficult to apply traditional statistical models to, and applies a learning algorithm to pick out large-scale patterns. Those algorithms are good for making predictions, but are difficult to use to assess the uncertainty of a prediction or where it comes from. Lucas’ research is trying to bridge the gap between machine learning and traditional statistical analysis.

While Lucas was at Cornell, he started thinking that criminology or forensics was a good area in which to try his new methods. “You’ve got either data from specific crimes or a crime database where you are trying to pick out raw patterns. You might be looking for other specific things, such as a specific time when crimes are committed, or certain areas in a city where crime occurs more often. I wanted to use machine learning to find those larger patterns, but also trying to see which variables are actually making a difference,” Lucas explained.

Lucas was alerted to the program at SAMSI by Len Stefanski, a professor at NC State and also by Benjamin Risk, another postdoc who is at SAMSI this year after finishing his degree at Cornell.  Ben is involved with the Challenges in Computational  Neuroscience program.

Lucas is involved with two working groups. He is participating in the Bias group. He remarked “There is not been a lot of attention in the area of bias in the past. It has to do with how much of a forensic examiner’s case-specific knowledge is influencing what they conclude. So, for example, if they know a lot of details about a murder, is that influencing what they say? ” The group is working with the Houston Crime Lab to set up blinding procedures where a case manager acts as an intermediary to police and the analysts. The case manager screens the information before it gets to the analysts to ensure the tests are carried out in an unbiased fashion.

The other working group that Lucas is in is trying to assess the quality of latent pattern evidence. Fingerprints are taken from a crime scene using whatever means are available and then scan it into a system to be imported as an image. But the quality of each scanner can be different, just as any piece of computer equipment or camera taking a photo can be different.  Different kinds of scanners distort the fingerprints in different kinds of ways and some scanners can produce a crisp image, even when the fingerprint itself is very smudged.

“There’s been a lot of work on quality metrics for fingerprints. So you have a fingerprint and someone puts a number on how good the fingerprint is compared to others. One of the things our group is trying to do is to say ‘does it matter what type of scanner you use with the original fingerprint?’  Our group recently got some data and can already see that fingerprints scanned with one type of scanner are almost universally better than those taken with another type of scanner according to most existing quality metrics,” Lucas said. He explained that a good scan of a bad fingerprint can often get a higher score than a good fingerprint scanned with a bad scanner. The group is well into completing this project.

“One great thing about the SAMSI program is that I have been able to meet and interact with people in forensics. Most universities don’t have a Department of Forensics, so it would have been difficult to develop these relationships in a purely academic setting,” noted Lucas.

Lucas on his motorcycle

One of Lucas’ hobbies is to ride motorcycles.

When Lucas has time to himself, he loves to ride motorcycles watch movies of all genres.

Next year Lucas will be back at the University of Pittsburgh. He took a year of leave to be able to participate in the SAMSI program.  He will continue to collaborate with people from his working groups on the projects they have started.

Op-Ed in Post-Gazette: Why Forensic Analysis of Crime Scenes is not as Reliable as you Think

SAMSI was featured in an op-ed piece in the Post-Gazette that was written by Lucas Mentch, Maria Cuellar, William C. Thompson and Clifford Spiegelman, all whom are participating in the SAMSI program on forensics this year.

The piece focuses on the Netflix mini-series, “Making a Murderer,” that raised questions about the actions and motives of law enforcement. Read their piece here.

My Experience at the Undergraduate Workshop Focusing on Forensics

The following was written by Briahnna Austin, and undergraduate student from University of California Riverside.

Briahnna Austin

Briahnna Austin

Statistics is the interchange and communication of everyday information.

This past February of 2016, I was fortunate enough to attend my first SAMSI workshop. The topic was forensic science and I was completely overjoyed and anxious, not only for the material I was going to engage in, but also excited for the interesting people I was going to interact and converse with. Coming from an undergraduate biology background, and aspiring to go into graduate level biostatistics, I have a particular fondness for interdisciplinary fields. This interdisciplinary material I was able to find during SAMSI’s Forensic Science Workshop; the purpose of this workshop was to give insight about how statistics, mathematics, data, and scientific principles amalgamate to form what we call forensic science.

Upon my arrival I was able to meet a professor from Duke at the airport; this was one of the most amazing coincidences since SAMSI has ties with Duke; I took it as a sign the workshop has something important in store for me, which it did. On the first day of the workshop, I was able to learn about comparative bullet analysis, retail sampling, and latent fingerprinting. The speakers highlighted the importance of decision-making and techniques choices. In forensic science, there is a large toolkit of information to pull from, and this toolkit gets larger as technology grows so it is our job as the statistician, investigator, or forensic scientist to make responsible and informed selections. During the first day, I was also able to see a forensics science lab; this is where movies and TV shows portray a lot of action going on, but it is different in the real world. Going to the forensic lab, gave a great opportunity to clear up assumptions and see what the real “CSI” does on a daily basis. The director of the crime lab showed my group around the facilities, and I kept hoping to see something scary or something crazy pop out of the wall, but no luck.

two lab workers

Lab workers at the Wake County Crime Lab.

During the next day of the workshop, I was able to learn about the uniqueness fallacy, statistical reliability, contextual/confirmation bias as well as a Bayesian model for fingerprint statistics. This gave insight into how important reproducibility of work as well as professionalism comes into play. In this field of work, it is essential to keep out biases and ensuring statistical reliability can assist with the types of bias we went over. The take away from both days was the idea of accountability of your work and passion for the field. Every speaker enjoyed his or her line of work. Their commitment to the field was inspiring, and shows first hand how forensic science is a collaborative effort, and when working open dialogue and communication is key to success.

Students listening to a lecture.

Students listening to a lecture.

The last large take away I acquired from this workshop was regarding networking. One of my most vivid memories during the SAMSI workshop, beside the awesome food, was communicating with the post-doc student, and undergraduate students. At the end of the first day I was able to talk to post-doc students, which help steer me in the right direction for my educational future. I am glad SAMSI provided the time to network with post-doc students; they were very friendly and funny. Not only did I network with the post-doc students, but the students attending the workshop as well. The SAMSI workshop gave me the opportunity to make new friends. Moving forward in education and career aspiration, I will be calling upon others for different aspects in STEM. Looking around the conference room and realizing these students will be the next set of forensic scientists, investigators, statisticians, and researchers, it is important we are able to network with one another. I would definitely recommend this workshop to other students and I encourage student to seek out other SAMSI opportunities as well. Lastly, do not forget to take many pictures; looking back, I realized how scenic Durham is and wish I had more pictures.

Reaching into an Abyss – Challenges in Computational Neuroscience and Graduate School

group shot of students at SAMSI

Students attending the Undergraduate Workshop at SAMSI.

The following was written by Praveen Suthaharan, an undergraduate student from North Carolina State University who recently attended the SAMSI Undergraduate Workshop on Computational Neuroscience.

Continually baffling researchers across the globe, the 3 pounds of matter that sits in our skull holds many mysteries that have yet to be discovered. Brain research, or Neuroscience, is on the verge of revolutionizing our world. In the past few years, by taking advantage of the advancements made in the computing world, several neuroscientists have delved into the brain trying to unfold many of its hidden intricacies. I, too, aspire to be part of this rising era of computational neuroscience research.

I’m an undergrad, majoring in Statistics and Neurobiology, at North Carolina State University. I plan to pursue a PhD in Computational Neuroscience. My exposure to the coursework in Statistics and Neurobiology has made me curious about the areas of study that lie at the intersection of the two fields. This curiosity has led me to steadfastly chase the inevitable question of, what IS computational neuroscience? This year’s SAMSI undergraduate workshop has served as a portal for me to explore this question that stemmed from my curiosity.

It was a Saturday morning and I could see new prospects for my future as I stepped into SAMSI and grabbed my official name tag. My pulse rate started beating fast, with a sense of excitement, as I walked into the conference room to a group of other dedicated and driven prospective scientists. The series of presentations started with a high note as Dr. Ciprian Crainiceanu began his talk with a tutorial on clinical brain imaging. Given the time, he provided a fast-paced, yet comprehensive lecture on ‘neurohacking’ and on the process of how brain images are coded into computable values for the purpose of monitoring/detecting changes in the brain. His presentation set the tone for our next presenter, Dr. Ana-Maria Staicu, who provided deep insight on the applications of an interesting image processing technique (anisotropic diffusion) on a well-known neurological disorder known as Multiple Sclerosis. At this very moment, as the momentum of wanting to think began to fade, I got distracted.

As the aroma of freshly baked bread hit my olfactory senses with a blast of pleasant sensation, I glanced at the time knowing it was lunch time. Immediately as we vacated the conference room, an announcement about taking a group photo was broadcasted to the students. We all congregated outside of SAMSI like any group of young, excited individuals – confused, yet composed.

people on the shuttle going to Duke University

Riding the bus to Duke.

With a blink of an eye, we were all set to board the shuttle to the Center of Neuroimaging at Duke. Here, we visited Dzirasa’s lab. We were all given an overview of the research lab and a tour of the facility. This visit has strengthened my interest in computational neuroscience research, and will be looking forward to applying to Duke for grad school.

Person talking at the Duke Lab

Stephen Mague talks to the students about Dzirasa’s Lab at Duke.

On our way back to SAMSI, the desire to acquire more knowledge grew inside of me as I was eager to learn about the applications of Fourier Transform (FT) within neuroscience, to interactively work with brain data using various programming languages, and to attend the graduate school panel discussion. Benjamin Risk, a postdoc who works at SAMSI, engaged us with a tutorial on image reconstruction using Discrete Fourier Transformation (DFT). The ability to manipulate images through mathematical approaches was mind-blowing, especially knowing that these approaches have been invaluable to neuroscience research. Following Benjamin’s talk, Sarah Vallélian introduced her presentation with a tutorial on Computed Tomography (CT). She discussed about several useful signal processing techniques, including back-projection, filtered back-projection, and Hilbert Transform, and gave us the opportunity to work with CT data using some of these techniques. As much as the other students enjoyed these presentations, I believe these interactive activities (i.e., using R, Matlab, and python) served as the best part of this workshop, allowing us to fiddle with the data and providing us with the initial steps to computational neuroscience research.

As the panel discussion about graduate studies commenced, my ears were engaged in the conversation as I was absorbing various useful information coming from insightful graduate students. I have come to realize that research mirrors an abyss – it’s a never ending path of glory. This appreciation of mine for research has now become my driving force to pursue graduate school. With that, the first day came to a close with an enticing dinner. The food formed this perfect taste combination that left my mouth revitalized and extremely satisfied. SAMSI definitely knows how to treat prospective scientists!

Ezra Miller, Duke, giving a lecture at the workshop.

Ezra Miller, Duke, giving a lecture at the workshop.

The next day ended with some more fascinating mathematical/statistical approaches to neuroscience as Dr. Laura Miller and Dr. Ezra miller took the floor. Particularly, Dr. Ezra Miller’s presentation on Topology for Statistical Analysis of Brain Artery Images provided me with a deeper insight on an interesting mathematical approach towards neuroscience. As a matter of fact, his presentation motivated me to immerse myself in Topology and its various applications to neuroscience.

With the end of my undergrad years, just around the corner, new doors to success have emerged with this amazing workshop. Not only did this workshop provide me with a new perspective on my research interest and grad school, but it has also given me the appreciation and audacity to reach into the abyss, knowing that it will lead me on a never ending path of glory. After all, research, in particular, computational neuroscience research, is an abyss – a bottomless pit filled with incessantly approaching questions that permeate your mind with curiosity of the mysteries of the brain.

SAMSI has organized an incredible workshop that I would not think twice about attending in the future.